Introduction to the topic
This site offers interactive simulations to explore how the parameters of two Gaussian distributions influence the Bayesian decision boundary. Depending on the case, this limit can be a straight line, a conic (such as an ellipse) or a more complex shape. The aim is to provide a visual and intuitive understanding of these principles.
What is a decision boundary?
Adjusting the input values
- Mean : This sets the center of the distribution. Moving the mean shifts the entire distribution left or right along the number line.
- Variance : This controls the spread of the distribution. You're directly adjusting how far data tends to fall from the mean.
1) Case Σᵢ = σᵢ · I
Dataset 1 :
Dataset 2 :
2) Case σ₁ ≠ σ₂
Dataset 1 :
Dataset 2 :
Note: If you entered a negative value for sigma, it was automatically converted to its absolute (positive) value, as negative values are not valid.
Recall: The multivariate Gaussian distribution
A Gaussian distribution of dimension \( D \) is given by :
\[ N(\mathbf{x} | \mu, \Sigma) = \frac{1}{(2\pi)^{D/2} |\Sigma|^{1/2}} \exp \left( -\frac{1}{2} (\mathbf{x} - \mu)^T \Sigma^{-1} (\mathbf{x} - \mu) \right) \]
Let
- \( \mu \), the mean vector,
- \( \Sigma \), the covariance matrix.
Shape of the decision boundary
The decision border separates the regions where one class is more plausible than the other. It is given by the equality of the probability densities of the two distributions:
\[ N(\mathbf{x} \mid \mu_1, \Sigma_1) = N(\mathbf{x} \mid \mu_2, \Sigma_2) \]
This equation can be simplified into a quadratic or linear equation, depending on its coefficients, which are determined by the inputs, i.e. the mean and variance of the two Gaussian models. On this website, we will consider the following cases :
- If \( \Sigma_1 = \Sigma_2 \) (identical covariances) : the boundary is a line.
- If \( \Sigma_1 \neq \Sigma_2 \) (different covariances) : the boundary is either a circle or an ellipse.