machine-learning linear-algebra statistics dimensionality-reduction
Definition
Principal Component Analysis
Principal Component Analysis (PCA) is a deterministic linear dimensionality reduction technique that identifies a lower-dimensional subspace that captures the maximum variance of the data. Formally, for a centred data matrix with covariance matrix , PCA seeks a direction with that maximises the projected variance:
The solution is the eigenvector corresponding to the largest eigenvalue of the covariance matrix .
Optimisation: The Lagrangian Approach
The objective is to maximise subject to the constraint . Utilising the method of Lagrange multipliers:
Taking the derivative with respect to and equating to zero yields:
Substituting this back into the variance term gives . Thus, the variance is maximised by selecting as the eigenvector associated with the largest eigenvalue .
Structural Properties
Orthogonality: The principal components (the projected directions) are mutually orthogonal and form a new basis for the feature space.
Reconstruction Error: Maximising the variance is mathematically equivalent to minimising the mean squared reconstruction error between the original data and its projection.
Linearity: As a linear method, PCA identifies subspaces (planes or hyperplanes). It may fail to capture non-linear structures, for which t-SNE or spectral clustering are more appropriate.
Numerical Stability: The quality of the principal components depends on the distribution of the eigenvalues. If the eigenvalues are roughly equal (), no single direction dominates the variance. In such cases, the principal components become unstable, as many different orthogonal directions capture approximately the same amount of information, making the model highly sensitive to small perturbations.