machine-learning linear-algebra optimisation
Definition
Normal Equations
The normal equations provide a closed-form analytical solution for finding the optimal parameters that minimise the sum of squared errors in a linear regression task. Formally, given a design matrix and a target vector , the optimal weight vector is:
where is the matrix of input features (with samples as columns) and is the target vector.
Operational Properties
Exact Minimisation: Unlike iterative methods like gradient descent, the normal equations provide the global minimum of the quadratic loss function in a single computational step, requiring no hyperparameters such as a learning rate.
Computational Complexity: The solution requires the inversion of the matrix , which has a complexity of . This makes the method highly efficient for low-dimensional feature spaces but computationally prohibitive as the number of features increases.