machine-learning linear-algebra optimisation

Definition

Normal Equations

The normal equations provide a closed-form analytical solution for finding the optimal parameters that minimise the sum of squared errors in a linear regression task. Formally, given a design matrix and a target vector , the optimal weight vector is:

where is the matrix of input features (with samples as columns) and is the target vector.

Operational Properties

Exact Minimisation: Unlike iterative methods like gradient descent, the normal equations provide the global minimum of the quadratic loss function in a single computational step, requiring no hyperparameters such as a learning rate.

Computational Complexity: The solution requires the inversion of the matrix , which has a complexity of . This makes the method highly efficient for low-dimensional feature spaces but computationally prohibitive as the number of features increases.