machine-learning optimisation calculus
Definition
Gradient Descent
Gradient descent is a first-order iterative optimisation algorithm used to identify a local minimum of a differentiable function. In the context of model training, it minimises a loss function by updating parameters in the direction of the steepest descent.
The update rule for iteration is:
Convergence Properties
Learning Rate Impact: The hyperparameter (step size) is critical; excessive values may lead to divergence, while insufficient values result in slow convergence.
Local Optima: For non-convex objective functions, the algorithm may converge to a local minimum or a saddle point rather than the global minimum.
Guarantees: Convergence to the global minimum is guaranteed for -smooth convex functions, provided the learning rate is sufficiently small.