machine-learning intelligence

Definition

Generalisation

Generalisation is the ability of a learning algorithm to perform accurately on new, previously unseen data sampled from the same distribution as the training data. It is the primary objective of model fitting, as a model that merely memorises the training set without generalising is considered to be overfitting.

Formally, generalisation performance is quantified by the gap between the empirical risk (training error) and the true risk (expected error on the full distribution):

Generalisation Error

The goal of machine learning is to identify a hypothesis that minimises the true risk . Since the true distribution is unknown, the model is optimised on the empirical risk, and its generalisation capability is estimated using a disjoint test set that remains strictly unseen during the training and hyperparameter tuning phases.