machine-learning statistics

Definition

Empirical Risk

Empirical risk, denoted , is the expected loss of a hypothesis over a finite training dataset . It serves as a computable proxy for the theoretical true risk. Formally:

where is the chosen loss function.

Risk Minimisation

The fundamental principle of Empirical Risk Minimisation (ERM) involves selecting a hypothesis from the hypothesis class that minimises the error on the observed sample:

While ERM is computationally feasible, relying solely on empirical risk without regularisation can lead to overfitting, where the model captures sample-specific noise rather than the underlying distribution.