True Risk

Definition

True Risk

True risk (or generalisation error), denoted $R (h)$ , is the expected loss of a hypothesis $h$ over the entire joint probability distribution $P (X, Y)$ . Formally:
$R (h) = E_{(x, y) \sim P (X, Y)} [L (h (x), y)] = \int_{X \times Y} L (h (x), y) d P (x, y)$
where $L$ is the loss function.

Relation to Empirical Risk

In practice, the true risk is incomputable as the underlying distribution $P (X, Y)$ is unknown. It represents the theoretical performance of the model on unseen data. The objective of machine learning is to find a hypothesis that minimises $R (h)$ , typically by using the empirical risk calculated from finite samples as a computable proxy. The difference between true and empirical risk defines the model’s generalisation capability.

Lukas' Notes

True Risk

Definition

Relation to Empirical Risk

Graph View

Table of Contents

Backlinks