machine-learning learning-theory
Definition
Estimation Error
The estimation error is the gap between the empirical-risk minimiser returned from a finite training set and the best model its hypothesis class could express. Writing
for the best-in-class, it is
It is the cost of choosing from one sample rather than knowing exactly: different training sets yield different , and the spread of those picks is the learning-theory counterpart of variance. Together with the approximation error, it partitions the excess risk over the Bayes optimum (see the decomposition).
What drives it
Estimation error falls as the sample grows — the empirical risk concentrates toward the true risk, so tends to . It rises with the capacity of : a richer class fits any given sample more closely, so varies more across samples. When this term dominates, the model tracks the particular sample rather than the underlying pattern, the familiar symptom of overfitting.