Zero-One Loss

Definition

Zero-One Loss

The Zero-One Loss is a loss function used in classification tasks to measure prediction error. It is the most direct metric for misclassification, assigning a penalty of 1 for an incorrect prediction and 0 for a correct one.

Let $h (x)$ be the model’s predicted label for an input $x$ , and let $y$ be the true label. The zero-one loss, denoted $L_{0 - 1}$ , is defined as:

$L_{0 - 1} (h (x), y) = {10 if h (x) \neq = y if h (x) = y$

This can be written more compactly using an indicator function $I (\cdot)$ :

$L_{0 - 1} (h (x), y) = I (h (x) \neq = y)$

Properties and Role

Direct Interpretation: The empirical risk calculated with zero-one loss is exactly the model’s error rate or misclassification rate on the dataset.
Optimisation Challenge: The function is none-convex and non-differentiable. This makes it computationally intractable (NP-hard) to minimise directly with standard gradient-based optimisation algorithms.
Practical Use: While minimising zero-one loss is the ultimate theoretical goal of classification, in practice, algorithms optimise a continuous and convex surrogate loss function (like Hinge loss or Cross-Entropy) instead. The zero-one loss is then used as a final evaluation metric to report the model’s performance.

Lukas' Notes

Zero-One Loss

Definition

Properties and Role

Graph View

Table of Contents