Definition
Zero-One Loss
The Zero-One Loss is a loss function used in classification tasks to measure prediction error. It is the most direct metric for misclassification, assigning a penalty of 1 for an incorrect prediction and 0 for a correct one.
Let be the model’s predicted label for an input , and let be the true label. The zero-one loss, denoted , is defined as:
This can be written more compactly using an indicator function :
Properties and Role
- Direct Interpretation: The empirical risk calculated with zero-one loss is exactly the model’s error rate or misclassification rate on the dataset.
- Optimisation Challenge: The function is none-convex and non-differentiable. This makes it computationally intractable (NP-hard) to minimise directly with standard gradient-based optimisation algorithms.
- Practical Use: While minimising zero-one loss is the ultimate theoretical goal of classification, in practice, algorithms optimise a continuous and convex surrogate loss function (like Hinge loss or Cross-Entropy) instead. The zero-one loss is then used as a final evaluation metric to report the model’s performance.