Lukas' Notes

machine-learning classification optimisation

Definition

Binary Cross-Entropy Loss

Binary cross-entropy loss is a loss function for binary classification that measures how well a predicted Bernoulli probability matches a binary label.

For a true label and a predicted probability for class , it is defined by

Here is usually interpreted as the conditional probability predicted by the model.

Cases

For a positive example , the loss becomes

The model is punished when it assigns low probability to the positive class.

For a negative example , the loss becomes

The model is punished when it assigns high probability to the positive class.

Interpretation

Binary cross-entropy is the negative log-likelihood of a Bernoulli distribution. If the model predicts

then minimising binary cross-entropy is equivalent to maximising the likelihood of the observed labels.

It is the standard loss for logistic regression and binary neural classifiers with a sigmoid output.

Relation to KL Divergence

For a soft target distribution with true conditional probability and predicted probability , the KL divergence is

When the target is a hard label , minimising this divergence is equivalent to minimising binary cross-entropy.