machine-learning deep-learning analysis

Definition

Rectified Linear Unit

The Rectified Linear Unit (ReLU) is a non-linear activation function that outputs the input directly if it is positive, and zero otherwise. Formally:

Operational Advantages

Gradient Flow: Unlike saturating activation functions like Sigmoid, ReLU does not suffer from the vanishing gradient problem for positive inputs, as the gradient is constant (). This property is fundamental for the successful training of deep artificial neural networks.

Sparsity: Since any input results in a zero activation, ReLU naturally induces sparsity in the hidden layers, which can lead to more robust and computationally efficient representations.