machine-learning deep-learning analysis
Definition
Rectified Linear Unit
The Rectified Linear Unit (ReLU) is a non-linear activation function that outputs the input directly if it is positive, and zero otherwise. Formally:
Operational Advantages
Gradient Flow: Unlike saturating activation functions like Sigmoid, ReLU does not suffer from the vanishing gradient problem for positive inputs, as the gradient is constant (). This property is fundamental for the successful training of deep artificial neural networks.
Sparsity: Since any input results in a zero activation, ReLU naturally induces sparsity in the hidden layers, which can lead to more robust and computationally efficient representations.