machine-learning deep-learning
Definition
Multi-Layer Perceptron
A Multi-Layer Perceptron (MLP) is a feedforward artificial neural network consisting of an input layer, at least one hidden layer, and an output layer. Formally, the activation vector of layer is defined by the recursive transformation:
where:
- is the pre-activation vector of layer .
- is the weight matrix connecting layer to layer .
- is the bias vector of the transformation.
- is a non-linear activation function.
Functional Expressivity
Universal Approximation: According to the Universal Function Approximation Theorem, an MLP with a single hidden layer containing a finite number of neurons can approximate any continuous function on compact subsets of , provided the activation function is non-linear and bounded.
Linearity Collapse: Without the inclusion of non-linear activation functions, any stack of multiple layers collapses into a single linear transformation, as the composition of affine mappings is itself affine: .