machine-learning deep-learning

Definition

Multi-Layer Perceptron

A Multi-Layer Perceptron (MLP) is a feedforward artificial neural network consisting of an input layer, at least one hidden layer, and an output layer. Formally, the activation vector of layer is defined by the recursive transformation:

where:

  • is the pre-activation vector of layer .
  • is the weight matrix connecting layer to layer .
  • is the bias vector of the transformation.
  • is a non-linear activation function.

Functional Expressivity

Universal Approximation: According to the Universal Function Approximation Theorem, an MLP with a single hidden layer containing a finite number of neurons can approximate any continuous function on compact subsets of , provided the activation function is non-linear and bounded.

Linearity Collapse: Without the inclusion of non-linear activation functions, any stack of multiple layers collapses into a single linear transformation, as the composition of affine mappings is itself affine: .