Multi-Layer Perceptron

Definition

Multi-Layer Perceptron

A Multi-Layer Perceptron (MLP) is a feedforward artificial neural network consisting of an input layer, at least one hidden layer, and an output layer. Formally, the activation vector $a^{(l)}$ of layer $l$ is defined by the recursive transformation:

$a^{(l)} = σ (s^{(l)}) = σ (W^{(l - 1)} a^{(l - 1)} + b^{(l - 1)})$

where:

$s^{(l)}$ is the pre-activation vector of layer $l$ .

$W^{(l - 1)}$ is the weight matrix connecting layer $l - 1$ to layer $l$ .

$b^{(l - 1)}$ is the bias vector of the transformation.

$σ (\cdot)$ is a non-linear activation function.

Functional Expressivity

Universal Approximation: According to the Universal Function Approximation Theorem, an MLP with a single hidden layer containing a finite number of neurons can approximate any continuous function on compact subsets of $R^{n}$ , provided the activation function is non-linear and bounded.

Linearity Collapse: Without the inclusion of non-linear activation functions, any stack of multiple layers collapses into a single linear transformation, as the composition of affine mappings is itself affine: $W_{2} (W_{1} x + b_{1}) + b_{2} = W^{'} x + b^{'}$ .

Lukas' Notes

Multi-Layer Perceptron

Definition

Functional Expressivity

Graph View

Table of Contents

Backlinks