machine-learning deep-learning analysis
Definition
Universal Function Approximation Theorem
The Universal Function Approximation Theorem states that a feedforward artificial neural network with a single hidden layer containing a finite number of neurons can approximate any continuous function to arbitrary precision on a compact subset of .
Formally, for any continuous function and any , there exists a neural network such that:
This holds provided that the activation function used in the hidden layer is non-linear, non-constant, and bounded.
Implications for Depth
While the theorem guarantees that a “wide” network (many neurons in one layer) can represent complex functions, it does not specify the efficiency of the representation. In practice, deep architectures (many hidden layers) are often significantly more efficient at capturing hierarchical features and require exponentially fewer total parameters than shallow networks to achieve the same level of approximation.