machine-learning linear-algebra

Definition

Mercer's Theorem

Let be a compact metric space and let be a continuous, symmetric function. Then is a valid kernel — i.e. there exists a Hilbert space and a feature map such that

— if and only if is positive semi-definite. That is, for every finite set of points and every choice of coefficients :

Consequence

Mercer’s theorem guarantees that the “kernel trick” is mathematically sound. Whenever a kernel satisfies the positive semi-definiteness condition, one can construct an implicit high-dimensional feature space and run linear algorithms there without ever computing explicitly.

Eigenfunction Expansion

When the conditions hold, the kernel admits a spectral decomposition:

where are non-negative eigenvalues and are the corresponding orthonormal eigenfunctions in . The feature map can then be taken as .