machine-learning linear-algebra
Definition
Mercer's Theorem
Let be a compact metric space and let be a continuous, symmetric function. Then is a valid kernel — i.e. there exists a Hilbert space and a feature map such that
— if and only if is positive semi-definite. That is, for every finite set of points and every choice of coefficients :
Consequence
Mercer’s theorem guarantees that the “kernel trick” is mathematically sound. Whenever a kernel satisfies the positive semi-definiteness condition, one can construct an implicit high-dimensional feature space and run linear algorithms there without ever computing explicitly.
Eigenfunction Expansion
When the conditions hold, the kernel admits a spectral decomposition:
where are non-negative eigenvalues and are the corresponding orthonormal eigenfunctions in . The feature map can then be taken as .