machine-learning learning-theory statistics

Definition

Vapnik-Chervonenkis Dimension

The Vapnik-Chervonenkis (VC) dimension, denoted , is a measure of the capacity or complexity of a hypothesis class. It is defined as the cardinality of the largest set of points that can be shattered by . If can shatter sets of arbitrary size, the VC dimension is infinite.

PAC Learnability

The VC dimension provides a necessary and sufficient condition for PAC learnability in the binary classification setting. A class is PAC-learnable if and only if . The sample complexity is then bounded by the VC dimension:

Examples

Thresholds on the Real Line: For the class , the VC dimension is 1.

Axis-aligned Rectangles: In , the class of axis-aligned rectangles can shatter at most 4 points, resulting in .

Linear Separators: In a -dimensional space , the VC dimension of linear classifiers is .

Convex Polygons: For the class of convex polygons in , the VC dimension is infinite. This is because any number of points can be placed in convex position (e.g., on a circle), and for any binary labelling, a convex polygon can be constructed that contains exactly the points with positive labels.