VC Dimension

categories: Data Science, Definition

Definition:
The Vapnik-Chervonenkis (VC) dimension is a measure of the capacity or complexity of a hypothesis class in statistical learning theory. Specifically, it is the largest number of points that can be shattered by the hypothesis class $H$ .

A set of points is shattered by $H$ if, for every possible binary labeling of those points, there exists a hypothesis $h \in H$ that correctly classifies them.

Formal Definition:
The VC dimension of a hypothesis class $H$ , denoted $VC (H)$ , is the size of the largest finite set $S$ such that $S$ can be shattered by $H$ . If $H$ can shatter arbitrarily large sets, $VC (H) = \infty$ .

Key Concepts:

Shattering:
A hypothesis class $H$ shatters a set $S = {x_{1}, x_{2}, \dots, x_{n}}$ if for every possible labeling of $S$ (i.e., $2^{n}$ labelings), there exists a hypothesis $h \in H$ that perfectly separates the points according to that labeling.
Examples:
- A hypothesis class of linear classifiers in $R^{2}$ can shatter any 3 non-collinear points but cannot shatter 4 points in general position. Thus, $VC (H) = 3$ .
- A hypothesis class of all functions can shatter any finite set, so $VC (H) = \infty$ .

Properties:

Relation to Learning:
The VC dimension quantifies the capacity of $H$ .
- A higher VC dimension implies a more expressive hypothesis class, which can fit more complex data but is more prone to overfitting.
- A lower VC dimension implies a less expressive hypothesis class, which may underfit.
Generalization Bound:
If $VC (H) = d$ , then with high probability, the generalization error $ϵ$ is bounded as:
$ϵ \leq \frac{d l o g ( m / d ) + l o g ( 1/ δ )}{m}$
where $m$ is the number of training samples, and $δ$ is the confidence level.
Finite VC Dimension:
If $H$ has finite VC dimension, it satisfies the PAC (Probably Approximately Correct) learning framework.

Applications:

Model Selection:
Helps compare the complexity of hypothesis classes (e.g., linear classifiers, decision trees).
Generalization Analysis:
Used to derive theoretical guarantees on a model’s ability to generalize beyond training data.
Support Vector Machines (SVMs):
The VC dimension is related to the margin of separation, impacting the capacity of SVMs.

Examples:

Linear Classifiers in $R^{2}$ :
- VC dimension = 3 (3 points in general position can be shattered).
Axis-Aligned Rectangles in $R^{2}$ :
- VC dimension = 4 (4 points in a specific arrangement can be shattered).
Polynomial Classifiers of Degree $k$ :
- VC dimension = $(2 k + 2)$ in $R^{2}$ (related to the number of parameters).

Evgeny's Notes

Explorer

Recent posts

Installing the Homebrew Channel App on an LG TV (Ubuntu)

Obsidian + Zettelkasten + PARA

About this site

VC Dimension

Graph View