Random Projection

machine-learning dimensionality-reduction probability

Definition

Random Projection

Random projection is a computationally efficient dimensionality reduction technique that maps high-dimensional data points $x \in R^{p}$ into a lower-dimensional subspace $R^{k}$ using a random matrix $R$ . Formally, the projection $f : R^{p} \to R^{k}$ is defined as:

$f (x) = \frac{1}{k} R x$

where the entries of $R$ are sampled independently from a zero-mean distribution (e.g., $R_{ij} \sim N (0, 1)$ ).

Distance Preservation

Johnson-Lindenstrauss Lemma

For any set of $n$ points in $R^{p}$ and any $ϵ \in (0, 1)$ , there exists a projection into $k$ dimensions where $k \geq O (\frac{l n n}{ϵ ^{2}})$ such that the pairwise squared Euclidean distances between all points are preserved within a factor of $1 \pm ϵ$ :

$(1 - ϵ) ∥ u - v ∥^{2} \leq ∥ f (u) - f (v) ∥^{2} \leq (1 + ϵ) ∥ u - v ∥^{2}$

Computational Efficiency

Compared to PCA, which has a complexity of $O (p^{2} n + p^{3})$ , random projection is significantly faster, requiring only $O (p kn)$ time. It is particularly effective for extremely high-dimensional datasets where the covariance matrix calculation is computationally prohibitive.

Lukas' Notes

Random Projection

Definition

Distance Preservation

Computational Efficiency

Graph View

Table of Contents