t-distributed Stochastic Neighbour Embedding

machine-learning unsupervised-learning visualisation

Definition

t-distributed Stochastic Neighbour Embedding

t-distributed Stochastic Neighbour Embedding (t-SNE) is a non-linear, stochastic dimensionality reduction technique specifically designed for the visualisation of high-dimensional datasets in 2D or 3D space. Formally, it converts the high-dimensional Euclidean distances between points into conditional probabilities that represent similarities:

$p_{j ∣ i} = \frac{e x p ( - ∥ x _{i} - x _{j} ∥ ^{2} /2 σ _{i}^{2} )}{\sum _{k \neq = i} e x p ( - ∥ x _{i} - x _{k} ∥ ^{2} /2 σ _{i}^{2} )}$

The algorithm then identifies a low-dimensional embedding ${y_{1}, \dots, y_{n}}$ that minimises the Kullback–Leibler divergence between the high-dimensional distribution $P$ and a low-dimensional t-distribution $Q$ :

$q_{ij} = \frac{( 1 + ∥ y _{i} - y _{j} ∥ ^{2} ) ^{- 1}}{\sum _{k} \sum _{l \neq = k} ( 1 + ∥ y _{k} - y _{l} ∥ ^{2} ) ^{- 1}}$

Comparison with PCA

Unlike PCA, which is a deterministic, linear mapping that prioritises the preservation of global variance, t-SNE is non-deterministic and non-linear. It excels at preserving the local structure of the data, ensuring that nearby points in the high-dimensional space remain close in the low-dimensional embedding, making it superior for identifying clusters and sub-manifolds.

Computational Considerations

Stochastic Nature: Due to its non-deterministic optimisation (typically performed via gradient descent), multiple runs of t-SNE may result in different visual representations.

Parameter Sensitivity: The results are highly dependent on the perplexity parameter, which effectively balances the model’s focus between local and global aspects of the data.

Lukas' Notes

t-distributed Stochastic Neighbour Embedding

Definition

Comparison with PCA

Computational Considerations

Graph View

Table of Contents

Backlinks