machine-learning

Definition

Cosine Similarity

Cosine similarity is a measure of similarity between two non-zero vectors that evaluates the cosine of the angle between them. Formally, for two vectors , the cosine similarity is defined as the inner product of the vectors normalised by the product of their magnitudes:

The resulting value resides in the interval :

  • 1: Identical direction (maximally similar).
  • 0: Orthogonal (decorrelated).
  • -1: Opposite direction (maximally dissimilar).

Properties

  • Scale Invariance: Unlike Euclidean distance, cosine similarity is invariant to the magnitude of the vectors, making it suitable for high-dimensional data such as text embeddings (where document length varies).
  • Metric Relation: It is related to the angular distance, but is not a formal metric itself as it violates the triangle inequality.