k-Nearest Neighbour

machine-learning classification regression

Definition

k-Nearest Neighbour

The k-nearest neighbour (k-NN) algorithm is a non-parametric distance-based method used for classification and regression. For a query point $x$ , the algorithm identifies the $k$ closest instances ${x_{1}, \dots, x_{k}}$ in the training set according to a specified metric $d$ .

Prediction Rules

Classification: The predicted label is determined by a majority vote among the $k$ neighbours. If $k = 1$ , the object is simply assigned the class of its single nearest neighbour.

Regression: The predicted value is the arithmetic mean of the target values of the $k$ neighbours.

Properties

Lazy Learning: k-NN is a memory-based learner that requires no explicit training phase; instead, it stores the entire dataset and performs all computations during the inference step.

Sensitivity to Scale: As a distance-based method, k-NN is highly sensitive to the relative scales of features, necessitating the use of Standardisation or Min-Max Scaling to ensure all dimensions contribute equally to the distance calculation.

Lukas' Notes

k-Nearest Neighbour

Definition

Prediction Rules

Properties

Graph View

Table of Contents

Backlinks