machine-learning classification regression
Definition
k-Nearest Neighbour
The k-nearest neighbour (k-NN) algorithm is a non-parametric distance-based method used for classification and regression. For a query point , the algorithm identifies the closest instances in the training set according to a specified metric .
Prediction Rules
Classification: The predicted label is determined by a majority vote among the neighbours. If , the object is simply assigned the class of its single nearest neighbour.
Regression: The predicted value is the arithmetic mean of the target values of the neighbours.
Properties
Lazy Learning: k-NN is a memory-based learner that requires no explicit training phase; instead, it stores the entire dataset and performs all computations during the inference step.
Sensitivity to Scale: As a distance-based method, k-NN is highly sensitive to the relative scales of features, necessitating the use of Standardisation or Min-Max Scaling to ensure all dimensions contribute equally to the distance calculation.