machine-learning supervised-learning
Definition
Supervised Learning
Supervised learning is a paradigm of machine learning where the objective is to learn a mapping from an instance space to a label space based on provided input-output pairs. Formally, given a training dataset sampled i.i.d. from a joint distribution , the learner aims to identify a hypothesis that minimises the expected risk for a given loss function .
Data Representation
The supervised setting assumes a structural relationship between input and output, defined by the geometry of the data and the capacity of the learner. The learning process operates on an instance space (typically ) and a label space . The nature of distinguishes between tasks: discrete labels define classification, while continuous values define regression.
Inductive Bias
To prevent an unconstrained search, the learner is restricted to a hypothesis class . This restriction encodes the inductive bias, allowing the model to generalise from finite samples to the full distribution.
Empirical Risk Minimisation
Since the true distribution is unknown, the model is trained by identifying a hypothesis that minimises the loss over the available data. This principle, known as Empirical Risk Minimisation (ERM), seeks to approximate the optimal mapping by penalising discrepancies between the predicted and true labels.
Definition
Link to originalSupervised Learning Algorithm
A supervised learning algorithm is a learning algorithm that maps a set of training data to a specific hypothesis:
where:
- is the instance space.
- is the label space.
- is the hypothesis class.