Hindley–Milner Type System

Definition

Hindley–Milner Type System

The Hindley–Milner type system is a static type system for a lambda calculus with let, where every typable expression has a principal type scheme and where type inference is decidable by unification.

Its central judgement has the form
$Γ ⊢ e : τ,$
meaning that expression $e$ has type $τ$ under type environment $Γ$ .

Object Language

HM is usually presented over the simply typed lambda calculus extended with let:

e ::= x ∣ c ∣ λ x . e ∣ e_{1} e_{2} ∣ let x = e_{1} in e_{2} .

Here:

$x \in V$ is a variable;
$c \in C$ is a constant with a predefined type;
$λ x . e$ introduces a function;
$e_{1} e_{2}$ applies a function;
let is the source of parametric polymorphism.

Types

Monotypes

A monotype is an ordinary type with no explicit universal quantifier:

τ ::= α ∣ b ∣ τ_{1} \to τ_{2} .

Here:

$α \in A$ is a type variable;
$b \in B$ is a base type, such as $Int$ or $Bool$ ;
$τ_{1} \to τ_{2}$ is a function type.

Extensions often add product types, sum types, lists, records, and algebraic data types, but the core mechanism does not require them.

Type Schemes

A type scheme is a type with universal quantification over some type variables:

σ ::= \forall α_{1} \dots α_{n} . τ .

The scheme

\forall α . α \to α

means: for every type $α$ , the expression may be used as a function from $α$ to $α$ .

HM is rank-1 polymorphic: universal quantifiers appear only at the outermost level of type schemes stored in $Γ$ .

Type Environments

A type environment maps term variables to type schemes:

Γ : V ⇀ Σ.

For example:

Γ = {id : \forall α . α \to α, n : Int} .

A variable looked up from $Γ$ may be instantiated before use.

Free Type Variables

Let $ftv (τ)$ be the set of free type variables in $τ$ :

ftv (α) ftv (b) ftv (τ_{1} \to τ_{2}) = {α}, = \emptyset, = ftv (τ_{1}) \cup ftv (τ_{2}) .

For type schemes and environments:

ftv (\forall α_{1} \dots α_{n} . τ) ftv (Γ) = ftv (τ) ∖ {α_{1}, \dots, α_{n}}, = x : σ \in Γ ⋃ ftv (σ) .

Free type variables are exactly the type variables that are not locally bound by a $\forall$ .

Generalisation and Instantiation

HM has two complementary operations.

Generalisation

After inferring a type for a let-bound expression, HM quantifies over type variables that are not fixed by the surrounding environment:

gen (Γ, τ) = \forall \overline{α} . τ where \overline{α} = ftv (τ) ∖ ftv (Γ) .

Only variables not already mentioned by $Γ$ may be generalised.

Instantiation

When a polymorphic variable is used, HM replaces its quantified variables by fresh type variables:

inst (\forall α_{1} \dots α_{n} . τ) = τ [β_{1} / α_{1}, \dots, β_{n} / α_{n}],

where $β_{1}, \dots, β_{n}$ are fresh.

Thus each use of a let-bound polymorphic variable receives its own copy of the type.

Typing Rules

Variable

\frac{x : σ \in Γ τ = inst ( σ )}{Γ ⊢ x : τ}_{(var)}

A variable is looked up and instantiated.

Constant

\frac{c : σ \in Γ _{0} τ = inst ( σ )}{Γ ⊢ c : τ}_{(const)}

Constants are treated like predefined variables in a global environment $Γ_{0}$ .

Abstraction

\frac{Γ , x : τ _{1} ⊢ e : τ _{2}}{Γ ⊢ λ x . e : τ _{1} \to τ _{2}}_{(abs)}

The parameter $x$ is monomorphic inside the abstraction.

Application

\frac{Γ ⊢ e _{1} : τ _{1} \to τ _{2} Γ ⊢ e _{2} : τ _{1}}{Γ ⊢ e _{1} e _{2} : τ _{2}}_{(app)}

Application forces the function input type to match the argument type.

Let

\frac{Γ ⊢ e _{1} : τ _{1} Γ , x : gen ( Γ , τ _{1} ) ⊢ e _{2} : τ _{2}}{Γ ⊢ let x = e _{1} in e _{2} : τ _{2}}_{(let)}

This is the key HM rule: infer $e_{1}$ , generalise it, then type $e_{2}$ with $x$ available polymorphically.

Inference Principle

Algorithm W is a standard inference algorithm for HM. It computes either failure or a pair

W (Γ, e) = (S, τ),

where:

$S$ is a type substitution;
$τ$ is the inferred monotype;
$S Γ ⊢ e : τ$ if inference succeeds.

A type substitution maps type variables to types:

S : A ⇀ T .

For application, inference creates a fresh type variable and solves the equality constraint by a most general unifier:

W (Γ, e_{1}) W (S_{1} Γ, e_{2}) U W (Γ, e_{1} e_{2}) = (S_{1}, τ_{1}), = (S_{2}, τ_{2}), = mgu (S_{2} τ_{1}, τ_{2} \to α), = (U \circ S_{2} \circ S_{1}, Uα) .

The occurs check rejects equations such as

α = α \to β,

because they would require an infinite type.

Principal Type Property

Principal type property

If an expression $e$ is typable in HM under $Γ$ , then there exists a monotype $τ_{0}$ such that:

$Γ ⊢ e : τ_{0}$ ;

for every $τ$ with $Γ ⊢ e : τ$ , there is a substitution $S$ with $S τ_{0} = τ$ .

Thus $gen (Γ, τ_{0})$ is the most general type scheme for $e$ .

The principal type property is what makes annotation-free inference practical: the system does not have to guess among unrelated valid types.[^damas-milner]

Worked Example

Consider:

let id = λ x . x in id id .

First infer the bound expression:

x λ x . x : α, : α \to α .

Since $α \in / ftv (\emptyset)$ , generalise:

id : \forall α . α \to α .

In the body, each occurrence is instantiated separately:

id_{1} id_{2} : β \to β, : γ \to γ .

Application requires

β \to β = (γ \to γ) \to δ .

Unification gives

β = γ \to γ, δ = γ \to γ .

Therefore:

let id = λ x . x in id id : γ \to γ .

The whole expression has principal type scheme:

\forall γ . γ \to γ .

Boundary

HM is deliberately small. In its classical form it excludes:

higher-rank polymorphism, where $\forall$ appears inside function arguments;
ad-hoc overloading, unless added by a separate mechanism such as type classes;
general subtyping;
unrestricted polymorphic recursion;
dependent types.

These features can be added to programming languages, but usually at the cost of a simpler principal-type theorem, complete inference, or both.

Lukas' Notes

Hindley–Milner Type System

Table of Contents

Definition

Object Language

Types

Monotypes

Type Schemes

Type Environments

Free Type Variables

Generalisation and Instantiation

Generalisation

Instantiation

Typing Rules

Variable

Constant

Abstraction

Application

Let

Inference Principle

Principal Type Property

Worked Example

Boundary

Backlinks