data-structures

Definition

Hash Function

A hash function is a deterministic function used in computer systems to compute a position in a data structure from a key without the need to search.

Note that does not need to be surjective, see quality.

Quality

An hash function is called optimal if it has no collisions. However, since resources are limited, this is not always feasible for general domains.

Methods

Modulo Method

Let be the size of a data structure, e.g. an array and be a key, then a hash function ca be defined as:

Multiplication Method

Let be a key and be an irrational number, then a hash function can be defined as:

However, a more optimal hash distribution can be achieved using the reciprocal of the golden ratio:

Collision

Let be two different keys. Two keys are called colliding if:

meaning that both keys have the same hash values and thus same positions in e.g. an array.

This must be addressed by the data structure. For instance, an hash table achieves this by storing a singly linked list at each position of the array (separate chaining), meaning multiple values can be stored at each position.

See probing.