Union-Find Data Structure

Definition

Union-Find Data Structure

The union-find data structure is a data structure that stores a collection of disjoint sets ( $\forall i, j : S_{i} \cap S_{j} = \emptyset$ ) and provides the following three major functions:

$makeset (v)$ : Creates a new disjoint set $S_{v} = {v}$ where $v$ is called representative of $S_{v}$ .

$union (v, w)$ : Merges two disjoint sets, $S_{v}$ and $S_{w}$ , where $v$ and $w$ are the representatives of their respective sets.

$findset (v)$ : Finds the set to which $v$ belongs.

Implementation

The main idea of the implementation is to represent the disjoint sets as strongly connected graph components:

The above shows two disjoint sets $S_{i}$ and $S_{h}$ . We notice that

$findset (c) = findset (i) = S_{i} = {i, c}$
$findset (g) = findset (f) = findset (h) = S_{h} = {h, f, g}$

Trivially, creating a new disjoint set using $makeset (z)$ would result in a subgraph with a single node $z$ .

To union those two sets, we just have to change the parent of $i$ to $h$ (without loss of generality):

Merging two disjoint sets using $union (v, w)$ have a time complexity of $O (1)$ since we just need to update $p a re n t [i] = h$ , with $p a re n t$ being an array.

Creating a set using $makeset (v)$ also has a time complexity of $O (1)$ .

However, contra-intuitively, $findset(v)$ has an amortised time complexity of $O (1)$ since when iterating over the chain, we can just change the parent of each node in the disjoint set to the representative of that disjoint set, and always attach the lower-depth subgraph to the greater-depth subgraph. Thus:

p a re n t [c] = p a re n t [f] = p a re n t [h] = p a re n t [i] = p a re n t [g] = h

where each array access has a time complexity of $O (1)$ . Therefore, the amortised time complexity of $findset (v)$ is $O (1)$ .

Lukas' Notes

Union-Find Data Structure

Definition

Implementation

Graph View

Table of Contents

Backlinks