Definition
Union-Find Data Structure
The union-find data structure is a data structure that stores a collection of disjoint sets () and provides the following three major functions:
- : Creates a new disjoint set where is called representative of .
- : Merges two disjoint sets, and , where and are the representatives of their respective sets.
- : Finds the set to which belongs.
Implementation
The main idea of the implementation is to represent the disjoint sets as strongly connected graph components:
The above shows two disjoint sets and . We notice that
Trivially, creating a new disjoint set using would result in a subgraph with a single node .
To union those two sets, we just have to change the parent of to (without loss of generality):
Merging two disjoint sets using have a time complexity of since we just need to update , with being an array.
Creating a set using also has a time complexity of .
However, contra-intuitively, has an amortised time complexity of since when iterating over the chain, we can just change the parent of each node in the disjoint set to the representative of that disjoint set, and always attach the lower-depth subgraph to the greater-depth subgraph. Thus:
where each array access has a time complexity of . Therefore, the amortised time complexity of is .