# This content is archived!

For the 2018-2019 school year, we have switched to using the WLMOJ judge for all MCPT related content. This is an archive of our old website and will not be updated.# Disjoint Set

This is an example of a disjoint set:`{1, 2, 4}, {5}, {7, 8}`

. There are 3 sets in total, wich each one containing various elements. Note that no element is present in more than one set, making the collection of sets ‘disjoint’ from one another.

A disjoint set supports the following operations:

Operation | Description |
---|---|

`find(e)` |
Determine which set the given element e is in |

`union(e1, e2)` |
Merge the sets containing the elements e1 and e2 into one set |

Disjoint sets can be used to solve various problems quickly, an example given below:

Given an undirected graph G, perform Q operations, either of Type A or B.

Type A: Create a bidirectional edge (U,V) connecting vertices U, V

Type B: Determine whether vertices U,V are connected to each other

# Explanation

To keep track of the various elements in each set, a disjoint set appoints an arbitrary element from each set to act as the ‘representative’ of the entire set. Each set can now be represented as a rooted tree, with the representative of the set acting as the root.

In the tree, each element is connected to another element, with the exception of the root, which is connected to itself. Only elements belonging to the same set will be connected to each other. To keep track of the various connections, a `parent`

array is used, where `parent[e]`

represents the parent element of e.

An example is shown below, representing the collection of sets `{1, 2, 4}, {5}, {7, 8}`

:

# Implementation

When initializing the disjoint set, it is assumed that all elements are initially in their own set. This can be implemented as follows:

Using this ‘rooted tree’ structure, it is simple to find which set a given element belongs to, i.e. the `find()`

operation. Using DFS, we can traverse up the tree of the given element until the root is found, and returned.

Note that the `find()`

operation modifies the elements to link directly to the root of the tree for faster access in the future. For example, if `find(4)`

is called on the above graph, the tree is modified as illustrated below:

Implementing the `union()`

operation is also relatively easy. To merge the sets containing `e1`

and `e2`

, the roots of the trees containing `e1`

and `e2`

must be found. If the roots are identical, `e1`

and `e2`

are already in the same set and no work needs to be done.

Otherwise, the root of the larger tree will become the root of the smaller tree. This is done to improve the performance of the `find()`

operation on elements of the smaller tree (as less `parent`

values will have to be updated to account for the new root of the tree).

But how do we know which tree has more elements? In order to always assign the smaller tree to the root of the larger tree, a second array, `rank`

, will be used, where `rank[e]`

represents the ‘depth’ of the tree rooted at `e`

(relative to the other trees).

If `rank[e1]`

> `rank[e2]`

, this means that `e1`

is part of a larger tree than `e2`

, and that `parent[find(e2)]`

should be assigned to `find(e1)`

.

The result of the operation `union(4, 7)`

on the original example is shown below:

The ideas discussed above are implemented below:

## Time Complexity

**Construction**: O(N), where N is the total number of elements.

**Find operation**: nearly O(1) amortized, explanation of this is beyond the scope of this lesson.

**Union operation**: nearly O(1) amortized, explanation of this is beyond the scope of this lesson.

## Space Complexity

O(N), where N is the total number of elements.

# Conclusion

The disjoint set is a powerful data structure which can perform find() and union() operations in near constant time. Using disjoint sets, the solution to the given problem can be implemented in approximately O(N+Q) time.