# Union-Find Algorithm for Disjoint Sets (with code)

## What is the Union-Find Algorithm?

The union-find algorithm is a powerful tool in computer science used to maintain a collection of disjoint sets. But what is a disjoint set? It is a set of subsets of a larger set where each element belongs to exactly one subset. In other words, the subsets are disjoint and their union forms the original set.

Let's take an example to understand this concept. Consider the set {1, 2, 3, 4, 5}. We can partition this set into two subsets: {1, 2, 3} and {4, 5}. These subsets are disjoint because they have no elements in common, and their union forms the original set.

The basic idea of the union-find algorithm is to partition a set into disjoint subsets, where each subset contains one or more elements. These subsets are usually represented as trees, where each node in a tree represents an element in the set, and the parent node represents the parent set.

The union-find algorithm supports two fundamental operations, namely union and find. The union operation combines two sets into a single set, while the find operation returns the representative element of a set.

The implementation of the union-find algorithm involves maintaining a set of parent pointers and a rank for each element in the set. The parent pointer of an element points to the parent node of its corresponding set, while the rank is a measure of the height of the tree rooted at that node.

### The union operation

The union operation involves finding the root nodes of the two sets to be combined and then attaching the smaller tree to the larger tree by setting the root node of the smaller tree as a child of the root node of the larger tree. This ensures that the rank of the resulting tree is not more than one greater than the rank of the larger tree.

### The find operation

The find operation involves recursively following the parent pointers from a given node until the root node is reached. This operation is usually implemented with path compression, where the parent pointers of all nodes along the path to the root node are updated to point directly to the root node. This technique ensures that subsequent find operations are faster as the depth of the tree is reduced.

### Union-Find Algorithm Example

Let's walk through an example of the Union-Find algorithm.

Suppose we have a set of 7 elements: {0, 1, 2, 3, 4, 5, 6}. Initially, each element belongs to its own set. We can represent each set as a tree, with each node representing an element in the set and the root node representing the set itself. Initially, each node is its own root.

Now, let's say we want to merge the sets containing elements 1 and 2. We find the roots of their sets, which are 1 and 2 respectively. We then make the root of the smaller set (1) a child of the root of the larger set (2), by setting the parent of node 1 to 2.

Now, the set containing elements {1, 2} has been merged into the set containing element 2.

Next, let's say we want to merge the sets containing elements 2 and 4. We find the roots of their sets, which are 2 and 4 respectively. Since the roots are different, we can make the root of the smaller set (2) a child of the root of the larger set (4), by setting the parent of node 2 to 4.

Now, the set containing elements {1, 2, 4} has been merged into the set containing element 4.

Finally, let's say we want to check whether elements 0 and 5 belong to the same set. We find the roots of their sets, which are 0 and 5 respectively. Since the roots are different, we can conclude that elements 0 and 5 do not belong to the same set.

This is a simple example of how the Union-Find algorithm can be used to merge query sets efficiently. The algorithm has a time complexity of O(alpha(N)), where N is the number of elements in the set and alpha(N) is the inverse Ackermann function, which grows very slowly and can be considered constant for most practical purposes.

### C++ Implementation of Union-Find Algorithm

Below is the code in C++ to implement it:

```#include<iostream>
#include<vector

using namespace std;

class UnionFind {
private:
vector<int> parent, rank;
public:
UnionFind(int n) {
parent.resize(n);
rank.resize(n, 0);
for (int i = 0; i < n; i++) {
parent[i] = i;
}
}

int find(int x) {
if (parent[x] == x) {
return x;
}
return parent[x] = find(parent[x]); // path compression
}

void merge(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
};

int main() {
UnionFind uf(5);
uf.merge(0, 1);
uf.merge(1, 2);
uf.merge(3, 4);
cout << uf.find(2) << endl; // output: 0
cout << uf.find(3) << endl; // output: 3
uf.merge(2, 3);
cout << uf.find(4) << endl; // output: 0
return 0;
}
```

Output:

```0
3
0```

In this implementation, we define a `UnionFind` class with private member variables `parent` and `rank`. The constructor initializes the parent array with each element pointing to itself, and the rank array with all zeros.

The `find` function follows the parent pointers until it reaches the root node, and implements path compression by setting the parent of each node along the path to the root.

The `merge` function finds the root nodes of the two sets to be merged, and attaches the smaller tree to the larger tree to ensure that the resulting tree has a balanced height.

In the main function, we create a new instance of the `UnionFind` class with 5 elements and perform some merges and finds to demonstrate the functionality of the algorithm.

### Time & Space Complexity

The time complexity of a sequence of m find and unite operations is O(m * log(n)) with path compression and weighted quick-union. This is because the time complexity of the find operation is O(log(n)) and unite operation is O(log(n)).

The space complexity of the Union-Find data structure is O(n), where n is the number of elements. This is because we need to store an array of size n for the parent array and another array of size n for the rank array.

The biggest advantage of the union-find algorithm is its scalability. The Union-Find algorithm can be easily parallelized, making it an excellent choice for distributed systems. Also, it is relatively simple to understand and implement.

But on the other side, space complexity is too high. The Union-Find algorithm is designed only to solve the dynamic connectivity problem. It cannot be used for more general graph problems, such as finding shortest paths or minimum spanning trees.

## Conclusion

The Union-Find algorithm has a wide range of applications, primarily in problems related to image processing graph theory, social networks, and network connectivity. You can identify and segment objects in an image with it or identify clusters of individuals who are connected to each other.

### FavTutor - 24x7 Live Coding Help from Expert Tutors! 