Union-Find, also known as Disjoint Set Union (DSU) is a powerful data structure for solving graph connectivity problems like checking whether two nodes are connected, detecting cycles, or grouping elements efficiently.
Using union find, we can easily tell whether two elements belong to the same group and if not, it lets us merge groups together.
In this chapter, I’ll cover:
Union-Find is a data structure that helps us manage a collection of elements divided into disjoint sets meaning no two sets overlap.
Every element belongs to exactly one set, and sets can be merged when a connection is discovered.
Think of it like tracking friend groups:
Union-Find supports two fundamental operations:
find(Alice) and find(Bob) will return the same root.These two operations make Union-Find extremely powerful for solving graph connectivity problems, such as:
Now that we understand what Union-Find does, let’s see how to implement it.
The simplest way to represent Union-Find is with an array called parent.
parent array. Initially, every node is its own parent, meaning each element is its own setStep 2: Find Operation
Step 3: Union Operation
Let’s say we start with 5 elements: [0, 1, 2, 3, 4].
union(0, 1), elements 0 and 1 share the same root.union(2, 3), elements 2 and 3 share the same root.find(1) and find(0), they’ll both return 0, meaning they’re in the same set.In this basic version:
The basic implementation of Union-Find works — but it can get slow if the parent pointers form tall chains. In the worst case, operations can take O(n) time.
To fix this, we use two powerful optimizations: Path Compression and Union by Rank (or Size).
With both applied, all operations run in almost constant time — O(α(n)), where α is the inverse Ackermann function (effectively constant for all practical input sizes).
Lets start with path compression.
Path compression flattens the structure of the tree whenever we call find.
find(x) climbs one parent at a time until it reaches the root.Imagine a long chain of 10 nodes. Without path compression, you’d walk through all 10 nodes every time. With path compression, after just one find, all 10 point directly to the root — making future finds O(1).
Next optimization is Union by rank.
This optimization helps keep the tree shallow when merging sets.
When path compression and union by rank are used together, Union-Find operations (find and union) run in near O(1) time.