Disjoint-set data structure: Difference between revisions

Content deleted Content added
External links: Java implementation
Line 125:
 
In an efficient implementation, tree height is controlled using '''union by size''' or '''union by rank'''. Both of these require a node to store information besides just its parent pointer. This information is used to decide which root becomes the new parent. Both strategies ensure that trees do not become too deep.
 
====Union by size====
 
In the case of union by size, a node stores its size, which is simply its number of descendants (including the node itself). When the trees with roots {{mvar|x}} and {{mvar|y}} are merged, the node with more descendants becomes the parent. If the two nodes have the same number of descendants, then either one can become the parent. In both cases, the size of the new parent node is set to its new total number of descendants.
Line 150 ⟶ 152:
 
The number of bits necessary to store the size is clearly the number of bits necessary to store {{mvar|n}}. This adds a constant factor to the forest's required storage.
 
====Union by rank====
 
For union by rank, a node stores its {{em|rank}}, which is an upper bound for its height. When a node is initialized, its rank is set to zero. To merge trees with roots {{mvar|x}} and {{mvar|y}}, first compare their ranks. If the ranks are different, then the larger rank tree becomes the parent, and the ranks of {{mvar|x}} and {{mvar|y}} do not change. If the ranks are the same, then either one can become the parent, but the new parent's rank is incremented by one. While the rank of a node is clearly related to its height, storing ranks is more efficient than storing heights. The height of a node can change during a <code>Find</code> operation, so storing ranks avoids the extra effort of keeping the height correct. In pseudocode, union by rank is: