Expected linear time MST algorithm: Difference between revisions

Content deleted Content added
FrescoBot (talk | contribs)
m Bot: fixing section wikilinks and minor changes
Yobot (talk | contribs)
m WP:CHECKWIKI error fixes using AWB (9075)
Line 29:
| booktitle = Proceedings of the 4th International Workshop on Algorithms and Data Structures
| year = 1995
| pages = 440-448440–448
| url = http://dl.acm.org/citation.cfm?id=645930.672859
| publisher = Springer-Verlag
Line 50:
Output: The edges selected in step 1 and the contracted graph ''G'''
A Borůvka Step is equivalent to the inner loop of [[Borůvka's algorithm]] which runs in ''O''(''m'') time where ''m'' is the number of edges in ''G''. Furthermore, since each edge can be selected at most twice (once by each incident vertex) the maximum number of disconnected components after step 1 is equal to half the number of vertices. Thus, a Borůvka step reduces the number of vertices in the graph by at least a factor of two and deletes at least ''n''/2 edges where ''n'' is the number of vertices in ''G''.
 
<br />
<br />
Example Execution of a Borůvka Step
{| border=1 cellspacing=2 cellpadding=5 class="wikitable"
Line 85 ⟶ 84:
 
==Correctness==
Correctness is proved by induction on the number of vertices in the graph. The base case is trivially true. Let ''T*'' be the minimum spanning tree of ''G''. Every edge selected in a Borůvka step is in ''T*'' by the [[Minimum_spanning_tree#Cut_property|cut property]] and none of the edges removed to form the contracted graph are in ''T*'' by the [[Minimum_spanning_tree#Cut_property|cut property]] (for redundant edges) and the [[Minimum_spanning_tree#Cycle_property|cycle property]] (for self loops). The remaining edges of ''T*'' not selected in step 2 form the [[minimum spanning tree]] of the contracted graph by the [[Minimum_spanning_tree#Cut_property|cut property]] (let each cut be a supernode). Every '''[[#F-heavy and F-light Edges|'''F-heavy edge]]''']] deleted is not in the minimum spanning tree by the [[Minimum_spanning_tree#Cycle_property|cycle property]]. Finally ''F<nowiki>'</nowiki>'' is the minimum spanning tree of the contracted graph by the inductive hypothesis. Thus ''F<nowiki>'</nowiki>'' and the edges contracted edges from the Borůvka steps form the minimum spanning tree.
 
==Performance==
The expected performance is a result of the random sampling step. The effectiveness of the random sampling step is described by the following lemma which places a bound on the number of '''[[#F-heavy and F-light Edges|'''F-light]]''']] edges in ''G'' thereby restricting the size of the second subproblem.
 
===Random Sampling Lemma===
'''Lemma'''- Let ''H'' be a subgraph of ''G'' formed by including each edge of ''G'' independently with probability ''p'' and let ''F'' be the minimum spanning forest of ''H''. The [[Expected value|expected number]] of '''[[#F-heavy and F-light Edges|'''F-light]]''']] edges in ''G'' is at most ''n/p'' where ''n'' is the number of vertices in ''G''
 
To prove the lemma examine the edges of ''G'' as they are being added to ''H''. The number of [[#F-heavy and F-light Edges|F-light]] edges in ''G'' is independent of the order in which the edges of ''H'' are selected since the minimum spanning forest of ''H'' is the same for all selection orders. For the sake of the proof consider selecting edges for ''H'' by taking the edges of ''G'' one at a time in order of edge weight from lightest to heaviest. Let ''e'' be the current edge being considered. If the endpoints of ''e'' are in two disconnected components of ''H'' then ''e'' is the lightest edge connecting those components and if it is added to ''H'' it will be in ''F'' by the [[Minimum_spanning_tree#Cut_property|cut property]]. This also means ''e'' is [[#F-heavy and F-light Edges|F-light]] regardless of whether or not it is added to ''H'' since only heavier edges are subsequently considered. If both endpoints of ''e'' are in the same component of ''H'' then it is (and always will be) F-heavy by the [[Minimum_spanning_tree#Cycle_property|cycle property]]. Edge ''e'' is then added to ''H'' with probability ''p''.