Content deleted Content added
→top: fine, put optimality of 2d in lead |
→Dynamic and kinetic: wlnk |
||
(31 intermediate revisions by 11 users not shown) | |||
Line 1:
{{Short description|Shortest network connecting
{{good article}}
[[Image:Euclidean minimum spanning tree.svg|thumb|300px|right|Euclidean minimum spanning tree of 25 random points in the plane]]
Line 7 ⟶ 8:
==Definition and related problems==
A Euclidean minimum spanning tree, for a set of <math>n</math> points in the [[Euclidean plane]] or [[Euclidean space]], is a system of [[line segment]]s, having only the given points as their endpoints, whose union includes all of the points in a [[connected set]], and which has the minimum possible total length of any such system. Such a network cannot contain a [[polygon
Publications on the Euclidean minimum spanning tree commonly abbreviate it as "EMST".{{r|aesw|mrg}} They may also be called "geometric minimum spanning trees",{{r|clarkson|monsur}} but that term may be used more generally for geometric spaces with non-Euclidean distances, such as [[Lp space|{{math|''L''<sup>''p''</sup>}} spaces]].{{r|nzz}} When the context of Euclidean point sets is clear, they may be called simply "minimum spanning trees".{{r|frakau|king|kln}}
Line 14 ⟶ 15:
*The [[Steiner tree problem]] again seeks a system of line segments connecting all given points, but without requiring the segments to start and end only at given points. In this problem, additional points may be added as segment endpoints, allowing the Steiner tree to be shorter than the minimum spanning tree.{{r|gilpol}}
*In the Euclidean [[Travelling salesman problem|traveling salesperson path problem]], the connecting line segments must start and end at the given points, like the spanning tree and unlike the Steiner tree; additionally, each point can touch at most two line segments, so the result forms a [[polygonal chain]]. Because of this restriction, the optimal path may be longer than the Euclidean minimum spanning tree, but is at most twice as long.{{r|shahoe}}
*[[Geometric spanner]]s are low-weight networks that, like the minimum spanning tree, connect all of the points. Unlike the minimum spanning tree, all of these connecting paths are required to be short, having length proportional to the distance between the points they connect. To achieve this property, these networks generally have cycles
==Properties==
Line 21 ⟶ 22:
Whenever two edges of a Euclidean minimum spanning tree meet at a vertex, they must form an angle of 60° or more, with equality only when they form two sides of an [[equilateral triangle]]. This is because, for two edges forming any sharper angle, one of the two edges could be replaced by the third, shorter edge of the triangle they form, forming a tree with smaller total length.{{r|1steiner}} In comparison, the Steiner tree problem has a stronger angle bound: an optimal Steiner tree has all angles at least 120°.{{r|gilpol}}
The same 60° angle bound also occurs in the [[kissing number]] problem, of finding the maximum number of [[unit sphere]]s in Euclidean space that can be tangent to a central unit sphere without any two spheres intersecting (beyond a point of tangency). The center points of these spheres have a minimum spanning tree in the form of a [[star (graph theory)|star]], with the central point adjacent to all other points. Conversely, for any vertex <math>v</math> of any minimum spanning tree, one can construct non-overlapping unit spheres centered at <math>v</math> and at points two units along each of its edges, with a tangency for each neighbor of <math>v</math>. Therefore, in <math>n</math>-dimensional space the maximum possible [[degree (graph theory)|degree]] of a vertex (the number of spanning tree edges
For points generated at random from a given continuous distribution, the minimum spanning tree is [[almost surely]] unique. The numbers of vertices of any given degree
===Empty regions===
Line 29 ⟶ 30:
For any edge <math>uv</math> of any Euclidean minimum spanning tree, the [[Lens (geometry)|lens]] (or [[vesica piscis]]) formed by intersecting the two circles with <math>uv</math> as their radii cannot have any other given vertex <math>w</math> in its interior. Put another way, if any tree has an edge <math>uv</math> whose lens contains a third point <math>w</math>, then it is not of minimum length. For, by the geometry of the two circles, <math>w</math> would be closer to both <math>u</math> and <math>v</math> than they are to each other. If edge <math>uv</math> were removed from the tree, <math>w</math> would remain connected to one of <math>u</math> and <math>v</math>, but not the other. Replacing the removed edge <math>uv</math> by <math>uw</math> or <math>vw</math> (whichever of these two edges reconnects <math>w</math> to the vertex from which it was disconnected) would produce a shorter tree.{{r|gilpol}}
For any edge <math>uv</math> of any Euclidean minimum spanning tree, the [[rhombus]] with angles of 60° and 120°, having <math>uv</math> as its long diagonal, is disjoint from the rhombi formed analogously by all other edges. Two edges sharing an endpoint cannot have overlapping rhombi, because that would imply
===Supergraphs===
Line 42 ⟶ 43:
===Total length===
For <math>n</math> points in the [[unit square]] (or any other fixed shape), the total length of the minimum spanning tree edges is <math>O(\sqrt n)</math>. Some sets of points, such as points evenly spaced in a <math>\sqrt n \times \sqrt n</math> grid, attain this bound.{{r|gilpol}} For points in a unit hypercube in <math>d</math>-dimensional space, the corresponding bound is <math>O(n^{(d-1)/d})</math>.{{r|ss89}} The same bound applies to the expected total length of the minimum spanning tree for <math>n</math> points chosen uniformly and independently from a unit square or unit hypercube.{{r|steele}} Returning to the unit square, the sum of squared edge lengths of the minimum spanning tree is <math>O(1)</math>. This bound follows from the observation that the edges have disjoint rhombi, with area proportional to the edge lengths squared. The <math>O(\sqrt n)</math> bound on total length
Another interpretation of these results is that the average edge length for any set of points in a unit square is <math>O(1/\sqrt n)</math>, at most proportional to the spacing of points in a regular grid; and that for ''random'' points in a unit square the average length is proportional to <math>1/\sqrt n</math>. However, in the random case, with high probability the longest edge has length approximately <math display=block>\sqrt{\frac{\log n}{\pi
Any [[geometric spanner]], a subgraph of a complete geometric graph whose [[Shortest path problem|shortest paths]] approximate the Euclidean distance, must have total edge length at least as large as the minimum spanning tree, and one of the standard quality measures for a geometric spanner is the ratio between its total length and of the minimum spanning tree for the same points. Several methods for constructing spanners, such as the [[greedy geometric spanner]], achieve a constant bound for this ratio.{{r|eppstein}} It has been conjectured that the [[Steiner ratio]], the largest possible ratio between the total length of a minimum spanning tree and Steiner tree for the same set of points in the plane, is <math>2/\sqrt{3}\approx 1.1547</math>, the ratio for three points in an [[equilateral triangle]].{{r|gilpol}}
Line 54 ⟶ 55:
For points in any dimension, the minimum spanning tree can be constructed in time <math>O(n^2)</math> by constructing a [[complete graph]] with an edge between every pair of points, weighted by [[Euclidean distance]], and then applying a graph minimum spanning tree algorithm such as the [[Prim's algorithm|Prim–Dijkstra–Jarník algorithm]] or [[Borůvka's algorithm]] on it. These algorithms can be made to take time <math>O(n^2)</math> on complete graphs, unlike another common choice, [[Kruskal's algorithm]], which is slower because it involves sorting all distances.{{r|eppstein}} For points in low-dimensional spaces, the problem may be solved more quickly, as detailed below.
Computing Euclidean distances involves a [[square root]] calculation. In any comparison of edge weights,
===Two dimensions===
Line 61 ⟶ 62:
# Label each edge with its (squared) length.
# Run a graph minimum spanning tree algorithm. Since there are <math>O(n)</math> edges, this requires <math>O(n\log n)</math> time using any of the standard minimum spanning tree algorithms.
The
If the input coordinates are integers and can be used as [[array index|array indices]], faster algorithms are possible: the Delaunay triangulation can be constructed by a [[randomized algorithm]] in <math>O(n\log\log n)</math> expected time.{{r|bm09}} Additionally, since the Delaunay triangulation is a [[planar graph]], its minimum spanning tree can be found in [[linear time]] by a variant of Borůvka's algorithm that removes all but the cheapest edge between each pair of components after each stage of the algorithm.{{r|eppstein|mares}} Therefore, the total expected time for this algorithm is <math>O(n\log\log n)</math>.{{r|bm09}} In the other direction, the Delaunay triangulation can be constructed from the minimum spanning tree in the near-linear time bound <math>O(n\log^* n)</math>, where <math>\log^*</math> denotes the [[iterated logarithm]].{{r|devillers}}
Line 70 ⟶ 71:
for any <math>\varepsilon>0</math>—faster than the quadratic time bound for the complete graph and Delaunay triangulation algorithms.{{r|aesw}}
The optimal time complexity for higher-dimensional minimum spanning trees remains unknown,{{r|topp}} but
For uniformly random point sets in any bounded dimension, the [[Yao graph]]{{r|yao82}} or Delaunay triangulation have linear expected numbers of edges, are guaranteed to contain the minimum spanning tree, and can be constructed in linear expected time.{{r|bwy|clarkson|dwyer}}
A [[well-separated pair decomposition]] is a family of pairs of subsets of the given points, so that every pair of points belong to one of these pairs of subsets, and so that all pairs of points coming from the same pair of subsets have approximately the same length. It is possible to find a well-separated pair decomposition with a linear number of subsets, and a representative pair of points for each subset, in time <math>O(n\log n)</math>. The minimum spanning tree of the graph formed by these representative pairs is then an approximation to the minimum spanning tree. Using these ideas,
===Dynamic and kinetic===
The Euclidean minimum spanning tree has been generalized in many different ways to systems of moving or changing points:
*If a set of points undergoes a sequence of dynamic insertions or deletions of points, each of these updates
*For <math>n</math> points moving linearly with constant speed, or with more general algebraic motions, the minimum spanning tree will change by a sequence of swaps, in which one edge is removed and another replaces it at a point in time where both have equal length.{{r|kti}} For linear motions, the number of changes is at most slightly larger than <math>n^{25/9}</math>.{{r|chalev}} For more general algebraic motions, there is a near-cubic upper bound on the number of swaps
*The ''minimum moving spanning tree problem'' again concerns points moving linearly with constant speed, over an interval of time, and seeks a single tree that minimizes the maximum sum of weights occurring at any instant during this interval. It is [[NP-hard]] to compute exactly, but can be approximated to within a factor of two in polynomial time.{{r|abb}}
*The [[kinetic Euclidean minimum spanning tree]] problem asks for a [[kinetic data structure]] that can maintain the minimum spanning tree as its points undergo both continuous motions and insertions and deletions. Several papers have studied such structures,{{r|bgz|aegh|rzkin|rakwz|msvw}} and a kinetic structure for algebraically moving points with near-cubic total time, nearly matching the bound on the number of swaps, is known.{{r|rakwz}}
===Lower bound===
An asymptotic lower bound of <math>\Omega(n\log n)</math> of the Euclidean minimum spanning tree problem can be established in restricted models of computation. These include the [[algebraic decision tree]] and [[algebraic computation tree]] models, in which the algorithm has access to the input points only through certain restricted primitives that perform simple algebraic computations on their coordinates. In these models, the [[closest pair of points problem]] requires <math>\Omega(n\log n)</math> time, but the closest pair is necessarily an edge of the minimum spanning tree, so the minimum spanning tree also requires this much time. Therefore, algorithms for constructing the planar minimum
== Applications ==
An obvious application of Euclidean minimum spanning trees is to find the cheapest network of wires or pipes to connect a set of places, assuming the links cost a fixed amount per unit length. The first publications on minimum spanning trees more generally concerned a geographic version of the problem, involving the design of an [[electrical grid]] for [[South Moravian Region|southern Moravia]],{{r|history}} and an application to minimizing wire lengths in circuits was described in 1957 by Loberman and Weinberger.{{r|lobwei}}
Minimum spanning trees are closely related to [[single-linkage clustering]], one of several methods for [[hierarchical clustering]]. The edges of a minimum spanning tree, sorted by their length, give the order in which to merge clusters into larger clusters in this clustering method. Once these edges have been found, by any algorithm, they may be used to construct the single-linkage clustering in time <math>O(n\log n)</math>.{{r|gowros}} Although the long thin cluster shapes produced by single-linkage clustering can be a bad fit for certain types of data, such as [[Mixture model|mixtures of Gaussian distributions]], it can be a good choice in applications where the clusters themselves are expected to have long thin shapes, such as in modeling the [[dark matter halo]]s of [[galaxy|galaxies]].{{r|mrg}} In [[geographic information science]], several researcher groups
Minimum spanning trees have also been used to infer the shape of curves in the plane, given points sampled along the curve. For a smooth curve, sampled more finely than its [[local feature size]], the minimum spanning tree will form a path connecting consecutive points along the curve.
Another application of minimum spanning trees is a [[constant-factor approximation algorithm]] for the [[Euclidean traveling salesman problem]], the problem of finding the shortest [[polygonalization]] of a point set. Walking around the boundary of the minimum spanning tree can approximate the optimal traveling salesman tour within a factor of two of the optimal length.{{r|shahoe}} However, more accurate [[polynomial time approximation scheme|polynomial-time approximation scheme]]s are known for this problem.{{r|bargot}} In [[wireless ad hoc network]]s, [[Broadcasting (networking)|broadcasting]] messages along paths in a minimum spanning tree can be an accurate approximation to the minimum-energy broadcast routing, which is, again, hard to compute exactly.{{r|wclf|chrvp|fknp|ambuhl}}
==Realization==
The ''realization problem'' for Euclidean minimum spanning trees takes an abstract [[tree (graph theory)|tree]] as input
==See also==
Line 133 ⟶ 134:
| title = Algorithms and Data Structures: 17th International Symposium, WADS 2021, Virtual Event, August 9–11, 2021, Proceedings
| volume = 12808
|
| year = 2021| s2cid = 234599877
<ref name=abcfks>{{citation
Line 149 ⟶ 152:
| title = On the area requirements of Euclidean minimum spanning trees
| volume = 47
| year = 2014
<ref name=aegh>{{citation
Line 161 ⟶ 165:
| publisher = IEEE Computer Society
| title = 39th Annual Symposium on Foundations of Computer Science, FOCS '98, November 8–11, 1998, Palo Alto, California, USA
| year = 1998| isbn = 0-8186-9172-7 | s2cid = 2559456 | url = https://infoscience.epfl.ch/record/99344/files/AgarwalEGH98.pdf }}</ref>
▲ | year = 1998}}</ref>
<ref name=aesw>{{citation
Line 181 ⟶ 185:
| last = Ambühl | first = Christoph
| editor1-last = Caires | editor1-first = Luís
| editor2-last = Italiano | editor2-first = Giuseppe F. | editor2-link = Giuseppe F. Italiano
| editor3-last = Monteiro | editor3-first = Luís
| editor4-last = Palamidessi | editor4-first = Catuscia | editor4-link = Catuscia Palamidessi
| editor5-last = Yung | editor5-first = Moti
| contribution = An optimal bound for the MST algorithm to compute energy efficient broadcast trees in wireless networks
Line 192 ⟶ 196:
| title = Automata, Languages and Programming, 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005, Proceedings
| volume = 3580
| isbn = 978-3-540-27580-0
| year = 2005}}</ref>
Line 203 ⟶ 208:
| pages = 1220–1233
| title = Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016
| year = 2016
| isbn = 978-1-61197-433-1
<ref name=bargot>{{citation
Line 213 ⟶ 220:
| pages = 698–706
| title = 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA
| year = 2013
| s2cid = 17514182
| citeseerx = 10.1.1.409.1291
}}</ref>
<ref name=bdek>{{citation
Line 252 ⟶ 262:
| publisher = Association for Computing Machinery
| title = Proceedings of the Thirteenth Annual Symposium on Computational Geometry, Nice, France, June 4–6, 1997
| year = 1997
| doi-access = free
| isbn = 0-89791-878-9
}}</ref>
<ref name=bm09>{{citation
Line 264 ⟶ 277:
| title = Delaunay triangulations in {{mvar|O}}(sort({{mvar|n}})) time and more
| volume = 58
| year = 2011
}}</ref>
<ref name=bwy>{{citation
Line 277 ⟶ 291:
| title = Optimal expected-time algorithms for closest point problems
| volume = 6
| year = 1980| s2cid = 17238717 | url = https://figshare.com/articles/journal_contribution/6608108 | doi-access = free
▲ | year = 1980}}</ref>
}}</ref>
<ref name=cck>{{citation
Line 291 ⟶ 306:
| title = Experimental Algorithms: 9th International Symposium, SEA 2010, Ischia Island, Naples, Italy, May 20-22, 2010, Proceedings
| volume = 6049
| isbn = 978-3-642-13192-9
| year = 2010}}</ref>
Line 302 ⟶ 318:
| title = A dynamic data structure for 3-D convex hulls and 2-D nearest neighbor queries
| volume = 57
| year = 2010| s2cid = 47454142 }}</ref>
<ref name=chalev>{{citation
Line 313 ⟶ 329:
| title = On levels in arrangements of curves
| volume = 29
| year = 2003
}}</ref>
<ref name=chrvp>{{citation
Line 326 ⟶ 343:
| publisher = IEEE Computer Society
| title = 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 22-26 April 2003, Nice, France, Proceedings
| year = 2003
| s2cid = 17863487
}}</ref>
<ref name=clarkson>{{citation
| last = Clarkson | first = Kenneth L. | author-link = Kenneth L. Clarkson
| doi = 10.1007/BF01553902
| issue =
| journal = [[Algorithmica]]
| mr = 1019387
Line 337 ⟶ 356:
| title = An algorithm for geometric minimum spanning trees requiring nearly linear expected time
| volume = 4
| year = 1989| s2cid = 22176641 }}</ref>
<ref name=comment>{{citation
Line 347 ⟶ 366:
| title = Comment: Algorithms for computing relative neighborhood graph
| volume = 16
| year = 1980| bibcode = 1980ElL....16..860T }}; reply by Urquhart, [[doi:10.1049/el:19800612|pp. 860–861]]</ref>
<ref name=devillers>{{citation
Line 359 ⟶ 378:
| url = https://hal.inria.fr/inria-00167206/file/hal.pdf
| volume = 2
| year = 1992
}}</ref>
<ref name=dwyer>{{citation
Line 370 ⟶ 390:
| title = Higher-dimensional Voronoi diagrams in linear expected time
| volume = 6
| year = 1991
}}</ref>
<ref name=eppstein>{{citation
Line 405 ⟶ 426:
| title = Dynamic Euclidean minimum spanning trees and extrema of binary functions
| volume = 13
| year = 1995
}}</ref>
<ref name=fknp>{{citation
Line 419 ⟶ 441:
| title = Improved approximation results for the minimum energy broadcasting problem
| volume = 49
| year = 2007
}}</ref>
<ref name=frakau>{{citation
Line 431 ⟶ 454:
| title = Polynomial area bounds for MST embeddings of trees
| volume = 44
| year = 2011
| doi-access = free
}}</ref>
<ref name=gilpol>{{citation
Line 443 ⟶ 468:
| title = Steiner minimal trees
| volume = 16
| year = 1968| issue = 1 }}</ref>
<ref name=gowros>{{citation
Line 455 ⟶ 480:
| title = Minimum spanning trees and single linkage cluster analysis
| volume = 18
| year = 1969| issue = 1
}}</ref> <ref name=history>{{citation
Line 467 ⟶ 493:
| title = On the history of the minimum spanning tree problem
| volume = 7
| year = 1985| s2cid = 10555375 }}</ref>
<ref name=king>{{citation
Line 490 ⟶ 516:
| title = A randomized linear-time algorithm to find minimum spanning trees
| volume = 42
| year = 1995
}}</ref>
<ref name=kln>{{citation
Line 515 ⟶ 542:
| title = On minimum and maximum spanning trees of linearly moving points
| volume = 13
| year = 1995
}}</ref>
<ref name=lee>{{citation
Line 526 ⟶ 554:
| title = Curve reconstruction from unorganized points
| volume = 17
| year = 2000
}}</ref>
<ref name=lobwei>{{citation
Line 537 ⟶ 566:
| pages = 428–437
| title = Formal procedures for connecting terminals with a minimum total wire length
| volume = 4
| doi-access = free
}}</ref>
<ref name=mares>{{citation
Line 560 ⟶ 591:
| title = Transitions in geometric minimum spanning trees
| volume = 8
| year = 1992
| doi-access = free
}}</ref>
<ref name=mrg>{{citation
Line 574 ⟶ 607:
| pages = 603–612
| title = Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25-28, 2010
| year = 2010
}}</ref>
<ref name=msvw>{{citation
Line 591 ⟶ 625:
| title = LATIN 2018: Theoretical Informatics – 13th Latin American Symposium, Buenos Aires, Argentina, April 16–19, 2018, Proceedings
| volume = 10807
|
| year = 2018| s2cid = 4709616
| url = https://research.tue.nl/nl/publications/bde45eae-314a-42b4-8dca-9a0a420f38e1
}}</ref>
<ref name=nzz>{{citation
Line 611 ⟶ 648:
| title = Offline algorithms for dynamic minimum spanning tree problems
| volume = 17
| year = 1994| url = https://escholarship.org/uc/item/6j46j8nc }}</ref>
<ref name=penrose>{{citation
Line 622 ⟶ 659:
| title = The longest edge of the random minimal spanning tree
| volume = 7
| year = 1997
}}</ref>
<ref name=presha>{{citation
Line 634 ⟶ 672:
| series = Texts and Monographs in Computer Science
| title = Computational Geometry: An Introduction
| year = 1985| s2cid = 206656565 }}</ref>
<ref name=rakwz>{{citation
Line 649 ⟶ 687:
| title = A simple, faster method for kinetic proximity problems
| volume = 48
| year = 2015
| doi-access = free
| arxiv = 1311.2032
}}</ref>
<ref name=robsal>{{citation
Line 661 ⟶ 702:
| title = Low-degree minimum spanning trees
| volume = 14
| year = 1995
| doi-access = free
}}</ref>
<ref name=rzcomb>{{citation
Line 681 ⟶ 724:
| title = Kinetic Euclidean minimum spanning tree in the plane
| volume = 16
| year = 2012
}}</ref>
<ref name=shahoe>{{citation
Line 693 ⟶ 737:
| title = 16th Annual Symposium on Foundations of Computer Science, Berkeley, California, USA, October 13-15, 1975
| title-link = Symposium on Foundations of Computer Science
| year = 1975| s2cid = 40615455 }}</ref>
<ref name=ss89>{{citation
Line 705 ⟶ 749:
| title = Worst-case growth rates of some classical problems of combinatorial optimization
| volume = 18
| year = 1989| url = https://repository.upenn.edu/cgi/viewcontent.cgi?article=1048&context=oid_papers }}</ref>
▲ | year = 1989}}</ref>
<ref name=sse>{{citation
Line 718 ⟶ 762:
| title = On the number of leaves of a Euclidean minimal spanning tree
| volume = 24
| year = 1987| jstor = 3214207 | s2cid = 29026025 }}</ref>
<ref name=steele>{{citation
Line 748 ⟶ 792:
| pages = 450–475
| title = An extended minimum spanning tree method for characterizing local urban patterns
| volume = 32
}}</ref>
<ref name=wclf>{{citation
Line 761 ⟶ 806:
| title = Minimum-energy broadcasting in static ad hoc wireless networks
| volume = 8
| year = 2002
}}</ref>
<ref name=yao82>{{citation
|