Library sort: Difference between revisions

Content deleted Content added
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0
Jafarre (talk | contribs)
Library sort is neither adaptive nor stable. Changed best-case performance bound and explanation of the features. Also, the total cost of rebalancing is linear. Fixed that and added the logarithmic time for the insertion step. Finally, I can't see almost any similarity with skip lists, so I removed that sentence, which was only a subjective opinion.
Line 5:
|data=[[Array data structure|Array]]
|time=<math>O(n^2)</math>
|best-time=<math>O(n\log n)</math>
|average-time=<math>O(n\log n)</math>
|space=<math>O(n)</math>
Line 15:
The algorithm was proposed by [[Michael A. Bender]], [[Martín Farach-Colton]], and [[Miguel Mosteiro]] in 2004<ref>{{cite arxiv |arxiv=cs/0407003 |title=Insertion Sort is O(n log n) |date=1 July 2004 |last1=Bender |first1=Michael A. |last2=Farach-Colton |first2=Martín |authorlink2=Martin Farach-Colton |last3=Mosteiro |first3=Miguel A.}}</ref> and was published in 2006.<ref name="definition">{{cite journal | journal=Theory of Computing Systems | volume=39 | issue=3 | pages=391–397 | date=June 2006 | last1=Bender | first1=Michael A. | last2=Farach-Colton | first2=Martín | authorlink2=Martin Farach-Colton | last3=Mosteiro | first3=Miguel A. | title=Insertion Sort is O(n log n) | doi=10.1007/s00224-005-1237-z | url=http://csis.pace.edu/~mmosteiro/pub/paperToCS06.pdf | arxiv=cs/0407003 | access-date=2017-09-07 | archive-url=https://web.archive.org/web/20170908070035/http://csis.pace.edu/~mmosteiro/pub/paperToCS06.pdf | archive-date=2017-09-08 | url-status=dead }}</ref>
 
Like the insertion sort it is based on, library sort is a [[stable sort|stable]] [[comparison sort]] and can be run as an [[online algorithm]]; however, it was shown to have a high probability of running in O(n log n) time (comparable to [[quicksort]]), rather than an insertion sort's O(n<sup>2</sup>). The mechanism used for this improvement is very similar to that of a [[skip list]]. There is no full implementation given in the paper, nor the exact algorithms of important parts, such as insertion and rebalancing. Further information would be needed to discuss how the efficiency of library sort compares to that of other sorting methods in reality.
 
Compared to basic insertion sort, the drawback of library sort is that it requires extra space for the gaps. The amount and distribution of that space would be implementation dependent. In the paper the size of the needed array is ''(1 + ε)n'',<ref name="definition" /> but with no further recommendations on how to choose ε. Moreover, it is neither adaptive nor stable. In order to warrant the with-high-probability time bounds, it requires to randomly permute the input, what changes the relative order of equal elements and shuffles any presorted input. Also, the algorithm uses binary search to find the insertion point for each element, which does not take profit of presorted input.
 
One weakness of [[insertion sort]] is that it may require a high number of swap operations and be costly if memory write is expensive. Library sort may improve that somewhat in the insertion step, as fewer elements need to move to make room, but is also adding an extra cost in the rebalancing step. In addition, locality of reference will be poor compared to [[mergesort]] as each insertion from a random data set may access memory that is no longer in cache, especially with large data sets.
Line 29:
 
# '''Binary Search''': Finding the position of insertion by applying binary search within the already inserted elements. This can be done by linearly moving towards left or right side of the array if you hit an empty space in the middle element.
# '''Insertion''': Inserting the element in the position found and swapping the following elements by 1 position till an empty space is hit. This is done in logarithmic time, with high probability.
# '''Re-Balancing''': Inserting spaces between each pair of elements in the array. ThisThe takescost of rebalancing is linear time,in andthe becausenumber thereof areelements logalready ninserted. roundsAs inthese lengths increase with the algorithmpowers of 2 for each round, the total re-balancingcost takesof O(nrebalancing logis n)also time onlylinear.
 
===Pseudocode===