Smoothsort: differenze tra le versioni

Naviga nella cronologia in modo interattivo

Contenuto cancellato Contenuto aggiunto

VisualeWikitesto

Versione delle 11:15, 4 mag 2010 modifica Leo72 (discussione \| contributi) Utenti autoverificati 9 247 modifiche ←Nuova pagina: {{Infobox Algoritmo \|class=Algoritmo di ordinamento \|image=[[Image:Smoothsort.gif\|none\|Lo Smoothsort durante l'ordinamento di una lista già abbastanza ordinata ma con...		Versione attuale delle 11:32, 10 nov 2022 modifica annulla StefanoTrv (discussione \| contributi) Utenti autoverificati 8 522 modifiche m Rimuovo articolo in inglese nascosto in commento presente dalla creazione della pagina nel 2010: se si vuole continuare la traduzione (dopo 12 anni) si può usare una versione più recente.
(18 versioni intermedie di 11 utenti non mostrate)
Riga 1: {{~~Infobox~~ Algoritmo \|~~class~~classe = [[Algoritmo di ordinamento]] \|~~image~~immagine =~~[[Image:~~ Smoothsort.gif \|~~none\|~~didascalia = Lo Smoothsort durante l'ordinamento di una lista già abbastanza ordinata ma con qualche elemento fuori sequenza.]] \|struttura dati = [[Array]] ~~\|caption=Lo Smoothsort durante l'ordinamento di una lista già abbastanza ordinata ma con qualche elemento fuori sequenza.~~ \|tempo = <math>O(n\log n)</math> ~~\|data=[[Array]]~~ \|~~time~~tempo migliore = <math>O(~~n\log~~ n)</math> \|~~best-time~~spazio = <math>O(n)</math> totale, <math>O(1)</math> ausiliario \|ottimale = Quando i dati sono già ordinati ~~\|space=<math>O(n)</math> totale, <math>O(1)</math> ausiliario~~ \|completo = ~~\|optimal=Quando i dati sono già ordinati~~ ~~\|complete=~~ }} In [[informatica]] lo '''Smoothsort'''<ref>~~http~~[https://www.cs.utexas.edu/users/EWD/ewd07xx/EWD796a.PDF ewd796a<!-- Titolo generato automaticamente -->]</ref><ref>~~http~~[https://www.cs.utexas.edu/~EWD/transcriptions/EWD07xx/EWD796a.html E.W.Dijkstra Archive: Smoothsort, an alternative for sorting in situ (EWD 796a)<!-- Titolo generato automaticamente -->]</ref> (metodo) è un [[algoritmo di ordinamento]] particolarmente indicato per ordinare liste di dati già parzialmente ordinate. Lo Smoothsort è una variante dell'[[Heap sort]] sviluppata da [[Edsger Dijkstra]] nel [[1981]]: come l'Heap sort anche lo Smoothsort presenta il limite computazionale massimo pari a [[O-grande\|O]](''n'' log ''n''). Lo Smoothsort, però, si avvicina ad un tempo O(''n'') se [[algoritmi di ordinamento adattivi\|i dati in ingresso sono già parzialmente ordinati]], mentre l'Heap sort mediamente impiega O(''n'' log ''n''), indifferentemente dal livello di ordinamento iniziale. ~~<!--~~ ~~==Overview==~~ The array to be sorted is divided up into a string of heaps, each heap having a size equal to one of the [[Leonardo numbers]] L(n). The division process is simple - the leftmost nodes of the array are made into the largest heap possible, and the remainder is likewise divided up. It can be proved that ==Analisi== * Any array of any length can so be divided up into sections of size L(x). La lista da ordinare viene divisa in una stringa di [[Heap (struttura dati)\|heap]], ognuna delle quali di dimensione pari ad uno dei [[numero di Leonardo\|numeri di Leonardo]] L(n). Il processo di divisione è semplice: i nodi più a sinistra della lista sono divisi nell'heap più grande possibile, ed i rimanenti sono divisi allo stesso modo. Si può dimostrare che: : Proof: trivially - the smallest L(0) is 1. No two heaps will have the same size. The string will therefore be a string of heaps strictly descending in size. Also, a bit array can be used to keep track of which Leonardo numbers are being used as heap sizes (most implementations use bit-fiddling to accomplish this). : Proof: The Leonardo numbers L(n) grow more slowly than 2<sup>n</sup>, and so for any array there will always be an L(n) that is greater than half the size of the array. The exception is the array of size 2, but can be divided into two heaps of size L(1) and L(0), which are both 1 by definition. No two heaps will have sizes that are consecutive Leonardo numbers, except for possibly the final two. : Proof: if there is anything left - even a single element - after using two consecutive Leonardo numbers L(x+1) and L(x), we could have combined them to make a larger chunk of size L(x+2). Since we didn't, there can be nothing left after two heaps of size L(x+1) and L(x). qualsiasi lista di qualsiasi dimensione può essere divisa in sezioni di dimensione L(x). Each heap, having a size of L(x), is structured from left to right as a sub-heap of size L(x-1), a sub-heap of size L(x-2), and a root node, with the exception of heaps with a size of L(1) and L(0) (which by definition are 1). Each heap maintains the heap property that a root node is always >= the root nodes of its child heaps (and therefore >= all nodes in its child heaps), and the string of heaps as a whole maintains the string property that the root node of each heap is >= the root node of the heap to the left. : Verifica: banalmente, la più piccola sezione L(0) è 1. Non possono esserci 2 heap con la stessa dimensione. La stringa sarà dunque una stringa di heap con dimensioni forzatamente in ordine decrescente. Si può inoltre utilizzare un [[array]] di bit per tenere traccia di quali numeri di Leonardo sono usati come dimensioni degli heap (molte implementazioni utilizzano a tale scopo i bit meno significativi). : Verifica: i numeri di Leonardo L(n) crescono meno rapidamente di 2<sup>n</sup>, per cui per ogni array ci sarà sempre un L(n) la cui dimensione sarà più grande della metà di quella dell'array. L'eccezione è l'array di dimensione 2, ma che può essere diviso in 2 heap di dimensione L(1) e L(0) che sono, per definizione, entrambi di valore 1. Non possono esserci 2 heap le cui dimensioni siano due numeri consecutivi di Leonardo, eccetto per gli ultimi due. * Verifica: se fosse rimasto qualcosa, anche un singolo elemento, dopo aver usato due numeri di Leonardo consecutivi L(x+1) e L(x), potremmo averli combinati insieme per formare una porzione più grande di dimensione L(x+2). Ma siccome non lo abbiamo fatto, non può essere rimasto nulla dopo due heap di dimensione L(x+1) e L(x). Ogni heap, la cui dimensione è L(x), è strutturata da sinistra a destra come un heap secondario di dimensione L(x-1), un heap secondario di dimensione L(x-2) ed un nodo radice, ad eccezione degli heap di dimensione L(1) e L(0) (che hanno valore 1 per definizione). Ogni heap mantiene la proprietà degli heap per cui un nodo radice è sempre maggiore o uguale ai nodi radice dei suoi heap figli (e quindi maggiore o uguale a tutti i nodi nei suoi heap figli), e la stringa di heap in toto mantiene la proprietà delle stringhe per cui il nodo radice di ogni heap è maggiore o uguale al nodo radice dell'heap alla sua sinistra. The consequence of this is that the rightmost node in the string will always be >= all of the other nodes, and importantly - an array that is already sorted needs no rearrangement to be made into a valid series of heaps. This is the source of the adaptive qualities of the algorithm. Come conseguenza di ciò si ha che il nodo più a destra nella stringa è sempre maggiore o uguale a tutti gli altri nodi e, molto importante, un array che è già ordinato non richiede aggiustamenti per essere distribuito in una serie di heap validi. Questa è la caratteristica adattiva dell'algoritmo. The algorithm is simple. We start by dividing up our unsorted array into a single heap of one element, followed by an unsorted portion. A one-element array is trivially a valid sequence of heaps. This sequence is then grown by adding one element at a time to the right, performing swaps to keep the sequence property and the heap property, until it fills the entire original array. L'algoritmo è semplice. Si inizia dividendo la nostra lista non ordinata in un singolo heap di un elemento, seguito da una porzione non ordinata. Una lista di un elemento è banalmente una sequenza valida di heap: questa sequenza viene poi incrementata aggiungendo un elemento per volta alla sua destra, effettuando gli scambi del caso per tenere la proprietà della sequenza e la proprietà dell'heap, finché non ricopre l'intera lista iniziale. From this point on, the rightmost element of the sequence of heaps will be the largest element in any of the heaps, and will therefore be in its correct, final position. We then reduce the series of heaps back down to a single heap of one element by removing the rightmost node (which stays in place) and performing re-arrangements to restore the heap condition. When we are back down to a single heap of one element, the array is sorted. A questo punto l'elemento più a destra della sequenza di heap sarà l'elemento più grande in qualsiasi heap e sarà perciò nella sua posizione corretta e definitiva. Si riducono quindi le serie di heap fino ad un singolo heap di un solo elemento rimuovendo il nodo più a destra (che sta a posto) ed effettuando un riarrangiamento per ripristinare la condizione di heap. Quando siamo tornati nella condizione di un singolo heap con un solo elemento, allora la lista è ordinata. ~~==Operations==~~ ==Note== Ignoring (for the moment) Dijkstra's optimisations, two operations are necessary - increase the string by adding one element to the right, and decrease the string by removing the right most element (the root of the last heap), preserving the heap and string conditions. <references/> ==Collegamenti esterni== ~~===Grow the string by adding an element to the right===~~ * {{cita web\|http://www.enterag.ch/hartwig/order/smoothsort.pdf\|Commented transcription of EWD796a}} {{ordinamento}} * If the last two heaps are of size L(x+1) and L(x) (ie: consecutive leonardo numbers), the new element becomes the root node of a bigger heap of size L(x+2). This heap will not necessarily have the heap property. {{portale\|informatica}} * If the last two heaps of the string are not consecutive Leonardo numbers, then the rightmost element becomes a new heap of size 1. This 1 is taken to be L(1), unless the rightmost heap already has size L(1), in which case the new one-element heap is taken to be of size L(0). [[Categoria:Algoritmi di ordinamento]] ~~After this, the heap and string properties must be restored. To do this,~~ [[Categoria:Heap]] ~~# the rightmost heap (the one that has just been created) becomes the "current" heap~~ ~~# while there is a heap to the left of the current heap and its root is larger than the current root '''and''' both of its child heap roots~~ ~~#* then swap the new root with the root on the heap to the left (this will not disturb the heap property of the current heap). That heap then becomes the current heap.~~ ~~# Perform a "filter" operation on the current heap to establish the heap property:~~ ~~#while the current heap has a size greater than 1 and either child heap of the current heap has a root node greater than the root of the current heap~~ ~~#* Swap the greater child root with the current root. That child heap becomes the current heap.~~ The filter operation is greatly simplified by the use of leonardo numbers, as a heap will always either be a single node, or will have two child heaps. One does not need to manage the condition of one of the child heaps not being present. ~~====Optimisation====~~ * If the new heap is going to become part of a larger heap by the time we are done, then don't bother establishing the string property: it only needs to be done when a heap has reached its final size. ** To do this, look at how many elements are left after the new heap of size L(x). If there are more than L(x-1)+1, then this new heap is going to be merged. * Do not maintain the heap property of the rightmost heap. If that heap becomes one of the final heaps of the string, then maintaining the string property will restore the heap property. Of course, whenever a new heap is created, then the rightmost heap is no longer the rightmost and the heap property needs to be restored. ~~===Shrink the string by removing the rightmost element===~~ ~~If the rightmost heap has a size of 1 (ie, L(1) or L(0)), then nothing needs to be done. Simply remove that rightmost heap.~~ ~~If the rightmost heap does not have a size of 1, then remove the root, exposing the two sub-heaps as members of the string. Restore the string property first on the left one and then on the right one.~~ ~~====Optimisation====~~ * When restoring the string property, we do not need to compare the root of the heap to the left with the two child nodes of the heaps that have just been exposed, because we know that these newly exposed heaps have the heap property. Just compare it to the root. ~~==Java implementation==~~ ~~This code uses '''lo''' and '''hi''' as the bounds of the array ''inclusive''. Note that this is not the usual convention.~~ ~~<source lang="java5">~~ ~~// by keeping these constants, we can avoid the tiresome business~~ ~~// of keeping track of Dijkstra's b and c. Instead of keeping~~ ~~// b and c, I will keep an index into this array.~~ ~~static final int LP[] = { 1, 1, 3, 5, 9, 15, 25, 41, 67, 109,~~ ~~177, 287, 465, 753, 1219, 1973, 3193, 5167, 8361, 13529, 21891,~~ ~~35421, 57313, 92735, 150049, 242785, 392835, 635621, 1028457,~~ ~~1664079, 2692537, 4356617, 7049155, 11405773, 18454929, 29860703,~~ ~~48315633, 78176337, 126491971, 204668309, 331160281, 535828591,~~ ~~866988873 // the next number is > 31 bits.~~ }; ~~public static <C extends Comparable<? super C>> void sort(C[] m,~~ ~~int lo, int hi) {~~ ~~int head = lo; // the offset of the first element of the prefix into m~~ ~~// These variables need a little explaining. If our string of heaps~~ ~~// is of length 38, then the heaps will be of size 25+9+3+1, which are~~ ~~// Leonardo numbers 6, 4, 2, 1.~~ ~~// Turning this int a binary number, we get b01010110 = x56. We represent~~ ~~// this number as a pair of numbers by right-shifting all the zeros and~~ ~~// storing the mantissa and exponent as "p" and "pshift".~~ ~~// This is handy, because the exponent is the index into L[] giving the~~ ~~// size of the rightmost heap, and because we can instantly find out if~~ ~~// the rightmost two heaps are consecutive leonardo numbers by checking~~ ~~// (p&3)==3~~ ~~int p = 1; // the bitmap of the current standard concatenation >> pshift~~ ~~int pshift = 1;~~ ~~while (head < hi) {~~ ~~if ((p & 3) == 3) {~~ ~~// Add 1 by merging the first two blocks into a larger one.~~ ~~// The next Leonardo num is one bigger.~~ ~~sift(m, pshift, head);~~ ~~p >>>= 2;~~ ~~pshift += 2;~~ ~~} else {~~ ~~// adding a new block of length 1~~ ~~if (LP[pshift - 1] >= hi - head) {~~ ~~// this block is its final size.~~ ~~trinkle(m, p, pshift, head, false);~~ ~~} else {~~ ~~// this block will get merged. Just make it trusty.~~ ~~sift(m, pshift, head);~~ } ~~if (pshift == 1) {~~ ~~// LP[1] is being used, so we add use LP[0]~~ ~~p <<= 1;~~ ~~pshift--;~~ ~~} else {~~ ~~// shift out to position 1, add LP[1]~~ ~~p <<= (pshift - 1);~~ ~~pshift = 1;~~ } } ~~p \|= 1;~~ ~~head++;~~ } ~~trinkle(m, p, pshift, head, false);~~ ~~while (pshift != 1 \|\| p != 1) {~~ ~~if (pshift <= 1) {~~ ~~// block of length 1. No fiddling needed~~ ~~int trail = Integer.numberOfTrailingZeros(p & ~1);~~ ~~p >>>= trail;~~ ~~pshift += trail;~~ ~~} else {~~ ~~p <<= 2;~~ ~~p ^= 7;~~ ~~pshift -= 2;~~ ~~// ok. This block gets broken into three bits. The rightmost~~ ~~// bit is a block of length 1. The left hand part is split into~~ ~~// two, a block of length LP[pshift+1] and one of LP[pshift].~~ ~~// Both these two are appropriately heapified, but the root~~ ~~// nodes are not nessesarily in order. We therefore semitrinkle~~ ~~// both~~ ~~// of them~~ ~~trinkle(m, p >>> 1, pshift + 1, head - LP[pshift] - 1, true);~~ ~~trinkle(m, p, pshift, head - 1, true);~~ } ~~head--;~~ } } ~~private static <C extends Comparable<? super C>> void sift(C[] m, int pshift,~~ ~~int head) {~~ ~~// we do not use Floyd's improvements to the heapsort sift, because we~~ ~~// are not doing what heapsort does - always moving nodes from near~~ ~~// the bottom of the tree to the root.~~ ~~C val = m[head];~~ ~~while (pshift > 1) {~~ ~~int rt = head - 1;~~ ~~int lf = head - 1 - LP[pshift - 2];~~ ~~if (val.compareTo(m[lf]) >= 0 && val.compareTo(m[rt]) >= 0)~~ ~~break;~~ ~~if (m[lf].compareTo(m[rt]) >= 0) {~~ ~~m[head] = m[lf];~~ ~~head = lf;~~ ~~pshift -= 1;~~ ~~} else {~~ ~~m[head] = m[rt];~~ ~~head = rt;~~ ~~pshift -= 2;~~ } } ~~m[head] = val;~~ } ~~private static <C extends Comparable<? super C>> void trinkle(C[] m, int p,~~ ~~int pshift, int head, boolean isTrusty) {~~ ~~C val = m[head];~~ ~~while (p != 1) {~~ ~~int stepson = head - LP[pshift];~~ ~~if (m[stepson].compareTo(val) <= 0)~~ ~~break; // current node is greater than head. Sift.~~ ~~// no need to check this if we know the current node is trusty,~~ ~~// because we just checked the head (which is val, in the first~~ ~~// iteration)~~ ~~if (!isTrusty && pshift > 1) {~~ ~~int rt = head - 1;~~ ~~int lf = head - 1 - LP[pshift - 2];~~ ~~if (m[rt].compareTo(m[stepson]) >= 0~~ ~~\|\| m[lf].compareTo(m[stepson]) >= 0)~~ ~~break;~~ } ~~m[head] = m[stepson];~~ ~~head = stepson;~~ ~~int trail = Integer.numberOfTrailingZeros(p & ~1);~~ ~~p >>>= trail;~~ ~~pshift += trail;~~ ~~isTrusty = false;~~ } ~~if (!isTrusty) {~~ ~~m[head] = val;~~ ~~sift(m, pshift, head);~~ } } ~~</source>~~ ~~==Notes==~~ ~~{{reflist}}~~ * [http://www.enterag.ch/hartwig/order/smoothsort.pdf Commented transcription of EWD796a] ~~{{sorting}}~~ ~~[[Category:Sorting algorithms]]~~ ~~[[Category:Comparison sorts]]~~ ~~[[Category:Heaps (structure)]]~~ ~~[[Category:Articles with example Java code]]~~ ~~[[de:Smoothsort]]~~ ~~[[fa:مرتب‌سازی روان]]~~ ~~[[fr:Smoothsort]]~~ ~~[[tr:Rahat sıralama]]~~ ~~-->~~