Burrows–Wheeler transform: Difference between revisions

Content deleted Content added
Remove claim about the bijective variant that appears to be false (and contradicted by the very example that followed it). See https://en.wikipedia.org/wiki/Talk:Burrows%E2%80%93Wheeler_transform#%22the_transformed_text_will_differ_from_the_result_of_BWT_by_only_one_character_per_Lyndon_word%22_-_uh,_no?
Tweak intro to and 1. correct claim about the inventor and date of invention - B&W's own paper says the transform was "discovered" by Wheeler all the way back in 1983, and simply wasn't published until later. 2. fix conflation of BSLDCA with BWT - see Talk:Burrows–Wheeler_transform#compression? and 3. reduce word count
Line 7:
| data = string
}}
The '''Burrows–Wheeler transform''' ('''BWT''', also called '''block-sorting compression''') rearranges a [[character string (computer science)|character string]] into runs of similar characters., Thisin isa usefulmanner forthat [[datacan compression|compression]], since itbe tendsreversed to berecover easythe to compress aoriginal string. thatSince has[[data runs of repeated characters bycompression|compression]] techniques such as [[move-to-front transform]] and [[run-length encoding]]. are Importantly,more theeffective transformationwhen issuch ''reversible'',runs withoutare needing to store any additional data exceptpresent, the position of the first original character. The BWT can thus be used as a "free" preparatory step to improve the efficiency of a text compression algorithm, costing only some additional computation, and is used this way in software such as [[bzip2]]. The algorithm can be implemented efficiently using a [[suffix array]] thus reaching linear time complexity.
 
It was invented by [[MichaelDavid BurrowsWheeler (computer scientist)|MichaelDavid BurrowsWheeler]] in 1983, and later published by him and [[DavidMichael WheelerBurrows (computer scientist)|DavidMichael WheelerBurrows]] in 1994. whileTheir Burrowspaper wasincluded workinga atcompression [[DECalgorithm, Systemscalled Researchthe Center]]'''Block-sorting inLossless [[PaloData Alto]],Compression California.Algorithm''' Itor is'''BSLDCA''', basedthat oncompresses a previously unpublished transformation discovereddata by Wheelerusing inthe 1983.BWT Thefollowed by algorithmmove-to-front cancoding beand implemented[[Huffman efficientlycoding]] using aor [[suffixarithmetic arraycoding]] thus reaching linear time complexity.<ref name=Burrows1994>{{citation
| first1 = Michael
| last1 = Burrows
Line 23:
| archiveurl=http://web.archive.org/web/20030105080431/https://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.html
| archivedate=January 5, 2003
}}</ref><ref name="u822">{{cite conference | last=Arnavut | first=Z. | last2=Magliveras | first2=S.S. | title=Block sorting and compression | publisher=IEEE Comput. Soc. Press | date=1997 | isbn=978-0-8186-7761-8 | doi=10.1109/DCC.1997.582009 | url=https://ieeexplore.ieee.org/document/582009/ | access-date=2025-05-07 | page=181–190}}</ref>
}}</ref>
 
==Description==