Suffix array: Difference between revisions

Content deleted Content added
Bender the Bot (talk | contribs)
m External links: HTTP to HTTPS for SourceForge
 
(5 intermediate revisions by 2 users not shown)
Line 28:
 
Enhanced suffix arrays (ESAs) are suffix arrays with additional tables that reproduce the full functionality of suffix trees preserving the same time and memory complexity.{{sfn|Abouelhoda|Kurtz|Ohlebusch|2004}}
TheA suffixsorted array forof aonly subsetsome of(rather than all) suffixes of a string is called a [[sparse suffix array]].{{sfn|I|Kärkkäinen|Kempa|2014}} Multiple probabilistic algorithms have been developed to minimize the additional memory usage including an optimal time and memory algorithm.{{sfn|Gawrychowski|Kociumaka|2017}}
 
== Definition ==
Line 176:
Recent work by {{harvtxt|Salson|Lecroq|Léonard|Mouchard|2010}} proposes an algorithm for updating the suffix array of a text that has been edited instead of rebuilding a new suffix array from scratch. Even if the theoretical worst-case time complexity is <math>\mathcal{O}(n \log n)</math>, it appears to perform well in practice: experimental results from the authors showed that their implementation of dynamic suffix arrays is generally more efficient than rebuilding when considering the insertion of a reasonable number of letters in the original text.
 
In practical [[open source]] work, a commonly used routine for suffix array construction was qsufsort, based on the 1999 Larsson-Sadakane algorithm.<ref>{{cite journal |last1=Larsson |first1=N. Jesper |last2=Sadakane |first2=Kunihiko |title=Faster suffix sorting |journal=Theoretical Computer Science |date=22 November 2007 |volume=387 |issue=3 |pages=258–272 |doi=10.1016/j.tcs.2007.07.017 |language=en |issn=0304-3975|doi-access=free }}</ref> This routine has been superseded by Yuta Mori's DivSufSort, "the fastest known suffix sorting algorithm in main memory" as of 2017. It too can be modified to compute an LCP array. It uses a induced copying combined with Itoh-Tanaka.<ref>{{cite journal |last1=Fischer |first1=Johannes |last2=Kurpicz |first2=Florian |title=Dismantling DivSufSort |arxiv=1710.01896 |date=5 October 2017 |journal=Proceedings of the Prague Stringology Conference 2017}}</ref> In 2021 a faster implementation of the algorithm was presented by Ilya Grebnov <ref>{{Cite web|title=New saca and bwt library (libsais)|url=https://encode.su/threads/3579-New-saca-and-bwt-library-(libsais)|access-date=2021-10-03|website=encode.su}}</ref> which in average showed 65% performance improvement over DivSufSort implementation on the [[Silesia Corpuscorpus]].<ref>{{Citation|last=Grebnov|first=Ilya|title=libsais|date=2021-09-22|url=https://github.com/IlyaGrebnov/libsais|access-date=2021-10-02}}</ref>
 
== Generalized suffix array ==
Line 288:
| author2-link = Gene Myers
}}
* {{Cite journal|last1=Gawrychowski|first1=Paweł|last2=Kociumaka|first2=Tomasz|date=January 2017 |title=Sparse Suffix Tree Construction in Optimal Time and Space|journal=Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms|___location=Philadelphia, PA|publisher=Society for Industrial and Applied Mathematics|pages=425–439|isbn=9781611974782|arxiv=1608.00865|doi=10.1137/1.9781611974782.27 |s2cid=6608776}}
* {{cite conference |last1=Li|first1=Zhize|last2=Li|first2=Jian|last3=Huo|first3=Hongwei
|doi = 10.1007/978-3-030-00479-8_22
Line 377 ⟶ 376:
* [http://code.google.com/p/compression-code/downloads/list Suffix sorting module for BWT in C code]
* [http://www.codeodor.com/index.cfm/2007/12/24/The-Suffix-Array/1845 Suffix Array Implementation in Ruby]
* [httphttps://sary.sourceforge.net/index.html.en Suffix array library and tools]
* [http://pizzachili.dcc.uchile.cl/ Project containing various Suffix Array c/c++ Implementations with a unified interface]
* [https://github.com/y-256/libdivsufsort A fast, lightweight, and robust C API library to construct the suffix array]