Content deleted Content added
IznoRepeat (talk | contribs) m →External links: add WP:TEMPLATECAT to remove from template; genfixes |
|||
(21 intermediate revisions by 17 users not shown) | |||
Line 1:
'''Sequitur''' (or '''Nevill-
| author =
| author-link = Craig Nevill-Manning
|author2=Witten, I.H.▼
| author2 = Witten, I.H.
| title = Identifying Hierarchical Structure in Sequences: A linear-time algorithm
| arxiv = cs/9709102
}}</ref> that infers a hierarchical structure ([[context-free grammar]]) from a sequence of discrete symbols. The algorithm operates in linear space and time. It can be used in [[data compression]] software applications.▼
| bibcode= 1997cs........9102N
▲ }}</ref> that infers a hierarchical structure ([[context-free grammar]]) from a sequence of discrete symbols. The algorithm operates in linear space and time. It can be used in [[data compression]] software applications.<ref name=Nevill-manning1997_2>{{cite book
| doi = 10.1109/DCC.1997.581951
| author = Nevill-Manning, C.G.
| author-link = Craig Nevill-Manning
| title = Proceedings DCC '97. Data Compression Conference
| pages = 3–11
▲ | author2=Witten, I.H.
| year = 1997
| chapter = Linear-Time, Incremental Hierarchy Inference for Compression
| isbn = 978-0-8186-7761-8
| citeseerx = 10.1.1.30.2305
| s2cid = 14324301
}}</ref>
== Constraints ==
: S→abcab, the algorithm will produce : While scanning the input sequence,
=== Digram
Whenever a new symbol is scanned from the sequence, it is appended with the last scanned symbol to form a new [[Bigram|digram]]. If this digram has been formed earlier then a new rule is made to replace both
Therefore, it ensures that no digram occurs more than once in the grammar. For example, in the sequence '''S→abaaba''', when the first four symbols are already scanned, digrams formed are
=== Rule
This constraint ensures that all the rules are used more than once in the right
: '''S→BB, B→aba'''. This constraint helps
== Method summary ==
The algorithm works by scanning a sequence of [[terminal symbol]]s
==See also==
* [[Context-free grammar]]
* [[Straight-line grammar]]▼
* [[Lossless compression|Lossless data compression]]▼
* [[Data compression]]
▲* [[Lossless compression|Lossless data compression]]
▲* [[Straight-line grammar]]
* [[Byte pair encoding]]
==References==
{{
==External links==
* [http://www.sequitur.info/ sequitur.info
{{Compression methods}}
[[Category:Lossless compression algorithms]]
[[Category:Data compression]]
|