CYK algorithm: Difference between revisions

Content deleted Content added
Valiant's algorithm
Line 14:
 
The importance of the CYK algorithm stems from the fact that it constructively proves that the [[membership problem]] for context-free languages is [[decision problem|decidable]] (see also: [[theory of computation]]) and the fact that it does so quite efficiently.
Using [[Landau symbol]]s, the [[Analysis of algorithms|worst case [[asymptoticrunning time]] time complexity of CYK is <math>\Theta(n^3 \cdot |G|)</math>, where ''n'' is the length of the parsed string and ''|G|'' is the size of the CNF grammar ''G''. This makes it one of the most efficient (in those terms) algorithms for recognizing general context-free languages. Specialized and faster algorithms exist for certain subsets of the context-free languages.
 
==Standard form==
Line 72:
===Parsing weighted context-free grammars===
It is also possible to extend the CYK algorithm to parse strings using [[weighted context-free grammar|weighted]] and [[stochastic context-free grammar]]s. Weights (probabilities) are then stored in the table P instead of booleans, so P[i,j,A] will contain the minimum weight (maximum probability) that the substring from i to j can be derived from A. Further extensions of the algorithm allow all parses of a string to be enumerated from lowest to highest weight (highest to lowest probability).
 
===Valiant's algorithm===
Using [[Landau symbol]]s, the [[Analysis of algorithms|worst case running time]] of CYK is <math>\Theta(n^3 \cdot |G|)</math>, where ''n'' is the length of the parsed string and ''|G|'' is the size of the CNF grammar ''G''. This makes it one of the most efficient algorithms for recognizing general context-free languages in practice. Valiant (1975) gave an extension of the CYK algorithm. His algorithm computes the same parsing table
as the CYK algorithm; yet he showed that Boolean [[Matrix multiplication#Algorithms for efficient matrix multiplication|matrix multiplication]] can be utilized for performing this computation.
 
Using the [[Coppersmith–Winograd algorithm]] for multiplying Boolean matrices, this gives an asymptotic worst-case running time of <math>O(n^{2.38} \cdot |G|)</math>. However, the constant term hidden by the [[Big O Notation]] is so large that the Coppersmith–Winograd algorithm is only worthwhile for matrices that are too large to handle on present-day computers.<ref>Knuth, 1997</ref>
 
==See also==
Line 82 ⟶ 88:
* [[Tadao Kasami|T. Kasami]] (1965). An efficient recognition and syntax-analysis algorithm for context-free languages. Scientific report AFCRL-65-758, Air Force Cambridge Research Lab, [[Bedford, MA]].
* Daniel H. Younger (1967). Recognition and parsing of context-free languages in time ''n''<sup>3</sup>. ''Information and Control'' 10(2): 189&ndash;208.
* D.E. Knuth. ''The Art of Computer Programming Volume 2: Seminumerical Algorithms''. Addison-Wesley Professional; 3 edition (November 14, 1997). ISBN 978-0201896848. pp. 501.
* Martin Lange and Hans Leiß. [http://www.informatica-didactica.de/InformaticaDidactica/LangeLeiss2009 ''To CNF or not to CNF? An Efficient Yet Presentable Version of the CYK Algorithm.''.] Informatica Didactica 8, 2009. [http://www.informatica-didactica.de/InformaticaDidactica/LangeLeiss2009.pdf (pdf)]
* Leslie G. Valiant. ''General context-free recognition in less than cubic time''. [[Journal of Computer and System Sciences]], 10:2 (1975), 308–314.
 
 
==External links==
* [http://homepages.uni-tuebingen.de/student/martin.lazarov/demos/cky.html CYK parsing demo in JavaScript]<br/>
* [http://www.informatik.uni-leipzig.de/alg/lehre/ss08/AUTO-SPRACHEN/Java-Applets/CYK-Algorithmus.html Interactive Applet from the University of Leipzig to demonstrate the CYK-Algorithm (Site is in german)]<br />
* [http://www.swisseduc.ch/compscience/exorciser/ Exorciser is a Java application to generate exercises in the CYK algorithm as well as Finite State Machines, Markov algorithms etc (this page is in English)]
 
[[Category:Parsing algorithms]]