CYK algorithm: Difference between revisions

Content deleted Content added
External links: the former link didn't work
Citation bot (talk | contribs)
Removed URL that duplicated identifier. | Use this bot. Report bugs. | #UCB_CommandLine
 
(12 intermediate revisions by 10 users not shown)
Line 1:
{{Short description|Parsing algorithm for context-free grammars}}
{{Redirect|CYK||Cyk (disambiguation)}}
{{Infobox algorithm
|name=Cocke–Younger–Kasami algorithm (CYK)
Line 8 ⟶ 10:
}}
 
In [[computer science]], the '''Cocke–Younger–Kasami algorithm''' (alternatively called '''CYK''', or '''CKY''') is a [[parsing]] [[algorithm]] for [[context-free grammar]]s published by Itiroo Sakai in 1961.<ref>{{cite book |last1=Grune |first1=Dick |title=Parsing techniques : a practical guide |date=2008 |publisher=Springer |___location=New York |page=579 |isbn=978-0-387-20248-8 |edition=2nd}}</ref><ref>Itiroo Sakai, “Syntax in universal translation”. In Proceedings 1961 International Conference on Machine Translation of Languages and Applied Language Analysis, Her Majesty’s Stationery Office, London, p. 593-608, 1962.</ref> The algorithm is named after some of its rediscoverers: [[John Cocke (computer scientist)|John Cocke]], Daniel Younger, [[Tadao Kasami]], and [[Jacob T. Schwartz]]. It employs [[bottom-up parsing]] and [[dynamic programming]].
 
The standard version of CYK operates only on context-free grammars given in [[Chomsky normal form]] (CNF). However any context-free grammar may be algorithmically transformed (after convention) tointo a CNF grammar expressing the same language {{harv|Sipser|1997}}.
 
The importance of the CYK algorithm stems from its high efficiency in certain situations. Using [[Big O notation|big ''O'' notation]], the [[Analysis of algorithms|worst case running time]] of CYK is <math>\mathcal{O}\left( n^3 \cdot \left| G \right| \right)</math>, where <math>n</math> is the length of the parsed string and <math>\left| G \right|</math> is the size of the CNF grammar <math>G</math> {{harv|Hopcroft|Ullman|1979|p=140}}. This makes it one of the most efficient {{Citation needed|reason=cubic time does not seem efficient at all; other algorithms claim linear execution time|date=August 2023}} parsing algorithms in terms of worst-case [[asymptotic complexity]], although other algorithms exist with better average running time in many practical scenarios.
 
==Standard form==
 
The [[dynamic programming]] algorithm requires the context-free grammar to be rendered into [[Chomsky normal form]] (CNF), because it tests for possibilities to split the current sequence into two smaller sequences. Any context-free grammar that does not generate the empty string can be represented in CNF using only [[Formal grammar#The syntax of grammars|production rules]] of the forms <math>A\rightarrow \alpha</math>, and <math>A\rightarrow B C</math>; to allow for the empty string, andone can explicitly allow <math>S\to \varepsilon</math>, where <math>S</math> is the start symbol.<ref>{{Cite book |last=Sipser |first=Michael |url=https://www.worldcat.org/oclc/58544333 |title=Introduction to the theory of computation |date=2006 |publisher=Thomson Course Technology |isbn=0-534-95097-3 |edition=2nd |___location=Boston |at=Definition 2.8 |oclc=58544333}}</ref>
 
==Algorithm==
Line 156 ⟶ 158:
== Sources ==
*{{cite conference |title= Syntax in universal translation |last= Sakai |first= Itiroo |date= 1962 |___location= London |publisher= Her Majesty’s Stationery Office |volume= II |pages= 593–608 |conference= 1961 International Conference on Machine Translation of Languages and Applied Language Analysis, Teddington, England}}
*{{cite techreporttech report |last1=Cocke |first1=John |author-link1=John Cocke (computer scientist) |last2=Schwartz |first2=Jacob T. |date=April 1970 |title=Programming languages and their compilers: Preliminary notes |edition=2nd revised |publisher=[[Courant Institute of Mathematical Sciences|CIMS]], [[New York University|NYU]] |url=http://www.softwarepreservation.org/projects/FORTRAN/CockeSchwartz_ProgLangCompilers.pdf}}
* {{cite book | isbn=0-201-02988-X | first1=John E. | last1=Hopcroft | author1-link=John E. Hopcroft | first2=Jeffrey D. | last2=Ullman | author2-link=Jeffrey D. Ullman | title=Introduction to Automata Theory, Languages, and Computation | ___location=Reading/MA | publisher=Addison-Wesley | year=1979 | url=https://archive.org/details/introductiontoau00hopc }}
*{{cite techreporttech report |last1=Kasami |first1=T. |author-link1=Tadao Kasami |year=1965 |title=An efficient recognition and syntax-analysis algorithm for context-free languages |number=65-758 |publisher=[[Air Force Cambridge Research Laboratories|AFCRL]]}}
*{{cite book |last1=Knuth |first1=Donald E. |author-link1=Donald Knuth |title=The Art of Computer Programming Volume 2: Seminumerical Algorithms |publisher=Addison-Wesley Professional |edition=3rd |date=November 14, 1997 |isbn=0-201-89684-2 |pages=501 }}
*{{cite journal |last1=Lang |first1=Bernard |title=Recognition can be harder than parsing |journal=[[Computational Intelligence (journal)|Comput. Intell.]] |year=1994 |volume=10 |issue=4 |pages=486–494 |citeseerx=10.1.1.50.6982 |doi=10.1111/j.1467-8640.1994.tb00011.x |s2cid=5873640 }}
Line 168 ⟶ 170:
 
==External links==
* [https://raw.org/tool/cyk-algorithm/ Interactive Visualization of the CYK algorithm]
* [https://martinlaz.github.io/demos/cky.html CYK parsing demo in JavaScript]
* [https://www.swisseduc.ch/informatik/exorciser/ Exorciser is a Java application to generate exercises in the CYK algorithm as well as Finite State Machines, Markov algorithms etc]