Talk:CYK algorithm: Difference between revisions

Content deleted Content added
Teetooan (talk | contribs)
Citation from 2009?: Some remarks on what should be cited
Implementing WP:PIQA (Task 26)
 
(8 intermediate revisions by 4 users not shown)
Line 1:
{{WikiProject Computerbanner Science|importance=midshell|class=C}}|
{{WikiProject Computer science|importance=mid}}
 
}}
 
==I don't like the article==
Line 11 ⟶ 12:
:The example given is for ''parsing'' sentences into NP, VP and so on, which is more than [[part-of-speech tagging]]. It may be more common to introduce CYK with artificial examples, but I don't see anything wrong with using an example from an application area. [[Computational linguistics]] is one of the areas where CYK is applied, and CYK is sometimes studied as part of a CL curriculum. But I've reconstructed the grammar from the example and added it to the article. It should now make more sense. -- [[User:UKoch|UKoch]] ([[User talk:UKoch|talk]]) 17:25, 20 December 2011 (UTC)
 
== I think there is still much room for improvement ==
==I don't like the article either ==
 
The fact that binary form (no RHS longer than 2) should be used, rather than CNF has been known for about 40 years. It was not published, because it was considered obvious, as the complexity results from the length of right-hand sides. It is also the only way to preserve conveniently the shape of parse trees.
Beau Shiel had a nice paper about it at Coling 76. The empty word is a problem that cannot be escaped, if parse-trees are to be preserved, which is implied by the name parser (but sometimes not described with the algorithm). If no parse tree is kept, it is a recognizer.
 
The size of a grammar should be defined explicitly, or a reference given.[[User:Teetooan|Teetooan]] ([[User talk:Teetooan|talk]]) 20:32, 27 February 2014 (UTC)
For building the parse-tree it is always necessary to store a list of nodes with at least one for each non-terminal found.
 
The proper way to write for an encyclopedia is to give the modern and simpler description first. Then to give some historical details such as the use of CNF in the first versions. On the other hand, the CNF is easier to explain. It should be noted that it loses infinite ambiguity.
 
In 1974, Valiant used the Strassen algorithm (1969) to reduce the complexity, but the complexity of matrix multiplication was reduced since then, first by Coppersmith–Winograd (1990), then by Williams (2011). It should be better to discuss usability on the basis of Strassen's algorithm, which is the only one that has a chance of being meaningful. <small><span class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Teetooan|Teetooan]] ([[User talk:Teetooan|talk]] • [[Special:Contributions/Teetooan|contribs]]) 00:45, 26 February 2014 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
[[User:Teetooan|Teetooan]] ([[User talk:Teetooan|talk]]) 00:49, 26 February 2014 (UTC)
 
It might be nice to underscore that CYK is the only cubic algorithm that can run directly on the grammar (up to the binary transformation, that is both trivial an unescapable).[[User:Teetooan|Teetooan]] ([[User talk:Teetooan|talk]]) 13:09, 26 February 2014 (UTC)
 
Parsing weighted context-free grammars is not specific of CYK and is available in a similar way to all general CF parsers. It is a special case of decorating the rules with values from a semi-ring.[[User:Teetooan|Teetooan]] ([[User talk:Teetooan|talk]]) 20:45, 27 February 2014 (UTC)
 
==Explain practical parse tree generation!==
Line 40 ⟶ 45:
 
== Another article on CYK ==
 
There is another, almost empty article on the CYK parser: http://en.wikipedia.org/wiki/CYK_%28algorithm%29
The two should be merged (the former should be deleted.)
<br><small><span class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:89.133.45.42|89.133.45.42]] ([[User talk:89.133.45.42|talk]] • [[Special:Contributions/89.133.45.42|contribs]]) 21:33, 21 June 2007</span></small><!-- Template:Unsigned -->
 
: Done. [[User:Ceroklis|Ceroklis]] 10:01, 26 September 2007 (UTC)
Line 125 ⟶ 126:
 
:I changed "the most efficient parsing algorithm" to "one of the most efficient parsing algorithms". -- [[User:UKoch|UKoch]] ([[User talk:UKoch|talk]]) 14:08, 20 December 2011 (UTC)
 
 
== External Links ==
 
I tried to fix the broken link in the external links section as well as adding a visualization I published on my site: https://www.xarg.org/tools/cyk-algorithm/
 
Unfortunately, the change was reverted automatically as I got explained here: https://en.wikipedia.org/wiki/User_talk:Xarg
 
However, as I wrote, I don't think this is an advertising other than bringing in a valuable reference to Wikipedia and so I would be glad if you add the link permanently if the resource meeds your quality demands. I assure that this link will be available and that I own the site. <!-- Template:Unsigned --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Xarg|Xarg]] ([[User talk:Xarg#top|talk]] • [[Special:Contributions/Xarg|contribs]]) 03:04, 21 July 2018 (UTC)</small> <!--Autosigned by SineBot-->
 
== Pseudocode Difficult to Read ==
 
Aside from the fact that one-letter variable names obfuscate meaning and are difficult to follow, having a lowercase letter L in a font where it looks nearly identical for the numeral for "one" (cf. l and 1) only compounds the issue. [[User:Antigravity711|Antigravity711]] ([[User talk:Antigravity711|talk]]) 17:06, 27 January 2020 (UTC)