Talk:CYK algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Add topic

Revision as of 17:25, 20 December 2011 edit UKoch (talk \| contribs) Extended confirmed users 2,904 edits →I don't like the article ← Previous edit		Latest revision as of 04:00, 28 January 2024 edit undo Qwerfjkl (bot) (talk \| contribs) Bots, Mass message senders 4,080,841 edits Implementing WP:PIQA (Task 26) Tag: Talk banner shell conversion
(13 intermediate revisions by 4 users not shown)
Line 1: {{WikiProject ~~Computer~~banner ~~Science\|importance=mid~~shell\|class=C}}\| {{WikiProject Computer science\|importance=mid}} }} ==I don't like the article== Line 10 ⟶ 11: :The example given is for ''parsing'' sentences into NP, VP and so on, which is more than [[part-of-speech tagging]]. It may be more common to introduce CYK with artificial examples, but I don't see anything wrong with using an example from an application area. [[Computational linguistics]] is one of the areas where CYK is applied, and CYK is sometimes studied as part of a CL curriculum. But I've reconstructed the grammar from the example and added it to the article. It should now make more sense. -- [[User:UKoch\|UKoch]] ([[User talk:UKoch\|talk]]) 17:25, 20 December 2011 (UTC) == I think there is still much room for improvement == The fact that binary form (no RHS longer than 2) should be used, rather than CNF has been known for about 40 years. It was not published, because it was considered obvious, as the complexity results from the length of right-hand sides. It is also the only way to preserve conveniently the shape of parse trees. Beau Shiel had a nice paper about it at Coling 76. The empty word is a problem that cannot be escaped, if parse-trees are to be preserved, which is implied by the name parser (but sometimes not described with the algorithm). If no parse tree is kept, it is a recognizer. The size of a grammar should be defined explicitly, or a reference given.[[User:Teetooan\|Teetooan]] ([[User talk:Teetooan\|talk]]) 20:32, 27 February 2014 (UTC) The proper way to write for an encyclopedia is to give the modern and simpler description first. Then to give some historical details such as the use of CNF in the first versions. On the other hand, the CNF is easier to explain. It should be noted that it loses infinite ambiguity. In 1974, Valiant used the Strassen algorithm (1969) to reduce the complexity, but the complexity of matrix multiplication was reduced since then, first by Coppersmith–Winograd (1990), then by Williams (2011). It should be better to discuss usability on the basis of Strassen's algorithm, which is the only one that has a chance of being meaningful. [[User:Teetooan\|Teetooan]] ([[User talk:Teetooan\|talk]]) 00:49, 26 February 2014 (UTC) It might be nice to underscore that CYK is the only cubic algorithm that can run directly on the grammar (up to the binary transformation, that is both trivial an unescapable).[[User:Teetooan\|Teetooan]] ([[User talk:Teetooan\|talk]]) 13:09, 26 February 2014 (UTC) Parsing weighted context-free grammars is not specific of CYK and is available in a similar way to all general CF parsers. It is a special case of decorating the rules with values from a semi-ring.[[User:Teetooan\|Teetooan]] ([[User talk:Teetooan\|talk]]) 20:45, 27 February 2014 (UTC) ==Explain practical parse tree generation!== Line 28 ⟶ 45: == Another article on CYK == ~~There is another, almost empty article on the CYK parser: http://en.wikipedia.org/wiki/CYK_%28algorithm%29~~ ~~The two should be merged (the former should be deleted.)~~ <br><small><span class="autosigned">—Preceding [[Wikipedia:Signatures\|unsigned]] comment added by [[User:89.133.45.42\|89.133.45.42]] ([[User talk:89.133.45.42\|talk]] • [[Special:Contributions/89.133.45.42\|contribs]]) 21:33, 21 June 2007</span></small><!-- Template:Unsigned --> : Done. [[User:Ceroklis\|Ceroklis]] 10:01, 26 September 2007 (UTC) Line 69 ⟶ 82: == Citation from 2009? == Is that paper from 2009 really notable enough to warrant a citation in Wikipedia? It is used to make a fairly technical point. Since all other sources (besides Knuth, which is a reference work) are from the 60s, 70s, it makes the impression as if the latest paper would be the only continuation of the study of this algorithm, which is difficult to believe... that is, IMO it inflates the importance of the paper and the reference should be removed. :I support this removal [[User:Teetooan\|Teetooan]] ([[User talk:Teetooan\|talk]]) 00:56, 26 February 2014 (UTC) : :While I understand your feelings about giving undue weight to that paper, I would suggest to add more references of recent date about the CYK algorithm. For instance, the CYK algorithm, as well as Valiant's improvement have been recently generalized from context-free grammars to the case of [[Boolean grammar]]s. Removing content from the article would certainly be a step into the wrong direction, given the size of the article. If there are any recent papers that deserve to be included, please [[Wikipedia:Be_bold\|be bold]] and just go ahead. I am confident that the article will further expanded over time, given that the article's subject is covered in many textbooks and courses on automata theory. [[User:Hermel\|Hermel]] ([[User talk:Hermel\|talk]]) 21:29, 20 July 2010 (UTC) There is no reason to add recent references on CYK if there was no significant change in know-how regarding this algorithm. The best would rather be an account of actual use, and of current role. Some references could be added from the nineties that give original results on the relationship between all the general CF parsers (and some non CF ones).[[User:Teetooan\|Teetooan]] ([[User talk:Teetooan\|talk]]) 07:57, 26 February 2014 (UTC) In the previous comment, do you mean the Lange & Leiß paper? I just read that paper, without going deep into the math, and found it incredibly informative, since the main aim is pedagogical. ▼ ▲In the previous comment, do you mean the Lange & Leiß paper? I just read that paper, without going deep into the math, and found it incredibly informative, since the main aim is pedagogical. :So are lots of other papers on the web. But it is no significant contribution [[User:Teetooan\|Teetooan]] ([[User talk:Teetooan\|talk]]) 00:56, 26 February 2014 (UTC)) The paragraph in this article summarizing this paper is weak, and would benefit from being written in more specific terms concerning the three operations (BIN, DEL, UNIT) necessary to transform to CNF, the effect of ordering these operations on size explosion (exponential if DEL precedes BIN), and that DEL and UNIT can be internalized into the CYK algorithm at no extra cost, leaving only a linear transformation on grammar size into 2NF form (BIN). This papers omits TERM because it offers no advantage for the CYK algorithm. Line 108 ⟶ 126: :I changed "the most efficient parsing algorithm" to "one of the most efficient parsing algorithms". -- [[User:UKoch\|UKoch]] ([[User talk:UKoch\|talk]]) 14:08, 20 December 2011 (UTC) == External Links == I tried to fix the broken link in the external links section as well as adding a visualization I published on my site: https://www.xarg.org/tools/cyk-algorithm/ Unfortunately, the change was reverted automatically as I got explained here: https://en.wikipedia.org/wiki/User_talk:Xarg However, as I wrote, I don't think this is an advertising other than bringing in a valuable reference to Wikipedia and so I would be glad if you add the link permanently if the resource meeds your quality demands. I assure that this link will be available and that I own the site. <!-- Template:Unsigned --><small class="autosigned">— Preceding [[Wikipedia:Signatures\|unsigned]] comment added by [[User:Xarg\|Xarg]] ([[User talk:Xarg#top\|talk]] • [[Special:Contributions/Xarg\|contribs]]) 03:04, 21 July 2018 (UTC)</small> <!--Autosigned by SineBot--> == Pseudocode Difficult to Read == Aside from the fact that one-letter variable names obfuscate meaning and are difficult to follow, having a lowercase letter L in a font where it looks nearly identical for the numeral for "one" (cf. l and 1) only compounds the issue. [[User:Antigravity711\|Antigravity711]] ([[User talk:Antigravity711\|talk]]) 17:06, 27 January 2020 (UTC)