Content deleted Content added
No edit summary |
m General fixes and Typo fixing, typos fixed: occuring → occurring, orignial → original using AWB (7069) |
||
Line 23:
==Simplified Lesk Algorithm==
In Simplified Lesk algorithm, the correct meaning of each word in a given contest is determined individually by locating the sense that overlaps the most between its dictionary definition and the given context. Rather than simultaneously determining the meanings of all words in a given context, this approach tackles each word individually, independent of the meaning of the other words
"A comparative evaluation perfomed by Vasileseu et al. (2004)<ref>Florentina Vasilescu, Philippe Langlais, and Guy Lapalme.
2004. Evaluating Variants of the Lesk Approach for Disambiguating Words. LREC, Portugal.</ref> has shown that the simplified Lesk algorithm can significantly outperform the original definition of the algorithm, both in terms of precision and efficiency. By evaluating the disambiguation algorithms on the Senseval-2 English all words, data, they measure a 58% precision using the simplified Lesk algorithm compared to the only 42% under the original algorithm.
Note: Vasileseu et al. implementation considers a back-off strategy for words not covered by the algorithm, consisting of the most frequent sense defined in WordNet. This means that words for which all their possible meanings lead to zero overlap with current context or with other word definitions are by default assigned sense number one in WordNet."<ref>Agirre, Eneko & Philip Edmonds (eds.). 2006. Word Sense Disambiguation: Algorithms and Applications. Dordrecht: Springer. www.wsdbook.org
{| class="wikitable"
|-
|
:
:''max-overlap <- 0''
:''context <- set of words in sentence ''
:
::''signature <- set of words in the gloss and examples of sense''
::''overlap'' <- COMPUTEOVERLAP (''signature,context'')
::
::::''max-overlap <- overlap
::::best-sense <- sense''
|}
The COMPUTEOVERLAP function returns the number of words in common between two sets, ignoring function words or other words on a stop list. The
==Criticisms and other Lesk-based methods==
Line 76:
[[Category:Computational linguistics]]
[[Category:Word sense disambiguation]]
{{ling-stub}}
|