Content deleted Content added
No edit summary |
|||
Line 5:
==Overview==
The Lesk algorithm is based on the assumption that words in a given neighbourhood will tend to share a common topic. A simplified version of the Lesk algorithm is to compare the dictionary definition of an ambiguous word with the terms contained of the neighbourhood. Versions have been adapted to [[WordNet]]<ref>Satanjeev Banerjee and Ted Pedersen. ''[http://www.cs.cmu.edu/~banerjee/Publications/cicling2002.ps.gz An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet]''
</ref>. It would be like this:
# for every sense of the word being disambiguated one should count the amount of words that are in both neighbourhood of that word and in the definition of each sense in a dictionary
Line 24:
Unfortunately, Lesk’s approach is very sensitive to the exact wording of definitions, so the absence of a certain word can radically change the results. Further, the algorithm determines overlaps only among the glosses of the senses being considered. This is a significant limitation in that dictionary glosses tend to be fairly short and do not provide sufficient vocabulary to relate fine-grained sense distinctions.
Recently, a lot of works appeared which offer different modifications of this algorithm. These works uses other resources for analysis (thesauruses, synonyms dictionaries or morphological and syntactic models): for instance, it may use such information as synonyms, different derivatives, or words from definitions of words from definitions<ref>Alexander Gelbukh, Grigori Sidorov. Automatic resolution of ambiguity of word senses in dictionary definitions (in Russian). J.
There are a lot of studies concerning Lesk and its extensions<ref>Roberto Navigli. [http://www.dsi.uniroma1.it/~navigli/pubs/ACM_Survey_2009_Navigli.pdf ''Word Sense Disambiguation: A Survey]'', ACM Computing Surveys, 41(2), 2009, pp. 1–69.</ref>:
|