Computational Measures of Semantic Relatedness include:
- Latent semantic analysis (+) vector-based, adds vectors to measure multi-word terms; (-) non-incremental vocabulary, long pre-processing times
- Pointwise Mutual Information (+) large vocab, because it uses any search engine (like Google); (-) cannot measure relatedness between whole sentences or documents
- GLSA (+) vector-based, adds vectors to measure multi-word terms; (-) non-incremental vocabulary, long pre-processing times
- ICAN (+) incremental, network-based measure, good for spreading activation, accounts for second-order relatedness; (-) cannot measure relatedness between multi-word terms, long pre-processing times
- NGD (+) large vocab, because it uses any search engine (like Google); (-) cannot measure relatedness between whole sentences or documents
- WordNet: (+) humanly constructed; (-) humanly constructed (not automatically learned), cannot measure relatedness between multi-word term, non-incremental vocabulary
External links
References
- Five Papers on WordNet by Miller, George A., Christiane Fellbaum, Katherine J. Miller. August, 1993, retrieved May 4, 2005
- "The Latent Semantic Indexing home page".
- Thomas Landauer, P. W. Foltz, & D. Laham (1998). "Introduction to Latent Semantic Analysis" (PDF). Discourse Processes. 25: 259–284.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)
Categories