Content deleted Content added
→Chatterbots: linking to the main list of chatbots |
mNo edit summary |
||
Line 238:
* [[Truecasing]] –
* [[Word segmentation]] – separates a chunk of continuous text into separate words. For a language like [[English language|English]], this is fairly trivial, since words are usually separated by spaces. However, some written languages like [[Chinese language|Chinese]], [[Japanese language|Japanese]] and [[Thai language|Thai]] do not mark word boundaries in such a fashion, and in those languages text segmentation is a significant task requiring knowledge of the [[vocabulary]] and [[morphology (linguistics)|morphology]] of words in the language.
* [[Word
** [[Word-sense induction]] – open problem of natural language processing, which concerns the automatic identification of the senses of a word (i.e. meanings). Given that the output of word-sense induction is a set of senses for the target word (sense inventory), this task is strictly related to that of word-sense disambiguation (WSD), which relies on a predefined sense inventory and aims to solve the ambiguity of words in context.
** [[Automatic acquisition of sense-tagged corpora]] –
|