Content deleted Content added
reference needed |
Undid revision 599956549 by 161.64.89.126 (talk) ner crf in 2013 not notable / relevant here |
||
Line 12:
In [[English language|English]] and many other languages using some form of the [[Latin alphabet]], the [[Space (punctuation)|space]] is a good approximation of a [[word divider]] (word [[delimiter]]). (Some examples where the space character alone may not be sufficient include contractions like ''can't'' for ''can not''.)
However the equivalent to this character is not found in all written scripts, and without it word segmentation is a difficult problem. Languages which do not have a trivial word segmentation process include [[Chinese language|Chinese]]
In some writing systems however, such as the [[Ge'ez script]] used for [[Amharic]] and [[Tigrinya]] among other languages, words are explicitly delimited (at least historically) with a non-[[Space (punctuation)|whitespace]] character.
|