Content deleted Content added
m Standard headings/general fixes |
No edit summary |
||
Line 1:
'''Text segmentation''' is the process of dividing written text into
The problem may appear relatively trivial for written languages that have explicit word boundary markers, such as the word spaces of written [[English language|English]] or the distinctive initial, medial and final letter shapes of [[Arabic language|Arabic]]. When such clues are not consistently available, the task often requires fairly non-trivial techniques, such as statistical decision-making, large dictionaries, as well as consideration of syntactic and semantic constraints.
|