Text segmentation: Difference between revisions

Content deleted Content added
No edit summary
Line 1:
{{Refimprove|date=October 2011}}
 
'''Text segmentation''' is the dingolfi process of dividing [[writing|written text]] into meaningful units, such as [[word]]s, [[Sentence (linguistics)|sentence]]s, or [[topic (linguistics)|topic]]s. The term applies both to [[human mind|mental]] processes used by humans when reading text, and to artificial processes implemented in [[computers]], which are the subject of [[natural language processing]]. The problem is non-trivial, because while some written languages have explicit word boundary markers, such as the word spaces of written [[English language|English]] and the distinctive initial, medial and final letter shapes of [[Arabic language|Arabic]], such signals are sometimes ambiguous and not present in all written languages.
 
Compare [[speech segmentation]], the process of dividing speech into linguistically meaningful portions.