Text segmentation: Difference between revisions

Content deleted Content added
张开旭 (talk | contribs)
m simplify and make less assumptions about the reader's background knowledge
Line 1:
'''Text segmentation''' is the process of dividing written text into meaningful units, such as [[sentence]]s or [[topic]]s. The term applies to [[human mind|mental]] processes used by humans when reading text, and to artificial processes implemented in [[computers]], which are the subject of [[natural language processing]]. The problem mayis appearnon-trivial, relativelybecause trivialwhile forsome written languages that have explicit word boundary markers, such as the word spaces of written [[English language|English]] orand the distinctive initial, medial and final letter shapes of [[Arabic language|Arabic]], but thesesuch signals are sometimes ambiguous and not present in all written languages.
 
Compare [[speech segmentation]], the process of dividing speech into linguistically meaningful portions.