Text normalization: Difference between revisions

Content deleted Content added
No edit summary
m Copyedits
Line 15:
While this may be done manually, and usually is in the case of ad hoc and personal documents, many [[programming language]]s support mechanisms which enable text normalization.
 
The textText normalization is useful, for example, for comparing two sequencesequences of characters which mean the same but are represented differently. The examples of this kind of normalization include, but not limited to, "don't" vs "do not", "I'm" vs "I am", "Can't" vs "Cannot".
 
Further, "1" and "one" are the same, "1st" is the same as "first", and so on. Instead of treating these strings as different, through text processing, one can treat them as the same.
 
[[Category:Unicode]]