Text normalization: Difference between revisions

Content deleted Content added
m And linked directly so that the automatic redirection is bypassed.
+1
Line 12:
* removing [[stopwords]] or "too common" words
* [[stemming]]
* [[canonicalization]]
 
While this may be done manually, and usually is in the case of ad hoc and personal documents, many [[programming language]]s support mechanisms which enable text normalization.