Content deleted Content added
added 'unreferenced' template |
No edit summary |
||
Line 8:
* converting all letters to lower or upper case
* removing punctuation
* removing
* expanding abbreviations
* removing [[stopwords]] or "too common" words
While this may be done manually, and usually is in the case of ad hoc and personal documents, many [[programming language]]s support mechanisms which enable text normalization.
|