Document-term matrix: Difference between revisions

Content deleted Content added
Janislaw (talk | contribs)
Danielx (talk | contribs)
improved the intro based on the description found in http://en.wikipedia.org/wiki/Latent_semantic_analysis#Occurrence_matrix
Line 1:
'''Document-term matrix''' is a mathematical [[Matrix (mathematics)|matrix]] that describes the frequency of terms that occur in a collection of documents. Each column corresponds to a document in the collection, and each row corresponds to a word or term. There are various schemes for determining the value that each entry in the matrix should take. One such scheme is [[tf-idf]]. They are useful in the field of [[natural language processing]].
'''Document-term matrix''' are used in [[natural language processing]] programs. They represent natural language documents as mathematical objects (a [[matrix (mathematics)|matrix]]) and make it possible to process them as a whole.
 
==General Concept==