Vector space model: Difference between revisions

Content deleted Content added
citation needed
details and cite
Line 3:
 
==Definitions==
In this section we consider a particular vector space model based on the [[Bag-of-words model|bag-of-words]] representation. Documents and queries are represented as vectors.
 
:<math>d_j = ( w_{1,j} ,w_{2,j} , \dotsc ,w_{n,j} )</math>
Line 12:
The definition of ''term'' depends on the application. Typically terms are single words, [[keyword (linguistics)|keyword]]s, or longer phrases. If words are chosen to be the terms, the dimensionality of the vector is the number of words in the vocabulary (the number of distinct words occurring in the [[text corpus|corpus]]).
 
Vector operations can be used to compare documents with queries.<ref>{{Cite book |last=Büttcher |first=Stefan |title=Information retrieval: implementing and evaluating search engines |last2=Clarke |first2=Charles L. A. |last3=Cormack |first3=Gordon V. |date=2016 |publisher=The MIT Press |isbn=978-0-262-52887-0 |edition=First MIT Press paperback edition |___location=Cambridge, Massachusetts London, England}}</ref>
Vector operations can be used to compare documents with queries.
 
==Applications==