Document-term matrix: Difference between revisions

Content deleted Content added
No edit summary
WikiCleanerBot (talk | contribs)
m v2.04b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
Line 14:
|'''D2'''||1||0||1||1
|}
which shows which documents contain which terms and how many times they appear. Such an approach is similar to the use of [[incidence matrix]] by an analysis of sentences inside the corpus of words.<ref>Bryan Bischof. Higher order co-occurrence tensors for hypergraphs via face-splitting. Published 15 February, 2020, Mathematics, Computer Science, [https://arxiv.org/abs/2002.06285 ArXiv]</ref>.
 
Note that more sophisticated weights can be used; one typical example, among others, would be [[tf-idf]].