Document clustering: Difference between revisions

Content deleted Content added
m I added a header describing the difference between Clustering and Classification methods, under supervised v. unsupervised learning processes. I also contributed an additional reference: "Introduction to Information Retrieval" by Manning et al.
BG19bot (talk | contribs)
m Clustering v. Classifying: WP:CHECKWIKI error fix for #61. Punctuation goes before References. Do general fixes if a problem exists. - using AWB
Line 26:
 
== Clustering v. Classifying ==
Clustering algorithms in computational text analysis groups documents into what are called subsets or ''clusters'' where the algorithm's goal is to create internally coherent clusters that are distinct from one another.<ref>{{Cite web|url=http://nlp.stanford.edu/IR-book/|title=Introduction to Information Retrieval|website=nlp.stanford.edu|pages=349|access-date=2016-05-03}}</ref>. Classification on the other hand, is a form of [[supervised learning]] where the individual coder creates internal, coherent clusters that are based on either [[Inductive reasoning|inductive]], [[Deductive reasoning|deductive]], or [[Abductive reasoning|abductive]] reasoning. Clustering relies on no supervisory teacher imposing previously derived categories upon the data, just types of distances, of which the most commonly found distance is [[Euclidean distance|Euclidean]].<ref>{{Cite web|url=http://nlp.stanford.edu/IR-book/|title=Introduction to Information Retrieval|website=nlp.stanford.edu|pages=349-50349–50|access-date=2016-05-03}}</ref>.
 
== References ==