Nearest centroid classifier: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 22:24, 17 August 2015 edit Sweepy (talk \| contribs) 9,729 edits supplement ← Previous edit		Latest revision as of 03:09, 17 April 2025 edit undo Citation bot (talk \| contribs) Bots 5,870,099 edits Added bibcode. \| Use this bot. Report bugs. \| Suggested by Dominic3203 \| Linked from User:LinguisticMystic/cs/outline \| #UCB_webform_linked 1410/2277
(7 intermediate revisions by 5 users not shown)
Line 1: {{Short description\|A classification model in machine learning based on centroids}} [[Image:Rocchioclassgraph.jpg\|thumb\|right\|250px\|Rocchio Classification]] In [[machine learning]], a '''nearest centroid classifier''' or '''nearest prototype classifier''' is a [[statistical classification\|classification model]] that assigns to observations the label of the class of training samples whose [[mean]] ([[centroid]]) is closest to the observation. When applied to [[text classification]] using [[vector space model\|word vectors]] containing [[tfidf]] weights to represent documents, the nearest centroid classifier is known as the '''Rocchio classifier''' because of its similarity to the [[Rocchio algorithm]] for [[relevance feedback]].<ref>{{cite book When applied to [[text classification]] using [[tfidf]] vectors to represent documents, the nearest centroid classifier is known as the '''Rocchio classifier''' because of its similarity to the [[Rocchio algorithm]] for [[relevance feedback]].<ref>{{cite book \| last1 = Manning \| first1 = Christopher Line 34 ⟶ 33: \| year = 2002 \| doi = 10.1073/pnas.082099299 \| pages=6567–6572 \| pmid=12011421 \| pmc=124443 \| doi-access = free \| bibcode = 2002PNAS...99.6567T }}</ref> == Algorithm == ===Training=== * Training procedure: givenGiven labeled training samples <math>\textstyle\{(\vec{x}_1, y_1), \dots, (\vec{x}_n, y_n)\}</math> with class labels <math>y_i \in \mathbf{Y}</math>, compute the per-class centroids <math>\textstyle\vec{\~~mu_l~~mu}_\ell = \frac{1}{\|~~C_l~~C_\ell\|}\underset{i \in ~~C_l~~C_\ell}{\sum} \vec{x}_i</math> where <math>~~C_l~~C_\ell</math> is the set of indices of samples belonging to class <math>l\ell \in \mathbf{Y}</math>. * Prediction function: the class assigned to an observation <math>\vec{x}</math> is <math>\hat{y} = {\arg\min}_{l \in \mathbf{Y}} \\|\vec{\mu}_l - \vec{x}\\|</math>.▼ ===Prediction=== ▲* Prediction function: theThe class assigned to an observation <math>\vec{x}</math> is <math>\hat{y} = {\arg\min}_{l\ell \in \mathbf{Y}} \\|\vec{\mu}_l_\ell - \vec{x}\\|</math>. == See also == * [[Cluster hypothesis]] * [[K-means clustering\|''k''-means clustering]] * [[K-nearest ~~neighbor~~neighbors algorithm\|''k''-nearest neighbor algorithm]] * [[Linear discriminant analysis]]