Revision as of 01:16, 28 July 2016 edit Oshwah (talk \| contribs) Edit filter managers, Autopatrolled, Checkusers, Interface administrators, Oversighters, Administrators 500,034 edits m Reverted edits by 186.223.223.194 (talk) (HG) (3.1.21) ← Previous edit		Revision as of 05:17, 30 November 2016 edit undo 65.96.156.161 (talk) No edit summary Tags: references removed Visual edit Next edit →
Line 1: {{machine learning bar}} In [[machine learning]], a '''probabilistic classifier''' is a [[statistical classification\|classifier]] that is able to predict, given a sample input, a [[probability distribution]] over a [[Set (mathematics)\|set]] of classes, rather than only outputting the most likely class that the sample should belong to. Probabilistic classifiers provide classification with a degree of certainty, which can be useful in its own right,<ref>{{cite book \|first1=Trevor \|last1=Hastie \|first2=Robert \|last2=Tibshirani \|first3=Jerome \|last3=Friedman \|year=2009 \|title=The Elements of Statistical Learning \|url=http://statweb.stanford.edu/~tibs/ElemStatLearn/ \|page=348 \|quote=[I]n [[data mining]] applications the interest is often more in the class probabilities <math>p_\ell(x), \ell = 1, \dots, K</math> themselves, rather than in performing a class assignment.}}</ref> or when combining classifiers into [[ensemble classifier\|ensembles]]. ==Types of classification== Line 28 ⟶ 27: For the [[binary classification\|binary]] case, a common approach is to apply [[Platt scaling]], which learns a [[logistic regression]] model on the scores.<ref name="platt99">{{cite journal \|last=Platt \|first=John \|authorlink=John Platt (computer scientist) \|title=Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods \|journal=Advances in large margin classifiers \|volume=10 \|issue=3 \|year=1999 \|pages=61–74 \|url=http://www.researchgate.net/publication/2594015_Probabilistic_Outputs_for_Support_Vector_Machines_and_Comparisons_to_Regularized_Likelihood_Methods/file/504635154cff5262d6.pdf}}</ref> An alternative method using [[isotonic regression]]<ref>{{Cite book \| last1 = Zadrozny \| first1 = Bianca\| last2 = Elkan \| first2 = Charles\| doi = 10.1145/775047.775151 \| chapter = Transforming classifier scores into accurate multiclass probability estimates \| url = http://www.cs.cornell.edu/courses/cs678/2007sp/ZadroznyElkan.pdf\| title = Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02 \| pages = 694–699\| year = 2002 \| isbn = 1-58113-567-X\| pmid = \| pmc = \| id = [[CiteSeerX]]: {{URL\|1=citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.7457\|2=10.1.1.13.7457}}}}</ref> is generally superior to Platt's method when sufficient training data is available.<~~ref name="Niculescu"/>~~r In the [[multiclass classification\|multiclass]] case, one can use a reduction to binary tasks, followed by univariate calibration with an algorithm as described above and further application of the pairwise coupling algorithm by Hastie and Tibshirani.<ref>{{Cite journal \| last1 = Hastie \| first1 = Trevor\| last2 = Tibshirani \| first2 = Robert\| doi = 10.1214/aos/1028144844 \| title = Classification by pairwise coupling \| journal = [[The Annals of Statistics]] \| volume = 26 \| issue = 2 \| pages = 451–471\| year = 1998 \| pmid = \| pmc = \| zbl = 0932.62071\| id = [[CiteSeerX]]: {{URL\|1=citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.6032\|2=10.1.1.46.6032}}}}</ref>

Probabilistic classification: Difference between revisions