Revision as of 13:54, 8 June 2016 edit 138.246.2.190 (talk) →k-NN outlier ← Previous edit		Revision as of 13:56, 8 June 2016 edit undo 138.246.2.185 (talk) →k-NN outlier: Fixed typo Tags: canned edit summary Mobile app edit Next edit →
Line 159: ==''k''-NN outlier== The distance to the ''k''th nearest neighbor can also be seen as a local density estimate and thus is also a popular outlier score in [[anomaly detection]]. The larger the distance to the ''k''-NN, the lower the local density, the more likely the query point is an outlier.<ref>{{cite conference \| doi = 10.1145/342009.335437\| title = Efficient algorithms for mining outliers from large data sets\| conference = Proceedings of the 2000 ACM SIGMOD international conference on Management of data – SIGMOD '00\| pages = 427\| year = 2000\| last1 = Ramaswamy \| first1 = S. \| last2 = Rastogi \| first2 = R. \| last3 = Shim \| first3 = K. \| isbn = 1-58113-217-4}}</ref> Although quite simple, this outlier model, along with another classic data mining method, [[local outlier factor]], works quite well also in comparison to more recent and more complex approaches, according to a large scale experimental analysis.<ref name="CamposZimek2016">{{cite journal\|last1=Campos\|first1=Guilherme O.\|last2=Zimek\|first2=Arthur\|last3=Sander\|first3=Jörg\|last4=Campello\|first4=Ricardo J. G. B.\|last5=Micenková\|first5=Barbora\|last6=Schubert\|first6=Erich\|last7=Assent\|first7=Ira\|last8=Houle\|first8=Michael E.\|title=On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study\|journal=Data Mining and Knowledge Discovery\|year=2016\|issn=1384-5810\|doi=10.1007/s10618-015-0444-8}}</ref> ==Validation of results==

K-nearest neighbors algorithm: Difference between revisions