K-nearest neighbors algorithm: Difference between revisions

Content deleted Content added
k-NN outlier: Fixed typo
Tags: canned edit summary Mobile app edit
Line 159:
 
==''k''-NN outlier==
The distance to the ''k''th nearest neighbor can also be seen as a local density estimate and thus is also a popular outlier score in [[anomaly detection]]. The larger the distance to the ''k''-NN, the lower the local density, the more likely the query point is an outlier.<ref>{{cite conference | doi = 10.1145/342009.335437| title = Efficient algorithms for mining outliers from large data sets| conference = Proceedings of the 2000 ACM SIGMOD international conference on Management of data – SIGMOD '00| pages = 427| year = 2000| last1 = Ramaswamy | first1 = S. | last2 = Rastogi | first2 = R. | last3 = Shim | first3 = K. | isbn = 1-58113-217-4}}</ref> Although quite simple, this outlier model, along with another classic data mining method, [[local outlier factor]], works quite well also in comparison to more recent and more complex approaches, according to a large scale experimental analysis.<ref name="CamposZimek2016">{{cite journal|last1=Campos|first1=Guilherme O.|last2=Zimek|first2=Arthur|last3=Sander|first3=Jörg|last4=Campello|first4=Ricardo J. G. B.|last5=Micenková|first5=Barbora|last6=Schubert|first6=Erich|last7=Assent|first7=Ira|last8=Houle|first8=Michael E.|title=On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study|journal=Data Mining and Knowledge Discovery|year=2016|issn=1384-5810|doi=10.1007/s10618-015-0444-8}}</ref>
 
==Validation of results==