Revision as of 05:44, 17 December 2019 edit 223.24.162.209 (talk) No edit summary Tags: Mobile edit Mobile web edit ← Previous edit		Revision as of 10:51, 17 December 2019 edit undo Fabrizio.angiulli (talk \| contribs) 11 edits mNo edit summary Tag: Visual edit Next edit →
Line 154: <ref>Beyer, Kevin, et al.. [https://minds.wisconsin.edu/bitstream/handle/1793/60174/TR1377.pdf?sequence=1 'When is “nearest neighbor” meaningful?]. Database Theory—ICDT’99, 217-235\|year 1999</ref> The curse of dimensionality in the ''k''-NN context basically means that [[Euclidean distance]] is unhelpful in high dimensions because all vectors are almost equidistant to the search query vector (imagine multiple points lying more or less on a circle with the query point at the center; the distance from the query to all data points in the search space is almost the same).<ref>{{Cite journal\|last=Angiulli\|first=Fabrizio\|date=2018\|title=On the Behavior of Intrinsically High-Dimensional Spaces: Distances, Direct and Reverse Nearest Neighbors, and Hubness\|url=http://jmlr.csail.mit.edu/papers/v18/17-151.html\|journal=Journal of Machine Learning Research (JMLR)\|volume=18(170)\|pages=1-60\|via=}}</ref> [[Feature extraction]] and dimension reduction can be combined in one step using [[Principal Component Analysis\|principal component analysis]] (PCA), [[linear discriminant analysis]] (LDA), or [[Canonical correlation\|canonical correlation analysis]] (CCA) techniques as a pre-processing step, followed by clustering by ''k''-NN on [[Feature (machine learning)\|feature vectors]] in reduced-dimension space. In [[machine learning]] this process is also called low-dimensional [[embedding]].<ref>Shaw, Blake, and Tony Jebara. 'Structure preserving embedding. Proceedings of the 26th Annual International Conference on Machine Learning. ACM,2009</ref>

K-nearest neighbors algorithm: Difference between revisions