Revision as of 20:12, 10 November 2022 edit AManWithNoPlan (talk \| contribs) Extended confirmed users 100,078 edits link died ← Previous edit		Revision as of 21:49, 10 November 2022 edit undo Jonesey95 (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, Mass message senders, Template editors 409,811 edits Undid revision 1121154757 by AManWithNoPlan (talk). Please do not simply delete dead links Next edit →
Line 157: [[Feature extraction]] and dimension reduction can be combined in one step using [[Principal Component Analysis\|principal component analysis]] (PCA), [[linear discriminant analysis]] (LDA), or [[Canonical correlation\|canonical correlation analysis]] (CCA) techniques as a pre-processing step, followed by clustering by ''k''-NN on [[Feature (machine learning)\|feature vectors]] in reduced-dimension space. This process is also called low-dimensional [[embedding]].<ref>{{citation \|last1=Shaw \|first1=Blake \|last2=Jebara \|first2=Tony \|title=Structure preserving embedding \|work=Proceedings of the 26th Annual International Conference on Machine Learning \|year=2009 \|pages=1–8 \| publication-date=June 2009 \|url=http://www.cs.columbia.edu/~jebara/papers/spe-icml09.pdf \|doi=10.1145/1553374.1553494 \|isbn=9781605585161 \|s2cid=8522279 }}</ref> For very-high-dimensional datasets (e.g. when performing a similarity search on live video streams, DNA data or high-dimensional [[time series]]) running a fast '''approximate''' ''k''-NN search using [[Locality Sensitive Hashing\|locality sensitive hashing]], "random projections",<ref>Bingham, Ella; and Mannila, Heikki; [https://citeseerx.ist.psu.edu/doc_view/pid/aed77346f737b0ed5890b61ad02e5eb4ab2f3dc6 "Random projection in dimensionality reduction: applications to image and text data"], in ''Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining'', ACM, 2001</ref> "sketches" <ref>Ryan, Donna (editor); ''High Performance Discovery in Time Series'', Berlin: Springer, 2004, {{ISBN\|0-387-00857-8}}</ref> or other high-dimensional similarity search techniques from the [[VLDB conference\|VLDB]] toolbox might be the only feasible option. == Decision boundary ==

K-nearest neighbors algorithm: Difference between revisions