K-nearest neighbors algorithm: Difference between revisions

Content deleted Content added
Dimension reduction: use Citation bot to fill in this citation properly
No edit summary
Line 5:
:* In ''k-NN classification'', the output is a class membership. An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its ''k'' nearest neighbors (''k'' is a positive [[integer]], typically small). If ''k'' = 1, then the object is simply assigned to the class of that single nearest neighbor.
 
:* In ''k-NN regression'', the output is the property value for the object. This value is the average of the values of ''k'' nearest neighbors. If ''k'' = 1, then the output is simply assigned to the value of that single nearest neighbor.
 
''k''-NN is a type of [[classification]] where the function is only approximated locally and all computation is deferred until function evaluation. Since this algorithm relies on distance for classification, if the features represent different physical units or come in vastly different scales then [[Normalization (statistics)|normalizing]] the training data can improve its accuracy dramatically.<ref name=":0">{{Cite journal|last1=Piryonesi S. Madeh|last2=El-Diraby Tamer E.|date=2020-06-01|title=Role of Data Analytics in Infrastructure Asset Management: Overcoming Data Size and Quality Problems|journal=Journal of Transportation Engineering, Part B: Pavements|volume=146|issue=2|pages=04020022|doi=10.1061/JPEODX.0000175|s2cid=216485629 }}</ref><ref>{{Cite book|last=Hastie, Trevor.|title=The elements of statistical learning : data mining, inference, and prediction : with 200 full-color illustrations|date=2001|publisher=Springer|others=Tibshirani, Robert., Friedman, J. H. (Jerome H.)|isbn=0-387-95284-5|___location=New York|oclc=46809224}}</ref>