Content deleted Content added
m Reverted edits by 2409:40F2:44:355:8000:0:0:0 (talk) (HG) (3.4.12) |
Citation bot (talk | contribs) Add: s2cid. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox2 | #UCB_webform_linked 460/999 |
||
Line 107:
The naive version of the algorithm is easy to implement by computing the distances from the test example to all stored examples, but it is computationally intensive for large training sets. Using an approximate [[nearest neighbor search]] algorithm makes ''k-''NN computationally tractable even for large data sets. Many nearest neighbor search algorithms have been proposed over the years; these generally seek to reduce the number of distance evaluations actually performed.
''k-''NN has some strong [[consistency (statistics)|consistency]] results. As the amount of data approaches infinity, the two-class ''k-''NN algorithm is guaranteed to yield an error rate no worse than twice the [[Bayes error rate]] (the minimum achievable error rate given the distribution of the data).<ref name=":1">{{cite journal |url=http://ssg.mit.edu/cal/abs/2000_spring/np_dens/classification/cover67.pdf |doi=10.1109/TIT.1967.1053964 |author-link1=Thomas M. Cover |last1=Cover |first1=Thomas M. |author-link2=Peter E. Hart |last2=Hart |first2=Peter E. |title=Nearest neighbor pattern classification |journal=IEEE Transactions on Information Theory |year=1967 |volume=13 |issue=1 |pages=21–27 | citeseerx=10.1.1.68.2616 |s2cid=5246200 }}</ref> Various improvements to the ''k''-NN speed are possible by using proximity graphs.<ref>{{cite journal |doi=10.1142/S0218195905001622 |last=Toussaint |first=Godfried T. |title=Geometric proximity graphs for improving nearest neighbor methods in instance-based learning and data mining |journal=International Journal of Computational Geometry and Applications |volume=15 |issue=2 |date=April 2005 |pages=101–150 }}</ref>
For multi-class ''k-''NN classification, [[Thomas M. Cover|Cover]] and [[Peter E. Hart|Hart]] (1967) prove an upper bound error rate of
|