Revision as of 04:47, 23 May 2025 edit Cosmia Nebula (talk \| contribs) Extended confirmed users 11,305 edits move Tag: 2017 wikitext editor		Revision as of 04:48, 23 May 2025 edit undo Cosmia Nebula (talk \| contribs) Extended confirmed users 11,305 edits mNo edit summary Tag: 2017 wikitext editor Next edit →
Line 2: {{Use dmy dates\|date=May 2024}} The ~~Ho-Kashyap~~Ho–Kashyap algorithm is an iterative method in [[machine learning]] for finding a [[linear classifier\|linear decision boundary]] that separates two [[Linear separability\|linearly separable]] classes. It was developed by [[Yu-Chi Ho]] and [[Rangasami L. Kashyap]] in 1965,<ref>{{Cite journal \|last=Ho \|first=Y-C. \|last2=Kashyap \|first2=R. L. \|date=1965-10 \|title=An Algorithm for Linear Inequalities and its Applications \|url=http://ieeexplore.ieee.org/document/4038553/ \|journal=IEEE Transactions on Electronic Computers \|volume=EC-14 \|issue=5 \|pages=683–688 \|doi=10.1109/PGEC.1965.264207 \|issn=0367-7508}}</ref><ref>{{Cite journal \|last=Ho \|first=Yu-Chi \|last2=Kashyap \|first2=R. L. \|date=1966-02 \|title=A Class of Iterative Procedures for Linear Inequalities \|url=http://epubs.siam.org/doi/10.1137/0304010 \|journal=SIAM Journal on Control \|language=en \|volume=4 \|issue=1 \|pages=112–115 \|doi=10.1137/0304010 \|issn=0036-1402}}</ref> and usually presented as a problem in [[linear programming]]. == Setup == Given a training set consisting of samples from two classes, the ~~Ho-Kashyap~~Ho–Kashyap algorithm seeks to find a weight vector <math>\mathbf{w}</math> and a margin vector <math>\mathbf{b}</math> such that: <math display="block"> \mathbf{Yw} = \mathbf{b} </math> where <math>\mathbf{Y}</math> is the augmented data matrix with samples from both classes (with appropriate sign conventions, e.g., samples from class 2 are negated), <math>\mathbf{w}</math> is the weight vector to be determined, and <math>\mathbf{b}</math> is a positive margin vector. Line 22: == Algorithm == The idea of the ~~Ho-Kashyap~~Ho–Kashyap rule is as follows: * Given any <math>\mathbf{b}</math>, the corresponding <math>\mathbf{w}</math> is known: It is simply <math>\mathbf{w} = \mathbf{Y}^+ \mathbf{b}</math>, where <math>\mathbf{Y}^+</math> denotes the [[Moore–Penrose inverse\|~~Moore-Penrose~~Moore–Penrose pseudoinverse]] of <math>\mathbf{Y}</math>. * Therefore, it only remains to find <math>\mathbf{b}</math> by gradient descent. * However, the gradient descent may sometimes ''decrease'' some of the coordinates of <math>\mathbf{b}</math>, which may cause some coordinates of <math>\mathbf{b}</math> to become negative, which is undesirable. Therefore, whenever some coordinates of <math>\mathbf{b}</math> would have decreased, those coordinates are unchanged instead. As for the coordinates of <math>\mathbf{b}</math> that would increase, those would increase without issue. Line 44: == Relationship to other algorithms == * [[Perceptron]] algorithm: Both seek linear separators. The perceptron updates weights incrementally based on individual misclassified samples, while ~~Ho-Kashyap~~Ho–Kashyap is a batch method that processes all samples to compute the pseudoinverse and updates based on an overall error vector. * [[Linear discriminant analysis]] (LDA): LDA assumes underlying Gaussian distributions with equal covariances for the classes and derives the decision boundary from these statistical assumptions. ~~Ho-Kashyap~~Ho–Kashyap makes no explicit distributional assumptions and instead tries to solve a system of linear inequalities directly. * [[Support vector machine]]s (SVM): For linearly separable data, SVMs aim to find the maximum-margin hyperplane. The ~~Ho-Kashyap~~Ho–Kashyap algorithm finds ''a'' separating hyperplane but not necessarily the one with the maximum margin. If the data is not separable, soft-margin SVMs allow for some misclassifications by optimizing a trade-off between margin size and misclassification penalty, while ~~Ho-Kashyap~~Ho–Kashyap provides a least-squares solution. == Variants == Modified ~~Ho-Kashyap~~Ho–Kashyap algorithm changes weight calculation step <math>\mathbf{w}(k+1) = \mathbf{Y}^+ \mathbf{b}(k+1)</math> to <math>\mathbf{w}(k+1) = \mathbf{w}(k) + \eta_k \mathbf{Y}^+ \|\mathbf{e}(k)\|</math>. Kernel ~~Ho-Kashyap~~Ho–Kashyap algorithm: Applies [[kernel method]]s (the "[[kernel trick]]") to the ~~Ho-Kashyap~~Ho–Kashyap framework to enable non-linear classification by implicitly mapping data to a higher-dimensional feature space.<ref>{{cite journal \|last=Łęski \|first=Jacek \|date=2004 \|title=Kernel ~~Ho-Kashyap~~Ho–Kashyap classifier with generalization control. \|journal=International Journal of Applied Mathematics and Computer Science \|volume=14 \|issue=1 \|pages=53-61}}</ref> == See also == Line 63: == References == <references /> * {{cite book \|last1=Duda \|first1=R. O. \|last2=Hart \|first2=P. E. \|last3=Stork \|first3=D. G. \|year=2001 \|title=Pattern Classification \|edition=2nd \|chapter=5.9. The ~~Ho-Kashyap~~Ho–Kashyap Procedures \|publisher=Wiley-Interscience \|isbn=978-0-471-05669-0}} * {{cite book \|last1=Bishop \|first1=C. M. \|year=2006 \|title=Pattern Recognition and Machine Learning \|publisher=Springer \|isbn=978-0-387-31073-2}} * {{cite book \|last1=Theodoridis \|first1=S. \|last2=Koutroumbas \|first2=K. \|year=2008 \|title=Pattern Recognition \|edition=4th \|publisher=Academic Press \|isbn=978-1-59749-272-0}}

Ho–Kashyap rule: Difference between revisions