Minimum relevant variables in linear system: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 12:09, 7 January 2019 edit Erel Segal (talk \| contribs) Extended confirmed users, IP block exemptions 14,576 edits →Related problems Tag: Visual edit ← Previous edit		Latest revision as of 01:52, 22 March 2024 edit undo Bkell (talk \| contribs) Administrators 64,798 edits m →Related problems: again, MOS:ACRO
(15 intermediate revisions by 12 users not shown)
Line 1: '''~~MINimum~~Minimum ~~Relevant~~relevant ~~Variables~~variables in ~~Linear~~linear ~~System~~system''' ('''Min-RVLS''') is a problem in [[mathematical optimization]]. Given a [[linear program]], it is required to find a feasible solution in which the number of non-zero variables is as small as possible. The problem is known to be [[NP-hardness\|NP-hard]] and even hard to approximate. == Definition == A Min-RVLS problem is defined by:<ref name=":1">{{~~Cite~~cite journal \|~~date~~last1=~~1998-12-06~~Amaldi \|first1=Edoardo \|last2=Kann \|first2=Viggo \|title=On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems~~\|url=https://www.sciencedirect.com/science/article/pii/S0304397597001151~~ \|journal=Theoretical Computer Science \|~~language~~date=enDecember 1998 \|volume=209 \|issue=~~1-2~~1–2 \|pages=237–260 \|doi=10.1016/S0304-3975(97)00115-1 \|~~issn=0304~~doi-~~3975~~access=free }}</ref> * A binary relation ''R'', which is one of {=, ≥, >, ≠}; Line 15: == Special case == The problem Min-RVLS[=] was presented by Garey and Johnson,<ref>{{~~Cite~~cite book ~~web~~\|~~url~~last1=~~https://www~~Garey \|first1=Michael R.~~semanticscholar.org/paper/Computers-and-Intractability:-A-Guide-to-the-Theory-Garey-~~ \|last2=Johnson~~/1e3b61f29e5317ef59d367e1a53ba407912d240e~~ \|first2=David S. \|title=Computers and Intractability: A Guide to the Theory of NP-~~Completeness\|last=Johnson\|first=David~~completeness ~~S.\|last2=Garey\|first2=M. R.~~\|date=1979 \|~~website~~publisher=~~www~~W.~~semanticscholar~~ H.~~org~~ Freeman \|~~language~~isbn=~~en\|access~~978-~~date=2019~~0-017167-071044-8 }}</ref> who called it "minimum weight solution to linear equations". They proved it was NP-hard, but did not consider approximations. == Applications == The Min-RVLS problem is important in [[machine learning]] and [[linear discriminant analysis]]. Given a set of positive and negative examples, it is required to minimize the number of features that are required to correctly classify them.<ref>{{~~Cite~~cite journal \|~~last~~last1=Koehler \|~~first~~first1=Gary J.~~\|date=1991-11-01~~ \|title=Linear Discriminant Functions Determined by Genetic Search~~\|url=https://pubsonline.informs.org/doi/abs/10.1287/ijoc.3.4.345~~ \|journal=ORSA Journal on Computing \|date=November 1991 \|volume=3 \|issue=4 \|pages=345–357 \|doi=10.1287/ijoc.3.4.345~~\|issn=0899-1499~~ }}</ref> The problem is known as the [[minimum feature set problem]]. An algorithm that approximates Min-RVLS within a factor of <math>O(\log(m))</math> could substantially reduce the number of training samples required to attain a given accuracy level. <ref>{{~~Cite~~cite journal \|~~last~~last1=Van Horn \|~~first~~first1=Kevin S. \|last2=Martinez \|first2=Tony R.~~\|date=1994-03-01~~ \|title=The ~~Minimum~~minimum ~~Feature~~feature ~~Set~~set problem ~~Problem\|url=http://dx.doi.org/10.1016/0893-6080(94)90082-5~~\|journal=Neural ~~Netw.~~Networks \|date=January 1994 \|volume=7 \|issue=3 \|pages=491–494 \|doi=10.1016/0893-6080(94)90082-5~~\|issn=0893-6080\|via=~~ }}</ref> The [[shortest codeword problem]] in [[coding theory]] is the same problem as Min-RVLS[=] when the coefficients are in GF(2). == Related problems == In '''~~MINimum~~minimum ~~Unsatisfied~~unsatisfied ~~Linear~~linear ~~Relations~~relations''' ('''Min-ULR'''), we are given a binary relation ''R'' and a linear system ''A x'' ''R'' ''b'', which is now assumed to be ''infeasible''. The goal is to find a vector ''x'' that violates as few relations as possible, while satisfying all the others. Min-ULR[≠] is trivially solvable, since any system with real variables and a finite number of inequality constraints is feasible. As for the other three variants: * Min-ULR[=,>,≥] are NP-hard even with homogeneous systems and bipolar coefficients (coefficients in {1,-1}). <ref name=":0">{{cite journal \|last1=Amaldi \|first1=Edoardo \|last2=Kann \|first2=Viggo \|title=The complexity and approximability of finding maximum feasible subsystems of linear relations \|journal=Theoretical Computer Science \|date=August 1995 \|volume=147 \|issue=1–2 \|pages=181–210 \|doi=10.1016/0304-3975(94)00254-G \|doi-access=free }}</ref> * The NP-complete problem [[Minimum feedback arc set]] reduces to Min-ULR[≥], with exactly one 1 and one -1 in each constraint, and all right-hand sides equal to 1. <ref name=":2">{{~~Cite~~cite journal \|~~date~~last1=~~1993-02-01~~Sankaran \|first1=Jayaram K \|title=A note on resolving infeasibility in linear programs by constraint relaxation~~\|url=https://www.sciencedirect.com/science/article/pii/016763779390079V~~ \|journal=Operations Research Letters \|~~language~~date=enFebruary 1993 \|volume=13 \|issue=1 \|pages=19–20 \|doi=10.1016/0167-6377(93)90079-V~~\|issn=0167-6377~~ }}</ref> * Min-ULR[=,>,≥] are polynomial if the number of variables ''n'' is constant: they can be solved polynomially using an algorithm of Greer<ref>{{~~Cite~~cite book \|~~url~~last1=~~https://books~~Greer \|first1=R.google.co.il/books?hl=iw&lr=&id=DRxEqwzbb2UC&oi=fnd&pg=PP1&dq=Trees+and+hills:+Methodology+for+maximizing+functions+of+systems+of+linear+relations,&ots=50EXPaJ_VC&sig=uL5nVIhh146iH1sYf--cLHvntcM&redir_esc=y#v=onepage&q=Trees%20and%20hills:%20Methodology%20for%20maximizing%20functions%20of%20systems%20of%20linear%20relations,&f=false \|title=Trees and Hills: Methodology for Maximizing Functions of Systems of Linear Relations~~\|last=Greer\|first=R.~~ \|date=2011~~-08-18~~ \|publisher=Elsevier \|isbn=~~9780080872070~~978-0-08-087207-0 }}{{pn\|~~language~~date=enOctober 2021}}</ref> in time <math>O(n\cdot m^n / 2^{n-1})</math>. * Min-ULR[=,>,≥] are linear if the number of constraints ''m'' is constant, since all subsystems can be checked in time ''O''(''n''). Min-ULR[≥] is polynomial in some special case.<ref name=":2" /> Min-ULR[=,>,≥] can be approximated within ''n'' + 1 in polynomial time.<ref name=":1" /> Min-ULR[>,≥] are [[Minimum dominating set\|minimum-dominating-set]]-hard, even with homogeneous systems and ternary coefficients (in {-1−1,0,1}). Min-ULR[=] cannot be approximated within a factor of <math>2^{\log^{1-\varepsilon}n}</math>, for any <math>\varepsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{\operatorname{polylog}(n)}</math>).<ref>{{cite journal \|last1=Arora \|first1=Sanjeev \|last2=Babai \|first2=László \|last3=Stern \|first3=Jacques \|last4=Sweedyk \|first4=Z \|title=The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations \|journal=Journal of Computer and System Sciences \|date=April 1997 \|volume=54 \|issue=2 \|pages=317–331 \|doi=10.1006/jcss.1997.1472 \|doi-access=free }}</ref> In the complementary problem '''maximum feasible linear subsystem''' ('''Max-FLS'''), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.<ref name=":0" /> * Max-FLS[≠] can be solved in polynomial time. ▼ * Max-FLS[=] is NP-hard even with homogeneous systems and bipolar coefficients. ▼ * . With integer coefficients, it is hard to approximate within <math>m^{\~~epsilon~~varepsilon}</math>. With coefficients over GF[q], it is ''q''-approximable. ▼ * Max-FLS[>] and Max-FLS[≥] are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor. ▼ == Hardness of approximation == All four variants of Min-RVLS are hard to approximate. In particular all four variants cannot be approximated within a factor of <math>2^{\log^{1-\varepsilon}n}</math>, for any <math>\varepsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{\operatorname{polylog}(n)}</math>).<ref name=":1" />{{Rp\|247-250}} The hardness is proved by reductions: * There is a reduction from Min-ULR[=] to Min-RVLS[=]. It also applies to Min-RVLS[≥] and Min-RVLS[>], since each equation can be replaced by two complementary inequalities. In the complementary problem '''MAXimum Feasible Linear Subystem''' ('''Max-FLS'''), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.<ref name=":0">{{Cite journal\|date=1995-08-07\|title=The complexity and approximability of finding maximum feasible subsystems of linear relations\|url=https://www.sciencedirect.com/science/article/pii/030439759400254G\|journal=Theoretical Computer Science\|language=en\|volume=147\|issue=1-2\|pages=181–210\|doi=10.1016/0304-3975(94)00254-G\|issn=0304-3975}}</ref> * There is a reduction from [[Minimum dominating set\|minimum-dominating-set]] to Min-RVLS[≠]. On the other hand, there is a reduction from Min-RVLS[=] to Min-ULR[=]. It also applies to Min-ULR[≥] and Min-ULR[>], since each equation can be replaced by two complementary inequalities. ▲* Max-FLS[≠] can be solved in polynomial time. ▲* Max-FLS[=] is NP-hard even with homogeneous systems and bipolar coefficients. Therefore, when R is in {=,>,≥}, Min-ULR and Min-RVLS are equivalent in terms of approximation hardness. ▲* . With integer coefficients, it is hard to approximate within <math>m^{\epsilon}</math>. With coefficients over GF[q], it is ''q''-approximable. ▲* Max-FLS[>] and Max-FLS[≥] are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor. == References == Line 45 ⟶ 58: [[Category:Linear programming]] [[Category:NP-hard problems]] [[Category:Approximation algorithms]] [[Category:Combinatorial optimization]]