Minimum relevant variables in linear system: Difference between revisions

Content deleted Content added
 
(15 intermediate revisions by 12 users not shown)
Line 1:
'''MINimumMinimum Relevantrelevant Variablesvariables in Linearlinear Systemsystem''' ('''Min-RVLS''') is a problem in [[mathematical optimization]]. Given a [[linear program]], it is required to find a feasible solution in which the number of non-zero variables is as small as possible.
 
The problem is known to be [[NP-hardness|NP-hard]] and even hard to approximate.
 
== Definition ==
A Min-RVLS problem is defined by:<ref name=":1">{{Citecite journal |datelast1=1998-12-06Amaldi |first1=Edoardo |last2=Kann |first2=Viggo |title=On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems|url=https://www.sciencedirect.com/science/article/pii/S0304397597001151 |journal=Theoretical Computer Science |languagedate=enDecember 1998 |volume=209 |issue=1-21–2 |pages=237–260 |doi=10.1016/S0304-3975(97)00115-1 |issn=0304doi-3975access=free }}</ref>
 
* A binary relation ''R'', which is one of {=, ≥, >, ≠};
Line 15:
 
== Special case ==
The problem Min-RVLS[=] was presented by Garey and Johnson,<ref>{{Citecite book web|urllast1=https://wwwGarey |first1=Michael R.semanticscholar.org/paper/Computers-and-Intractability:-A-Guide-to-the-Theory-Garey- |last2=Johnson/1e3b61f29e5317ef59d367e1a53ba407912d240e |first2=David S. |title=Computers and Intractability: A Guide to the Theory of NP-Completeness|last=Johnson|first=Davidcompleteness S.|last2=Garey|first2=M. R.|date=1979 |websitepublisher=wwwW.semanticscholar H.org Freeman |languageisbn=en|access978-date=20190-017167-071044-8 }}</ref> who called it "minimum weight solution to linear equations". They proved it was NP-hard, but did not consider approximations.
 
== Applications ==
The Min-RVLS problem is important in [[machine learning]] and [[linear discriminant analysis]]. Given a set of positive and negative examples, it is required to minimize the number of features that are required to correctly classify them.<ref>{{Citecite journal |lastlast1=Koehler |firstfirst1=Gary J.|date=1991-11-01 |title=Linear Discriminant Functions Determined by Genetic Search|url=https://pubsonline.informs.org/doi/abs/10.1287/ijoc.3.4.345 |journal=ORSA Journal on Computing |date=November 1991 |volume=3 |issue=4 |pages=345–357 |doi=10.1287/ijoc.3.4.345|issn=0899-1499 }}</ref> The problem is known as the [[minimum feature set problem]]. An algorithm that approximates Min-RVLS within a factor of <math>O(\log(m))</math> could substantially reduce the number of training samples required to attain a given accuracy level. <ref>{{Citecite journal |lastlast1=Van Horn |firstfirst1=Kevin S. |last2=Martinez |first2=Tony R.|date=1994-03-01 |title=The Minimumminimum Featurefeature Setset problem Problem|url=http://dx.doi.org/10.1016/0893-6080(94)90082-5|journal=Neural Netw.Networks |date=January 1994 |volume=7 |issue=3 |pages=491–494 |doi=10.1016/0893-6080(94)90082-5|issn=0893-6080|via= }}</ref>
 
The [[shortest codeword problem]] in [[coding theory]] is the same problem as Min-RVLS[=] when the coefficients are in GF(2).
 
== Related problems ==
In '''MINimumminimum Unsatisfiedunsatisfied Linearlinear Relationsrelations''' ('''Min-ULR'''), we are given a binary relation ''R'' and a linear system ''A x'' ''R'' ''b'', which is now assumed to be ''infeasible''. The goal is to find a vector ''x'' that violates as few relations as possible, while satisfying all the others.
 
Min-ULR[≠] is trivially solvable, since any system with real variables and a finite number of inequality constraints is feasible. As for the other three variants:
 
* Min-ULR[=,>,≥] are NP-hard even with homogeneous systems and bipolar coefficients (coefficients in {1,-1}). <ref name=":0">{{cite journal |last1=Amaldi |first1=Edoardo |last2=Kann |first2=Viggo |title=The complexity and approximability of finding maximum feasible subsystems of linear relations |journal=Theoretical Computer Science |date=August 1995 |volume=147 |issue=1–2 |pages=181–210 |doi=10.1016/0304-3975(94)00254-G |doi-access=free }}</ref>
* The NP-complete problem [[Minimum feedback arc set]] reduces to Min-ULR[≥], with exactly one 1 and one -1 in each constraint, and all right-hand sides equal to 1. <ref name=":2">{{Citecite journal |datelast1=1993-02-01Sankaran |first1=Jayaram K |title=A note on resolving infeasibility in linear programs by constraint relaxation|url=https://www.sciencedirect.com/science/article/pii/016763779390079V |journal=Operations Research Letters |languagedate=enFebruary 1993 |volume=13 |issue=1 |pages=19–20 |doi=10.1016/0167-6377(93)90079-V|issn=0167-6377 }}</ref>
* Min-ULR[=,>,≥] are polynomial if the number of variables ''n'' is constant: they can be solved polynomially using an algorithm of Greer<ref>{{Citecite book |urllast1=https://booksGreer |first1=R.google.co.il/books?hl=iw&lr=&id=DRxEqwzbb2UC&oi=fnd&pg=PP1&dq=Trees+and+hills:+Methodology+for+maximizing+functions+of+systems+of+linear+relations,&ots=50EXPaJ_VC&sig=uL5nVIhh146iH1sYf--cLHvntcM&redir_esc=y#v=onepage&q=Trees%20and%20hills:%20Methodology%20for%20maximizing%20functions%20of%20systems%20of%20linear%20relations,&f=false |title=Trees and Hills: Methodology for Maximizing Functions of Systems of Linear Relations|last=Greer|first=R. |date=2011-08-18 |publisher=Elsevier |isbn=9780080872070978-0-08-087207-0 }}{{pn|languagedate=enOctober 2021}}</ref> in time <math>O(n\cdot m^n / 2^{n-1})</math>.
* Min-ULR[=,>,≥] are linear if the number of constraints ''m'' is constant, since all subsystems can be checked in time ''O''(''n'').
*Min-ULR[≥] is polynomial in some special case.<ref name=":2" />
*Min-ULR[=,>,≥] can be approximated within ''n''&nbsp;+&nbsp;1 in polynomial time.<ref name=":1" />
*Min-ULR[>,≥] are [[Minimum dominating set|minimum-dominating-set]]-hard, even with homogeneous systems and ternary coefficients (in {-1−1,0,1}).
*Min-ULR[=] cannot be approximated within a factor of <math>2^{\log^{1-\varepsilon}n}</math>, for any <math>\varepsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{\operatorname{polylog}(n)}</math>).<ref>{{cite journal |last1=Arora |first1=Sanjeev |last2=Babai |first2=László |last3=Stern |first3=Jacques |last4=Sweedyk |first4=Z |title=The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations |journal=Journal of Computer and System Sciences |date=April 1997 |volume=54 |issue=2 |pages=317–331 |doi=10.1006/jcss.1997.1472 |doi-access=free }}</ref>
 
In the complementary problem '''maximum feasible linear subsystem''' ('''Max-FLS'''), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.<ref name=":0" />
 
* Max-FLS[≠] can be solved in polynomial time.
* Max-FLS[=] is NP-hard even with homogeneous systems and bipolar coefficients.
 
* . With integer coefficients, it is hard to approximate within <math>m^{\epsilonvarepsilon}</math>. With coefficients over GF[q], it is ''q''-approximable.
* Max-FLS[>] and Max-FLS[≥] are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor.
 
== Hardness of approximation ==
All four variants of Min-RVLS are hard to approximate. In particular all four variants cannot be approximated within a factor of <math>2^{\log^{1-\varepsilon}n}</math>, for any <math>\varepsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{\operatorname{polylog}(n)}</math>).<ref name=":1" />{{Rp|247-250}} The hardness is proved by reductions:
 
* There is a reduction from Min-ULR[=] to Min-RVLS[=]. It also applies to Min-RVLS[≥] and Min-RVLS[>], since each equation can be replaced by two complementary inequalities.
In the complementary problem '''MAXimum Feasible Linear Subystem''' ('''Max-FLS'''), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.<ref name=":0">{{Cite journal|date=1995-08-07|title=The complexity and approximability of finding maximum feasible subsystems of linear relations|url=https://www.sciencedirect.com/science/article/pii/030439759400254G|journal=Theoretical Computer Science|language=en|volume=147|issue=1-2|pages=181–210|doi=10.1016/0304-3975(94)00254-G|issn=0304-3975}}</ref>
* There is a reduction from [[Minimum dominating set|minimum-dominating-set]] to Min-RVLS[≠].
 
On the other hand, there is a reduction from Min-RVLS[=] to Min-ULR[=]. It also applies to Min-ULR[≥] and Min-ULR[>], since each equation can be replaced by two complementary inequalities.
* Max-FLS[≠] can be solved in polynomial time.
* Max-FLS[=] is NP-hard even with homogeneous systems and bipolar coefficients.
 
Therefore, when R is in {=,>,≥}, Min-ULR and Min-RVLS are equivalent in terms of approximation hardness.
* . With integer coefficients, it is hard to approximate within <math>m^{\epsilon}</math>. With coefficients over GF[q], it is ''q''-approximable.
* Max-FLS[>] and Max-FLS[≥] are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor.
 
== References ==
Line 45 ⟶ 58:
[[Category:Linear programming]]
[[Category:NP-hard problems]]
[[Category:Approximation algorithms]]
[[Category:Combinatorial optimization]]