Content deleted Content added
Erel Segal (talk | contribs) Starting |
m →Related problems: again, MOS:ACRO |
||
(23 intermediate revisions by 12 users not shown) | |||
Line 1:
The problem is known to be [[NP-hardness|NP-hard]] and even hard to approximate.
== Definition ==
A Min-RVLS problem is defined by:<ref name=":1">{{cite journal |last1=Amaldi |first1=Edoardo |last2=Kann |first2=Viggo |title=On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems |journal=Theoretical Computer Science |date=December 1998 |volume=209 |issue=1–2 |pages=237–260 |doi=10.1016/S0304-3975(97)00115-1 |doi-access=free }}</ref>
* A binary relation ''R'', which is one of {=, ≥, >, ≠};
* An ''m''-by-''n'' matrix ''A'' (where ''m'' is the number of constraints and ''n'' the number of variables);
* An ''m''-by-1 vector ''b''.
The linear system is given by: ''A x'' ''R'' ''b.'' It is assumed to be feasible (i.e., satisfied by at least one ''x''). Depending on R, there are four different variants of this system: ''A x = b, A x ≥ b, A x > b, A x ≠ b''.
The goal is to find an ''n''-by-1 vector ''x'' that satisfies the system ''A x'' ''R'' ''b'', and subject to that, contains as few as possible nonzero elements.
== Special case ==
The problem Min-RVLS[=] was presented by Garey and Johnson,<ref>{{cite book |last1=Garey |first1=Michael R. |last2=Johnson |first2=David S. |title=Computers and Intractability: A Guide to the Theory of NP-completeness |date=1979 |publisher=W. H. Freeman |isbn=978-0-7167-1044-8 }}</ref> who called it "minimum weight solution to linear equations". They proved it was NP-hard, but did not consider approximations.
== Applications ==
The Min-RVLS problem is important in [[machine learning]] and [[linear discriminant analysis]]. Given a set of positive and negative examples, it is required to minimize the number of features that are required to correctly classify them.<ref>{{cite journal |last1=Koehler |first1=Gary J. |title=Linear Discriminant Functions Determined by Genetic Search |journal=ORSA Journal on Computing |date=November 1991 |volume=3 |issue=4 |pages=345–357 |doi=10.1287/ijoc.3.4.345 }}</ref> The problem is known as the [[minimum feature set problem]]. An algorithm that approximates Min-RVLS within a factor of <math>O(\log(m))</math> could substantially reduce the number of training samples required to attain a given accuracy level.<ref>{{cite journal |last1=Van Horn |first1=Kevin S. |last2=Martinez |first2=Tony R. |title=The minimum feature set problem |journal=Neural Networks |date=January 1994 |volume=7 |issue=3 |pages=491–494 |doi=10.1016/0893-6080(94)90082-5 }}</ref>
The [[shortest codeword problem]] in [[coding theory]] is the same problem as Min-RVLS[=] when the coefficients are in GF(2).
== Related problems ==
In '''minimum unsatisfied linear relations''' ('''Min-ULR'''), we are given a binary relation ''R'' and a linear system ''A x'' ''R'' ''b'', which is now assumed to be ''infeasible''. The goal is to find a vector ''x'' that violates as few relations as possible, while satisfying all the others.
Min-ULR[≠] is trivially solvable, since any system with real variables and a finite number of inequality constraints is feasible. As for the other three variants:
* Min-ULR[=,>,≥] are NP-hard even with homogeneous systems and bipolar coefficients (coefficients in {1,-1}). <ref name=":0">{{cite journal |last1=Amaldi |first1=Edoardo |last2=Kann |first2=Viggo |title=The complexity and approximability of finding maximum feasible subsystems of linear relations |journal=Theoretical Computer Science |date=August 1995 |volume=147 |issue=1–2 |pages=181–210 |doi=10.1016/0304-3975(94)00254-G |doi-access=free }}</ref>
* The NP-complete problem [[Minimum feedback arc set]] reduces to Min-ULR[≥], with exactly one 1 and one -1 in each constraint, and all right-hand sides equal to 1.<ref name=":2">{{cite journal |last1=Sankaran |first1=Jayaram K |title=A note on resolving infeasibility in linear programs by constraint relaxation |journal=Operations Research Letters |date=February 1993 |volume=13 |issue=1 |pages=19–20 |doi=10.1016/0167-6377(93)90079-V }}</ref>
* Min-ULR[=,>,≥] are polynomial if the number of variables ''n'' is constant: they can be solved polynomially using an algorithm of Greer<ref>{{cite book |last1=Greer |first1=R. |title=Trees and Hills: Methodology for Maximizing Functions of Systems of Linear Relations |date=2011 |publisher=Elsevier |isbn=978-0-08-087207-0 }}{{pn|date=October 2021}}</ref> in time <math>O(n\cdot m^n / 2^{n-1})</math>.
* Min-ULR[=,>,≥] are linear if the number of constraints ''m'' is constant, since all subsystems can be checked in time ''O''(''n'').
*Min-ULR[≥] is polynomial in some special case.<ref name=":2" />
*Min-ULR[=,>,≥] can be approximated within ''n'' + 1 in polynomial time.<ref name=":1" />
*Min-ULR[>,≥] are [[Minimum dominating set|minimum-dominating-set]]-hard, even with homogeneous systems and ternary coefficients (in {−1,0,1}).
*Min-ULR[=] cannot be approximated within a factor of <math>2^{\log^{1-\varepsilon}n}</math>, for any <math>\varepsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{\operatorname{polylog}(n)}</math>).<ref>{{cite journal |last1=Arora |first1=Sanjeev |last2=Babai |first2=László |last3=Stern |first3=Jacques |last4=Sweedyk |first4=Z |title=The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations |journal=Journal of Computer and System Sciences |date=April 1997 |volume=54 |issue=2 |pages=317–331 |doi=10.1006/jcss.1997.1472 |doi-access=free }}</ref>
In the complementary problem '''maximum feasible linear subsystem''' ('''Max-FLS'''), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.<ref name=":0" />
* Max-FLS[≠] can be solved in polynomial time.
* Max-FLS[=] is NP-hard even with homogeneous systems and bipolar coefficients.
* . With integer coefficients, it is hard to approximate within <math>m^{\varepsilon}</math>. With coefficients over GF[q], it is ''q''-approximable.
* Max-FLS[>] and Max-FLS[≥] are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor.
== Hardness of approximation ==
All four variants of Min-RVLS are hard to approximate. In particular all four variants cannot be approximated within a factor of <math>2^{\log^{1-\varepsilon}n}</math>, for any <math>\varepsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{\operatorname{polylog}(n)}</math>).<ref name=":1" />{{Rp|247-250}} The hardness is proved by reductions:
* There is a reduction from Min-ULR[=] to Min-RVLS[=]. It also applies to Min-RVLS[≥] and Min-RVLS[>], since each equation can be replaced by two complementary inequalities.
* There is a reduction from [[Minimum dominating set|minimum-dominating-set]] to Min-RVLS[≠].
On the other hand, there is a reduction from Min-RVLS[=] to Min-ULR[=]. It also applies to Min-ULR[≥] and Min-ULR[>], since each equation can be replaced by two complementary inequalities.
Therefore, when R is in {=,>,≥}, Min-ULR and Min-RVLS are equivalent in terms of approximation hardness.
== References ==
{{Reflist}}
[[Category:Linear programming]]
[[Category:NP-hard problems]]
[[Category:Approximation algorithms]]
[[Category:Combinatorial optimization]]
|