Minimum relevant variables in linear system: Difference between revisions

Content deleted Content added
No edit summary
Line 19:
== Applications ==
The Min-RVLS problem is important in [[machine learning]] and [[linear discriminant analysis]]. Given a set of positive and negative examples, it is required to minimize the number of features that are required to correctly classify them.<ref>{{Cite journal|last=Koehler|first=Gary J.|date=1991-11-01|title=Linear Discriminant Functions Determined by Genetic Search|url=https://pubsonline.informs.org/doi/abs/10.1287/ijoc.3.4.345|journal=ORSA Journal on Computing|volume=3|issue=4|pages=345–357|doi=10.1287/ijoc.3.4.345|issn=0899-1499}}</ref> The problem is known as the [[minimum feature set problem]]. An algorithm that approximates Min-RVLS within a factor of <math>O(\log(m))</math>could substantially reduce the number of training samples required to attain a given accuracy level. <ref>{{Cite journal|last=Van Horn|first=Kevin S.|last2=Martinez|first2=Tony R.|date=1994-03-01|title=The Minimum Feature Set Problem|url=http://dx.doi.org/10.1016/0893-6080(94)90082-5|journal=Neural Netw.|volume=7|issue=3|pages=491–494|doi=10.1016/0893-6080(94)90082-5|issn=0893-6080|via=}}</ref>
 
The [[shortest codeword problem]] in [[coding theory]] is the same problem as Min-RVLS[=] when the coefficients are in GF(2).
 
== Related problems ==
Line 25 ⟶ 27:
Min-ULR[≠] is trivially solvable, since any system with real variables and a finite number of inequality constraints is feasible. As for the other three variants:
 
* Min-ULR[=,>,≥] are NP-hard even with homogeneous systems and bipolar coefficients (coefficients in {1,-1}). <ref name=":0">{{Cite journal|date=1995-08-07|title=The complexity and approximability of finding maximum feasible subsystems of linear relations|url=https://www.sciencedirect.com/science/article/pii/030439759400254G|journal=Theoretical Computer Science|language=en|volume=147|issue=1-2|pages=181–210|doi=10.1016/0304-3975(94)00254-G|issn=0304-3975}}</ref>
* The NP-complete problem [[Minimum feedback arc set]] reduces to Min-ULR[≥], with exactly one 1 and one -1 in each constraint, and all right-hand sides equal to 1. <ref name=":2">{{Cite journal|date=1993-02-01|title=A note on resolving infeasibility in linear programs by constraint relaxation|url=https://www.sciencedirect.com/science/article/pii/016763779390079V|journal=Operations Research Letters|language=en|volume=13|issue=1|pages=19–20|doi=10.1016/0167-6377(93)90079-V|issn=0167-6377}}</ref>
* Min-ULR[=,>,≥] are polynomial if the number of variables ''n'' is constant: they can be solved polynomially using an algorithm of Greer<ref>{{Cite book|url=https://books.google.co.il/books?hl=iw&lr=&id=DRxEqwzbb2UC&oi=fnd&pg=PP1&dq=Trees+and+hills:+Methodology+for+maximizing+functions+of+systems+of+linear+relations,&ots=50EXPaJ_VC&sig=uL5nVIhh146iH1sYf--cLHvntcM&redir_esc=y#v=onepage&q=Trees%20and%20hills:%20Methodology%20for%20maximizing%20functions%20of%20systems%20of%20linear%20relations,&f=false|title=Trees and Hills: Methodology for Maximizing Functions of Systems of Linear Relations|last=Greer|first=R.|date=2011-08-18|publisher=Elsevier|isbn=9780080872070|language=en}}</ref> in time <math>O(n\cdot m^n / 2^{n-1})</math>.
Line 31 ⟶ 33:
*Min-ULR[≥] is polynomial in some special case.<ref name=":2" />
*Min-ULR[=,>,≥] can be approximated within ''n''+1 in polynomial time.<ref name=":1" />
*Min-ULR[>,≥] are [[Minimum dominating set|minimum-dominating-set]]-hard, even with homogeneous systems and ternary coefficients (in {-1,0,1}).
*Min-ULR[=] cannot be approximated within a factor of <math>2^{\log^{1-\epsilon}n}</math>, for any <math>\epsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{polylog(n)}</math>). <ref>{{Cite journal|date=1997-04-01|title=The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations|url=https://www.sciencedirect.com/science/article/pii/S0022000097914720|journal=Journal of Computer and System Sciences|language=en|volume=54|issue=2|pages=317–331|doi=10.1006/jcss.1997.1472|issn=0022-0000}}</ref>
 
In the complementary problem '''MAXimum Feasible Linear Subystem''' ('''Max-FLS'''), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.<ref name=":0" />
 
* Max-FLS[≠] can be solved in polynomial time.
* Max-FLS[=] is NP-hard even with homogeneous systems and bipolar coefficients.
 
* . With integer coefficients, it is hard to approximate within <math>m^{\epsilon}</math>. With coefficients over GF[q], it is ''q''-approximable.
* Max-FLS[>] and Max-FLS[≥] are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor.
 
== Hardness of approximation ==
All four variants of Min-RVLS are hard to approximate. In particular all four variants cannot be approximated within a factor of <math>2^{\log^{1-\epsilon}n}</math>, for any <math>\epsilon>0</math>, unless NP is contained in [[DTIME]](<math>n^{polylog(n)}</math>).<ref name=":1" />{{Rp|247-250}} The hardness is proved by reductions:
 
* There is a reduction from Min-ULR[=] to Min-RVLS[=]. It also applies to Min-RVLS[≥] and Min-RVLS[>], since each equation can be replaced by two complementary inequalities.
In the complementary problem '''MAXimum Feasible Linear Subystem''' ('''Max-FLS'''), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.<ref name=":0">{{Cite journal|date=1995-08-07|title=The complexity and approximability of finding maximum feasible subsystems of linear relations|url=https://www.sciencedirect.com/science/article/pii/030439759400254G|journal=Theoretical Computer Science|language=en|volume=147|issue=1-2|pages=181–210|doi=10.1016/0304-3975(94)00254-G|issn=0304-3975}}</ref>
* There is a reduction from [[Minimum dominating set|minimum-dominating-set]] to Min-RVLS[≠].
 
On the other hand, there is a reduction fro Min-RVLS[=] to Min-ULR[=]. It also applies to Min-ULR[≥] and Min-ULR[>], since each equation can be replaced by two complementary inequalities.
* Max-FLS[≠] can be solved in polynomial time.
* Max-FLS[=] is NP-hard even with homogeneous systems and bipolar coefficients.
 
Therefore, when R is in {=,>,≥}, Min-ULR and Min-RVLS are equivalent in terms of approximation hardness.
* . With integer coefficients, it is hard to approximate within <math>m^{\epsilon}</math>. With coefficients over GF[q], it is ''q''-approximable.
* Max-FLS[>] and Max-FLS[≥] are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor.
 
== References ==