Maximum coverage problem: Difference between revisions

Content deleted Content added
m Correct typo can -> cannot
 
(40 intermediate revisions by 24 users not shown)
Line 1:
The '''maximum coverage problem''' is a classical question in [[computer science]] and, [[computational complexity theory]], and [[operations research]].
It is a problem that is widely taught in [[approximation algorithms]].
 
Line 8:
 
Formally, (unweighted) Maximum Coverage
: Instance: A number <math> k </math> and a collection of sets <math> S = \{S_1, S_2, \ldots, S_m\} </math>.
: Objective: Find a subset <math> S^{'} \subseteq S</math> of sets, such that <math> \left| S^{'} \right| \leq k</math> and the number of covered elements <math> \left| \bigcup_{S_i \in S^{'}}{S_i} \right| </math> is maximized.
The maximum coverage problem is [[NP-hard]], and cannot be approximated to within <math>1 - \frac{1}{e} + o(1) \approx 0.632</math> under standard assumptions.
This result essentially matches the approximation ratio achieved by the generic greedy algorithm used for [[Submodular_set_functionSubmodular set function#Optimization_problemsOptimization problems|maximization of submodular functions with a cardinality constraint]].<ref name="NVF"> [[George Nemhauser|G. L. Nemhauser]], L. A. Wolsey and M. L. Fisher. An analysis of approximations for maximizing submodular set functions I, Mathematical Programming 14 (1978), 265–294</ref>
 
==ILP formulation==
The maximum coverage problem can be formulated as the following [[integer linear program]].
 
{| cellpadding="5"
:maximize <math>\sum_{e_j \in E} y_j</math>. (maximizing the sum of covered elements).
:subject| tomaximize || <math> \sumsum_{x_i} e_j \leqin kE} y_j</math>; || (nomaximizing morethe than <math>k</math>sum setsof arecovered selectedelements).
|-
::<math> \sum_{\, e_j \in S_i} x_i \geq y_j </math>; (if <math>y_j > 0 </math> then at least one set <math>e_j \in S_i</math> is selected).
::| subject to || <math>0 \leq y_jsum{x_i} \leq 1k</math>; || (ifno <math>y_j=1</math>more thenthan <math>e_jk</math> issets are coveredselected)
|-
::<math>x_i \in \{0,1\}</math> (if <math>x_i=1</math> then <math>S_i</math> is selected for the cover).
::| || <math> \sum_{\, e_j \in S_i} x_i \geq y_j </math>; || (if <math>y_j > 0 </math> then at least one set <math>e_j \in S_i</math> is selected).
|-
| || <math>y_j \in \{0,1\}</math> || (if <math>y_j=1</math> then <math>e_j</math> is covered)
|-
::| || <math>x_i \in \{0,1\}</math> || (if <math>x_i=1</math> then <math>S_i</math> is selected for the cover).
|}
 
== Greedy algorithm ==
The [[greedy algorithm]] for maximum coverage chooses sets according to one rule: at each stage, choose a set which contains the largest number of uncovered elements. It can be shown that this algorithm achieves an approximation ratio of <math>1 - \frac{1}{e}</math>.<ref>{{cite book | last=Hochbaum, D| first=Dorit S. | author-link=Dorit S. (Hochbaum | editor-first=Dorit S. | editor-last=Hochbaum | year=1997), "| chapter=Approximating coveringCovering and packingPacking problemsProblems: Set coverCover, vertexVertex coverCover, independentIndependent setSet, and relatedRelated problems",Problems in| title=Approximation algorithmsAlgorithms for NP-hardHard problems,Problems | publisher=PWS Publishing Company, | ___location=Boston, 94| isbn=978-143.053494968-6 | pages=94–143}}</ref> ln-approximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for maximum coverage unless <math>P = NP</math>.<ref>{{cite journal | last = Feige | first = Uriel | author-link = Uriel Feige | title = A Threshold of ln ''n'' for Approximating Set Cover | journal = Journal of the ACM | volume = 45 | number = 4 |date=July 1998 | issn = 0004-5411 | pages = 634–652 | doi = 10.1145/285055.285059 | publisher = Association for Computing Machinery | ___location = New York, NY, USA| s2cid = 52827488 | doi-access = free }}</ref>
Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation
algorithm for maximum coverage.<ref>Feige, U., "A threshold of ln n for approximating set cover", J. ACM 45, 634-652. </ref>
 
== Known extensions ==
The inapproximability results apply to all extensions of the maximum coverage problem since they hold the maximum coverage problem as a special case.
 
The Maximum Coverage Problem can be applied to road traffic situations; one such example is selecting which bus routes in a public transportation network should be installed with pothole detectors to maximise coverage, when only a limited number of sensors is available. This problem is a known extension of the Maximum Coverage Problem and was first explored in literature by Junade Ali and Vladimir Dyo.<ref>{{cite book|last1=Ali|first1=Junade|last2=Dyo|first2=Vladimir|title=Proceedings of the 14th International Joint Conference on e-Business and Telecommunications |chapter=Coverage and Mobile Sensor Placement for Vehicles on Predetermined Routes: A Greedy Heuristic Approach |date=2017|volume=2: WINSYS|pages=83–88|doi=10.5220/0006469800830088|url=http://www.scitepress.org/DigitalLibrary/PublicationsDetail.aspx?ID=ddWw1NMB3VI%3d|isbn=978-989-758-261-5}}</ref>
 
== Weighted version ==
Line 35 ⟶ 41:
:maximize <math>\sum_{e \in E} w(e_j) \cdot y_j </math>. (maximizing the weighted sum of covered elements).
:subject to <math> \sum{x_i} \leq k </math>; (no more than <math>k</math> sets are selected).
::<math> \sum_{e_j \in S_i} x_i \geq y_j </math>; (if <math>y_j \geq> 0 </math> then at least one set <math>e_j \in S_i</math> is selected).
::<math>0y_j \leq y_jin \leq {0,1\}</math>; (if <math>y_j=1</math> then <math>e_j</math> is covered)
::<math>x_i \in \{0,1\}</math> (if <math>x_i=1</math> then <math>S_i</math> is selected for the cover).
 
The greedy algorithm for the weighted maximum coverage at each stage chooses a set whichthat contains the maximum weight of uncovered elements. This algorithm achieves an approximation ratio of <math>1 - \frac{1}{e}</math>.<ref name="NVF"/>
 
== Budgeted maximum coverage ==
In the budgeted maximum coverage version, not only does every element <math> e_j </math> have a weight <math>w(e_j)</math>, but also every set <math>S_i</math> has a cost <math>c(S_i)</math>. Instead of <math>k</math> that limits the number of sets in the cover a budget <math>B</math> is given. This budget <math>B</math> limits the weighttotal cost of the cover that can be chosen.
 
:maximize <math>\sum_{e \in E} w(e_j) \cdot y_j </math>. (maximizing the weighted sum of covered elements).
:subject to <math> \sum{c(S_i) \cdot x_i} \leq B </math>; (the cost of the selected sets cannot exceed <math>B</math>).
::<math> \sum_{e_j \in S_i} x_i \geq y_j </math>; (if <math>y_j \geq> 0 </math> then at least one set <math>e_j \in S_i</math> is selected).
::<math>0y_j \leq y_jin \leq {0,1\}</math>; (if <math>y_j=1</math> then <math>e_j</math> is covered)
::<math>x_i \in \{0,1\}</math> (if <math>x_i=1</math> then <math>S_i</math> is selected for the cover).
 
A greedy algorithm will no longer produce solutions with a performance guarantee. Namely, the worst case behavior of this algorithm might be very far from the optimal solution. The approximation algorithm is extended by the following way. First, after findingdefine a solution using themodified greedy algorithm, returnthat selects the betterset of<math>S_i</math> that has the greedybest algorithm'sratio solutionof andweighted theuncovered setelements to cost. Second, among covers of largestcardinality weight.<math>1, Call2, this..., algorithmk-1</math>, find the modifiedbest greedycover algorithmthat does not violate the budget. Second,Call startingthis withcover all<math>H_1</math>. possibleThird, familiesfind ofall setscovers of sizescardinality from<math>k</math> onethat todo (atnot least)violate three,the augmentbudget. Using these solutionscovers withof cardinality <math>k</math> as starting points, apply the modified greedy algorithm. Third, returnmaintaining the best outcover found so far. Call this cover <math>H_2</math>. At the end of allthe augmentedprocess, solutionsthe approximate best cover will be either <math>H_1</math> or <math>H_2</math>. This algorithm achieves an approximation ratio of <math>1- {1/ \over e}</math> for values of <math>k \geq 3</math>. This is the best possible approximation ratio unless <math>NP \subseteq DTIME(n^{O(\log\log n)})</math>.<ref>Khuller,{{Cite S.,journal Moss,|doi A.,= and Naor, J. 1999. [http://dx.doi.org/10.1016/S0020-0190(99)00031-9|title = The budgeted maximum coverage problem].|journal ''Inf.= Process.Information Lett''.Processing Letters|volume = 70,|pages 1= (Apr.39–45|year = 1999|last1 = Khuller|first1 = Samir|last2 = Moss|first2 = Anna|last3 = Naor|first3 = Joseph (Seffi),|citeseerx 39-45= 10.1.1.49.5784}}</ref>
 
== Generalized maximum coverage ==
Line 61 ⟶ 68:
:subject to <math> \sum{c_i(e_j) \cdot y_{ij}} + \sum{c(S_i) \cdot x_i} \leq B </math>; (the cost of the selected sets cannot exceed <math>B</math>).
::<math> \sum_{i} y_{ij} \leq 1 </math>; (element <math>e_j=1</math> can only be covered by at most one set).
::<math> \sum_{S_i} x_i \geq y_{ij} </math>; (if <math>y_j \geq> 0 </math> then at least one set <math>e_j \in S_i</math> is selected).
::<math>y_{ij} \in \{0,1\} </math>; (if <math>y_{ij}=1</math> then <math>e_j</math> is covered by set <math>S_i</math>)
::<math>x_i \in \{0,1\}</math> (if <math>x_i=1</math> then <math>S_i</math> is selected for the cover).
 
=== Generalized maximum coverage algorithm ===
The algorithm uses the concept of residual cost/weight. The residual cost/weight is measured against a tentative solution and itsit is the difference of the cost/weight from the cost/weight gained by a tentative solution.
 
The algorithm has several stages. First, find a solution using greedy algorithm. In each iteration of the greedy algorithm the tentative solution is added the set which contains the maximum residual weight of elements divided by the residual cost of these elements along with the residual cost of the set. Second, compare the solution gained by the first step to the best solution which uses a small number of sets. Third, return the best out of all examined solutions. This algorithm achieves an approximation ratio of <math>1-1/e - o(1)</math>.<ref>Cohen,{{Cite R.journal and|doi Katzir,= L. 2008. [http://dx.doi.org/10.1016/j.ipl.2008.03.017|title = The Generalized Maximum Coverage Problem].|journal ''Inf.= Process.Information Lett''.Processing Letters|volume = 108,|pages 1= (Sep.15–22|year = 2008),|last1 15-22= Cohen|first1 = Reuven|last2 = Katzir|first2 = Liran|citeseerx = 10.1.1.156.2073}}</ref>
 
== Related problems ==
* [[Set cover problem]] is to cover all elements with as few sets as possible.
 
== ReferencesNotes ==
{{Reflist}}
* {{Cite book | last=Vazirani | first=Vijay V. | authorlink=Vijay Vazirani | title=Approximation Algorithms | year=2001 | publisher=Springer-Verlag | isbn=3-540-65367-8 | pages=}}
* [[Uriel Feige]], ''A Threshold of ln <math>n</math> for Approximating Set Cover'', Journal of the ACM (JACM), v.45 n.4, p.&nbsp;634 - 652, July 1998.
 
== External linksReferences ==
* {{Cite book | last=Vazirani | first=Vijay V. | authorlinkauthor-link=Vijay Vazirani | title=Approximation Algorithms | year=2001 | publisher=Springer-Verlag | isbn=978-3-540-65367-87 | pages=}}
 
[[Category:SetFamilies familiesof sets]]
[[Category:NP-complete problems]]