|
|
(421 intermediate revisions by more than 100 users not shown) |
Line 1: |
Line 1: |
| The '''set covering problem''' ('''SCP''') is a classical question in [[combinatorics]], [[computer science]] and [[Computational complexity theory|complexity theory]].
| | This is a preview for the new '''MathML rendering mode''' (with SVG fallback), which is availble in production for registered users. |
|
| |
|
| It is a problem "whose study has led to the development of fundamental techniques for the entire field" of [[approximation algorithms]].<ref>{{harvtxt|Vazirani|2001|p=15}}</ref> It was also one of [[Karp's 21 NP-complete problems]] shown to be [[NP-complete]] in 1972.
| | If you would like use the '''MathML''' rendering mode, you need a wikipedia user account that can be registered here [[https://en.wikipedia.org/wiki/Special:UserLogin/signup]] |
| | * Only registered users will be able to execute this rendering mode. |
| | * Note: you need not enter a email address (nor any other private information). Please do not use a password that you use elsewhere. |
|
| |
|
| Given a set of elements <math>\{1,2,...,m\}</math> (called the universe) and a set <math>S</math> of <math>n</math> sets whose union equals the universe, the set cover problem is to identify the smallest subset of <math>S</math> whose union equals the universe. For example, consider the universe <math>U = \{1, 2, 3, 4, 5\}</math> and the set of sets <math>S = \{\{1, 2, 3\}, \{2, 4\}, \{3, 4\}, \{4, 5\}\}</math>. Clearly the union of <math>S</math> is <math>U</math>. However, we can cover all of the elements with the following, smaller number of sets: <math>\{\{1, 2, 3\}, \{4, 5\}\}</math>.
| | Registered users will be able to choose between the following three rendering modes: |
|
| |
|
| More formally, given a universe <math>\mathcal{U}</math> and a family <math>\mathcal{S}</math> of subsets of <math>\mathcal{U}</math>,
| | '''MathML''' |
| a ''cover'' is a subfamily <math>\mathcal{C}\subseteq\mathcal{S}</math> of sets whose union is <math>\mathcal{U}</math>. In the set covering [[decision problem]], the input is a pair <math>(\mathcal{U},\mathcal{S})</math> and an integer <math>k</math>; the question is whether
| | :<math forcemathmode="mathml">E=mc^2</math> |
| there is a set covering of size <math>k</math> or less. In the set covering [[optimization problem]], the input is a pair <math>(\mathcal{U},\mathcal{S})</math>, and the task is to find a set covering that uses the fewest sets.
| |
|
| |
|
| The decision version of set covering is [[NP-complete]], and the optimization version of set cover is [[NP-hard]] .{{sfn |Korte|Vygen|2012|p=414}}
| | <!--'''PNG''' (currently default in production) |
| | :<math forcemathmode="png">E=mc^2</math> |
|
| |
|
| If additionally, you want to minimize the cost of the sets, it becomes Weighted Set Cover Problem. Otherwise, its an Un-weighted Set Cover Problem with all costs equal to one.
| | '''source''' |
| | :<math forcemathmode="source">E=mc^2</math> --> |
|
| |
|
| {{Covering-Packing Problem Pairs}}
| | <span style="color: red">Follow this [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering link] to change your Math rendering settings.</span> You can also add a [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering-skin Custom CSS] to force the MathML/SVG rendering or select different font families. See [https://www.mediawiki.org/wiki/Extension:Math#CSS_for_the_MathML_with_SVG_fallback_mode these examples]. |
|
| |
|
| ==Integer linear program formulation== | | ==Demos== |
| The minimum set cover problem can be formulated as the following [[integer linear program]] (ILP).<ref>{{harvtxt|Vazirani|2001|p=108}}</ref>
| |
| {|
| |
| | minimize
| |
| | <math>\sum_{S \in \mathcal S} x_S</math>
| |
| |
| |
| | (minimize the number of sets)
| |
| |-
| |
| | subject to
| |
| | <math>\sum_{S\colon e \in S} x_S \geqslant 1 </math>
| |
| | for all <math>e\in \mathcal U</math>
| |
| | (cover every element of the universe)
| |
| |-
| |
| |
| |
| | <math>x_S \in \{0,1\}</math>
| |
| | for all <math>S\in \mathcal S</math>.
| |
| | (every set is either in the set cover or not)
| |
| |}
| |
| This ILP belongs to the more general class of ILPs for [[covering problem]]s.
| |
| The [[Linear programming relaxation#Approximation and integrality gap|integrality gap]] of this ILP is at most <math>\scriptstyle \log n</math>, so its [[Linear programming relaxation|relaxation]] gives a factor-<math>\scriptstyle \log n</math> [[approximation algorithm]] for the minimum set cover problem (where <math>\scriptstyle n</math> is the size of the universe).<ref>{{harvtxt|Vazirani|2001|pp=110–112}}</ref>
| |
|
| |
|
| == Hitting set formulation ==
| | Here are some [https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Frederic.wang demos]: |
| Set covering is equivalent to the '''hitting set problem'''. It is easy to see this by observing that an instance of set covering can
| |
| be viewed as an arbitrary [[bipartite graph]], with sets represented by vertices on the left, the universe represented by vertices on the
| |
| right, and edges representing the inclusion of elements in sets. The task is then to find a minimum cardinality subset of left-vertices which covers all of the right-vertices. In the Hitting set problem, the objective is to cover the left-vertices using a minimum subset of the right vertices. Converting from one problem to the other is therefore achieved by interchanging the two sets of vertices.
| |
|
| |
|
| == Greedy algorithm ==
| |
|
| |
|
| The [[greedy algorithm]] for set covering chooses sets according to one rule: at each stage, choose the set that contains the largest number of uncovered elements. It can be shown<ref>Chvatal, V. [http://www.jstor.org/stable/3689577 A Greedy Heuristic for the Set-Covering Problem]. Mathematics of Operations Research
| | * accessibility: |
| Vol. 4, No. 3 (Aug., 1979), pp. 233-235</ref> that this algorithm achieves an approximation ratio of <math>H(s)</math>, where <math>s</math> is the size of the set to be covered, <math>H(n)</math> is the <math>n</math>-th [[harmonic number]]:
| | ** Safari + VoiceOver: [https://commons.wikimedia.org/wiki/File:VoiceOver-Mac-Safari.ogv video only], [[File:Voiceover-mathml-example-1.wav|thumb|Voiceover-mathml-example-1]], [[File:Voiceover-mathml-example-2.wav|thumb|Voiceover-mathml-example-2]], [[File:Voiceover-mathml-example-3.wav|thumb|Voiceover-mathml-example-3]], [[File:Voiceover-mathml-example-4.wav|thumb|Voiceover-mathml-example-4]], [[File:Voiceover-mathml-example-5.wav|thumb|Voiceover-mathml-example-5]], [[File:Voiceover-mathml-example-6.wav|thumb|Voiceover-mathml-example-6]], [[File:Voiceover-mathml-example-7.wav|thumb|Voiceover-mathml-example-7]] |
| | ** [https://commons.wikimedia.org/wiki/File:MathPlayer-Audio-Windows7-InternetExplorer.ogg Internet Explorer + MathPlayer (audio)] |
| | ** [https://commons.wikimedia.org/wiki/File:MathPlayer-SynchronizedHighlighting-WIndows7-InternetExplorer.png Internet Explorer + MathPlayer (synchronized highlighting)] |
| | ** [https://commons.wikimedia.org/wiki/File:MathPlayer-Braille-Windows7-InternetExplorer.png Internet Explorer + MathPlayer (braille)] |
| | ** NVDA+MathPlayer: [[File:Nvda-mathml-example-1.wav|thumb|Nvda-mathml-example-1]], [[File:Nvda-mathml-example-2.wav|thumb|Nvda-mathml-example-2]], [[File:Nvda-mathml-example-3.wav|thumb|Nvda-mathml-example-3]], [[File:Nvda-mathml-example-4.wav|thumb|Nvda-mathml-example-4]], [[File:Nvda-mathml-example-5.wav|thumb|Nvda-mathml-example-5]], [[File:Nvda-mathml-example-6.wav|thumb|Nvda-mathml-example-6]], [[File:Nvda-mathml-example-7.wav|thumb|Nvda-mathml-example-7]]. |
| | ** Orca: There is ongoing work, but no support at all at the moment [[File:Orca-mathml-example-1.wav|thumb|Orca-mathml-example-1]], [[File:Orca-mathml-example-2.wav|thumb|Orca-mathml-example-2]], [[File:Orca-mathml-example-3.wav|thumb|Orca-mathml-example-3]], [[File:Orca-mathml-example-4.wav|thumb|Orca-mathml-example-4]], [[File:Orca-mathml-example-5.wav|thumb|Orca-mathml-example-5]], [[File:Orca-mathml-example-6.wav|thumb|Orca-mathml-example-6]], [[File:Orca-mathml-example-7.wav|thumb|Orca-mathml-example-7]]. |
| | ** From our testing, ChromeVox and JAWS are not able to read the formulas generated by the MathML mode. |
|
| |
|
| :<math> H(n) = \sum_{k=1}^{n} \frac{1}{k} \le \ln{n} +1</math>
| | ==Test pages == |
|
| |
|
| This greedy algorithm actually achieves an approximation ratio of <math>H(s^\prime)</math> where <math>s^\prime</math> is the maximum cardinality set of <math>S</math>.
| | To test the '''MathML''', '''PNG''', and '''source''' rendering modes, please go to one of the following test pages: |
| | *[[Displaystyle]] |
| | *[[MathAxisAlignment]] |
| | *[[Styling]] |
| | *[[Linebreaking]] |
| | *[[Unique Ids]] |
| | *[[Help:Formula]] |
|
| |
|
| [[Image:SetCoverGreedy.gif|frame|Tight example for the greedy algorithm with k=3]] | | *[[Inputtypes|Inputtypes (private Wikis only)]] |
| There is a standard example on which the greedy algorithm achieves an approximation ratio of <math>\log_2(n)/2</math>.
| | *[[Url2Image|Url2Image (private Wikis only)]] |
| The universe consists of <math>n=2^{(k+1)}-2</math> elements. The set system consists of <math>k</math> pairwise disjoint sets
| | ==Bug reporting== |
| <math>S_1,\ldots,S_k</math> with sizes <math>2,4,8,\ldots,2^k</math> respectively, as well as two additional disjoint sets <math>T_0,T_1</math>,
| | If you find any bugs, please report them at [https://bugzilla.wikimedia.org/enter_bug.cgi?product=MediaWiki%20extensions&component=Math&version=master&short_desc=Math-preview%20rendering%20problem Bugzilla], or write an email to math_bugs (at) ckurs (dot) de . |
| each of which contains half of the elements from each <math>S_i</math>. On this input, the greedy algorithm takes the sets
| |
| <math>S_k,\ldots,S_1</math>, in that order, while the optimal solution consists only of <math>T_0</math> and <math>T_1</math>.
| |
| An example of such an input for <math>k=3</math> is pictured on the right.
| |
| | |
| Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover
| |
| (see [[Set cover problem#Inapproximability results|Inapproximability results]] below), under plausible complexity assumptions.
| |
| | |
| == Low-frequency systems ==
| |
| | |
| If each element occurs in at most ''f'' sets, then a solution can be found in polynomial time that approximates the optimum to within a factor of ''f'' using [[Linear programming relaxation|LP relaxation]].<ref>{{harvtxt|Vazirani|2001|pp=118–119}}</ref>
| |
| | |
| == Inapproximability results ==
| |
| | |
| When <math> n</math> refers to the size of the universe, {{harvtxt |Lund|Yannakakis|1994}} showed that set covering cannot be approximated in polynomial time to within a factor of <math>\tfrac{1}{2}\log_2{n} \approx 0.72\ln{n}</math>, unless '''NP''' has [[quasi-polynomial time]] algorithms. Feige (1998) improved this lower bound to <math>\bigl(1-o(1)\bigr)\cdot\ln{n}</math> under the same assumptions, which essentially matches the approximation ratio achieved by the greedy algorithm. {{harvtxt |Raz|Safra|1997}} established a lower bound
| |
| of <math>c\cdot\ln{n}</math>, where <math>c</math> is a constant, under the weaker assumption that '''P'''<math>\not=</math>'''NP'''.
| |
| A similar result with a higher value of <math>c</math> was recently proved by {{harvtxt |Alon|Moshkovitz|Safra|2006}}.
| |
| | |
| == Related problems ==
| |
| * Hitting set is an equivalent reformulation of Set Cover.
| |
| * [[Vertex cover problem|Vertex cover]] is a special case of Hitting Set.
| |
| * [[Edge cover problem|Edge cover]] is a special case of Set Cover. | |
| * [[Set packing]] is the dual problem of Set Cover.
| |
| * [[Maximum coverage problem]] is to choose at most k sets to cover as many elements as possible.
| |
| * [[Dominating set]] is the problem of selecting a set of vertices (the dominating set) in a graph such that all other vertices are adjacent to at least one vertex in the dominating set. The Dominating set problem was shown to be NP complete through a reduction from Set cover.
| |
| * [[Exact cover problem]] is to choose a set cover with no element included in more than one covering set.
| |
| * [[Closest pair of points problem]]
| |
| * [[Nearest neighbor search]]
| |
| | |
| ==See also== | |
| *[[MinHash]]
| |
| *[[Sequence assembly]]
| |
| | |
| ==Notes==
| |
| | |
| {{reflist}}
| |
| | |
| == References ==
| |
| | |
| * {{Citation | last1=Alon | first1=Noga | author1-link=Noga Alon | last2=Moshkovitz | first2=Dana | last3=Safra | first3=Shmuel | author3-link=Shmuel Safra | title=Algorithmic construction of sets for k-restrictions | publisher=ACM | year=2006 | journal=ACM Trans. Algorithms | issn=1549-6325 | volume=2 | issue=2 | pages=153–177 | doi=10.1145/1150334.1150336}}.
| |
| * {{Citation
| |
| | last=Cormen | first=Thomas H. | authorlink=Thomas H. Cormen
| |
| | last2=Leiserson | first2=Charles E. | authorlink2=Charles E. Leiserson
| |
| | last3=Rivest | first3=Ronald L. | authorlink3=Ronald L. Rivest
| |
| | last4=Stein | first4=Clifford | authorlink4=Clifford Stein
| |
| | title=Introduction to Algorithms | year=2001 | publisher=MIT Press and McGraw-Hill | location=Cambridge, Mass.
| |
| | isbn=0-262-03293-7 | pages=1033–1038}}
| |
| * {{Citation | last1=Feige | first1=Uriel | author1-link=Uriel Feige | title=A threshold of ln n for approximating set cover | publisher=ACM | year=1998 | journal=[[Journal of the ACM]] | issn=0004-5411 | volume=45 | issue=4 | pages=634–652 | doi=10.1145/285055.285059}}.
| |
| * {{Citation | last1=Lund | first1=Carsten | author1-link=Carsten Lund | last2=Yannakakis | first2=Mihalis | author2-link=Mihalis Yannakakis | title=On the hardness of approximating minimization problems | publisher=ACM | year=1994 | journal=[[Journal of the ACM]] | issn=0004-5411 | volume=41 | issue=5 | pages=960–981 | doi=10.1145/185675.306789}}.
| |
| * {{Citation | last1=Raz | first1=Ran | author1-link=Ran Raz | last2=Safra | first2=Shmuel | author2-link=Shmuel Safra | title=STOC '97: Proceedings of the twenty-ninth annual ACM symposium on Theory of computing | publisher=ACM | isbn=978-0-89791-888-6 | year=1997 | chapter=A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP | pages=475–484}}.
| |
| * {{cite book
| |
| | last = Vazirani | first = Vijay V.
| |
| | authorlink = Vijay Vazirani
| |
| | title = Approximation Algorithms
| |
| | year = 2001
| |
| | publisher = Springer-Verlag
| |
| | isbn = 3-540-65367-8
| |
| | url = http://www.cc.gatech.edu/fac/Vijay.Vazirani/book.pdf
| |
| | ref = harv
| |
| }}
| |
| * {{cite book
| |
| | last1 = Korte | first1 = Bernhard
| |
| | last2 = Vygen | first2 = Jens
| |
| | title = Combinatorial Optimization: Theory and Algorithms
| |
| | year = 2012
| |
| | isbn = 978-3-642-24487-2
| |
| | edition = 5
| |
| | publisher = Springer
| |
| | ref = harv
| |
| }}
| |
| | |
| == External links ==
| |
| * [http://www.nlsde.buaa.edu.cn/~kexu/benchmarks/set-benchmarks.htm Benchmarks with Hidden Optimum Solutions for Set Covering, Set Packing and Winner Determination]
| |
| * [http://www.csc.kth.se/~viggo/wwwcompendium/node146.html A compendium of NP optimization problems - Minimum Set Cover]
| |
| | |
| {{DEFAULTSORT:Set Cover Problem}}
| |
| [[Category:Set families]]
| |
| [[Category:NP-complete problems]]
| |