Method of conditional probabilities: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 15:27, 15 November 2016 edit Swpb (talk \| contribs) Autopatrolled, Extended confirmed users, New page reviewers, Pending changes reviewers 93,492 edits clean up - WP:ACCIM rule #6 using AWB ← Previous edit		Latest revision as of 16:49, 21 February 2025 edit undo Polygnotus (talk \| contribs) Extended confirmed users, File movers, Rollbackers 35,099 edits m →Further reading: clean up, replaced: explaind → explained Tag: AWB
(22 intermediate revisions by 11 users not shown)
Line 1: In [[mathematics]] and [[computer science]], the '''method of conditional probabilities'''<ref name='spencer'>{{Citation In [[mathematics]] and [[computer science]], the [[probabilistic method]] is used to prove the existence of mathematical objects with desired combinatorial properties. The proofs are probabilistic — they work by showing that a random object, chosen from some probability distribution, has the desired properties with positive probability. Consequently, they are [[nonconstructive proof\|nonconstructive]] — they don't explicitly describe an efficient method for computing the desired objects.▼ \| title=Ten lectures on the probabilistic method▼ \| last=Spencer\|first=Joel H.\|authorlink=Joel Spencer▼ \| url=https://books.google.com/books?id=Kz0B0KkwfVAC▼ \| year=1987▼ \| publisher=SIAM▼ \| isbn=978-0-89871-325-1}}</ref><ref name='raghavan'>▼ * {{Citation▼ \| title= Probabilistic construction of deterministic algorithms: approximating packing integer programs▼ \| first = Prabhakar \| last = Raghavan \| authorlink=Prabhakar Raghavan▼ \| journal=[[Journal of Computer and System Sciences]]▼ \| volume=37▼ \| issue=2▼ \| pages=130–143▼ \| year = 1988▼ \| doi = 10.1016/0022-0000(88)90003-7~~}}.~~\| doi-access=free▼ }} </ref> is a systematic method for converting [[non-constructive]] probabilistic existence proofs into efficient [[Deterministic algorithm\|deterministic algorithms]] that explicitly construct the desired object.<ref>[http://algnotes.info/on/background/probabilistic-method/method-of-conditional-probabilities/ The probabilistic method — method of conditional probabilities], blog entry by Neal E. Young, accessed 19/04/2012 and 14/09/2023.</ref> ▲~~In [[mathematics]] and [[computer science]]~~Often, the [[probabilistic method]] is used to prove the existence of mathematical objects with some desired combinatorial properties. The proofs ~~are~~in ~~probabilistic~~that ~~— they~~method work by showing that a random object, chosen from some probability distribution, has the desired properties with positive probability. Consequently, they are [[nonconstructive proof\|nonconstructive]] — they don't explicitly describe an efficient method for computing the desired objects. The ~~'''~~method of conditional probabilities~~''' {{harv\|Erdős\|Selfridge\|1973}}, {{harv\|Spencer\|1987}}, {{harv\|Raghavan\|1988}}~~ converts such a proof, in a "very precise sense", into an efficient [[deterministic algorithm]], one that is guaranteed to compute an object with the desired properties. That is, the method [[derandomization\|derandomizes]] the proof. The basic idea is to replace each random choice in a random experiment by a deterministic choice, so as to keep the conditional probability of failure, given the choices so far, below 1. The method is particularly relevant in the context of [[randomized rounding]] (which uses the probabilistic method to design [[approximation algorithm]]s). Line 8 ⟶ 27: == Overview == ~~{{harv\|~~Raghavan~~\|1988}}~~<ref name='raghavan'/> gives this description: : ''We first show the existence of a provably good approximate solution using the [[probabilistic method]]... [We then] show that the probabilistic existence proof can be converted, in a very precise sense, into a deterministic approximation algorithm.'' (Raghavan is discussing the method in the context of [[randomized rounding]], but it works with the probabilistic method in general.) [[File:Method of conditional probabilities.png\|thumb\|450px~~\|border~~\|right\|The method of conditional probabilities]] To apply the method to a probabilistic proof, the randomly chosen object in the proof must be choosable by a random experiment that consists of a sequence of "small" random choices. Line 38 ⟶ 57: == Efficiency == In a typical application of the method, the goal is to be able to implement the resulting deterministic process by a reasonably efficient algorithm (the word "efficient" usually means an algorithm, ~~which~~that ~~needs~~runs in ~~the~~ [[polynomial time]] ~~of the input size~~), even though typically the number of possible outcomes is huge (exponentially large). For example, consider the task with coin flipping, but extended to ''n'' flips for large ''n''. In the ideal case, given a partial state (a node in the tree), the conditional probability of failure (the label on the node) can be efficiently and exactly computed. (The example above is like this.) If this is so, then the algorithm can select the next node to go to by computing the conditional probabilities at each of the children of the current node, then moving to any child whose conditional probability is less than 1. As discussed above, there is guaranteed to be such a node. Line 44 ⟶ 63: Unfortunately, in most applications, the conditional probability of failure is not easy to compute efficiently. There are two standard and related techniques for dealing with this: === Using a conditional expectation === * '''Using a conditional expectation:''' Many probabilistic proofs work as follows: they implicitly define a random variable ''Q'', show that (i) the expectation of ''Q'' is at most (or at least) some threshold value, and (ii) in any outcome where ''Q'' is at most (at least) this threshold, the outcome is a success. Then (i) implies that there exists an outcome where ''Q'' is at most (at least) the threshold, and this and (ii) imply that there is a successful outcome. (In the example above, ''Q'' is the number of tails, which should be at least the threshold 1.5. In many applications, ''Q'' is the number of "bad" events (not necessarily disjoint) that occur in a given outcome, where each bad event corresponds to one way the experiment can fail, and the expected number of bad events that occur is less than 1.) In this case, to keep the conditional probability of failure below 1, it suffices to keep the conditional expectation of ''Q'' below (or above) the threshold. To do this, instead of computing the conditional probability of failure, the algorithm computes the conditional expectation of ''Q'' and proceeds accordingly: at each interior node, there is some child whose conditional expectation is at most (at least) the node's conditional expectation; the algorithm moves from the current node to such a child, thus keeping the conditional expectation below (above) the threshold. === Using a pessimistic estimator === * '''Using a pessimistic estimator:''' In some cases, as a proxy for the exact conditional expectation of the quantity ''Q'', one uses an appropriately tight bound called a ''pessimistic estimator''. The pessimistic estimator is a function of the current state. It should be an upper (or lower) bound for the conditional expectation of ''Q'' given the current state, and it should be non-increasing (or non-decreasing) in expectation with each random step of the experiment. Typically, a good pessimistic estimator can be computed by precisely deconstructing the logic of the original proof. == Example using conditional expectations == Line 68 ⟶ 89: Next, replace the random choice at each step by a deterministic choice, so as to keep the conditional probability of failure, given the vertices colored so far, below 1. (Here ''failure'' means that finally fewer than \|''E''\|/2 edges are cut.) In this case, the conditional probability of failure is not easy to calculate. Indeed, the original proof did not calculate the probability of failure directly; instead, the proof worked by showing that the expected number of cut edges was at least \|''E''\|/2. Let random variable ''Q'' be the number of edges cut. To keep the conditional probability of failure below 1, it suffices to keep the conditional expectation of ''Q'' at or above the threshold \|''E''\|/2. (This is because, as long as the conditional expectation of ''Q'' is at least \|''E''\|/2, there must be some still-reachable outcome where ''Q'' is at least \|''E''\|/2, so the conditional probability of reaching such an outcome is positive.) To keep the conditional expectation of ''Q'' at \|''E''\|/2 or above, the algorithm will, at each step, color the vertex under consideration so as to ''maximize'' the resulting conditional expectation of ''Q''. This suffices, because there must be some child whose conditional expectation is at least the current state's conditional expectation (and thus at least \|''E''\|/2). Given that some of the vertices are colored already, what is this conditional expectation? Following the logic of the original proof, the conditional expectation of the number of cut edges is Line 82 ⟶ 103: 1. For each vertex ''u'' in ''V'' (in any order): 2. Consider the already-colored neighboring vertices of ''u''. 3. Among these vertices, if more are black than white, then color ''u'' white. 4. Otherwise, color ''u'' black. Because of its derivation, this deterministic algorithm is guaranteed to cut at least half the edges of the given graph. (This makes it a [[Maximum cut#Approximation algorithms\|0.5-approximation algorithm for Max-cut]].) == Example using pessimistic estimators == Line 103 ⟶ 124: 1. Initialize ''S'' to be the empty set. 2. For each vertex ''u'' in ''V'' in random order: 3. If no neighbors of ''u'' are in ''S'', add ''u'' to ''S'' 4. Return ''S''. Clearly the process computes an independent set. Any vertex ''u'' that is considered before all of its neighbors will be added to ''S''. Thus, letting ''d''(''u'') denote the degree of ''u'', the probability that ''u'' is added to ''S'' is at least 1/(''d''(''u'')+1). By [[Expected value#Linearity\|linearity of expectation]], the expected size of ''S'' is at least : <math>\sum_{u\in V} \frac{1}{d(u)+1} ~\ge~ \frac{\|V\|}{D+1}.</math> (The inequality above follows because 1/(''x''+1) is [[Convex function\|convex]] in ''x'', so the left-hand side is minimized, subject to the sum of the degrees being fixed at 2\|''E''\|, when each ''d''(''u'') = ''D'' = 2\|''E''\|/\|''V''\|.) ''QED'' Line 140 ⟶ 161: 1. Initialize ''S'' to be the empty set. 2. While there exists a not-yet-considered vertex ''u'' with no neighbor in ''S'': 3. Add such a vertex ''u'' to ''S'' where ''u'' minimizes <math>\sum_{w\in N^{(t)}(u)\cup\{u\}} \frac{1}{d(w)+1}</math>. 4. Return ''S''. Line 149 ⟶ 170: 1. Initialize ''S'' to be the empty set. 2. While there exists a vertex ''u'' in the graph with no neighbor in ''S'': 3. Add such a vertex ''u'' to ''S'', where ''u'' minimizes ''d''(''u'') (the initial degree of ''u''). 4. Return ''S''. 1. Initialize ''S'' to be the empty set. 2. While the remaining graph is not empty: 3. Add a vertex ''u'' to ''S'', where ''u'' has minimum degree in the ''remaining'' graph. 4. Delete ''u'' and all of its neighbors from the graph. 5. Return ''S''. Line 180 ⟶ 201: * [[Probabilistic method]] * [[Derandomization]] * [[Randomized rounding]] {{no footnotes\|date=June 2012}} == References == {{reflist}} ▲* {{Citation ~~\| title = On a combinatorial game~~ ~~\| first1 = Paul \| last1 = Erdős \| authorlink1= Paul Erdős~~ ~~\| first2 = J. L. \| last2 = Selfridge~~ ~~\| journal = [[Journal of Combinatorial Theory, Series A]]~~ ~~\| volume = 14~~ ~~\| issue = 3~~ ~~\| pages = 298–301~~ ~~\| year = 1973~~ ~~\| doi = 10.1016/0097-3165(73)90005-8}}.~~ * {{Citation ▲\| title=Ten lectures on the probabilistic method ▲\| last=Spencer\|first=Joel H.\|authorlink=Joel Spencer ▲\| url=https://books.google.com/books?id=Kz0B0KkwfVAC ▲\| year=1987 ▲\| publisher=SIAM ▲\| isbn=978-0-89871-325-1}} * {{Citation ▲\| title= Probabilistic construction of deterministic algorithms: approximating packing integer programs ▲\| first = Prabhakar \| last = Raghavan \| authorlink=Prabhakar Raghavan ▲\| journal=[[Journal of Computer and System Sciences]] ▲\| volume=37 ▲\| issue=2 ▲\| pages=130–143 ▲\| year = 1988 ▲\| doi = 10.1016/0022-0000(88)90003-7}}. == Further reading == The method of conditional rounding is explained in several textbooks: * {{Cite book \|first1=Noga \|last1= Alon \|authorlink1=Noga Alon \| first2=Joel \|last2=Spencer \|authorlink2=Joel Spencer \| series=Wiley-Interscience Series in Discrete Mathematics and Optimization \| title=The probabilistic method \| url=https://books.google.com/books?id=q3lUjheWiMoC&q=%22method+of+conditional+probabilities%22~~#v=snippet&q=%22method%20of%20conditional%20probabilities%22&f=false~~ \| year=2008 \| edition=Third Line 230 ⟶ 221: \| ___location=Hoboken, NJ \| isbn=978-0-470-17020-5 \| pages=250 et seq. \| mr=2437651 }} (cited pages in 2nd edition, {{ISBN \|9780471653981}}) ~~\| mr=2437651 }}~~ * {{Cite book \| first1=Rajeev \|last1=Motwani \|authorlink1=Rajeev Motwani \| first2=Prabhakar \|last2=Raghavan \|authorlink2=Prabhakar Raghavan \| title=Randomized algorithms \|date=25 August 1995 \| url=https://books.google.com/books?id=QKVY4mDivBEC&q=%22method+of+conditional+probabilities%22~~#v=snippet&q=%22method%20of%20conditional%20probabilities%22&f=false~~ \| publisher=[[Cambridge University Press]] \| pages=~~120-~~120– \| isbn=978-0-521-47465-8}} * {{Citation \| first=Vijay \|last=Vazirani \| authorlink=Vijay Vazirani \| title=Approximation algorithms \|date=5 December 2002 \| url=https://books.google.com/books?id=EILqAmzKgYIC&q=%22method+of+conditional%22~~#v=snippet&q=%22method%20of%20conditional%22&f=false~~ \| publisher=[[Springer Verlag]] \| pages=~~130-~~130– \| isbn=978-3-540-65367-7}} <!-- \|url=https://books.google.com/books?id=EILqAmzKgYIC -->