Revision as of 15:40, 14 December 2012 edit 99.241.86.114 (talk) latex to html ← Previous edit		Revision as of 23:12, 14 December 2012 edit undo Gareth Griffith-Jones (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 89,541 edits Cleaned up using AutoEd Next edit →
Line 1: In [[mathematics]] and [[computer science]], the [[probabilistic method]] is used to prove the existence of mathematical objects with desired combinatorial properties. The proofs are probabilistic — they work by showing that a random object, chosen from some probability distribution, has the desired properties with positive probability. Consequently, they are [[nonconstructive proof\|nonconstructive]] — they don't explicitly describe an efficient method for computing the desired objects. The '''method of conditional probabilities''' {{harv\|Erdös\|Selfridge\|1973}}, {{harv\|Spencer\|1987}}, {{harv\|Raghavan\|1988}} converts such a proof, in a "very precise sense", into an efficient [[deterministic algorithm]], one that is guaranteed to compute an object with the desired properties. That is, the method [[derandomization\|derandomizes]] the proof. The basic idea is to replace each random choice in a random experiment by a deterministic choice, so as to keep the conditional probability of failure,given the choices so far, below 1. The method is particularly relevant in the context of [[randomized rounding]] (which uses the probabilistic method to design [[approximation algorithm]]s). Line 9: == Overview == {{harv\|Raghavan\|1988}} gives this description: :''We first show the existence of a provably good approximate solution using the [[probabilistic method]]... [We then] show that the probabilistic existence proof can be converted, in a very precise sense, into a deterministic approximation algorithm.'' : ''We first show the existence of a provably good approximate solution using the [[probabilistic method]]... [We then] show that the probabilistic existence proof can be converted, in a very precise sense, into a deterministic approximation algorithm.'' (Raghavan is discussing the method in the context of [[randomized rounding]], but it works with the probabilistic method in general.) Line 18 ⟶ 19: Here is a trivial example to illustrate the principle. ~~:'''Lemma:''' ''It is possible to flip three coins so that the number of tails is at least 2.''~~ : '''Lemma:''' ''It is possible to flip three coins so that the number of tails is at least 2.'' : ''Probabilistic proof.'' If the three coins are flipped randomly, the expected number of tails is 1.5. Thus, there must be some outcome (way of flipping the coins) so that the number of tails is at least 1.5. Since the number of tails is an integer, in such an outcome there are at least 2 tails. ''QED'' In this example the random experiment consists of flipping three fair coins. The experiment is illustrated by the rooted tree in the diagram to the right. There are eight outcomes, each corresponding to a leaf in the tree. A trial of the random experiment corresponds to taking a random walk from the root (the top node in the tree, where no coins have been flipped) to a leaf. The successful outcomes are those in which at least two coins came up tails. The interior nodes in the tree correspond to partially determined outcomes, where only 0, 1, or 2 of the coins have been flipped so far. Line 27 ⟶ 29: The method of conditional probabilities replaces the random root-to-leaf walk in the random experiment by a deterministic root-to-leaf walk, where each step is chosen to inductively maintain the following invariant: ~~::''the conditional probability of failure, given the current state, is less than 1.''~~ :: ''the conditional probability of failure, given the current state, is less than 1.'' In this way, it is guaranteed to arrive at a leaf with label 0, that is, a successful outcome. Line 33 ⟶ 37: == Efficiency == In a typical application of the method, the goal is to be able to implement the resulting deterministic process by a reasonably efficient algorithm (formally, one taking [[polynomial time\|time polynomial]] in the input size), even though typically the number of possible outcomes is huge (exponentially large). (E.g., consider the example above, but extended to ''n'' flips for large ''n''.) Line 43 ⟶ 48: In this case, to keep the conditional probability of failure below 1, it suffices to keep the conditional expectation of ''Q'' below (or above) the threshold. To do this, instead of computing the conditional probability of failure, the algorithm computes the conditional expectation of ''Q'' and proceeds accordingly: at each interior node, there is some child whose conditional expectation is at most (at least) the node's conditional expectation; the algorithm moves from the current node to such a child, thus keeping the conditional expectation below (above) the threshold. * '''Using a pessimistic estimator:''' In some cases, as a proxy for the exact conditional expectation of the quantity ''Q'', one uses an appropriately tight bound called a [[pessimistic estimator]]. The pessimistic estimator is a function of the current state. It should upper (or lower) bound the conditional expectation of ''Q'' given the current state, and it should be non-increasing (or non-decreasing) in expectation with each random step of the experiment. Typically, a good pessimistic estimator can be computed by precisely deconstructing the logic of the original proof. == Example using conditional expectations == This example demonstrates the method of conditional probabilities using a conditional expectation. === Max-Cut Lemma === Given any undirected [[Graph (mathematics)\|graph]] ''G'' = (''V'', ''E''), the [[Max cut]] problem is to color each vertex of the graph with one of two colors (say black or white) so as to maximize the number of edges whose endpoints have different colors. (Say such an edge is ''cut''.) Line 54 ⟶ 61: === Probabilistic proof of Max-Cut lemma === Color each vertex black or white by flipping a fair coin. By calculation, for any edge e in ''E'', the probability that it is cut is 1/2. Thus, by [[Expected_value#Linearity\|linearity of expectation]], the expected number of edges cut is \|''E''\|/2. Thus, there exists a coloring that cuts at least \|''E''\|/2 edges. ''QED'' Color each vertex black or white by flipping a fair coin. By calculation, for any edge e in ''E'', the probability that it is cut is 1/2. Thus, by [[Expected value#Linearity\|linearity of expectation]], the expected number of edges cut is \|''E''\|/2. Thus, there exists a coloring that cuts at least \|''E''\|/2 edges. ''QED'' === The method of conditional probabilities with conditional expectations === To apply the method of conditional probabilities, first model the random experiment as a sequence of small random steps. In this case it is natural to consider each step to be the choice of color for a particular vertex (so there are \|''V''\| steps). Line 66 ⟶ 75: Given that some of the vertices are colored already, what is this conditional expectation? Following the logic of the original proof, the conditional expectation of the number of cut edges is ~~::''the number of edges whose endpoints are colored differently so far''~~ ::+ ~~(1/2)(~~''the number of edges ~~with~~whose atendpoints ~~least~~are ~~one~~colored ~~endpoint~~differently ~~not~~so ~~yet colored~~far''). :: + (1/2)(''the number of edges with at least one endpoint not yet colored''). === Algorithm === The algorithm colors each vertex to maximize the resulting value of the above conditional expectation. This is guaranteed to keep the conditional expectation at \|''E''\|/2 or above, and so is guaranteed to keep the conditional probability of failure below 1, which in turn guarantees a successful outcome. By calculation, the algorithm simplifies to the following: Line 77 ⟶ 88: 4. Otherwise, color ''u'' black. Because of its derivation, this deterministic algorithm is guaranteed to cut at least half the edges of the given graph. (This makes it a [[~~Maximum_cut~~Maximum cut#~~Approximation_algorithms~~Approximation algorithms\|0.5-approximation algorithm for Max-cut]].) == Example using pessimistic estimators == The next example demonstrates the use of pessimistic estimators. === Turán's theorem <!-- linked to from [[~~Randomized_rounding~~Randomized rounding#~~Comparison_to_other_applications_of_the_probabilistic_method~~Comparison to other applications of the probabilistic method]] and from [[Turán'~~s_theorem~~s theorem#~~See_also~~See also]] --> === One way of stating [[Turán's theorem]] is the following: : Any graph ''G'' = (''V'', ''E'') contains an [[Independent set (graph theory)\|independent set]] of size at least \|''V''\|/(''D''+1), where ''D'' = 2\|''E''\|/\|''V''\| is the average degree of the graph. === Probabilistic proof of Turan's theorem === Consider the following random process for constructing an independent set ''S'': 1. Initialize ''S'' to be the empty set. Line 93 ⟶ 107: 3. If no neighbors of ''u'' are in ''S'', add ''u'' to ''S'' 4. Return ''S''. Clearly the process computes an independent set. Any vertex ''u'' that is considered before all of its neighbors will be added to ''S''. Thus, letting ''d''(''u'') denote the degree of ''u'', the probability that ''u'' is added to ''S'' is at least <math>1/(d(u)+1)</math>. By [[~~Expected_value~~Expected value#Linearity\|linearity of expectation]], the expected size of ''S'' is at least : <math>\sum_{u\in V} \frac{1}{d(u)+1} ~\ge~\frac{\|V\|}{D+1}.</math> (The inequality above follows because 1/(''x''+1) is [[Convex function\|convex]] in ''x'', so the left-hand side is minimized, subject to the sum of the degrees being fixed at 2\|''E''\|, when each ''d''(''u'') = ''D'' = 2\|''E''\|/\|''V''\|.) ''QED'' === The method of conditional probabilities using pessimistic estimators === In this case, the random process has \|''V''\| steps. Each step considers some not-yet considered vertex ''u'' and adds ''u'' to ''S'' if none of its neighbors have yet been added. Let random variable ''Q'' be the number of vertices added to ''S''. The proof shows that ''E''[''Q''] ≥ \|''V''\|/(''D''+1). Line 103 ⟶ 120: Given that the first t steps have been taken, let <math>S^{(t)}</math> denote the vertices added so far. Let <math>R^{(t)}</math> denote those vertices that have not yet been considered, and that have no neighbors in <math>S^{(t)}</math>. Given the first t steps, following the reasoning in the original proof, any given vertex ''w'' in <math>R^{(t)}</math> has conditional probability at least <math>1/(d(w)+1)</math> of being added to ''S'', so the conditional expectation of ''Q'' is ''at least'' ~~:<math>\|S^{(t)}\| ~+~ \sum_{w\in R^{(t)}} \frac{1}{d(w)+1}. </math>~~ : <math>\|S^{(t)}\| ~+~ \sum_{w\in R^{(t)}} \frac{1}{d(w)+1}. </math> Let <math>Q^{(t)}</math> denote the above quantity, which is called a '''pessimistic estimator''' for the conditional expectation. Line 117 ⟶ 136: === Algorithm maximizing the pessimistic estimator === The algorithm below chooses each vertex ''u'' to maximize the resulting pessimistic estimator. By the previous considerations, this keeps the pessimistic estimator from decreasing and guarantees a successful outcome. Line 126 ⟶ 146: === Algorithms that don't maximize the pessimistic estimator === For the method of conditional probabilities to work, it suffices if the algorithm keeps the pessimistic estimator from decreasing (or increasing, as appropriate). The algorithm does not necessarily have to maximize (or minimize) the pessimistic estimator. This gives some flexibility in deriving the algorithm. The next two algorithms illustrate this. Line 139 ⟶ 160: 5. Return ''S''. Each algorithm is analyzed with the same pessimistic estimator as before. With each step of either algorithm, the net increase in the pessimistic estimator is ~~:<math>1 - \sum_{w\in N^{(t)}(u)\cup\{u\}} \frac{1}{d(w)+1},</math>~~ : <math>1 - \sum_{w\in N^{(t)}(u)\cup\{u\}} \frac{1}{d(w)+1},</math> where <math>N^{(t)}(u)</math> denotes the neighbors of ''u'' in the remaining graph (that is, in <math>R^{(t)}</math>). For the first algorithm, the net increase is non-negative because, by the choice of ''u'', ~~:<math>\sum_{w\in N^{(t)}(u)\cup\{u\}} \frac{1}{d(w)+1} \le (d(u)+1) \frac{1}{d(u)+1} = 1 </math>,~~ : <math>\sum_{w\in N^{(t)}(u)\cup\{u\}} \frac{1}{d(w)+1} \le (d(u)+1) \frac{1}{d(u)+1} = 1 </math>, where ''d''(''u'') is the degree of ''u'' in the original graph. For the second algorithm, the net increase is non-negative because, by the choice of ''u'', ~~:<math>\sum_{w\in N^{(t)}(u)\cup\{u\}} \frac{1}{d(w)+1} \le (d'(u)+1) \frac{1}{d'(u)+1} = 1 </math>,~~ : <math>\sum_{w\in N^{(t)}(u)\cup\{u\}} \frac{1}{d(w)+1} \le (d'(u)+1) \frac{1}{d'(u)+1} = 1 </math>, where <math>d'(u)</math> is the degree of ''u'' in the remaining graph. == See also == * [[Probabilistic method]] * [[Derandomization]] Line 158 ⟶ 186: == References == * {{Citation \| title = On a combinatorial game \| first1 = Paul \| last1 = Erdös \| authorlink1= Paul Erdös \| first2 = J. L. \| last2 = Selfridge \| journal = [[Journal of Combinatorial Theory, Series A]] \| volume = 14 \| issue = 3 Line 168 ⟶ 198: \| year = 1973 \| doi = 10.1016/0097-3165(73)90005-8}}. * {{Citation ~~\|title=Ten lectures on the probabilistic method~~ \| title=Ten lectures on the probabilistic method ~~\|last=Spencer\|first=Joel H.\|authorlink=Joel Spencer~~ \| last=Spencer\|first=Joel H.\|authorlink=Joel Spencer ~~\|url=http://books.google.com/books?id=Kz0B0KkwfVAC~~ \| url=http://books.google.com/books?id=Kz0B0KkwfVAC ~~\|year=1987~~ \| year=1987 ~~\|publisher=SIAM~~ \| publisher=SIAM ~~\|isbn=978-0-89871-325-1}}~~ \| isbn=978-0-89871-325-1}} * {{Citation ~~\| title= Probabilistic construction of deterministic algorithms: approximating packing integer programs~~ \| title= Probabilistic construction of deterministic algorithms: approximating packing integer programs ~~\| first = Prabhakar \| last = Raghavan \| authorlink=Prabhakar Raghavan~~ \| first = Prabhakar \| last = Raghavan \| authorlink=Prabhakar Raghavan ~~\| journal=[[Journal of Computer and System Sciences]]~~ \| journal=[[Journal of Computer and System Sciences]] \| volume=37 \| issue=2 \| pages=130–143 \| year = 1988 \| doi = 10.1016/0022-0000(88)90003-7}}. == Further reading == * {{Cite book \|first1=Noga \|last1= Alon \|authorlink1=Noga Alon ~~\|first2=Joel \|last2=Spencer \|authorlink2=Joel Spencer~~ \| first2=Joel \|last2=Spencer \|authorlink2=Joel Spencer \| series=Wiley-Interscience Series in Discrete Mathematics and Optimization \| title=The probabilistic method \| url=http://books.google.com/books?id=q3lUjheWiMoC&q=%22method+of+conditional+probabilities%22#v=snippet&q=%22method%20of%20conditional%20probabilities%22&f=false \| year=2008 \| edition=third \| publisher=John Wiley and Sons \| ___location=Hoboken, NJ \| isbn=978-0-470-17020-5 , (Second 9780471370468) \| pages=250 et seq. (Second edition) \| mr=2437651 }} * {{Cite book ~~\|first1=Rajeev \|last1=Motwani \|authorlink1=Rajeev Motwani~~ \| first1=Rajeev \|last1=Motwani \|authorlink1=Rajeev Motwani ~~\|first2=Prabhakar \|last2=Raghavan \|authorlink2=Prabhakar Raghavan~~ \| first2=Prabhakar \|last2=Raghavan \|authorlink2=Prabhakar Raghavan ~~\|title=Randomized algorithms~~ \| title=Randomized algorithms ~~\|url=http://books.google.com/books?id=QKVY4mDivBEC&q=%22method+of+conditional+probabilities%22#v=snippet&q=%22method%20of%20conditional%20probabilities%22&f=false~~ \| url=http://books.google.com/books?id=QKVY4mDivBEC&q=%22method+of+conditional+probabilities%22#v=snippet&q=%22method%20of%20conditional%20probabilities%22&f=false ~~\|publisher=[[Cambridge University Press]]~~ \| publisher=[[Cambridge University Press]] ~~\|pages=120-~~ \| pages=120- \| isbn=978-0-521-47465-8}} * {{Citation ~~\|first=Vijay \|last=Vazirani~~ \|~~authorlink~~ first=Vijay \|last=Vazirani \| authorlink=Vijay Vazirani ~~\|title=Approximation algorithms~~ \| title=Approximation algorithms ~~\|url=http://books.google.com/books?id=EILqAmzKgYIC&q=%22method+of+conditional%22#v=snippet&q=%22method%20of%20conditional%22&f=false~~ \| url=http://books.google.com/books?id=EILqAmzKgYIC&q=%22method+of+conditional%22#v=snippet&q=%22method%20of%20conditional%22&f=false ~~\|publisher=[[Springer Verlag]]~~ \| publisher=[[Springer Verlag]] ~~\|pages=130-~~ \| pages=130- \| isbn=978-3-540-65367-7}} <!-- \|url=http://books.google.com/books?id=EILqAmzKgYIC --> <!-- book references generated by http://reftag.appspot.com --> == External links == [http://greedyalgs.info/blog/method-of-conditional-probabilities/ The probabilistic method — method of conditional probabilities], blog entry by Neal E. Young, accessed 19/04/2012. [http://greedyalgs.info/blog/method-of-conditional-probabilities/ The probabilistic method — method of conditional probabilities], blog entry by Neal E. Young, accessed 19/04/2012. {{DEFAULTSORT:Method Of Conditional Probabilities}} [[Category:Algorithms]] [[Category:Probabilistic arguments]]

Method of conditional probabilities: Difference between revisions