Talk:Monty Hall problem/Arguments/Archive 8: Difference between revisions

Content deleted Content added
MiszaBot I (talk | contribs)
m Archiving 2 thread(s) from Talk:Monty Hall problem/Arguments.
MiszaBot I (talk | contribs)
m Archiving 2 thread(s) from Talk:Monty Hall problem/Arguments.
Line 480:
 
: The simple solution shows that always switching is a much better strategy than always staying. The conditional solution shows that there is nothing better still. But of course nobody would imagine this, anyway. This is true if all doors are initially equally likely to hide a car. It doesn't matter whether or not the host has some bias. It's easy to see that the more biased the host is the more favourable it is to the player. And with a maximally biased host you can easily see (just work out the two possibilities in the Monty Crawl game) that switching is never unfavourable. Yet also in Monty crawl always switching only has success rate 2/3. This proves rigorously that there is no better strategy than always switching. (This argument was discovered by Lambiam, it is inspired by game theory, and it makes the explicit computations of Morgan et al. superfluous and allows you also to explain the biased host case to your Grandma or Grandpa. I have written it up in the longer version [http://www.math.leidenuniv.nl/~gill/mhp-statprob.pdf] of my article in the StatProb encyclopedia [http://statprob.com/encyclopedia/MontyHallProblem.html]. It's also written out on my University of Leiden home page. @Lambiam got a prize from me for finding this proof. [User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 16:48, 14 April 2011 (UTC)
== Course on probability ==
 
Sorry, dind't know of this page. Moved to here.
 
In my course on probability the problem is treated and solved by computing with Bayes' formula the conditional probability on the car under the condition of the door number chosen by the contestant and number of the door opened by Monty. As far as I know this is done in all probability courses. Is the point of discussion here that there is a different approach possible? [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 22:38, 6 April 2011 (UTC)
: Most probability courses do this, I believe, because the Monty Hall problem has become a popular example with which to exemplify Bayes theorem. It then becomes necessary to make assumptions which may or may not be well justifiable, and whose justifiability might well depend on whether you think of probability in a frequentist or subjective manner. Marilyn Vos Savant, who made the problem famous, and many others, including mathematicians, probabilists, statisticians, game theorists and mathematical economists, show that the player who always switches wins with probability 2/3, under weaker assumptions than the usual assumptions needed to apply Bayes. And with arguments which ordinary people can understand and appreciate. So in short: not only probabilists are interested in the Monty Hall problem. It is popular to a large number of people, many of whom do not have any education in probability at all.
 
: From the mathematical point of view, showing no conditional probability favours switching is equivalent to showing that the uncondional win-chance of 2/3 (attained by always switching) cannot be improved upon. In the fully symmetric case favoured by most probability text books symmetry shows immediately that the unconditional and conditional probabilities must be the same. I think that ordinary folk who are not familiar with conditional probability can also appreciate this argument. Though most people would find it not worth the bother. Always switching gives 2/3, always staying gives 1/3. It's pretty inconceivable that a mixed strategy could do better than always switching. There is a short proof of this fact (in the situation where the three doors are equally likely to hide the car, but the host is not equally likely to open either door when he has a choice) which uses the fact that knowing the outcome of the (possibly biased) coin toss which Monty would make if he needed to make one, can't be unfavourable to the player. Knowing the outcome puts the player in the Monty Crawl problem: the problem where, if you choose door 1 and the car happened to be behind it, you know which door Monty would open. Now it's easy to see by inspection of the two possible cases (that he opens the door which you would expect in that case, or the door which you do not expect in that case) that also in the Monty Crawl problem it is never unfavourable to switch. But always swithcing in the Monty Crawl problem still only gives an overall success chance of 2/3. Consequently 2/3 can't be beaten.
 
: In short, there are a lot of reliable sources who think that MHP doesn't have to be solved by going to Bayes theorem (including the lady who made it famous), a few reliable sources who think that people who only give the unconditional probability are wrong and Mrs Vos Savant is stupid, a lot of probability text books which do the conditional probability problem as an exercise to illustrate Bayes theorem, and a whole lot of other ways to get whichever results you like under whichever assumptions you find natural.
: If you must do it with Bayes I'd recommend you teach your students Bayes' rule which not only gets the right answer even faster but also gives insight into why conditional and unconditional probabilities are equal (in the symmetric case) as well as being a powerful and internalizable tool and a beautiful result .. while Bayes theorem is just the definition of conditional probability twice and the law of total probability. It's easier to figure it out from first principles rather than remember it. Bayes' rule is on the other hand memorable and surprising by its utter simplicity. You just have to pick up the notion of odds. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 23:11, 6 April 2011 (UTC)
 
 
Thanks for your extensive answer. I'm not teaching, but just a student. Our professor said that because you know the door numbers of the chosen and the opened door, you have to use conditional probabilities. How come? [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 21:48, 7 April 2011 (UTC)
:There's teachers of student probability for you. [[User:Martin Hogbin|Martin Hogbin]] ([[User talk:Martin Hogbin|talk]]) 21:54, 7 April 2011 (UTC)
 
:: He should tell you *why* you have to, or rather, why it could be wise to do so. There is no *must*. Mathematics advises. It is not about (legal) law, and it is not about morals, its about wisdom. The reason it would be wise to do so is because that way you will be sure of doing the best you can, overall. The chance of winning the car cna be broken down by the situation in which you find yourself. You have the best chance overall, if you have the best chance in each situation.
 
:: There are many many other ways of showing that 2/3 overall win chance is the best you can possibly get. If you use one of these other ways, it's a waste of time to use Bayes to do the same thing.
 
:: There is one interpretation of probability in which a sort-of "must" can be deduced. In the Bayesian world we are continually getting information and if we are self-consistent we are continually updating our beliefs about uncertain things according to the laws of probability. (If we don't update by Bayes we could be tricked with a Dutch book into accepting simultaneously different bets such that whatever happened, we'd lose money). A Bayesian first chooses a door. Probability 2/3 of missing the car since all doors are equally likely - according to his knowledge. The host opens a door and shows a goat, but the player doesn't take any notice of the number. Still probility 2/3 that there's no car behind door 1, because whether or not the car is behind door 1, the host is certain to do what he did. Next the player takes notice that it was door 3 the host opened. Still probability 2/3 that there's no car behind door 1 since whether or not the car is behind door 1, the host is for him equally likely to open door 2 or door 3, by symmetry of the player's knowledge.
 
:: Actually, a very smart Bayesian knows in advance that the door numbers will be irrelevant to whether or not his initially chose door hides the car, by symmetry. So he'll just pick any door and thereafter switch without taking any notice of the number of his door nor of the number of the door opened by the host.
 
:: This brings the smart Bayesian into harmony with the smart frequentist. The smart frequentist knows that he doesn't know the probabilities of the ___location of the car, nor does he know the probabilities by which the host opens a door. He does know game theory though. So he'll initially pick his door completely at random and thereafter switch, and he'll get the car with overall probability 2/3. He doesn't know about the conditional probability he got the car (if he did), and he doesn't care about it either. Both of them choose at random and switch because they know nothing. That's wisdom.
 
:: Sorry for another extensive answer. There are in fact many answers. And it's a rather important question.
 
:: And indeed, the problem is almost as simple as 1+1=2. In fact it's just 3=1+2. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 06:32, 10 April 2011 (UTC)
 
:More explanation is available if you wish. [[User:Martin Hogbin|Martin Hogbin]] ([[User talk:Martin Hogbin|talk]]) 21:54, 7 April 2011 (UTC)
 
I'm a little confused by Richard Gill's lengthy explanation. It seems he shows that the average contestant wins with probability 2/3 when switching. That's no surprise, and he might as well have proven that 1+1=2. Neither leads logically to the conditional probability being 2/3. What point am I missing? [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 22:06, 7 April 2011 (UTC)
:I was actually referring to an explanation of my remark above but although I cannot speak for Gill, I will give you my take on his argument. The point is that, in the normal interpretation of the problem, the situation is perfectly symmetrical with respect to door number. The numbers on the doors tell us nothing, thus we can change 'the host opens door 3' to the host opens door 2 without changing the answer, because we have no information that allows us to distinguish between the two doors. [[User:Martin Hogbin|Martin Hogbin]] ([[User talk:Martin Hogbin|talk]]) 23:04, 7 April 2011 (UTC)
 
:If you know the door numbers you can ask (for example) what is the conditional probability of winning if a player picks door 1 and the host opens door 3 - and this probability may or may not be the same as the overall average. You should use conditional probabilities in this case because you can do an experiment that will in the limit show this probability (randomly select a door, see which door the host opens, and throw this sample away unless the player picked door 1 and the host opened door 3). If you do this you are experimentally determining the conditional probability, NOT the average probability. If you think because the average is 2/3 that this experiment should come out with the same answer, you might be surprised. If you figure out the conditional probability, and see what it is sensitive to (the initial car ___location AND how the host picks if the player initially picks the car), you might run your experiment differently. The experiment vos Savant suggested in her third column did not control how the host selects in this case, so unless the results are undifferentiated (by specific case) the experimental answer can be anything from probability 1/2 to probability 1 of winning by switching. However if all 6 of the conditional probabilities are 2/3 the average (clearly) must be 2/3. Richard is inverting this - if the average is 2/3 ''and all the conditionals must be equal'' then all the conditionals must be 2/3 as well. What it means for all the conditionals to be equal is that the problem is symmetrical, so if you know the problem is symmetrical and you know the average probability is 2/3, then you know all the conditionals must be 2/3 as well (it's really just a minor short cut to figure out the conditionals). -- [[user:Rick Block|Rick Block]] <small>([[user talk:Rick Block|talk]])</small> 23:12, 7 April 2011 (UTC)
 
:::"Minor short cut to figure out the conditionals"! For some reliable sources, it's a <i>major</i> insight which makes even thinking about them completely superfluous! And Rick, you write that you "should" use conditional probabilities... where does this imperative come from??? Mathematics does not tell you what you *must* do. It can only give you advice on what might be wise. You can have good reasons to ignore the advice in many situations. Certainly, you are not obliged to use it. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 08:09, 11 April 2011 (UTC)
 
::::If I do understand you correctly, from the symmetry you conclude that all the relevant conditional probabilities are the same, and hence have the same value as the average probability. But why do you want this conclusion if you on the other hand state that it is not needed to consider the conditional probabilities? On the other hand, if you want to use the symmetry for some purpose, you should have to mention this in your argumentation, and I get the impression you don't want to mention this. Can you explain this? I take the opportunity to make another remark. I discussed the problem with my fellow students and we have the idea that the solutions, mentioned in the simple solution section, with reference to some Devlin, are logically nonsense. Indeed do the two remaining doors have probability 2/3 on the car, but this is not only because the chosen door has probability 1/3, but also because each of these remaining doors itself has probability 1/3 on the car. It seems impossible to conclude from this that the remaining unopened door should have probability 2/3 on the car. Are we right? [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 20:33, 11 April 2011 (UTC)
 
:::::Pardon for the intrusion: Handy2000, you are right to say that any of the three doors originally had a chance of 1/3, so the chance of the pair of the ''two remaining host's doors together is 2/3.''  And you know that at least one of those two host's doors definitely must hide a goat, as there is only one car. So you could say that if the host shows a goat behind one of his two doors, that's no news, and that it didn't influence the chance of your originally chosen door of 1/3, and that its chance ''has remained 1/3.'' That's right, so in 2 out of three the host's second door will hide the car, and the probability to win by switching is 2/3, as long as you got no "additional information". That's a fact. And in only 1 out of three there will be the second goat. That's a fact also.
 
:::::But now comes "the crux" (conditionalists call it "the core") of the MHP: The host "could" have been telling you additional bootless info in showing you a goat, he could have been giving you ''some closer'' knowledge on the "actual" probability to win by switching: If, for example, he always uses to open just one special door if he has got two goats, say he always uses to open his door with the lowest number only, if ever possible, and never his door with the higher number, then - in exceptionally opening his "other, strictly avoided door with the higher number", then he will have shown you by that action that he was ''unable'' to open his preferred door with the lower number, and that it is very likely that his usually preferred door with the lower number "actually" hides the car!  And that your chance by switching actually is exceptionally "higher than 2/3", and the chance of your originally chosen door dropped below 1/3, as a consequence. And vice versa. That's the argument that conditionalists use to present: that it could be possible that the host just could have given us "additional information" on the actual ___location of the car. But as the host forever will be out of position to indicate any "closer" probability to win by switching than to be forever within the fixed range of "at least 1/2 (but never less!), to max. 1/1, but never more :-)", all of that forever will remain "bootless and useless closer information" without any relevance as to the decision to switch or to stay, and so conditionalists are indeed unable to give you better advice than always to switch, also.  Because staying never can be better (see my "Question" below). [[User:Gerhardvalentin|Gerhardvalentin]] ([[User talk:Gerhardvalentin|talk]]) 22:49, 11 April 2011 (UTC)
 
::::::Thank you for your explanation. However it does not explain why the referred reasoning should be correct. It is not about the answer of 2/3, but about the way it is derived. You say the chance for the originally chosen door "has remained" 1/3. What do you mean by that. Of course do all the chances remain the same. May be you mean that the conditional probability after the initial choice and the action of the host also has the value 1/3. Am I right? Still the way Devlin reasons seems not correct to us. [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 08:51, 12 April 2011 (UTC)
:::::::Handy2000, I just answered on your [http://en.wikipedia.org/w/index.php?title=User_talk:Handy2000 talk page]. Kind regards, [[User:Gerhardvalentin|Gerhardvalentin]] ([[User talk:Gerhardvalentin|talk]]) 13:08, 12 April 2011 (UTC)
 
::::::::On my talk page Gerardvalentin explains that Devlin is merely trying to give a way of understanding rather than a complete solution. He admits that the resulting probabilities of 2/3 and 0 for the remaining and the opened door are not the original probabilities, but conditional probabilities. The question remains whether Devlin also considers these probabilities to be conditional. At least it looks somehow a little misleading by giving an way of understanding and presenting this as a solution, without referring to the conditional nature of the resulting probabilities. Agree? [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 12:33, 13 April 2011 (UTC)
 
::<del>Oh dear Rick, we are back your misunderstanding of what probability means. I am not sure if we are allowed to discus this any more, perhaps it is OK on this page.</del>
 
::Rick, perhaps you could tell me what you understand the term 'probability' to mean.
 
::Richard confirms above that there are two interpretations of probability, Bayesian and frequentist. Although they both should give the same answer to any exact question it is very confusing to flip interpretation in the middle of an argument. I am happy to discuss the MHP using either model but let us start with the Bayesian perspective first.
 
::To a Bayesian, probability is a state of knowledge. That is the definition of what Bayesian probability means. In the Bayesian model, what is not known does not exist. That is not something that I have just made up, I am sure Richard will confirm it or you can confirm it using WP or any good text book. So, if the player does not know where the car is placed, or the host's door opening strategy, the probability of winning by switching is exactly 2/3. It may be that the car is always placed behind door 2 and the host always opens door 3 but, if we do not know that, it makes no difference to our (Bayesian) probability calculation, which depends ''by definition'' only on the information that we have, . The concept of average probability makes no sense, there is just 'the probability' which is based ''only'' on our state of knowledge. [[User:Martin Hogbin|Martin Hogbin]] ([[User talk:Martin Hogbin|talk]]) 11:26, 8 April 2011 (UTC)
 
:::<del>Re: [quote deleted], No, that is not allowed here. I have placed a warning on your user page. I strongly advise against anyone else who may be tempted to respond in a nnon-civil manner to resist the temptation.</del> (issue resolved) [[User:Guymacon|Guy Macon]] ([[User talk:Guymacon|talk]]) 06:19, 9 April 2011 (UTC)
 
::::I apologise and have struck out the offending comment and have replaced it with a more appropriate intro. I hope Rick will now respond to the content of my post, which explains much of the pointless disagreement here. [[User:Martin Hogbin|Martin Hogbin]] ([[User talk:Martin Hogbin|talk]]) 08:21, 9 April 2011 (UTC)
 
:::@Martin - we have discussed this to death and back numerous times before. In the spirit of a fresh start, I'll try once (only once) more.
 
:::Point 1: We're not talking about interpretations of probability here, but rather what probability is of interest. Is it P(win by switching), or P(win by switching|player picks door 1), or P(win by switching|player picks door 1 and host opens door 3)? These are each distinct probabilities whether we're using frequentist or Bayesian interpretations of probability, i.e. this difference has absolutely no relevance to this question. P(win by switching) is what I'm calling the "average" probability. It is what Carlton, and Grinstead and Snell (and others) call the success of a "switching strategy" (i.e. decide before even picking a door that you're going to switch). In a Bayesian sense (as you seem to be thinking about it) it is the probability of winning by switching knowing the rules of the game, ''before'' going on the show. In a frequentist sense, it is the limit of (number of players who switch and win, plus number of players who stay and lose) / (total number of players) as the number of players grows large. It is what the simplest solution (you have a 2/3 chance of initially selecting a goat, if your initial choice is a goat and you switch you end up with the car, so if you switch you have a 2/3 chance of winning the car) computes - and this probability is 2/3 (assuming the player's initial choice is independent of the initial car ___location, and the host must open a door showing a goat and must make the offer to switch) whether you're interpreting probability as a frequentist or a Bayesian.
 
:::Point 2: vos Savant's solution (which enumerates three equally likely cases assuming the player picks door 1, with switching losing in one and winning in 2) computes P(win by switching|player picks door 1) - and this probability is 2/3 (assuming the car is uniformly randomly located and the host must open a door showing a goat and the host must make the offer to switch) whether you're a frequentist or a Bayesian. As a Bayesian, this means you're deciding to switch ''after'' picking door 1 but before seeing which door the host opens. As a frequentist, it is the limit of (players who pick door 1 and switch and win, plus players who pick door 1 and stay and lose) / (players who pick door 1).
 
:::Point 3: If you're putting the point of the player's choice ''after'' the host opens a door, the state of knowledge of the contestant includes the BOTH the number of the door she originally chose and the number of the door the host opens. This means, for a frequentist or a Bayesian, the probability of interest (assuming the player has picked door 1 and the host has opened door 3) is P(win by switching|player picks door 1 and host opens door 3), NOT simply P(win by switching) and NOT P(win by switching|player picks door 1).
 
:::Point 4: Per Puza et al. (Teaching Statistics, vol 27 number 1, Spring 2007 - quite a nice paper BTW), the (clearly Bayesian) probability of winning by switching given the player picks door 1 and the host opens door 3 is (their notation):
 
::: ( P(C<sub>2</sub>)P(S<sub>1</sub>|C<sub>2</sub>)P(H<sub>3</sub>|C<sub>2</sub>S<sub>1</sub>) ) / ( P(C<sub>1</sub>)P(S<sub>1</sub>|C<sub>1</sub>)P(H<sub>3</sub>|C<sub>1</sub>S<sub>1</sub>) + P(C<sub>2</sub>)P(S<sub>1</sub>|C<sub>2</sub>)P(H<sub>3</sub>|C<sub>2</sub>S<sub>1</sub>) )
 
:::which varies from 0 to 1 depending on what assumptions you make. Once again, this is the probability regardless of whether you're a Bayesian or a frequentist. As a Bayesian it means you don't know for sure your probability of winning by switching if you've picked door 1 and have seen the host open door 3 - unless you know or make some assumption about all of the probabilities involved. As a frequentist, it means (players who pick door 1 and see the host open door 3 and switch and win, plus players who pick door 1 and see the host open door 3 and stay and lose) / (players who pick door 1 and see the host open door 3) will approach something between 0 and 1, but you could do a number of experiments to determine the individual probabilities involved.
 
:::Point 5: (also per Puza et al.) Assuming what they (referring to Morgan et al.) call the "vos Savant" scenario (i.e. the car is uniformly randomly placed, the host must open a door showing a goat, and the host must make the offer to switch) they show this reduces to the 1/(1+''q'') answer, with ''q''=P(H<sub>3</sub>|C<sub>1</sub>S<sub>1</sub>) (i.e. the probability the host opens door 3 if the car is behind door 1 and the player initially chose door 1). Once again, this is the probability regardless of whether you're a Bayesian or a frequentist. As a Bayesian it means you don't know for sure your probability of winning by switching if you've picked door 1 and have seen the host open door 3 (even with these assumptions), but you do know it's between 1/2 and 1. As a frequentist, it means with these assumptions that (players who pick door 1 and see the host open door 3 and switch and win, plus players who pick door 1 and see the host open door 3 and stay and lose) / (players who pick door 1 and see the host open door 3) will approach something between 1/2 and 1.
 
:::Point 6: Further assuming an uninformative prior (all values of this ''q'' are equally likely) Puza et al. end up with 2/3. This is the only thing in this whole discussion where Bayesian and frequentists might conceivably disagree - although even here there is a distinct equivalence. As a frequentist, this would mean uniformly randomly selecting ''q'' per trial that (players who pick door 1 and see the host open door 3 and switch and win, plus players who pick door 1 and see the host open door 3 and stay and lose) / (players who pick door 1 and see the host open door 3) will approach 2/3.
 
:::Do you agree with all of these points? If not, please explain (references would be nice). -- [[user:Rick Block|Rick Block]] <small>([[user talk:Rick Block|talk]])</small> 17:05, 9 April 2011 (UTC)
 
::::: Rick. this is all fine, but not terribly important, since for most readers it is already exciting enough to learn that you can win with unconditional probabiity 2/3 rather than unconditional probability 1/3. You are talking about ways to prove rigorously that you cannot do better than 2/3. Some people might find this interesting, others not. One way to do this is through a derivation and analysis of the formula which you just showed us. Most readers of the wikipedia page will have no use for that whatsoever. Fortunately, another way is simply to remark that by symmetry the specific door numbers on the door chosen by the player and the door opened by the goat are irrelevant. You can't use the knowledge of the numbers on the doors to improve your strategy since they have no bearing whatsoever on the question whether the car is behind your door or the other closed door!
 
::::: Here I am taking the point of view of almost all ordinary readers who will instinctively use probability in the subjective sense - which is perhaps the only way possible, given Vos Savant's question, since she doesn't tell us anything about the procedure whereby the door hiding the car is determined nor the door opened by the host. Her solution is perfectly adequate to her question, and the world-wide school-children's experiments confirmed the correctness in precisely the same way. Moreover there are academic sources which do it Vos Savant's way (Georgii, Gill, and no doubt others too). The fact that certain formulas are duplicated again and again by writers of elementary texts in probability and statistics ''in the chapter where they prove and illustrate Bayes theorem'' does not give these formulas some kind of academic superiority. Monty Hall problem is treated by textbook after textbook on game theory and optimization and there you will see completely different formulas. Fortunately, formulas are not necessary at all. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 11:29, 10 April 2011 (UTC)
 
::::Intrusion: Rick, that was no ''"reply to Martin",'' even if you call it so.  Martin's question was:
 
::::''"Rick, perhaps you could tell me what you understand the term 'probability' to mean."'' – And he added<br />''"there are two interpretations of probability, Bayesian and frequentist. Although they both should give the same answer to any exact question it is very confusing to flip interpretation in the middle of an argument. I am happy to discuss the MHP using either model but let us start with the Bayesian perspective first. [...] In the Bayesian model, what is not known does not exist. [...] (Bayesian) probability calculation, which depends by definition only on the information that we have [...] there is just 'the probability' which is based only on our state of knowledge."''
 
::::May I assume that your (pointless) "citing of sources" says that you do not even think of answering Martin's question? Not paying regard to Martin's distinction between "Bayesian" vs. "frequentist's" –  "state of knowledge" vs. "all kinds of possibilities might exist in the world out there and ''could be assumed''"?
::::I think Martin showed us a way that really could be of help for the article. If we just "knew" what we (contradictory) are talking about, and how to distinguish the different perspectives of the sources. Regards, [[User:Gerhardvalentin|Gerhardvalentin]] ([[User talk:Gerhardvalentin|talk]]) 18:42, 9 April 2011 (UTC)
 
:::::Gerhard - I believe my points above are very clear. The overriding point is that the distinction between frequentist and Bayesian has nothing whatsoever to do with what we keep arguing about. -- [[user:Rick Block|Rick Block]] <small>([[user talk:Rick Block|talk]])</small> 23:48, 9 April 2011 (UTC)
 
::::::Rick, in defining the different meanings of the term "probability", Martin just said to you ''"In the Bayesian model, what is not known does not exist"'', and he added ''"there is just 'the probability' which is based only on our state of knowledge."''  –  Sticking on ''that'' denotation means that [http://en.wikipedia.org/w/index.php?title=User_talk:Nijdam/Discussion&diff=next&oldid=368142766 "Before and After"] gives no "additional information". Whereas your accounting from the outset for "doors no. 1 and no. 3" denotes that you just converted the 'actual denotation' of the term "probability", and transformed it to include a lot of things that might be given in the world out there, converting the 'actual denotation' of the term probability to "what we actually do not know, but what could be assumed as any additional information that might exist in the world out there." I guess you do not like to stick for a moment to the denotation of "probability" that Martin actually was talking about. Please try.  [[User:Gerhardvalentin|Gerhardvalentin]] ([[User talk:Gerhardvalentin|talk]]) 01:19, 10 April 2011 (UTC)
 
:::::::Gerhard - Perhaps there's a language barrier here, please try to understand what I'm saying. A player, Bayesian or frequentist, that knows she has picked door 1 and has seen the host open door 3 knows the identity of the two doors. For this player, Bayesian or frequentist, the probability of interest is the ''conditional'' probability P(win by switching|player picks door 1 and host opens door 3), NOT P(win by switching). Whether you're interested in the conditional probability or not has NOTHING to do with whether you're a Bayesian or a frequentist. In both interpretations there is the concept of conditional probability, and in both it follows Bayes Rule. The Puza et al. paper presents a completely general, Bayesian, analysis. The conditional probability of winning, to a Bayesian, is the expression above (from the Puza et al. paper). You don't need to take my word for this - read the Puza et al. paper. -- [[user:Rick Block|Rick Block]] <small>([[user talk:Rick Block|talk]])</small> 02:38, 10 April 2011 (UTC)
 
:::::::::Rick, I do no see any language barrier in what Gerhard says but I am puzzled by your reluctance to address my point. I am not proposing a long discussion on philosophy, just that you make clear what definition of probability you are using in your arguments. If you do not do this is is impossible to discus the issues logically. As Gerhard says above, based only on a Bayesian interpretation of Whitaker's question the difference between the 'conditional' and 'unconditional' formulations becomes extremely pedantic. We cannot revise our probability estimate on seeing a specific door opened because we are given no additional information by the number number of that door.
 
:::::::::In your response to Handy2000 above you talk of running an experiment. I therefore assume that you are talking about a frequentist interpretation of probability. If, purely for the sake of clarity of discussion, you would confirm this, I would like to explain where I think the weaknesses in your argument are. If you are not willing to do this (and I can see no logical reason for not wanting to) we are doomed to go on forever failing to understand the other's arguments.[[User:Martin Hogbin|Martin Hogbin]] ([[User talk:Martin Hogbin|talk]]) 21:00, 10 April 2011 (UTC)
 
::::::::::If you're doing an experiment, you're either thinking of probability from a frequentist viewpoint or validating your Bayesian result. In my response to Handy2000 the interpretation of probability you're using makes no difference. Whether you're a Bayesian or a frequentist, the probability ''before'' the host opens a door is (typically) 1/3 per door. If you're a Bayesian or a frequentist and have chosen door 1 and have seen the host open door 3, after the host opens this door the probability of door 3 is definitely 0. If you're a Bayesian or a frequentist, the probabilities of the other two doors may or may not be the same as they were before the host opened door 3 - and you can express the ''after'' probabilities as conditional probabilities. The knowledge you've gained as a Bayesian is the identity of the open door (it is this knowledge that allows you to update its probability to 0). This same knowledge allows you (requires you) to update the probabilities of the other two doors. To avoid confusion, and so we can understand each other's arguments, I think we should use different names for these ''before'' and ''after'' probabilities. Both Bayesian and frequentist interpretations of probability use the language of ''conditional probability'' for this. I can see no logical reason for not using this terminology here. If you'd like to point out weaknesses and think the specific interpretation of probability makes a difference, please feel free to clarify what interpretation your comment pertains to. My assumption is it makes no difference, so unless it's clarified I'm assuming we're always talking about both. -- [[user:Rick Block|Rick Block]] <small>([[user talk:Rick Block|talk]])</small> 13:59, 11 April 2011 (UTC)
 
::::::::::: The term "conditional probability" is technical. It belongs to mathematics and it has a precise mathematical definition, namely as that definition which makes the chain rule true (prob of A and B equals prob of B times prob of A given B). Ordinary language does know about the intuitive concept of "the probability of A given B". Ordinary logic acknowledges the chain rule. Ordinary language and ordinary logic knows the concept of independence. Ordinary folk can agree that the identity (car or goat) of the object behind door 1 (the door chosen by the player) is independent of the identity (door 2 or 3) of the door opened by the host. This is intuitively true when we are using intuitive of subjective probability.
 
::::::::::: The difference between a subjectivist and a frequentist analysis of MHP is that for a subjectivist, the host *is* equally likely to open either door if the host has a choice (because we are subjectivist), while a freqentists doesn't know anything about the probabilities of his choices. Hence the frequentist either needs to leave them unknown (Morgan et al.) or has to be told "out of nowhere" that the host chooses randomly. The frequentist also has to be told that the door hiding the car is chosen at random. Alternatively, the frequentist finesses all discussion of unknowable host-side randomness by choosing his own door at random and switching. The frequentist chooses at random and switches and gets the car with probability 2/3. He doesn't know and doesn't care what is the conditional probability. The subjectivist chooses door 1 since that is their favourite number, and because he knows nothing about how the car is hidden, so it's equally likely behind any of the three doors. He'll switch and he knows he now has 2/3 chance of getting the car. He knows this probability, because he knows nothing at all. The frequentist doesn't know this probability, because he knows nothing at all! (I find that rather amusing). The frequentist who chose his door at random has got a strong guarantee that he'll get the car with (unconditional) probability 2/3 according to anyone's interpretation of probability. He's not interested in the conditional probability. The subjectivist has no guarantees whatsoever. Once off, he'll probably be OK, but how well he would do in many games is anybody's guess. And my subjective probability that't he'll get the car is less than 2/3 since I think Monty is smarter than he is.
 
::::::::::: Of course, subjective and objective probabilities satisfy the same rules, so you can check the results of calculations of subjective probabilities by actually doing many repetitions using "real" randomness. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 16:43, 24 April 2011 (UTC)
 
::::::::::: In abstract mathematics, the conditional probability of A given B is *defined* by demanding that P(A and B)=P(A|B) P(B). In the real world, ordinary people who are frequentists define the probability of A given B by imagining many repetitions of the probability experiment in question, and imagining how often A occurs within that subsequence of repetitions in which B happened to occur. In the real world, ordinary people who are subjectivists define the probability of A given B by imaging a betting situation in which they are to bet on A versus not A, where the bet is only called if B happens to happen. So if you would be prepared to bet at most 5 Euros against my 2 Euro on A happening against it not happening where the bet is settled if and only if B happens to occur (otherwise we just keep our own money), then for you the odds on A (against not A) given B is 5 to 2, and your conditional probability is 5/7. It's a nice exercise to check that the frequentist's conditional probability satisfies the chain rule, and that the subjectivist conditional probability satisfies the chain rule. The different kinds of probability follow the same calculus. The same formal rules. When teaching probability to mathematicians we emphasize the formal rules and down-play the interpretation (since this has been a field of controversy for at least 300 years). [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 07:22, 25 April 2011 (UTC)
 
== Devlin and others ==
 
As I wrote under "Course on probability", Gerhardvalentin tried to explain to me the reasoning of Devlin and others. This goes along the lines that the chance the contestant initially picks the car is 1/3 and as this chance is not changed when the door is opened by the host, the remaining 2/3 chance is on the two other doors. As the opened door has chance 0, the other must have chance 2/3. Gerhardvalentin calls it a way of understanding, not a full solution. I wonder why this yet is presented as a solution, rather than a way of understanding. Even then, as a way of understanding, it seems not a correct way of reasoning. Strangely also Rchard Gill - from his user page I discovered he is a professor in statistics - also reasons along these lines on the talk page of this article. I would say, that after door 3 has been opened by the host a new (conditional) probability law is governing the situation. Hence you can't say that the chance for the chosen door 1 does not change. In fact no chance whatsoever ever changes. You should argue that the new probability has the same value for the first door as the initial probability. And hence, as the new probability for door 3 is 0, door 2 must have a new probability of 2/3. I think this is a serious mistake in the arguing of Devlin and others. [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 09:23, 14 April 2011 (UTC)
:Handy2000, I have to reject sharply your untrue and false statement about my words. I never said the thing you foist on me to have said. Stop immediately to spread your misinterpretation of what I have "told". [[User:Gerhardvalentin|Gerhardvalentin]] ([[User talk:Gerhardvalentin|talk]]) 10:32, 14 April 2011 (UTC)
::Handy2000, I hope your probability teacher will also teach you Bayes' rule. Suppose we are interested in two possible scenarios or hypotheses (eg: car is behind door 1, car is not behind door 1). Our relative belief in the truth of those two scenarios can be measured by the [[odds]] of the one scenario relative to the other. For instance, initially the odds are 2 to 1 that the car is not behind Door 1. Bayes rule says that every time some new piece of info comes into your possession, you can find the conditional odds, ie the ratio of the conditional probabilities of the scenarios given the information, by just multiplying the prior odds by the so-called Bayes factor: the ratio of the probability of the information under scenario 1, to the probability of the information under scenario 2. You can do this in many steps. At each step, already incorporated information is included alongside the two scenarios, so the Bayes factor is always: ratio of probability of new information under scenario 1 and old info, to probability of new info under scenario 2 and old info.
:: Now think of the host first opening a door revealing a goat, and only afterwards telling the player (he was looking the other way) which door it was that the host opened (2 or 3). Two scenarios: car behind Door 1, car not behind Door 1. Initial odds: one to two. First info: a goat is revealed behind one of the two other doors, we don't know yet which. Bayes factor: one to one, since the host is certain to do this under either scenario. Odds that car is behind door 1 after opening of another door: 2 to 1 against. Now we get the info it was door 3. What is the chance it was door 3 when the car was actually behind door 1? 50-50 (host can choose, either choice equally likely in the eyes of the player). What is the chance it was door 3 when the car was actually not behind door 1? 50-50 (car is equally likely behind door 2 or 3, host opens 3 or 2). The Bayes factor is therefore 0.50 to 0.50 or 1:1. The final odds are still 2 to 1 against. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 16:30, 14 April 2011 (UTC)
:: PS, here's the proof of Bayes rule. Let D stand for the info (data), let H and K stand for the two scenariors (hypotheses). We know P(H|D)=P(H & D)/P(D) = P(H)P(D|H)/P(D) by applying the chain rule (=definition of conditional probability) twice. Similarly P(K|D) = P(K)P(D|K)/P(D). Divide the one equality by the other. P(H|D):P(K|D) = P(H):P(K) * P(D|H):P(D|K). Posterior odds equals prior odds times likelihood ratio. Whoever thought of this was a genius. I have no idea who it was. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 16:37, 14 April 2011 (UTC)
 
:::Richard - the confusion here is clearly the conditional sounding reasoning applied at the step where you don't yet know which door is open. Saying the probability of the open (but still unknown) door is 0 and reasoning that the probability of the remaining door is 2/3 is completely and entirely different from saying that the probability of door 3 is 0 and reasoning that the probability of door 2 must be 2/3. The first of these is valid, since the probability of the host opening "a door" is 1, so the "original" probability of door 1 is not changed. However, the second (applying this same reasoning to door 3 and door 2) is not valid, since knowing which door the host opens at least potentially affects the probabilities of all the doors (so you cannot assert without some sort of argument that the original probability of door 1 is unchanged).
 
:::Handy2000 - Delvin published a followup column [http://www.maa.org/devlin/devlin_12_05.html] where he says "it may be easier to find the relevant mathematical formula and simply plug in the appropriate values without worrying what it all means". Smashing advice in this case. -- [[user:Rick Block|Rick Block]] <small>([[user talk:Rick Block|talk]])</small> 17:36, 14 April 2011 (UTC)
 
:::: *My* argument only considered the question whether or not the door you first chose hides the car. I don't think there is any confusion there. Devlin in his second article admits that his first article jumped over one issue, namely the question whether or not *which* door is opened has any relevance. He was scared by his mistake didn't know how to get out of it and advised to fall back on insight-less accountancy. Pity he didn't see that Bayes' rule would help him over the difficulty. Not a difficulty at all if you actually know probability. Moroever it solves the problem of "what it all means" since it shows explicitly exactly what it all means. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 21:07, 14 April 2011 (UTC)
 
:::::Gerhardvalentin: Sorry I misunderstand what you've written on my talk page. May be be you can clarify what you meant.
:::::RickBlock: Thank you for pointing to the Devlin article. Although he gives a correct solution in the latter part, he still maintains his wrong argumentation in the first part. Somehow it seems to me he tries to justify this error in the last part of his article.
:::::Richard Gill: We learn Bayes' theorem if that's what you mean, and with this theorem we calculate the conditional probability. We do not speak off odds, but it looks as if you compare the probability on chosen door 1, opened door 3 and car behind 2, with the probability on chosen door 1, opened door 3 and car behind 1. This is, as far as I can see, equivalent to the calculation of the conditional probabilities.
:::::My conclusion: Indeed is the Devlin solution, as well as Cecil Adam's, mentioned in the solution section under "Simple solutions" incorrect. And I'm surprised that this wrong way of reasoning is still presented as a solution. [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 23:26, 14 April 2011 (UTC)
:::::: I agree Handy2000, solutions whose reasoning is actually wrong ought not to be included! We had a long discussion about Devlin some months ago. Part of the reason he screwed up was because he apparently doesn't know about Bayes rule. I think that the other part of the reason Devlin stumbled was because if you want to give a "full solution" you have to take account of the chances that the host opens either door when he has a choice. And this forces you to think about what probability means. If you are a subjectivist then this is 50-50 because you don't know anything. If you are a frequentist you are stuck.
 
:::::: Regarding Bayes' rule and Bayes' theorem: of course both of these are trivialities and they are completely equivalent. Bayes' theorem is an application twice of the definition of conditional probability (the chain rule). Bayes' rule is obtained by dividing the formula for Bayes' theorem for two different events both conditioned on the same "given". You can derive Bayes theorem from Bayes rule by applying Bayes' rule to a collection of events which are mutually exclusive and exhaustive. So you can think of them as equivalent, both are trivial.
 
:::::: The difference is that Bayes' theorem is a formula, while Bayes' rule is expressed in words: ''posterior odds equals prior odds times likelihood ratio'' (aka Bayes factor). You have to remember Bayes theorem (a formula) by remembering how to prove it, while you remember Bayes' rule by remembering the words. And the words contains a collection of truly important concepts. You need the concept of [[odds]] and you have to introduce the concept of [[likelihood ratio]] or [[Bayes factor]]. It's a very simple thing and easy to remember. Bayes' rule shows how your uncertainty is changed by getting new information, showing precisely how this depends both on the prior knowledge and new information. And this dependence is the most simple form you can possibly imagine. Bayes' rule is a gift from the Gods.
 
:::::: My experience in talking about probability and statistics to lawyers and medics and journalists - very intelligent but mathematics-challenged people - is that you can explain Bayes' rule but not Bayes' theorem to them. You can explain the concepts. You can go on to do numerical examples. I must say this is a little easier for native English speakers, since the concept of [[odds]] is native, it's part of ordinary language. The English love gambling. On mainland Europe, there is not a word for [[odds]] in any of the languages I know. The Dutch for instance preferred to make money by offering bets to English sailors, and the concept of a [[Dutch book]] is a collection of bets which a better foolishly accepts and loses money whatever the outcome. The Dutch think that betting is a sin but making money from foolish people is a virtue.
 
:::::: Rosenthal in his article and book uses Bayes' rule. His experience too is that this is the way to explain conditional probability to ordinary people. It's such a pity Devlin didn't know Bayes rule. I repeat, he's not from probability and statistics.
 
:::::: I'm offering a prize to whoever can tell me who first discovered Bayes' rule. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 07:03, 15 April 2011 (UTC)
 
:::::::What prize? Thanks you for your long explanation, it surely clairfied things to me. [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 07:49, 15 April 2011 (UTC)
 
:::::::: I'm glad the explanation was useful. It could surely have been shorter, but it's difficult to predict how much detail, and where, is needed by the intended reader. Prize: a bottle of good wine, or equivalent value as Amazon.com gift token, or Paypal cash, or donation to your favorite charity... [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 17:57, 15 April 2011 (UTC)
:::::::::By "discovered", do you mean who first stated the rule (obviously, [[Thomas Bayes]]), or who discovered it lost amongst his unpublished papers after he died and sent it to the Royal Society ([[Richard Price]])? --[[User:Almightybob101|almightybob]] ([[User_talk:Almightybob101|pray]]) 09:50, 25 April 2011 (UTC)
:::::::::Unless, of course, you accept [[Stephen Stigler|Professor Stigler]]'s words, in which case it's [[Nicholas Saunderson]].
:::::::::Failing that... I dunno. Probably some [[Pythagoras|Ancient Greek guy]]. They did all the best maths. --[[User:Almightybob101|almightybob]] ([[User_talk:Almightybob101|pray]]) 10:10, 25 April 2011 (UTC)
 
:::::::::: Bayes' rule, not Bayes theorem. Who first said and used ''posterior odds equals prior odds times likelihood ratio'', or if you prefer ''posterior is proportional to prior times likelihood''. Stephen Stigler told me that he doesn't know, but he's sure it's only well into the 20th century. [[User:Gill110951|Richard Gill]] ([[User talk:Gill110951|talk]]) 18:22, 25 April 2011 (UTC)
 
:::::::::::I noticed that the way of arguing of Devlin is in fact the way the pictures alongside the simple solution section explain the solution. The pictures are wrong, and so is Devlin's arguing. Why is this still kept as a kind of solution, instead of an example of false reasoning? Sorry, forgot to login. [[User:Handy2000|Handy2000]] ([[User talk:Handy2000|talk]]) 14:34, 5 May 2011 (UTC)
 
::::::::::::This seems like a topic for [[talk:Monty Hall problem]], not here. I would encourage you to bring this up there. -- [[user:Rick Block|Rick Block]] <small>([[user talk:Rick Block|talk]])</small> 15:08, 5 May 2011 (UTC)