Conditional probability distribution: Difference between revisions

Content deleted Content added
subst about-to-be-deleted template
Link suggestions feature: 3 links added.
Tags: Visual edit Mobile edit Mobile web edit Newcomer task Suggested: add links
 
(6 intermediate revisions by 5 users not shown)
Line 3:
In [[probability theory]] and [[statistics]], the conditional probability distribution is a probability distribution that describes the probability of an outcome given the occurrence of a particular event. Given two [[joint probability distribution|jointly distributed]] [[random variable]]s <math>X</math> and <math>Y</math>, the '''conditional probability distribution''' of <math>Y</math> given <math>X</math> is the [[probability distribution]] of <math>Y</math> when <math>X</math> is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value <math>x</math> of <math>X</math> as a parameter. When both <math>X</math> and <math>Y</math> are [[categorical variable]]s, a [[conditional probability table]] is typically used to represent the conditional probability. The conditional distribution contrasts with the [[marginal distribution]] of a random variable, which is its distribution without reference to the value of the other variable.
 
If the conditional distribution of <math>Y</math> given <math>X</math> is a [[continuous distribution]], then its [[probability density function]] is known as the '''conditional density function'''.<ref>{{cite book sfnp|first=Sheldon M. |last=Ross |authorlink=Sheldon M. Ross |title=Introduction to Probability Models |___location=San Diego |publisher=Academic Press |edition=Fifth |year=1993 |isbn=0-12-598455-3 |pagespp=88–91 }}</ref> The properties of a conditional distribution, such as the [[Moment (mathematics)|moments]], are often referred to by corresponding names such as the [[conditional mean]] and [[conditional variance]].
 
More generally, one can refer to the conditional distribution of a subset of a set of more than two variables; this conditional distribution is contingent on the values of all the remaining variables, and if more than one variable is included in the subset then this conditional distribution is the conditional [[joint distribution]] of the included variables.
 
==Conditional discrete distributions==
For [[discrete random variable]]s, the conditional [[probability mass function]] of <math>Y</math> given <math>X=x</math> can be written according to its definition as:
 
{{Equation box 1
Line 36:
|}
 
Then the unconditional probability that <math>X=1</math> is 3/6 = 1/2 (since there are six possible rolls of the dice, of which three are even), whereas the probability that <math>X=1</math> conditional on <math>Y=1</math> is 1/3 (since there are three possible [[prime number]] rolls—2, 3, and 5—of which one is even).
 
==Conditional continuous distributions==
Similarly for [[continuous random variable]]s, the conditional [[probability density function]] of <math>Y</math> given the occurrence of the value <math>x</math> of <math>X</math> can be written as<ref name=KunIlPark>{{cite book sfnp| author=Park, Kun Il| title=Fundamentals of Probability and Stochastic Processes with Applications to Communications| publisher=Springer | year=2018 | isbnp=978-3-319-68074-3}}</ref>{{rp|p. 99}}
 
{{Equation box 1
Line 65:
 
==Relation to independence==
Random variables <math>X</math>, <math>Y</math> are [[Statistical independence|independent]] [[if and only if]] the conditional distribution of <math>Y</math> given <math>X</math> is, for all possible realizations of <math>X</math>, equal to the unconditional distribution of <math>Y</math>. For discrete random variables this means <math>P(Y=y|X=x) = P(Y=y)</math> for all possible <math>y</math> and <math>x</math> with <math>P(X=x)>0</math>. For continuous random variables <math>X</math> and <math>Y</math>, having a [[joint density function]], it means <math>f_Y(y|X=x) = f_Y(y)</math> for all possible <math>y</math> and <math>x</math> with <math>f_X(x)>0</math>.
 
==Properties==
Line 73:
 
==Measure-theoretic formulation==
Let <math>(\Omega, \mathcal{F}, P)</math> be a [[probability space]], <math>\mathcal{G} \subseteq \mathcal{F}</math> a <math>\sigma</math>-field in <math>\mathcal{F}</math>. Given <math>A\in \mathcal{F}</math>, the [[Radon-NikodymRadon–Nikodym theorem]] implies that there is<ref>[[#billingsley95{{sfnp|Billingsley (|1995)]], |p. =430</ref>}} a <math>\mathcal{G}</math>-measurable random variable <math>P(A\mid\mathcal{G}):\Omega\to \mathbb{R}</math>, called the [[conditional probability]], such that<math display="block">\int_G P(A\mid\mathcal{G})(\omega) dP(\omega)=P(A\cap G)</math>for every <math>G\in \mathcal{G}</math>, and such a random variable is uniquely defined up to sets of probability zero. A conditional probability is called [[Regular conditional probability|'''regular''']] if <math> \operatorname{P}(\cdot\mid\mathcal{G})(\omega) </math> is a [[probability measure]] on <math>(\Omega, \mathcal{F})</math> for all <math>\omega \in \Omega</math> a.e.
 
Special cases:
Line 81:
Let <math>X : \Omega \to E</math> be a <math>(E, \mathcal{E})</math>-valued random variable. For each <math>B \in \mathcal{E}</math>, define <math display="block">\mu_{X \, | \, \mathcal{G}} (B \, |\, \mathcal{G}) = \mathrm{P} (X^{-1}(B) \, | \, \mathcal{G}).</math>For any <math>\omega \in \Omega</math>, the function <math>\mu_{X \, | \mathcal{G}}(\cdot \, | \mathcal{G}) (\omega) : \mathcal{E} \to \mathbb{R}</math> is called the '''conditional probability distribution''' of <math>X</math> given <math>\mathcal{G}</math>. If it is a probability measure on <math>(E, \mathcal{E})</math>, then it is called [[Regular conditional probability|'''regular''']].
 
For a real-valued random variable (with respect to the Borel <math>\sigma</math>-field <math>\mathcal{R}^1</math> on <math>\mathbb{R}</math>), every conditional probability distribution is regular.<ref>[[#billingsley95{{sfnp|Billingsley (|1995)]], |p. =439</ref>}} In this case,<math>E[X \mid \mathcal{G}] = \int_{-\infty}^\infty x \, \mu_{X \mid \mathcal{G}}(d x, \cdot)</math> almost surely.
 
=== Relation to conditional expectation ===
Line 98:
An expectation of a random variable with respect to a regular conditional probability is equal to its conditional expectation.
 
=== Interpretation of conditioning on a Sigma Field ===
Consider the probability space <math>(\Omega, \mathcal{F}, \mathbb{P})</math>
and a sub-sigma field <math>\mathcal{A} \subset \mathcal{F}</math>.
Line 105:
Also recall that an event <math>B</math> is independent of a sub-sigma field <math>\mathcal{A}</math> if <math>\mathbb{P}(B | A) = \mathbb{P}(B)</math> for all <math>A \in \mathcal{A}</math>. It is incorrect to conclude in general that the information in <math>\mathcal{A}</math> does not tell us anything about the probability of event <math>B</math> occurring. This can be shown with a counter-example:
 
Consider a probability space on the [[unit interval]], <math>\Omega = [0, 1]</math>. Let <math>\mathcal{G}</math> be the sigma-field of all countable sets and sets whose complement is countable. So each set in <math>\mathcal{G}</math> has measure <math>0</math> or <math>1</math> and so is independent of each event in <math>\mathcal{F}</math>. However, notice that <math>\mathcal{G}</math> also contains all the singleton events in <math>\mathcal{F}</math> (those sets which contain only a single <math>\omega \in \Omega</math>). So knowing which of the events in <math>\mathcal{G}</math> occurred is equivalent to knowing exactly which <math>\omega \in \Omega</math> occurred! So in one sense, <math>\mathcal{G}</math> contains no information about <math>\mathcal{F}</math> (it is independent of it), and in another sense it contains all the information in <math>\mathcal{F}</math>.<ref>{{Cite book sfnp|last=Billingsley |first=Patrick |url=https://www.amazon.com/Probability-Measure-Patrick-Billingsley/dp/1118122372 |title=Probability and Measure2012}}{{Page needed|date=2012-02-28May |publisher=Wiley |isbn=978-1-118-12237-2 |edition= |___location=Hoboken, New Jersey |language=English2025}}</ref>
 
== See also ==
* [[Conditioning (probability)]]
* [[Conditional probability]]
* [[Regular conditional probability]]
* [[Bayes' theorem]]
 
== References ==
=== Citations ===
{{Reflist}}
 
=== Sources ===
{{refbegin}}
* {{cite book |last= Billingsley |first= Patrick |date= 1995 |title= Probability and Measure |edition= 3rd |publisher= John Wiley and Sons |___location= New York |isbn= 0-471-00710-2 |author-link= Patrick Billingsley |url= https://books.google.com/books?id=a3gavZbxyJcC }}
* {{cite book
* {{cite book |last= Billingsley |first= Patrick |date= 2012 |title= Probability and Measure |edition= Anniversary |publisher= Wiley |___location= Hoboken, New Jersey |isbn= 978-1-118-12237-2 }}
| first = Patrick | last = Billingsley
* {{cite book |last= Park |first= Kun Il |date= 2018 |title= Fundamentals of Probability and Stochastic Processes with Applications to Communications |publisher= Springer |isbn= 978-3-319-68074-3}}
| authorlink = Patrick Billingsley
* {{cite book |last= Ross |first= Sheldon M. |date= 1993 |title=Introduction to Probability Models |edition= 5th |___location= San Diego |publisher= Academic Press |isbn=0-12-598455-3 |author-link= Sheldon M. Ross }}
| title = Probability and Measure
| edition = 3rd
| publisher = John Wiley and Sons
| ___location = New York, NY
| year = 1995
| ref = billingsley95
| url = https://books.google.com/books?id=a3gavZbxyJcC
}}
{{refend}}