Conditional probability distribution: Difference between revisions

Content deleted Content added
Measure-theoretic formulation: fix missing definition
Corrected places where formulas are incorrectly taken from a source. Section on "conditional cumulative distribution" removed from the beginning since it is not frequently used, was an unusual presentation, and all formulas were copied incorrectly from the source.
Line 6:
More generally, one can refer to the conditional distribution of a subset of a set of more than two variables; this conditional distribution is contingent on the values of all the remaining variables, and if more than one variable is included in the subset then this conditional distribution is the conditional [[joint distribution]] of the included variables.
 
==Conditional cumulative distribution==
Given a random variable <math>X</math> and a [[event (probability theory)|random event]] <math>A</math>, the conditional cumulative distribution of <math>X</math> given <math>A</math> is defined by<ref name=KunIlPark>{{cite book | author=Park,Kun Il| title=Fundamentals of Probability and Stochastic Processes with Applications to Communications| publisher=Springer | year=2018 | isbn=978-3-319-68074-3}}</ref>{{rp|p. 97}}
 
:<math>F_{X|A}(x) \triangleq \frac{P(X \leq x \cap A)}{P(A)}</math>
 
for <math>P(A) > 0</math>.
 
If another random variable is denoted by <math>Y</math>, it is possible to condition on the event <math>\{Y \leq y \}</math>. This yields
 
:<math>F_{X|Y \leq y}(x|y) = \frac{P(X \leq x \cap Y \leq y)}{P(Y \leq y)}</math>
 
which can be written as
 
:<math>F_{X|Y \leq y}(x|y) = \frac{F_{X,Y}(x,y)}{F_Y(y)}</math>
 
where <math>F_{X,Y}(x,y)</math> denotes the joint cumulative distribution function of <math>X</math> and <math>Y</math> and <math>F_Y(y)</math> is the cumulative distribution function of <math>Y</math>.
==Conditional discrete distributions==
For [[discrete random variable]]s, the [[conditional probability]] mass function of <math>Y</math> given the occurrence of the value <math>X=x</math> of <math>X</math> can be written according to its definition as:
 
{{Equation box 1
|indent =
|title=
|equation = <math>p_{Y|X}(y \mid x) \triangleq P(Y = y \mid X = x) = \frac{P(\{X=x\} \cap \{Y=y\})}{P(X=x)}</math>
|cellpadding= 6
|border
Line 38 ⟶ 22:
The relation with the probability distribution of <math>X</math> given <math>Y</math> is:
 
:<math>P(Y=y \mid X=x) P(X=x) = P(\{X=x\} \cap \{Y=y\}) = P(X=x \mid Y=y)P(Y=y).</math>
 
===Example===
Line 54 ⟶ 38:
 
==Conditional continuous distributions==
Similarly for [[continuous random variable]]s, the conditional [[probability density function]] of <math>Y</math> given the occurrence of the value <math>x</math> of <math>X</math> can be written as<ref name=KunIlPark>{{cite book | author=Park,Kun Il| title=Fundamentals of Probability and Stochastic Processes with Applications to Communications| publisher=Springer | year=2018 | isbn=978-3-319-68074-3}}</ref>{{rp|p. 99}}
 
{{Equation box 1
|indent =
|title=
|equation = <math>f_Yf_{Y\mid X}(y \mid X=x) = \frac{f_{X, Y}(x, y)}{f_X(x)}</math>
|cellpadding= 6
|border
Line 68 ⟶ 52:
 
The relation with the probability distribution of <math>X</math> given <math>Y</math> is given by:
:<math>f_Yf_{Y\mid X}(y \mid X=x)f_X(x) = f_{X,Y}(x, y) = f_Xf_{X|Y}(x \mid Y=y)f_Y(y). </math>
 
The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: [[Borel's paradox]] shows that conditional probability density functions need not be invariant under coordinate transformations.
Line 78 ⟶ 62:
 
==Relation to independence==
Random variables <math>X</math>, <math>Y</math> are [[Statistical independence|independent]] if and only if the conditional distribution of <math>Y</math> given <math>X</math> is, for all possible realizations of <math>X</math>, equal to the unconditional distribution of <math>Y</math>. For discrete random variables this means <math>P(Y=y|X=x) = P(Y=y)</math> for all possible <math>xy</math> and <math>yx</math> with <math>P(X=x)>0</math>. For continuous random variables <math>X</math> and <math>Y</math>, having a [[joint density function]], it means <math>f_Y(y|X=x) = f_Y(y)</math> for all possible <math>xy</math> and <math>yx</math> with <math>f_X(x)>0</math>.
 
==Properties==
Seen as a function of <math>y</math> for given <math>x</math>, <math>P(Y=y|X=x)</math> is a probability mass function and so the sum over all <math>y</math> (or integral if it is a conditional probability density) is 1. Seen as a function of <math>x</math> for given <math>y</math>, it is a [[likelihood function]], so that the sum over all <math>x</math> need not be 1.
 
==Measure-theoretic formulation==