Content deleted Content added
mNo edit summary |
mNo edit summary |
||
Line 1:
Given two jointly distributed [[random variable]]s ''X'' and ''Y'', the '''conditional probability distribution''' of ''Y'' given ''X'' (written "''Y'' | ''X''") is the [[probability distribution]] of ''Y'' when ''X'' is known to be a particular value.
For [[discrete random variable]]s, the [[conditional probability]] mass function can be written as ''P''(''Y'' = ''y'' | ''X'' = ''x''). From [[Bayes' theorem]], this is ▼
▲For [[discrete random variable]]s, the [[conditional probability]] mass function can be written as ''P''(''Y''=''y''|''X''=''x''). From [[Bayes' theorem]], this is
:<math>P(Y=y|X=x) = \frac{P(X=x,Y=y)}{P(X=x)}= \frac{P(X=x|Y=y) P(Y=y)}{P(X=x)}</math>
Similarly for [[continuous random variable]]s, the conditional [[probability density function]] can be written as ''p''<sub>''Y''|''X''</sub>(''y'' | ''x'') and this is
:<math>p_{Y|X}(y|x) = \frac{p_{X,Y}(x,y)}{p_X(x)}= \frac{p_{X|Y}(x|y)p_Y(y)}{p_X(x)}</math>
where ''p''<sub>''X'',''Y''</sub>(x, y) gives the [[joint distribution]] of ''X'' and ''Y'', while ''p''<sub>''X''</sub>(''x'') gives the [[marginal distribution]] for ''X''.
The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: [[Borel's paradox]] shows that conditional probability density functions need not be invariant under coordinate transformations.
If for discrete random variables ''P''(''Y'' = ''y'' | ''X'' = ''x'') = ''P''(''Y'' = ''y'') for all ''x'' and ''y'', or for continuous random variables ''p''<sub>''Y''|''X''</sub>(''y'' | ''x'') = ''p''<sub>''Y''</sub>(''y'') for all x and y, then ''Y'' is said to be [[Statistical independence|independent]] of ''X'' (and this implies that ''X'' is also independent of ''Y'').
Seen as a function of ''y'' for given ''x'', ''P''(''Y'' = ''y'' | ''X'' = ''x'') is a probability and so the sum over all ''y'' (or integral if it is a density) is 1. Seen as a function of ''x'' for given ''y'', it is a [[likelihood]], so that the sum over all ''x'' need not be 1.
[[Category:Probability distributions]]
|