Conditional variance: Difference between revisions

Content deleted Content added
m Definition: more direct link to Conditional expectation
Undid revision 1227126801 by 128.153.179.53 (talk) rv error
 
(27 intermediate revisions by 23 users not shown)
Line 1:
{{Short description|Variance of a random variable given value of other variables}}
In [[probability theory]] and [[statistics]], a '''conditional variance''' is the [[variance]] of a [[conditional probability distribution]]. That is, it is the variance of a [[random variable]] given the value(s) of one or more other variables. Particularly in [[econometrics]], the conditional variance is also known as the '''scedastic function''' or '''skedastic function'''. Conditional variances are important parts of [[autoregressive conditional heteroskedasticity]] (ARCH) models.
In [[probability theory]] and [[statistics]], a '''conditional variance''' is the [[variance]] of a [[random variable]] given the value(s) of one or more other variables.
Particularly in [[econometrics]], the conditional variance is also known as the '''scedastic function''' or '''skedastic function'''.<ref>{{cite book |first=Aris |last=Spanos |chapter=Conditioning and regression |title=Probability Theory and Statistical Inference |___location=New York |publisher=Cambridge University Press |year=1999 |isbn=0-521-42408-9 |pages=339–356 [p. 342] |url=https://books.google.com/books?id=G0_HxBubGAwC&pg=PA342 }}</ref> Conditional variances are important parts of [[autoregressive conditional heteroskedasticity]] (ARCH) models.
 
==Definition==
The conditional variance of a [[random variable]] ''Y'' given that the value of aanother random variable ''X'' takesis the value ''x'' is
 
:<math>\operatorname{Var}(Y|\mid X=x) = \operatorname{E}\Big(\big(Y - \operatorname{E}(Y\mid X=x)\big)^{2}\mid;\Big|\; X=x\Big),.</math>
 
whereThe E is the [[conditional expectation]],variance i.e.tells theus [[expectationhow operator]]much withvariance respectis toleft theif [[conditionalwe distribution]] of ''Y'' given that the ''X'' takes the value ''x''. An alternative notation for this isuse :<math>\operatorname{VarE}_{(Y\mid X}(Y|x).</math> to "predict" ''Y''.
Here, as usual, <math>\operatorname{E}(Y\mid X)</math> stands for the [[conditional expectation]] of ''Y'' given ''X'',
which we may recall, is a random variable itself (a function of ''X'', determined up to probability one).
As a result, <math>\operatorname{Var}(Y\mid X)</math> itself is a random variable (and is a function of ''X'').
 
==Explanation, relation to least-squares ==
The above may be stated in the alternative form that, based on the [[conditional distribution]] of ''Y'' given that the ''X'' takes the value ''x'', the conditional variance is the [[variance]] of this [[probability distribution]].
{{main|least-squares}}
Recall that variance is the expected squared deviation between a random variable (say, ''Y'') and its expected value.
The expected value can be thought of as a reasonable prediction of the outcomes of the random experiment (in particular, the expected value is the best constant prediction when predictions are assessed by expected squared prediction error). Thus, one interpretation of variance is that it gives the smallest possible expected squared prediction error. If we have the knowledge of another random variable (''X'') that we can use to predict ''Y'', we can potentially use this knowledge to reduce the expected squared error. As it turns out, the best prediction of ''Y'' given ''X'' is the conditional expectation. In particular, for any <math>f: \mathbb{R} \to \mathbb{R}</math> measurable,
 
:<math>
\begin{align}
\operatorname{E}[ (Y-f(X))^2 ]
&= \operatorname{E}[ (Y-\operatorname{E}(Y|X)\,\,+\,\, \operatorname{E}(Y|X)-f(X) )^2 ] \\
&= \operatorname{E}[ \operatorname{E}\{ (Y-\operatorname{E}(Y|X)\,\,+\,\, \operatorname{E}(Y|X)-f(X) )^2|X\} ] \\
&= \operatorname{E}[\operatorname{Var}( Y| X )] + \operatorname{E}[(\operatorname{E}(Y|X)-f(X))^2]\,.
\end{align}
</math>
 
By selecting <math>f(X)=\operatorname{E}(Y|X)</math>, the second, nonnegative term becomes zero, showing the claim.
Here, the second equality used the [[law of total expectation]].
We also see that the expected conditional variance of ''Y'' given ''X'' shows up as the irreducible error of predicting ''Y'' given only the knowledge of ''X''.
 
==Special cases, variations==
===Conditioning on discrete random variables===
When ''X'' takes on countable many values <math>S = \{x_1,x_2,\dots\}</math> with positive probability, i.e., it is a [[discrete random variable]], we can introduce <math>\operatorname{Var}(Y|X=x)</math>, the conditional variance of ''Y'' given that ''X=x'' for any ''x'' from ''S'' as follows:
 
:<math>\operatorname{Var}(Y|X=x) = \operatorname{E}((Y - \operatorname{E}(Y\mid X=x))^{2}\mid X=x)=\operatorname{E}(Y^2|X=x)-\operatorname{E}(Y|X=x)^2,</math>
 
where recall that <math>\operatorname{E}(Z\mid X=x)</math> is the [[Conditional_expectation#Conditional_expectation_with_respect_to_a_random_variable|conditional expectation of ''Z'' given that ''X=x'']], which is well-defined for <math>x\in S</math>.
An alternative notation for <math>\operatorname{Var}(Y|X=x)</math> is <math>\operatorname{Var}_{Y\mid X}(Y|x).</math>
 
Note that here <math>\operatorname{Var}(Y|X=x)</math> defines a constant for possible values of ''x'', and in particular, <math>\operatorname{Var}(Y|X=x)</math>, is ''not'' a random variable.
 
The connection of this definition to <math>\operatorname{Var}(Y|X)</math> is as follows:
Let ''S'' be as above and define the function <math>v: S \to \mathbb{R}</math> as <math>v(x) = \operatorname{Var}(Y|X=x)</math>. Then, <math>v(X) = \operatorname{Var}(Y|X)</math> [[almost surely]].
 
===Definition using conditional distributions===
The "conditional expectation of ''Y'' given ''X=x''" can also be defined more generally
using the [[conditional distribution]] of ''Y'' given ''X'' (this exists in this case, as both here ''X'' and ''Y'' are real-valued).
 
In particular, letting <math>P_{Y|X}</math> be the (regular) [[conditional distribution]] <math>P_{Y|X}</math> of ''Y'' given ''X'', i.e., <math>P_{Y|X}:\mathcal{B} \times \mathbb{R}\to [0,1]</math> (the intention is that <math>P_{Y|X}(U,x) = P(Y\in U|X=x)</math> almost surely over the support of ''X''), we can define
 
<math> \operatorname{Var}(Y|X=x) = \int \left(y- \int y' P_{Y|X}(dy'|x)\right)^2 P_{Y|X}(dy|x). </math>
 
This can, of course, be specialized to when ''Y'' is discrete itself (replacing the integrals with sums), and also when the [[conditional density]] of ''Y'' given ''X=x'' with respect to some underlying distribution exists.
 
==Components of variance==
The [[law of total variance]] says
 
:<math>\operatorname{Var}(Y) = \operatorname{E}(\operatorname{Var}(Y\mid X))+\operatorname{Var}(\operatorname{E}(Y\mid X)),.</math>
 
In words: the variance of ''Y'' is the sum of the expected conditional variance of ''Y'' given ''X'' and the variance of the conditional expectation of ''Y'' given ''X''. The first term captures the variation left after "using ''X'' to predict ''Y''", while the second term captures the variation due to the mean of the prediction of ''Y'' due to the randomness of ''X''.
where, for example, <math>\operatorname{Var}(Y|X)</math> is understood to mean that the value ''x'' at which the conditional variance would be evaluated is allowed to be a [[random variable]], ''X''. In this "law", the inner expectation or variance is taken with respect to ''Y'' conditional on ''X'', while the outer expectation or variance is taken with respect to ''X''. This expression represents the overall variance of ''Y'' as the sum of two components, involving a prediction of ''Y'' based on ''X''. Specifically, let the predictor be the least-mean-squares prediction based on ''X'', which is the [[conditional expectation]] of ''Y'' given ''X''. Then the two components are:
 
:*the average of the variance of ''Y'' about the prediction based on ''X'', as ''X'' varies;
==See also==
:*the variance of the prediction based on ''X'', as ''X'' varies.
*[[Mixed model]]
*[[Random effects model]]
 
==References==
{{Reflist}}
 
==Further reading==
* {{cite book |first=George |last=Casella |first2=Roger L. |last2=Berger |title=Statistical Inference |publisher=Wadsworth |edition=Second |year=2002 |isbn=0-534-24312-6 |pages=151–52 |url=https://books.google.com/books?id=0x_vAAAAMAAJ&pg=PA151 }}
 
[[Category:Statistical deviation and dispersion]]
[[Category:Statistical terminology]]
[[Category:Theory of probability distributions]]
[[Category:Conditional probability]]
 
 
{{statistics-stub}}