Continuous mapping theorem: Difference between revisions

Content deleted Content added
Redirected page to Slutsky's theorem
 
 
(89 intermediate revisions by 40 users not shown)
Line 1:
#REDIRECT{{Short [[Slutsky'sdescription|Probability theorem]]}}
{{Distinguish|text=the [[contraction mapping theorem]]}}
In [[probability theory]], the '''continuous mapping theorem''' states that continuous functions [[Continuous function#Heine definition of continuity|preserve limits]] even if their arguments are sequences of random variables. A continuous function, in [[Continuous function#Heine definition of continuity|Heine's definition]], is such a function that maps convergent sequences into convergent sequences: if ''x<sub>n</sub>'' → ''x'' then ''g''(''x<sub>n</sub>'') → ''g''(''x''). The ''continuous mapping theorem'' states that this will also be true if we replace the deterministic sequence {''x<sub>n</sub>''} with a sequence of random variables {''X<sub>n</sub>''}, and replace the standard notion of convergence of real numbers “→” with one of the types of [[convergence of random variables]].
 
This theorem was first proved by [[Henry Mann]] and [[Abraham Wald]] in 1943,<ref>{{cite journal | doi = 10.1214/aoms/1177731415 | last1 = Mann |first1=H. B. | last2=Wald |first2=A. | year = 1943 | title = On Stochastic Limit and Order Relationships | journal = [[Annals of Mathematical Statistics]] | volume = 14 | issue = 3 | pages = 217–226 | jstor = 2235800 | doi-access = free }}</ref> and it is therefore sometimes called the '''Mann–Wald theorem'''.<ref>{{cite book | last = Amemiya | first = Takeshi | author-link = Takeshi Amemiya | year = 1985 | title = Advanced Econometrics | publisher = Harvard University Press | ___location = Cambridge, MA | isbn = 0-674-00560-0 | url = https://books.google.com/books?id=0bzGQE14CwEC&pg=pA88 |page=88 }}</ref> Meanwhile, [[Denis Sargan]] refers to it as the '''general transformation theorem'''.<ref>{{cite book |first=Denis |last=Sargan |title=Lectures on Advanced Econometric Theory |___location=Oxford |publisher=Basil Blackwell |year=1988 |isbn=0-631-14956-2 |pages=4–8 }}</ref>
 
==Statement==
Let {''X<sub>n</sub>''}, ''X'' be [[random element]]s defined on a [[metric space]] ''S''. Suppose a function {{nowrap|''g'': ''S''→''S′''}} (where ''S′'' is another metric space) has the set of [[Discontinuity (mathematics)|discontinuity points]] ''D<sub>g</sub>'' such that {{nowrap|1=Pr[''X'' ∈ ''D<sub>g</sub>''] = 0}}. Then<ref>{{cite book | last = Billingsley | first = Patrick | author-link = Patrick Billingsley | title = Convergence of Probability Measures | year = 1969 | publisher = John Wiley & Sons | isbn = 0-471-07242-7|page=31 (Corollary 1) }}</ref><ref>{{cite book | last = van der Vaart | first = A. W. | title = Asymptotic Statistics | year = 1998 | publisher = Cambridge University Press | ___location = New York | isbn = 0-521-49603-9 | url =https://books.google.com/books?id=UEuQEM5RjWgC&pg=PA7 |page=7 (Theorem 2.3) }}</ref>
 
: <math>
\begin{align}
X_n \ \xrightarrow\text{d}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{d}\ g(X); \\[6pt]
X_n \ \xrightarrow\text{p}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{p}\ g(X); \\[6pt]
X_n \ \xrightarrow{\!\!\text{a.s.}\!\!}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow{\!\!\text{a.s.}\!\!}\ g(X).
\end{align}
</math>
where the superscripts, "d", "p", and "a.s." denote [[convergence in distribution]], [[convergence in probability]], and [[almost sure convergence]] respectively.
 
==Proof==
<div style="NO-align:right"><small>This proof has been adopted from {{harv|van der Vaart|1998|loc=Theorem 2.3}}</small></div>
 
Spaces ''S'' and ''S′'' are equipped with certain metrics. For simplicity we will denote both of these metrics using the |''x''&nbsp;−&nbsp;''y''| notation, even though the metrics may be arbitrary and not necessarily Euclidean.
 
===Convergence in distribution===
We will need a particular statement from the [[portmanteau theorem]]: that convergence in distribution <math>X_n\xrightarrow{d}X</math> is equivalent to
: <math> \mathbb E f(X_n) \to \mathbb E f(X)</math> for every bounded continuous functional ''f''.
 
So it suffices to prove that <math> \mathbb E f(g(X_n)) \to \mathbb E f(g(X))</math> for every bounded continuous functional ''f''. For simplicity we assume ''g'' continuous. Note that <math> F = f \circ g</math> is itself a bounded continuous functional. And so the claim follows from the statement above. The general case is slightly more technical.
 
===Convergence in probability===
Fix an arbitrary ''ε''&nbsp;>&nbsp;0. Then for any ''δ''&nbsp;>&nbsp;0 consider the set ''B<sub>δ</sub>'' defined as
: <math>
B_\delta = \big\{x\in S \mid x\notin D_g:\ \exists y\in S:\ |x-y|<\delta,\, |g(x)-g(y)|>\varepsilon\big\}.
</math>
This is the set of continuity points ''x'' of the function ''g''(·) for which it is possible to find, within the ''δ''-neighborhood of ''x'', a point which maps outside the ''ε''-neighborhood of ''g''(''x''). By definition of continuity, this set shrinks as ''δ'' goes to zero, so that lim<sub>''δ''&nbsp;→&nbsp;0</sub>''B<sub>δ</sub>''&nbsp;=&nbsp;∅.
 
Now suppose that |''g''(''X'')&nbsp;−&nbsp;''g''(''X<sub>n</sub>'')|&nbsp;>&nbsp;''ε''. This implies that at least one of the following is true: either |''X''−''X<sub>n</sub>''|&nbsp;≥&nbsp;''δ'', or ''X''&nbsp;∈&nbsp;''D<sub>g</sub>'', or ''X''∈''B<sub>δ</sub>''. In terms of probabilities this can be written as
: <math>
\Pr\big(\big|g(X_n)-g(X)\big|>\varepsilon\big) \leq
\Pr\big(|X_n-X|\geq\delta\big) + \Pr(X\in B_\delta) + \Pr(X\in D_g).
</math>
 
On the right-hand side, the first term converges to zero as ''n''&nbsp;→&nbsp;∞ for any fixed ''δ'', by the definition of convergence in probability of the sequence {''X<sub>n</sub>''}. The second term converges to zero as ''δ''&nbsp;→&nbsp;0, since the set ''B<sub>δ</sub>'' shrinks to an empty set. And the last term is identically equal to zero by assumption of the theorem. Therefore, the conclusion is that
: <math>
\lim_{n\to\infty}\Pr \big(\big|g(X_n)-g(X)\big|>\varepsilon\big) = 0,
</math>
which means that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') in probability.
 
=== Almost sure convergence ===
By definition of the continuity of the function ''g''(·),
: <math>
\lim_{n\to\infty}X_n(\omega) = X(\omega) \quad\Rightarrow\quad \lim_{n\to\infty}g(X_n(\omega)) = g(X(\omega))
</math>
at each point ''X''(''ω'') where ''g''(·) is continuous. Therefore,
: <math>\begin{align}
\Pr\left(\lim_{n\to\infty}g(X_n) = g(X)\right)
&\geq \Pr\left(\lim_{n\to\infty}g(X_n) = g(X),\ X\notin D_g\right) \\
&\geq \Pr\left(\lim_{n\to\infty}X_n = X,\ X\notin D_g\right) = 1,
\end{align}</math>
because the intersection of two almost sure events is almost sure.
 
By definition, we conclude that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') almost surely.
 
==See also==
* [[Slutsky's theorem]]
* [[Portmanteau theorem]]
* [[Pushforward measure]]
 
==References==
{{reflist}}
 
[[Category:Theorems in probability theory]]
[[Category:Theorems in statistics]]
[[Category:Articles containing proofs]]