Continuous mapping theorem: Difference between revisions

Content deleted Content added
 
(37 intermediate revisions by 19 users not shown)
Line 1:
{{Short description|Probability theorem}}
In [[probability theory]], the '''continuous mapping theorem''' states that continuous functions are [[Continuous function#Heine definition of continuity|limit-preserving]] even if their arguments are sequences of random variables. A continuous function, in [[Continuous function#Heine definition of continuity|Heine’s definition]], is such a function that maps convergent sequences into convergent sequences: if ''x<sub>n</sub>'' → ''x'' then ''g''(''x<sub>n</sub>'') → ''g''(''x''). The ''continuous mapping theorem'' states that this will also be true if we replace the deterministic sequence {''x<sub>n</sub>''} with a sequence of random variables {''X<sub>n</sub>''}, and replace the standard notion of convergence of real numbers “→” with one of the types of [[convergence of random variables]].
{{Distinguish|text=the [[contraction mapping theorem]]}}
In [[probability theory]], the '''continuous mapping theorem''' states that continuous functions [[Continuous function#Heine definition of continuity|preserve limits]] even if their arguments are sequences of random variables. A continuous function, in [[Continuous function#Heine definition of continuity|Heine's definition]], is such a function that maps convergent sequences into convergent sequences: if ''x<sub>n</sub>'' → ''x'' then ''g''(''x<sub>n</sub>'') → ''g''(''x''). The ''continuous mapping theorem'' states that this will also be true if we replace the deterministic sequence {''x<sub>n</sub>''} with a sequence of random variables {''X<sub>n</sub>''}, and replace the standard notion of convergence of real numbers “→” with one of the types of [[convergence of random variables]].
 
This theorem was first proved by [[Henry Mann]] and [[Abraham Wald]] in 1943,<ref>{{cite journal | doi = 10.1214/aoms/1177731415 | last1 = Mann |first1=H. B. | last2=Wald |first2=A. | year = 1943 | title = On Stochastic Limit and Order Relationships | journal = [[Annals of Mathematical Statistics]] | volume = 14 | issue = 3 | pages = 217–226 | jstor = 2235800 | doi-access = free }}</ref> and it is therefore sometimes called the '''Mann–Wald theorem'''.<ref>{{cite book | last = Amemiya | first = Takeshi | author-link = Takeshi Amemiya | year = 1985 | title = Advanced Econometrics | publisher = Harvard University Press | ___location = Cambridge, MA | isbn = 0-674-00560-0 | url = https://books.google.com/books?id=0bzGQE14CwEC&pg=pA88 |page=88 }}</ref> Meanwhile, [[Denis Sargan]] refers to it as the '''general transformation theorem'''.<ref>{{cite book |first=Denis |last=Sargan |title=Lectures on Advanced Econometric Theory |___location=Oxford |publisher=Basil Blackwell |year=1988 |isbn=0-631-14956-2 |pages=4–8 }}</ref>
This theorem was first proved by {{harvtxt|Mann|Wald|1943}}, and it is therefore sometimes called the '''Mann–Wald theorem'''.<ref>{{harvnb|Amemiya|1985|page=88}}</ref>
 
==Statement==
Let {''X<sub>n</sub>''}, ''X'' be [[random element]]s defined on a [[metric space]] ''S''. Suppose a function {{nowrap|''g'': ''S''→''S′''}} (where ''S′'' is another metric space) has the set of [[Discontinuity (mathematics)|discontinuity points]] ''D<sub>g</sub>'' such that {{nowrap|1=Pr[''X'' ∈ ''D<sub>g</sub>''] = 0}}. Then<ref>{{harvnbcite book |Van derlast = Billingsley Vaart|1998 first = Patrick |loc author-link =Theorem 2.3,Patrick pageBillingsley 7}}</ref><ref>{{harvnb|Billingsley title = Convergence of Probability Measures | year = 1969 | publisher = John Wiley & Sons | isbn = 0-471-07242-7|page=31, (Corollary 1) }}</ref><ref>{{harvnbcite book |Billingsley last = van der Vaart |1999 first = A. W. | title = Asymptotic Statistics | year = 1998 | publisher = Cambridge University Press | ___location = New York | isbn = 0-521-49603-9 | url =https://books.google.com/books?id=UEuQEM5RjWgC&pg=PA7 |page=21,7 (Theorem 2.73) }}</ref>
 
# <math>X_n \ \xrightarrow{d}\ X \quad\Rightarrow\quad g(X_n)\ \xrightarrow{d}\ g(X);</math>
: <math>
# <math>X_n \ \xrightarrow{p}\ X \quad\Rightarrow\quad g(X_n)\ \xrightarrow{p}\ g(X);</math>
\begin{align}
# <math>X_n \ \xrightarrow{\!\!as\!\!}\ X \quad\Rightarrow\quad g(X_n)\ \xrightarrow{\!\!as\!\!}\ g(X).</math>
X_n \ \xrightarrow\text{d}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{d}\ g(X); \\[6pt]
X_n \ \xrightarrow\text{p}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{p}\ g(X); \\[6pt]
X_n \ \xrightarrow{\!\!\text{a.s.}\!\!}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow{\!\!\text{a.s.}\!\!}\ g(X).
\end{align}
</math>
where the superscripts, "d", "p", and "a.s." denote [[convergence in distribution]], [[convergence in probability]], and [[almost sure convergence]] respectively.
 
==Proof==
<div style="NO-align:right"><small>This proof has been adopted from {{harv|van der Vaart|1998|loc=Theorem 2.3}}</small></div>
 
Spaces ''S'' and ''S′'' are equipped with certain metrics. For simplicity we will denote both of these metrics using the |x−y''x''&nbsp;−&nbsp;''y''| notation, even though the metrics may be arbitrary and not necessarily Euclidean.
 
===Convergence in distribution===
We will need a particular statement from the [[portmanteau theorem]]: that convergence in distribution <math>X_n\xrightarrow{d}X</math> is equivalent to
: <math> \limsup_{n\to\infty}\operatorname{Pr}mathbb E f(X_n) \in F)to \leqmathbb \operatorname{Pr}E f(X\in F) \text{</math> for every closedbounded setcontinuous }functional F''f''.</math>
 
So it suffices to prove that <math> \mathbb E f(g(X_n)) \to \mathbb E f(g(X))</math> for every bounded continuous functional ''f''. For simplicity we assume ''g'' continuous. Note that <math> F = f \circ g</math> is itself a bounded continuous functional. And so the claim follows from the statement above. The general case is slightly more technical.
Fix an arbitrary closed set ''F''⊂''S′''. Denote by ''g''<sup>−1</sup>(''F'') the pre-image of ''F'' under the mapping ''g'': the set of all points ''x''&nbsp;∈&nbsp;''S'' such that ''g''(''x'')∈''F''. Consider a sequence {''x<sub>k</sub>''} such that ''g''(''x<sub>k</sub>'')&nbsp;∈&nbsp;''F'' and ''x<sub>k</sub>''&nbsp;→&nbsp;''x''. Then this sequence lies in ''g''<sup>−1</sup>(''F''), and its limit point ''x'' belongs to the [[closure (topology)|closure]] of this set, <span style="text-decoration:overline">''g''<sup>−1</sup>(''F'')</span> (by definition of the closure). The point ''x'' may be either:
* a continuity point of ''g'', in which case ''g''(''x<sub>k</sub>'')&nbsp;→&nbsp;''g''(''x''), and hence ''g''(''x'')∈''F'' because ''F'' is a closed set, and therefore in this case ''x'' belongs to the pre-image of ''F'', or
* a discontinuity point of ''g'', so that ''x''&nbsp;∈&nbsp;''D<sub>g</sub>''.
Thus the following relationship holds:
: <math>
\overline{g^{-1}(F)} \ \subset\ g^{-1}(F) \cup D_g\ .
</math>
 
Consider the event {''g''(''X<sub>n</sub>'')∈''F''}. The probability of this event can be estimated as
: <math>
\operatorname{Pr}\big(g(X_n)\in F\big) = \operatorname{Pr}\big(X_n\in g^{-1}(F)\big) \leq \operatorname{Pr}\big(X_n\in \overline{g^{-1}(F)}\big),
</math>
and by the portmanteau theorem the [[limsup]] of the last expression is less than or equal to Pr(''X''&nbsp;∈&nbsp;<span style="text-decoration:overline">''g''<sup>−1</sup>(''F'')</span>). Using the formula we derived in the previous paragraph, this can be written as
: <math>\begin{align}
& \operatorname{Pr}\big(X\in \overline{g^{-1}(F)}\big) \leq
\operatorname{Pr}\big(X\in g^{-1}(F)\cup D_g\big) \leq \\
& \operatorname{Pr}\big(X \in g^{-1}(F)\big) + \operatorname{Pr}(X\in D_g) =
\operatorname{Pr}\big(g(X) \in F\big) + 0.
\end{align}</math>
 
On plugging this back into the original expression, it can be seen that
: <math>
\limsup_{n\to\infty} \Pr \big(g(X_n)\in F\big) \leq \Pr \big(g(X) \in F\big),
</math>
which, by the portmanteau theorem, implies that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') in distribution.
 
===Convergence in probability===
Line 57 ⟶ 41:
</math>
 
On the right-hand side, the first term converges to zero as ''n''&nbsp;→&nbsp;∞ for any fixed ''δ'', by the definition of convergence in probability of the sequence {''X<sub>n</sub>''}. The second term converges to zero as ''δ''&nbsp;→&nbsp;0, since the set ''B<sub>δ</sub>'' shrinks to an empty set. And the last term is identically equal to zero by assumption of the theorem. Therefore, the conclusion is that
: <math>
\lim_{n\to\infty}\Pr \big(\big|g(X_n)-g(X)\big|>\varepsilon\big) = 0,
Line 63 ⟶ 47:
which means that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') in probability.
 
=== Almost sure convergence ===
===Convergence almost surely===
By definition of the continuity of the function ''g''(·),
: <math>
\lim_{n\to\infty}X_n(\omega) = X(\omega) \quad\Rightarrow\quad \lim_{n\to\infty}g(X_n(\omega)) = g(X(\omega))
</math>
at each point ''X''(''ω'') where ''g''(·) is continuous. Therefore,
: <math>\begin{align}
\operatorname{Pr}\Bigleft(\lim_{n\to\infty}g(X_n) = g(X)\Bigright)
&\geq \operatorname{Pr}\Bigleft(\lim_{n\to\infty}g(X_n) = g(X),\ X\notin D_g\Bigright) \\
&\geq \operatorname{Pr}\Bigleft(\lim_{n\to\infty}X_n = X,\ X\notin D_g\Bigright) = 1.,
\end{align}</math>,
because the intersection of two almost sure events is almost sure.
 
Line 79 ⟶ 63:
 
==See also==
* [[Slutsky’sSlutsky's theorem]]
* [[Portmanteau theorem]]
* [[Pushforward measure]]
 
==References==
{{reflist}}
 
[[Category:Theorems in probability theory]]
==Further reading==
[[Category:Theorems in statistics]]
* {{cite book
| last = Amemiya
| first = Takeshi
| authorlink = Takeshi Amemiya
| year = 1985
| title = Advanced Econometrics
| publisher = Harvard University Press
| ___location = Cambridge, MA
| isbn = 0-674-00560-0
| url = https://books.google.com/books?id=0bzGQE14CwEC
| ref = harv
}}
* {{cite book
| last = Billingsley
| first = Patrick
| authorlink = Patrick Billingsley
| title = Convergence of Probability Measures
| year = 1969
| publisher = John Wiley & Sons
| isbn = 0-471-07242-7| ref = harv
}}
* {{cite book
| last = Billingsley
| first = Patrick
| title = Convergence of Probability Measures
| year = 1999
| publisher = John Wiley & Sons
| edition = 2nd
| isbn = 0-471-19745-9
| url = https://books.google.com/books?id=QY06uAAACAAJ
| ref = harv
}}
* {{cite journal
| doi = 10.1214/aoms/1177731415
| last = Mann |first=H. B.
| authorlink = Henry Mann
| last2=Wald |first2=A.
| authorlink2 = Abraham Wald
| year = 1943
| title = On Stochastic Limit and Order Relationships
| journal = [[Annals of Mathematical Statistics]]
| volume = 14
| issue = 3
| pages = 217–226
| jstor = 2235800
| ref = CITEREFMannWald1943
}}
* {{cite book
| last = Van der Vaart
| first = A. W.
| title = Asymptotic statistics
| year = 1998
| publisher = Cambridge University Press
| ___location = New York
| isbn = 0-521-49603-9
| url = https://books.google.com/books?id=UEuQEM5RjWgC
| ref = CITEREFVan_der_Vaart1998
}}
 
[[Category:Probability theorems]]
[[Category:Statistical theorems]]
[[Category:Articles containing proofs]]