Continuous mapping theorem: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 01:41, 7 May 2014 edit Monkbot (talk \| contribs) Bots 3,695,952 edits m →Literature: Task 3: Fix CS1 deprecated coauthor parameter errors ← Previous edit		Latest revision as of 06:32, 14 April 2025 edit undo JJMC89 bot III (talk \| contribs) Bots, Administrators 4,314,741 edits m Moving Category:Probability theorems to Category:Theorems in probability theory per Wikipedia:Categories for discussion/Speedy
(47 intermediate revisions by 24 users not shown)
Line 1: {{Short description\|Probability theorem}} In [[probability theory]], the '''continuous mapping theorem''' states that continuous functions are [[Continuous_function#Heine_definition_of_continuity\|limit-preserving]] even if their arguments are sequences of random variables. A continuous function, in [[Continuous_function#Heine_definition_of_continuity\|Heine’s definition]], is such a function that maps convergent sequences into convergent sequences: if ''x<sub>n</sub>'' → ''x'' then ''g''(''x<sub>n</sub>'') → ''g''(''x''). The ''continuous mapping theorem'' states that this will also be true if we replace the deterministic sequence {''x<sub>n</sub>''} with a sequence of random variables {''X<sub>n</sub>''}, and replace the standard notion of convergence of real numbers “→” with one of the types of [[convergence of random variables]]. {{Distinguish\|text=the [[contraction mapping theorem]]}} In [[probability theory]], the '''continuous mapping theorem''' states that continuous functions [[Continuous function#Heine definition of continuity\|preserve limits]] even if their arguments are sequences of random variables. A continuous function, in [[Continuous function#Heine definition of continuity\|Heine's definition]], is such a function that maps convergent sequences into convergent sequences: if ''x<sub>n</sub>'' → ''x'' then ''g''(''x<sub>n</sub>'') → ''g''(''x''). The ''continuous mapping theorem'' states that this will also be true if we replace the deterministic sequence {''x<sub>n</sub>''} with a sequence of random variables {''X<sub>n</sub>''}, and replace the standard notion of convergence of real numbers “→” with one of the types of [[convergence of random variables]]. This theorem was first proved by [[Henry Mann]] and [[Abraham Wald]] in 1943,<ref>{{cite journal \| doi = 10.1214/aoms/1177731415 \| last1 = Mann \|first1=H. B. \| last2=Wald \|first2=A. \| year = 1943 \| title = On Stochastic Limit and Order Relationships \| journal = [[Annals of Mathematical Statistics]] \| volume = 14 \| issue = 3 \| pages = 217–226 \| jstor = 2235800 \| doi-access = free }}</ref> and it is therefore sometimes called the '''Mann–Wald theorem'''.<ref>{{cite book \| last = Amemiya \| first = Takeshi \| author-link = Takeshi Amemiya \| year = 1985 \| title = Advanced Econometrics \| publisher = Harvard University Press \| ___location = Cambridge, MA \| isbn = 0-674-00560-0 \| url = https://books.google.com/books?id=0bzGQE14CwEC&pg=pA88 \|page=88 }}</ref> Meanwhile, [[Denis Sargan]] refers to it as the '''general transformation theorem'''.<ref>{{cite book \|first=Denis \|last=Sargan \|title=Lectures on Advanced Econometric Theory \|___location=Oxford \|publisher=Basil Blackwell \|year=1988 \|isbn=0-631-14956-2 \|pages=4–8 }}</ref> ~~This theorem was first proved by {{harv\|Mann\|Wald\|1943}}, and it is therefore sometimes called the '''Mann–Wald theorem'''.<ref>{{harvnb\|Amemiya\|1985\|page=88}}</ref>~~ ==Statement== Let {''X<sub>n</sub>''}, ''X'' be [[random element]]s defined on a [[metric space]] ''S''. Suppose a function {{nowrap\|''g'': ''S''→''S′''}} (where ''S′'' is another metric space) has the set of [[Discontinuity (mathematics)\|discontinuity points]] ''D<sub>g</sub>'' such that {{nowrap\|1=Pr[''X''~~ ∈ ~~ ∈ ''D<sub>g</sub>''] = 0}}. Then<ref>{{~~harvnb~~cite book \|~~Van~~ ~~der~~last = Billingsley ~~Vaart~~\|~~1998~~ first = Patrick \|~~loc~~ author-link =~~Theorem~~ ~~2.3,~~Patrick ~~page~~Billingsley ~~7}}</ref><ref>{{harvnb~~\|~~Billingsley~~ title = Convergence of Probability Measures \| year = 1969 \| publisher = John Wiley & Sons \| isbn = 0-471-07242-7\|page=31, (Corollary 1) }}</ref><ref>{{~~harvnb~~cite book \|~~Billingsley~~ last = van der Vaart \|~~1999~~ first = A. W. \| title = Asymptotic Statistics \| year = 1998 \| publisher = Cambridge University Press \| ___location = New York \| isbn = 0-521-49603-9 \| url =https://books.google.com/books?id=UEuQEM5RjWgC&pg=PA7 \|page=~~21,~~7 (Theorem 2.73) }}</ref> ~~# <math>X_n \ \xrightarrow{d}\ X \quad\Rightarrow\quad g(X_n)\ \xrightarrow{d}\ g(X);</math>~~ : <math> ~~# <math>X_n \ \xrightarrow{p}\ X \quad\Rightarrow\quad g(X_n)\ \xrightarrow{p}\ g(X);</math>~~ \begin{align} ~~# <math>X_n \ \xrightarrow{\!\!as\!\!}\ X \quad\Rightarrow\quad g(X_n)\ \xrightarrow{\!\!as\!\!}\ g(X).</math>~~ X_n \ \xrightarrow\text{d}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{d}\ g(X); \\[6pt] X_n \ \xrightarrow\text{p}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{p}\ g(X); \\[6pt] X_n \ \xrightarrow{\!\!\text{a.s.}\!\!}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow{\!\!\text{a.s.}\!\!}\ g(X). \end{align} </math> where the superscripts, "d", "p", and "a.s." denote [[convergence in distribution]], [[convergence in probability]], and [[almost sure convergence]] respectively. ==Proof== <div style="NO-align:right"><small>This proof has been adopted from {{harv\|van der Vaart\|1998\|loc=Theorem 2.3}}</small></div> Spaces ''S'' and ''S′'' are equipped with certain metrics. For simplicity we will denote both of these metrics using the \|~~x−y~~''x'' − ''y''\| notation, even though the metrics may be arbitrary and not necessarily Euclidean. ===Convergence in distribution=== We will need a particular statement from the [[portmanteau theorem]]: that convergence in distribution <math>X_n\xrightarrow{d}X</math> is equivalent to : <math> \~~limsup_{n\to\infty}\operatorname{Pr}~~mathbb E f(X_n) \~~in F)~~to \~~leq~~mathbb ~~\operatorname{Pr}~~E f(X~~\in F~~) ~~\text{~~</math> for every ~~closed~~bounded ~~set~~continuous }functional F''f''.~~</math>~~ So it suffices to prove that <math> \mathbb E f(g(X_n)) \to \mathbb E f(g(X))</math> for every bounded continuous functional ''f''. For simplicity we assume ''g'' continuous. Note that <math> F = f \circ g</math> is itself a bounded continuous functional. And so the claim follows from the statement above. The general case is slightly more technical. Fix an arbitrary closed set ''F''⊂''S′''. Denote by ''g''<sup>−1</sup>(''F'') the pre-image of ''F'' under the mapping ''g'': the set of all points ''x''∈''S'' such that ''g''(''x'')∈''F''. Consider a sequence {''x<sub>k</sub>''} such that ''g''(''x<sub>k</sub>'')∈''F'' and ''x<sub>k</sub>''→''x''. Then this sequence lies in ''g''<sup>−1</sup>(''F''), and its limit point ''x'' belongs to the [[closure (topology)\|closure]] of this set, <span style="text-decoration:overline">''g''<sup>−1</sup>(''F'')</span> (by definition of the closure). The point ''x'' may be either: * a continuity point of ''g'', in which case ''g''(''x<sub>k</sub>'')→''g''(''x''), and hence ''g''(''x'')∈''F'' because ''F'' is a closed set, and therefore in this case ''x'' belongs to the pre-image of ''F'', or * a discontinuity point of ''g'', so that ''x''∈''D<sub>g</sub>''. ~~Thus the following relationship holds:~~ ~~: <math>~~ ~~\overline{g^{-1}(F)} \ \subset\ g^{-1}(F) \cup D_g\ .~~ ~~</math>~~ ~~Consider the event {''g''(''X<sub>n</sub>'')∈''F''}. The probability of this event can be estimated as~~ ~~: <math>~~ ~~\operatorname{Pr}\big(g(X_n)\in F\big) = \operatorname{Pr}\big(X_n\in g^{-1}(F)\big) \leq \operatorname{Pr}\big(X_n\in \overline{g^{-1}(F)}\big),~~ ~~</math>~~ and by the portmanteau theorem the [[limsup]] of the last expression is less than or equal to Pr(''X''∈<span style="text-decoration:overline">''g''<sup>−1</sup>(''F'')</span>). Using the formula we derived in the previous paragraph, this can be written as ~~: <math>\begin{align}~~ ~~& \operatorname{Pr}\big(X\in \overline{g^{-1}(F)}\big) \leq~~ ~~\operatorname{Pr}\big(X\in g^{-1}(F)\cup D_g\big) \leq \\~~ ~~& \operatorname{Pr}\big(X \in g^{-1}(F)\big) + \operatorname{Pr}(X\in D_g) =~~ ~~\operatorname{Pr}\big(g(X) \in F\big) + 0.~~ ~~\end{align}</math>~~ ~~On plugging this back into the original expression, it can be seen that~~ ~~: <math>~~ ~~\limsup_{n\to\infty} \operatorname{Pr}\big(g(X_n)\in F\big) \leq \operatorname{Pr}\big(g(X) \in F\big),~~ ~~</math>~~ ~~which, by the portmanteau theorem, implies that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') in distribution.~~ ===Convergence in probability=== Fix an arbitrary ''ε'' > 0. Then for any ''δ'' > 0 consider the set ''B<sub>δ</sub>'' defined as : <math> B_\delta = \big\{x\in S\ \~~big\|\~~mid x\notin D_g:\ \exists y\in S:\ \|x-y\|<\delta,\, \|g(x)-g(y)\|>\varepsilon\big\}. </math> This is the set of continuity points ''x'' of the function ''g''(·) for which it is possible to find, within the ''δ''-neighborhood of ''x'', a point which maps outside the ''ε''-neighborhood of ''g''(''x''). By definition of continuity, this set shrinks as ''δ'' goes to zero, so that lim<sub>''δ''→0 → 0</sub>''B<sub>δ</sub>''  =  ∅. Now suppose that \|''g''(''X'')  −  ''g''(''X<sub>n</sub>'')\|  >  ''ε''. This implies that at least one of the following is true: either \|''X''−''X<sub>n</sub>''\| ≥ ''δ'', or ''X'' ∈ ''D<sub>g</sub>'', or ''X''∈''B<sub>δ</sub>''. In terms of probabilities this can be written as : <math> \~~operatorname{~~Pr}\big(\big\|g(X_n)-g(X)\big\|>\varepsilon\big) \leq \~~operatorname{~~Pr}\big(\|X_n-X\|\geq\delta\big) + \~~operatorname{~~Pr}(X\in B_\delta) + \~~operatorname{~~Pr}(X\in D_g). </math> On the right-hand side, the first term converges to zero as ''n''  →  ∞ for any fixed ''δ'', by the definition of convergence in probability of the sequence {''X<sub>n</sub>''}. The second term converges to zero as ''δ''  →  0, since the set ''B<sub>δ</sub>'' shrinks to an empty set. And the last term is identically equal to zero by assumption of the theorem. Therefore, the conclusion is that : <math> \lim_{n\to\infty}\~~operatorname{~~Pr} \big(\big\|g(X_n)-g(X)\big\|>\varepsilon\big) = 0, </math> which means that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') in probability. === Almost sure convergence === ~~===Convergence almost surely===~~ By definition of the continuity of the function ''g''(·), : <math> \lim_{n\to\infty}X_n(\omega) = X(\omega) \quad\Rightarrow\quad \lim_{n\to\infty}g(X_n(\omega)) = g(X(\omega)) </math> at each point ''X''(''ω'') where ''g''(·) is continuous. Therefore, : <math>\begin{align} \~~operatorname{~~Pr}\~~Big~~left(\lim_{n\to\infty}g(X_n) = g(X)\~~Big~~right) &\geq \~~operatorname{~~Pr}\~~Big~~left(\lim_{n\to\infty}g(X_n) = g(X),\ X\notin D_g\~~Big~~right) \\ &\geq \~~operatorname{~~Pr}\~~Big~~left(\lim_{n\to\infty}X_n = X,\ X\notin D_g\~~Big~~right) \\ = 1, ~~&\geq \operatorname{Pr}\Big(\lim_{n\to\infty}X_n = X\Big) - \operatorname{Pr}(X\in D_g) = 1-0 = 1.~~ \end{align}</math> because the intersection of two almost sure events is almost sure. By definition, we conclude that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') almost surely. ==See also== * [[~~Slutsky’s~~Slutsky's theorem]] * [[Portmanteau theorem]] * [[Pushforward measure]] ==References== ~~===Literature===~~ ~~{{refbegin}}~~ * {{cite book ~~\| last = Amemiya~~ ~~\| first = Takeshi~~ ~~\| year = 1985~~ ~~\| title = Advanced Econometrics~~ ~~\| publisher = Harvard University Press~~ ~~\| ___location = Cambridge, MA~~ ~~\| isbn = 0-674-00560-0~~ ~~\| lccn = HB139.A54 1985~~ }} * {{cite book ~~\| last = Billingsley~~ ~~\| first = Patrick~~ ~~\| title = Convergence of Probability Measures~~ ~~\| year = 1969~~ ~~\| publisher = John Wiley & Sons~~ ~~\| isbn = 0-471-07242-7~~ }} * {{cite book ~~\| last = Billingsley~~ ~~\| first = Patrick~~ ~~\| title = Convergence of Probability Measures~~ ~~\| year = 1999~~ ~~\| publisher = John Wiley & Sons~~ ~~\| edition = 2nd~~ ~~\| isbn = 0-471-19745-9~~ }} * {{cite journal ~~\| doi = 10.1214/aoms/1177731415~~ ~~\| author = Mann, H.B.~~ ~~\|author2=Wald, A.~~ ~~\| year = 1943~~ ~~\| title = On stochastic limit and order relationships~~ ~~\| journal = The Annals of Mathematical Statistics~~ ~~\| volume = 14~~ ~~\| issue = 3~~ ~~\| pages = 217–226~~ ~~\| jstor = 2235800~~ ~~\| ref = CITEREFMannWald1943~~ }} * {{cite book ~~\| last = Van der Vaart~~ ~~\| first = A. W.~~ ~~\| title = Asymptotic statistics~~ ~~\| year = 1998~~ ~~\| publisher = Cambridge University Press~~ ~~\| ___location = New York~~ ~~\| isbn = 978-0-521-49603-2~~ ~~\| lccn = QA276 .V22 1998~~ ~~\| ref = CITEREFvan_der_Vaart1998~~ }} ~~{{refend}}~~ ~~===Notes===~~ {{reflist}} [[Category:~~Probability~~Theorems ~~theorems~~in probability theory]] [[Category:~~Statistical~~Theorems ~~theorems~~in statistics]] [[Category:Articles containing proofs]]