Continuous mapping theorem: Difference between revisions

Browse history interactively

Content deleted Content added

VisualWikitext

Revision as of 01:52, 23 December 2008 edit Jmath666 (talk \| contribs) Extended confirmed users, Pending changes reviewers 3,204 edits ←Redirected page to Slutsky's theorem		Latest revision as of 06:32, 14 April 2025 edit undo JJMC89 bot III (talk \| contribs) Bots, Administrators 4,339,123 edits m Moving Category:Probability theorems to Category:Theorems in probability theory per Wikipedia:Categories for discussion/Speedy
(89 intermediate revisions by 40 users not shown)
Line 1: ~~#REDIRECT~~{{Short ~~[[Slutsky's~~description\|Probability theorem]]}} {{Distinguish\|text=the [[contraction mapping theorem]]}} In [[probability theory]], the '''continuous mapping theorem''' states that continuous functions [[Continuous function#Heine definition of continuity\|preserve limits]] even if their arguments are sequences of random variables. A continuous function, in [[Continuous function#Heine definition of continuity\|Heine's definition]], is such a function that maps convergent sequences into convergent sequences: if ''x<sub>n</sub>'' → ''x'' then ''g''(''x<sub>n</sub>'') → ''g''(''x''). The ''continuous mapping theorem'' states that this will also be true if we replace the deterministic sequence {''x<sub>n</sub>''} with a sequence of random variables {''X<sub>n</sub>''}, and replace the standard notion of convergence of real numbers “→” with one of the types of [[convergence of random variables]]. This theorem was first proved by [[Henry Mann]] and [[Abraham Wald]] in 1943,<ref>{{cite journal \| doi = 10.1214/aoms/1177731415 \| last1 = Mann \|first1=H. B. \| last2=Wald \|first2=A. \| year = 1943 \| title = On Stochastic Limit and Order Relationships \| journal = [[Annals of Mathematical Statistics]] \| volume = 14 \| issue = 3 \| pages = 217–226 \| jstor = 2235800 \| doi-access = free }}</ref> and it is therefore sometimes called the '''Mann–Wald theorem'''.<ref>{{cite book \| last = Amemiya \| first = Takeshi \| author-link = Takeshi Amemiya \| year = 1985 \| title = Advanced Econometrics \| publisher = Harvard University Press \| ___location = Cambridge, MA \| isbn = 0-674-00560-0 \| url = https://books.google.com/books?id=0bzGQE14CwEC&pg=pA88 \|page=88 }}</ref> Meanwhile, [[Denis Sargan]] refers to it as the '''general transformation theorem'''.<ref>{{cite book \|first=Denis \|last=Sargan \|title=Lectures on Advanced Econometric Theory \|___location=Oxford \|publisher=Basil Blackwell \|year=1988 \|isbn=0-631-14956-2 \|pages=4–8 }}</ref> ==Statement== Let {''X<sub>n</sub>''}, ''X'' be [[random element]]s defined on a [[metric space]] ''S''. Suppose a function {{nowrap\|''g'': ''S''→''S′''}} (where ''S′'' is another metric space) has the set of [[Discontinuity (mathematics)\|discontinuity points]] ''D<sub>g</sub>'' such that {{nowrap\|1=Pr[''X'' ∈ ''D<sub>g</sub>''] = 0}}. Then<ref>{{cite book \| last = Billingsley \| first = Patrick \| author-link = Patrick Billingsley \| title = Convergence of Probability Measures \| year = 1969 \| publisher = John Wiley & Sons \| isbn = 0-471-07242-7\|page=31 (Corollary 1) }}</ref><ref>{{cite book \| last = van der Vaart \| first = A. W. \| title = Asymptotic Statistics \| year = 1998 \| publisher = Cambridge University Press \| ___location = New York \| isbn = 0-521-49603-9 \| url =https://books.google.com/books?id=UEuQEM5RjWgC&pg=PA7 \|page=7 (Theorem 2.3) }}</ref> : <math> \begin{align} X_n \ \xrightarrow\text{d}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{d}\ g(X); \\[6pt] X_n \ \xrightarrow\text{p}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow\text{p}\ g(X); \\[6pt] X_n \ \xrightarrow{\!\!\text{a.s.}\!\!}\ X \quad & \Rightarrow\quad g(X_n)\ \xrightarrow{\!\!\text{a.s.}\!\!}\ g(X). \end{align} </math> where the superscripts, "d", "p", and "a.s." denote [[convergence in distribution]], [[convergence in probability]], and [[almost sure convergence]] respectively. ==Proof== <div style="NO-align:right"><small>This proof has been adopted from {{harv\|van der Vaart\|1998\|loc=Theorem 2.3}}</small></div> Spaces ''S'' and ''S′'' are equipped with certain metrics. For simplicity we will denote both of these metrics using the \|''x'' − ''y''\| notation, even though the metrics may be arbitrary and not necessarily Euclidean. ===Convergence in distribution=== We will need a particular statement from the [[portmanteau theorem]]: that convergence in distribution <math>X_n\xrightarrow{d}X</math> is equivalent to : <math> \mathbb E f(X_n) \to \mathbb E f(X)</math> for every bounded continuous functional ''f''. So it suffices to prove that <math> \mathbb E f(g(X_n)) \to \mathbb E f(g(X))</math> for every bounded continuous functional ''f''. For simplicity we assume ''g'' continuous. Note that <math> F = f \circ g</math> is itself a bounded continuous functional. And so the claim follows from the statement above. The general case is slightly more technical. ===Convergence in probability=== Fix an arbitrary ''ε'' > 0. Then for any ''δ'' > 0 consider the set ''B<sub>δ</sub>'' defined as : <math> B_\delta = \big\{x\in S \mid x\notin D_g:\ \exists y\in S:\ \|x-y\|<\delta,\, \|g(x)-g(y)\|>\varepsilon\big\}. </math> This is the set of continuity points ''x'' of the function ''g''(·) for which it is possible to find, within the ''δ''-neighborhood of ''x'', a point which maps outside the ''ε''-neighborhood of ''g''(''x''). By definition of continuity, this set shrinks as ''δ'' goes to zero, so that lim<sub>''δ'' → 0</sub>''B<sub>δ</sub>'' = ∅. Now suppose that \|''g''(''X'') − ''g''(''X<sub>n</sub>'')\| > ''ε''. This implies that at least one of the following is true: either \|''X''−''X<sub>n</sub>''\| ≥ ''δ'', or ''X'' ∈ ''D<sub>g</sub>'', or ''X''∈''B<sub>δ</sub>''. In terms of probabilities this can be written as : <math> \Pr\big(\big\|g(X_n)-g(X)\big\|>\varepsilon\big) \leq \Pr\big(\|X_n-X\|\geq\delta\big) + \Pr(X\in B_\delta) + \Pr(X\in D_g). </math> On the right-hand side, the first term converges to zero as ''n'' → ∞ for any fixed ''δ'', by the definition of convergence in probability of the sequence {''X<sub>n</sub>''}. The second term converges to zero as ''δ'' → 0, since the set ''B<sub>δ</sub>'' shrinks to an empty set. And the last term is identically equal to zero by assumption of the theorem. Therefore, the conclusion is that : <math> \lim_{n\to\infty}\Pr \big(\big\|g(X_n)-g(X)\big\|>\varepsilon\big) = 0, </math> which means that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') in probability. === Almost sure convergence === By definition of the continuity of the function ''g''(·), : <math> \lim_{n\to\infty}X_n(\omega) = X(\omega) \quad\Rightarrow\quad \lim_{n\to\infty}g(X_n(\omega)) = g(X(\omega)) </math> at each point ''X''(''ω'') where ''g''(·) is continuous. Therefore, : <math>\begin{align} \Pr\left(\lim_{n\to\infty}g(X_n) = g(X)\right) &\geq \Pr\left(\lim_{n\to\infty}g(X_n) = g(X),\ X\notin D_g\right) \\ &\geq \Pr\left(\lim_{n\to\infty}X_n = X,\ X\notin D_g\right) = 1, \end{align}</math> because the intersection of two almost sure events is almost sure. By definition, we conclude that ''g''(''X<sub>n</sub>'') converges to ''g''(''X'') almost surely. ==See also== * [[Slutsky's theorem]] * [[Portmanteau theorem]] * [[Pushforward measure]] ==References== {{reflist}} [[Category:Theorems in probability theory]] [[Category:Theorems in statistics]] [[Category:Articles containing proofs]]