Banach fixed-point theorem: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 04:24, 13 October 2023 edit 5.212.164.135 (talk) →Statement Tags: Mobile edit Mobile web edit ← Previous edit		Latest revision as of 19:58, 29 January 2025 edit undo Tito Omburo (talk \| contribs) Extended confirmed users 3,038 edits →Generalizations: compact case
(22 intermediate revisions by 13 users not shown)
Line 1: {{Short description\|Theorem about metric spaces}} In [[mathematics]], the '''Banach fixed-point theorem''' (also known as the '''contraction mapping theorem''' or '''contractive mapping theorem''' or '''~~Banach-Caccioppoli~~Banach–Caccioppoli theorem''') is an important [[Convergence proof techniques#contraction mapping\|tool]] in the theory of [[metric space]]s; it guarantees the existence and uniqueness of [[fixed point (mathematics)\|fixed points]] of certain self-maps of metric spaces, and provides a constructive method to find those fixed points. It can be understood as an abstract formulation of [[Fixed-point iteration\|Picard's method of successive approximations]].<ref>{{cite book \|first1=David \|last1=Kinderlehrer \|author-link=David Kinderlehrer \|first2=Guido \|last2=Stampacchia \|author-link2=Guido Stampacchia \|chapter=Variational Inequalities in '''R'''<sup>N</sup> \|title=An Introduction to Variational Inequalities and Their Applications \|___location=New York \|publisher=Academic Press \|year=1980 \|isbn=0-12-407350-6 \|pages=7–22 \|chapter-url=https://books.google.com/books?id=eCDnoB3Np5oC&pg=PA7 }}</ref> The theorem is named after [[Stefan Banach]] (1892–1945) who first stated it in 1922.<ref>{{cite journal \|last=Banach\|first= Stefan\|author-link=Stefan Banach\| title=Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales\|journal=[[Fundamenta Mathematicae]]\|volume= 3\|year=1922\|pages= 133–181 \|url=http://matwbn.icm.edu.pl/ksiazki/or/or2/or215.pdf \|archive-url=https://web.archive.org/web/20110607002842/http://matwbn.icm.edu.pl/ksiazki/or/or2/or215.pdf \|archive-date=2011-06-07 \|url-status=live \|doi=10.4064/fm-3-1-133-181}}</ref><ref>{{cite journal \|first=Krzysztof \|last=Ciesielski \|title=On Stefan Banach and some of his results \|journal=Banach J. Math. Anal. \|volume=1 \|year=2007 \|issue=1 \|pages=1–10 \|url=http://www.emis.de/journals/BJMA/tex_v1_n1_a1.pdf \|archive-url=https://web.archive.org/web/20090530012258/http://www.emis.de/journals/BJMA/tex_v1_n1_a1.pdf \|archive-date=2009-05-30 \|url-status=live \|doi=10.15352/bjma/1240321550 \|doi-access=free }}</ref> ==Statement== Line 7: for all <math>x, y \in X.</math> <blockquote> ~~<blockquote>~~'''Banach ~~Fixed Point~~fixed-point ~~Theorem~~theorem.''' Let <math>(X, d)</math> be a non-[[Empty set\|~~non-~~empty]] [[complete metric space]] with a contraction mapping <math>T : X \to X.</math> Then ''T'' admits a unique [[Fixed point (mathematics)\|fixed-point]] <math>x^</math> in ''X'' (i.e. <math>T(x^) = x^</math>). Furthermore, <math>x^</math> can be found as follows: start with an arbitrary element <math>x_0 \in X</math> and define a [[sequence]] <math>(x_n)_{n\in\mathbb N}</math> by <math>x_n = T(x_{n-1})</math> for <math>n \geq 1.</math> Then <math>\lim_{n \to \infty} x_n = x^</math>.</blockquote> ''Remark 1.'' The following inequalities are equivalent and describe the [[Rate of convergence\|speed of convergence]]: Line 13 ⟶ 14: :<math> \begin{align} d(x^, x_n) & \leq \frac{q^n}{1-q} d(x_1,x_0), \\[5pt] d(x^, x_{n+1}) & \leq \frac{q}{1-q} d(x_{n+1},x_n), \\[5pt] d(x^, x_{n+1}) & \leq q d(x^,x_n). \end{align} Line 23 ⟶ 24: ''Remark 2.'' <math>d(T(x),T(y))<d(x,y)</math> for all <math>x \neq y</math> is in general not enough to ensure the existence of a fixed point, as is shown by the map :<math>T : [1,\infty) \to [1,\infty), \,\, T(x)=x+\tfrac{1}{x}\,,</math> which lacks a fixed point. However, if <math>X</math> is [[Compact space\|compact]], then this weaker assumption does imply the existence and uniqueness of a fixed point, that can be easily found as a minimizer of <math>d(x,T(x))</math>, indeed, a minimizer exists by compactness, and has to be a fixed point of <math>T.</math>. It then easily follows that the fixed point is the limit of any sequence of iterations of <math>T.</math>. ''Remark 3.'' When using the theorem in practice, the most difficult part is typically to define <math>X</math> properly so that <math>T(X) \subseteq X.</math> Line 29 ⟶ 30: ==Proof== Let <math>x_0 \in X</math> be arbitrary and define a [[sequence]] <math>(x_n)_{n\in\mathbb N}</math> by setting ~~''x~~<~~sub~~math>~~n</sub>''~~x_n = ''T''(~~''x''<sub>''~~x_{n~~''−1~~-1})</~~sub~~math>). We first note that for all <math>n \in \N,</math> we have the inequality :<math>d(x_{n+1}, x_n) \le q^n d(x_1, x_0).</math> This follows by [[Principle of mathematical induction\|induction]] on ''<math>n''</math>, using the fact that ''<math>T''</math> is a contraction mapping. Then we can show that <math>(x_n)_{n\in\mathbb N}</math> is a [[Cauchy sequence]]. In particular, let <math>m, n \in \N</math> such that ''<math>m'' > ''n'' </math>: : <math>\begin{align} d(x_m, x_n) & \leq d(x_m, x_{m-1}) + d(x_{m-1}, x_{m-2}) + \cdots + d(x_{n+1}, x_n) \\[5pt] & \leq q^{m-1}d(x_1, x_0) + q^{m-2}d(x_1, x_0) + \cdots + q^nd(x_1, x_0) \\[5pt] & = q^n d(x_1, x_0) \sum_{k=0}^{m-n-1} q^k \\[5pt] & \leq q^n d(x_1, x_0) \sum_{k=0}^\infty q^k \\[5pt] & = q^n d(x_1, x_0) \left ( \frac{1}{1-q} \right ). \end{align}</math> Let ε<math>\varepsilon > 0</math> be arbitrary. Since ''<math>q'' ∈\in [0, 1)</math>, we can find a large <math>N \in \N</math> so that :<math>q^N < \frac{\varepsilon(1-q)}{d(x_1, x_0)}.</math> Therefore, by choosing ''<math>m''</math> and ''<math>n''</math> greater than ''<math>N''</math> we may write: :<math>d(x_m, x_n) \leq q^n d(x_1, x_0) \left ( \frac{1}{1-q} \right ) < \left (\frac{\varepsilon(1-q)}{d(x_1, x_0)} \right ) d(x_1, x_0) \left ( \frac{1}{1-q} \right ) = \varepsilon.</math> This proves that the sequence <math>(x_n)_{n\in\mathbb N}</math> is Cauchy. By completeness of <math>(''X'','' d'')</math>, the sequence has a limit <math>x^ \in X.</math> Furthermore, <math>x^</math> must be a [[Fixed point (mathematics)\|fixed point]] of ''<math>T''</math>: :<math>x^=\lim_{n\to\infty} x_n = \lim_{n\to\infty} T(x_{n-1}) = T\left(\lim_{n\to\infty} x_{n-1} \right) = T(x^). </math> As a contraction mapping, ''<math>T''</math> is continuous, so bringing the limit inside ''<math>T''</math> was justified. Lastly, ''<math>T''</math> cannot have more than one fixed point in <math>(''X'','' d'')</math>, since any pair of distinct fixed points ~~''p~~<~~sub~~math>1p_1</~~sub~~math>'' and ~~''p~~<~~sub~~math>2p_2</~~sub~~math>'' would contradict the contraction of ''<math>T''</math>: :<math> d(T(p_1),T(p_2)) = d(p_1,p_2) > q d(p_1, p_2).</math> ==Applications== A standard application is the proof of the [[Picard–Lindelöf theorem]] about the existence and uniqueness of solutions to certain [[ordinary differential equation]]s. The sought solution of the differential equation is expressed as a fixed point of a suitable integral operator ~~which~~on ~~changes~~the space of continuous functions ~~into~~under ~~continuous~~the ~~functions~~[[uniform norm]]. The Banach fixed-point theorem is then used to show that this integral operator has a unique fixed point. * One consequence of the Banach fixed-point theorem is that small Lipschitz perturbations of the identity are [[Lipschitz continuity#Definitions\|bi-lipschitz]] homeomorphisms. Let Ω be an open set of a Banach space ''E''; let {{nobr\|''I'' : Ω → ''E''}} denote the identity (inclusion) map and let ''g'' : Ω → ''E'' be a Lipschitz map of constant ''k'' < 1. Then # Ω′ := (''I'' + ''g'')(Ω) is an open subset of ''E'': precisely, for any ''x'' in Ω such that {{nobr\|''B''(''x'', ''r'') ⊂ Ω}} one has {{nobr\|''B''((''I'' + ''g'')(''x''), ''r''(1−1 − ''k'')) ⊂ Ω′; }} # ''I'' + ''g'' : Ω → Ω′ is a bi-~~lipschitz~~Lipschitz homeomorphism; : precisely, (''I'' + ''g'')<sup>−1</sup> is still of the form {{nobr\|''I'' + ''h'' : Ω → Ω′}} with ''h'' a Lipschitz map of constant ''k''/(1−1 − ''k''). A direct consequence of this result yields the proof of the [[inverse function theorem]]. * It can be used to give sufficient conditions under which Newton's method of successive approximations is guaranteed to work, and similarly for Chebyshev's third -order method. * It can be used to prove existence and uniqueness of solutions to integral equations. * It can be used to give a proof to the [[Nash embedding theorem]].<ref>{{cite journal \|first=Matthias\|last=Günther\|title=Zum Einbettungssatz von J. Nash \| trans-title=On the embedding theorem of J. Nash \| language=de \| journal=[[Mathematische Nachrichten]]\|volume= 144 \|year=1989\|pages= 165–187\|doi=10.1002/mana.19891440113 \| mr=1037168}}</ref> * It can be used to prove existence and uniqueness of solutions to value iteration, policy iteration, and policy evaluation of [[reinforcement learning]].<ref>{{cite book \|first1=Frank L. \|last1=Lewis \|first2=Draguna \|last2=Vrabie \|first3=Vassilis L. \|last3=Syrmos \|title=Optimal Control \|chapter=Reinforcement Learning and Optimal Adaptive Control \|___location=New York \|publisher=John Wiley & Sons \|year=2012 \|isbn=978-1-118-12272-3 \|pages=461–517 [p. 474] \|chapter-url=https://books.google.com/books?id=U3Gtlot_hYEC&pg=PA474 }}</ref> * It can be used to prove existence and uniqueness of an equilibrium in [[Cournot competition]],<ref>{{cite journal \|first1=Ngo Van \|last1=Long \|first2=Antoine \|last2=Soubeyran \|title=Existence and Uniqueness of Cournot Equilibrium: A Contraction Mapping Approach \|journal=[[Economics Letters]] \|volume=67 \|issue=3 \|year=2000 \|pages=345–348 \|doi=10.1016/S0165-1765(00)00211-1 \|url=https://www.cirano.qc.ca/pdf/publication/99s-22.pdf \|archive-url=https://web.archive.org/web/20041230225125/http://www.cirano.qc.ca/pdf/publication/99s-22.pdf \|archive-date=2004-12-30 \|url-status=live }}</ref> and other dynamic economic models.<ref>{{cite book \|first1=Nancy L. \|last1=Stokey\|author1-link=Nancy Stokey \|first2=Robert E. Jr. \|last2=Lucas \|author-link2=Robert Lucas Jr. \|title=Recursive Methods in Economic Dynamics \|___location=Cambridge \|publisher=Harvard University Press \|year=1989 \|isbn=0-674-75096-9 \|pages=508–516 \|url=https://books.google.com/books?id=BgQ3AwAAQBAJ&pg=PA508 }}</ref> ==Converses== Line 83 ⟶ 84: Let ''T'' : ''X'' → ''X'' be a map on a complete non-empty metric space. Then, for example, some generalizations of the Banach fixed-point theorem are: Assume that some iterate ''T<sup>n</sup>'' of ''T'' is a contraction. Then ''T'' has a unique fixed point. Assume that for each ''n'', there exist ''c<sub>n</sub>'' such that ''d''(''T''<sup>''n''</sup>(''x''), ''T''<sup>''n''</sup>(''y'')) ≤ ''c''<sub>''n''</sub>''d''(''x'', ''y)'') for all ''x'' and ''y'', and that ::<math>\sum\nolimits_n c_n <\infty.</math> :Then ''T'' has a unique fixed point. In applications, the existence and uniqueness of a fixed point often can be shown directly with the standard Banach fixed point theorem, by a suitable choice of the metric that makes the map ''T'' a contraction. Indeed, the above result by Bessaga strongly suggests to look for such a metric. See also the article on [[fixed point theorems in infinite-dimensional spaces]] for generalizations. In a non-empty [[compact metric space]], any function <math>T</math> satisfying <math>d(T(x),T(y))<d(x,y)</math> for all distinct <math>x,y</math>, has a unique fixed point. The proof is simpler than the Banach theorem, because the function <math>d(T(x),x)</math> is continuous, and therefore assumes a minimum, which is easily shown to be zero. A different class of generalizations arise from suitable generalizations of the notion of [[metric space]], e.g. by weakening the defining axioms for the notion of metric.<ref>{{cite book \|first1=Pascal \|last1=Hitzler \| author-link1=Pascal Hitzler\|first2=Anthony \|last2=Seda \|title=Mathematical Aspects of Logic Programming Semantics \|publisher=Chapman and Hall/CRC \|year=2010 \|isbn=978-1-4398-2961-5 }}</ref> Some of these have applications, e.g., in the theory of programming semantics in theoretical computer science.<ref>{{cite journal \|first1=Anthony K. \|last1=Seda \|first2=Pascal \|last2=Hitzler \| author-link2=Pascal Hitzler\|title=Generalized Distance Functions in the Theory of Computation \|journal=The Computer Journal \|volume=53 \|issue=4 \|pages=443–464 \|year=2010 \|doi=10.1093/comjnl/bxm108 }}</ref> ==Example== An application of the Banach fixed-point theorem and fixed-point iteration can be used to quickly obtain an approximation of {{pi}} with high accuracy. Consider the function <math>f(x)=\sin(x)+x</math>. It can be verified that {{pi}} is a fixed point of ''f'', and that ''f'' maps the interval <math>\left[3\pi/4,5\pi/4\right]</math> to itself. Moreover, <math>f'(x)=1+\cos(x)</math>, and it can be verified that :<math>0\leq1+\cos(x)\leq1-\frac{1}{\sqrt{2}}<1</math> on this interval. Therefore, by an application of the [[mean value theorem]], ''f'' has a Lipschitz constant less than 1 (namely <math>1-1/\sqrt{2}</math>). Applying the Banach fixed-point theorem shows that the fixed point {{pi}} is the unique fixed point on the interval, allowing for fixed-point iteration to be used. For example, the value 3 may be chosen to start the fixed-point iteration, as <math>3\pi/4\leq3\leq5\pi/4</math>. The Banach fixed-point theorem may be used to conclude that : <math>\pi=f(f(f(\cdots f(3)\cdots)))).</math> Applying ''f'' to 3 only three times already yields an expansion of {{pi}} accurate to 33 digits: : <math>f(f(f(3)))=3.141592653589793238462643383279502\ldots\,.</math> ==See also== Line 114 ⟶ 133: {{PlanetMath attribution \|urlname=banachfixedpointtheorem \|title=Banach fixed point theorem }} {{Metric spaces}} {{Topology}} Line 119 ⟶ 139: [[Category:Articles containing proofs]] [[Category:Eponymous theorems of mathematics]] [[Category:Fixed-point theorems]] [[Category:Metric geometry]]