Banach fixed-point theorem: Difference between revisions

Content deleted Content added
Tags: Mobile edit Mobile web edit
Generalizations: compact case
 
(22 intermediate revisions by 13 users not shown)
Line 1:
{{Short description|Theorem about metric spaces}}
In [[mathematics]], the '''Banach fixed-point theorem''' (also known as the '''contraction mapping theorem''' or '''contractive mapping theorem''' or '''Banach-CaccioppoliBanach–Caccioppoli theorem''') is an important [[Convergence proof techniques#contraction mapping|tool]] in the theory of [[metric space]]s; it guarantees the existence and uniqueness of [[fixed point (mathematics)|fixed points]] of certain self-maps of metric spaces, and provides a constructive method to find those fixed points. It can be understood as an abstract formulation of [[Fixed-point iteration|Picard's method of successive approximations]].<ref>{{cite book |first1=David |last1=Kinderlehrer |author-link=David Kinderlehrer |first2=Guido |last2=Stampacchia |author-link2=Guido Stampacchia |chapter=Variational Inequalities in '''R'''<sup>N</sup> |title=An Introduction to Variational Inequalities and Their Applications |___location=New York |publisher=Academic Press |year=1980 |isbn=0-12-407350-6 |pages=7–22 |chapter-url=https://books.google.com/books?id=eCDnoB3Np5oC&pg=PA7 }}</ref> The theorem is named after [[Stefan Banach]] (1892–1945) who first stated it in 1922.<ref>{{cite journal |last=Banach|first= Stefan|author-link=Stefan Banach| title=Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales|journal=[[Fundamenta Mathematicae]]|volume= 3|year=1922|pages= 133–181 |url=http://matwbn.icm.edu.pl/ksiazki/or/or2/or215.pdf |archive-url=https://web.archive.org/web/20110607002842/http://matwbn.icm.edu.pl/ksiazki/or/or2/or215.pdf |archive-date=2011-06-07 |url-status=live |doi=10.4064/fm-3-1-133-181}}</ref><ref>{{cite journal |first=Krzysztof |last=Ciesielski |title=On Stefan Banach and some of his results |journal=Banach J. Math. Anal. |volume=1 |year=2007 |issue=1 |pages=1–10 |url=http://www.emis.de/journals/BJMA/tex_v1_n1_a1.pdf |archive-url=https://web.archive.org/web/20090530012258/http://www.emis.de/journals/BJMA/tex_v1_n1_a1.pdf |archive-date=2009-05-30 |url-status=live |doi=10.15352/bjma/1240321550 |doi-access=free }}</ref>
 
==Statement==
Line 7:
for all <math>x, y \in X.</math>
 
<blockquote>
<blockquote>'''Banach Fixed Pointfixed-point Theoremtheorem.''' Let <math>(X, d)</math> be a non-[[Empty set|non-empty]] [[complete metric space]] with a contraction mapping <math>T : X \to X.</math> Then ''T'' admits a unique [[Fixed point (mathematics)|fixed-point]] <math>x^*</math> in ''X'' (i.e. <math>T(x^*) = x^*</math>). Furthermore, <math>x^*</math> can be found as follows: start with an arbitrary element <math>x_0 \in X</math> and define a [[sequence]] <math>(x_n)_{n\in\mathbb N}</math> by <math>x_n = T(x_{n-1})</math> for <math>n \geq 1.</math> Then <math>\lim_{n \to \infty} x_n = x^*</math>.</blockquote>
 
''Remark 1.'' The following inequalities are equivalent and describe the [[Rate of convergence|speed of convergence]]:
Line 13 ⟶ 14:
:<math>
\begin{align}
d(x^*, x_n) & \leq \frac{q^n}{1-q} d(x_1,x_0), \\[5pt]
d(x^*, x_{n+1}) & \leq \frac{q}{1-q} d(x_{n+1},x_n), \\[5pt]
d(x^*, x_{n+1}) & \leq q d(x^*,x_n).
\end{align}
Line 23 ⟶ 24:
''Remark 2.'' <math>d(T(x),T(y))<d(x,y)</math> for all <math>x \neq y</math> is in general not enough to ensure the existence of a fixed point, as is shown by the map
:<math>T : [1,\infty) \to [1,\infty), \,\, T(x)=x+\tfrac{1}{x}\,,</math>
which lacks a fixed point. However, if <math>X</math> is [[Compact space|compact]], then this weaker assumption does imply the existence and uniqueness of a fixed point, that can be easily found as a minimizer of <math>d(x,T(x))</math>, indeed, a minimizer exists by compactness, and has to be a fixed point of <math>T.</math>. It then easily follows that the fixed point is the limit of any sequence of iterations of <math>T.</math>.
 
''Remark 3.'' When using the theorem in practice, the most difficult part is typically to define <math>X</math> properly so that <math>T(X) \subseteq X.</math>
Line 29 ⟶ 30:
==Proof==
 
Let <math>x_0 \in X</math> be arbitrary and define a [[sequence]] <math>(x_n)_{n\in\mathbb N}</math> by setting ''x<submath>n</sub>''x_n = ''T''(''x''<sub>''x_{n''−1-1})</submath>). We first note that for all <math>n \in \N,</math> we have the inequality
 
:<math>d(x_{n+1}, x_n) \le q^n d(x_1, x_0).</math>
 
This follows by [[Principle of mathematical induction|induction]] on ''<math>n''</math>, using the fact that ''<math>T''</math> is a contraction mapping. Then we can show that <math>(x_n)_{n\in\mathbb N}</math> is a [[Cauchy sequence]]. In particular, let <math>m, n \in \N</math> such that ''<math>m'' > ''n'' </math>:
 
: <math>\begin{align}
d(x_m, x_n) & \leq d(x_m, x_{m-1}) + d(x_{m-1}, x_{m-2}) + \cdots + d(x_{n+1}, x_n) \\[5pt]
& \leq q^{m-1}d(x_1, x_0) + q^{m-2}d(x_1, x_0) + \cdots + q^nd(x_1, x_0) \\[5pt]
& = q^n d(x_1, x_0) \sum_{k=0}^{m-n-1} q^k \\[5pt]
& \leq q^n d(x_1, x_0) \sum_{k=0}^\infty q^k \\[5pt]
& = q^n d(x_1, x_0) \left ( \frac{1}{1-q} \right ).
\end{align}</math>
 
Let ε<math>\varepsilon > 0</math> be arbitrary. Since ''<math>q'' \in [0, 1)</math>, we can find a large <math>N \in \N</math> so that
 
:<math>q^N < \frac{\varepsilon(1-q)}{d(x_1, x_0)}.</math>
 
Therefore, by choosing ''<math>m''</math> and ''<math>n''</math> greater than ''<math>N''</math> we may write:
 
:<math>d(x_m, x_n) \leq q^n d(x_1, x_0) \left ( \frac{1}{1-q} \right ) < \left (\frac{\varepsilon(1-q)}{d(x_1, x_0)} \right ) d(x_1, x_0) \left ( \frac{1}{1-q} \right ) = \varepsilon.</math>
 
This proves that the sequence <math>(x_n)_{n\in\mathbb N}</math> is Cauchy. By completeness of <math>(''X'','' d'')</math>, the sequence has a limit <math>x^* \in X.</math> Furthermore, <math>x^*</math> must be a [[Fixed point (mathematics)|fixed point]] of ''<math>T''</math>:
 
:<math>x^*=\lim_{n\to\infty} x_n = \lim_{n\to\infty} T(x_{n-1}) = T\left(\lim_{n\to\infty} x_{n-1} \right) = T(x^*). </math>
 
As a contraction mapping, ''<math>T''</math> is continuous, so bringing the limit inside ''<math>T''</math> was justified. Lastly, ''<math>T''</math> cannot have more than one fixed point in <math>(''X'','' d'')</math>, since any pair of distinct fixed points ''p<submath>1p_1</submath>'' and ''p<submath>2p_2</submath>'' would contradict the contraction of ''<math>T''</math>:
 
:<math> d(T(p_1),T(p_2)) = d(p_1,p_2) > q d(p_1, p_2).</math>
 
==Applications==
* A standard application is the proof of the [[Picard–Lindelöf theorem]] about the existence and uniqueness of solutions to certain [[ordinary differential equation]]s. The sought solution of the differential equation is expressed as a fixed point of a suitable integral operator whichon changesthe space of continuous functions intounder continuousthe functions[[uniform norm]]. The Banach fixed-point theorem is then used to show that this integral operator has a unique fixed point.
* One consequence of the Banach fixed-point theorem is that small Lipschitz perturbations of the identity are [[Lipschitz continuity#Definitions|bi-lipschitz]] homeomorphisms. Let Ω be an open set of a Banach space ''E''; let {{nobr|''I'' : Ω → ''E''}} denote the identity (inclusion) map and let ''g'' : Ω → ''E'' be a Lipschitz map of constant ''k'' < 1. Then
# Ω′ := (''I'' + ''g'')(Ω) is an open subset of ''E'': precisely, for any ''x'' in Ω such that {{nobr|''B''(''x'', ''r'') ⊂ Ω}} one has {{nobr|''B''((''I'' + ''g'')(''x''), ''r''(1−1 − ''k'')) ⊂ Ω′; }}
# ''I'' + ''g'' : Ω → Ω′ is a bi-lipschitzLipschitz homeomorphism;
: precisely, (''I'' + ''g'')<sup>−1</sup> is still of the form {{nobr|''I'' + ''h'' : Ω → Ω′}} with ''h'' a Lipschitz map of constant ''k''/(1−1&nbsp;−&nbsp;''k''). A direct consequence of this result yields the proof of the [[inverse function theorem]].
* It can be used to give sufficient conditions under which Newton's method of successive approximations is guaranteed to work, and similarly for Chebyshev's third -order method.
* It can be used to prove existence and uniqueness of solutions to integral equations.
* It can be used to give a proof to the [[Nash embedding theorem]].<ref>{{cite journal |first=Matthias|last=Günther|title=Zum Einbettungssatz von J. Nash | trans-title=On the embedding theorem of J. Nash | language=de | journal=[[Mathematische Nachrichten]]|volume= 144 |year=1989|pages= 165–187|doi=10.1002/mana.19891440113 | mr=1037168}}</ref>
* It can be used to prove existence and uniqueness of solutions to value iteration, policy iteration, and policy evaluation of [[reinforcement learning]].<ref>{{cite book |first1=Frank L. |last1=Lewis |first2=Draguna |last2=Vrabie |first3=Vassilis L. |last3=Syrmos |title=Optimal Control |chapter=Reinforcement Learning and Optimal Adaptive Control |___location=New York |publisher=John Wiley & Sons |year=2012 |isbn=978-1-118-12272-3 |pages=461–517 [p. 474] |chapter-url=https://books.google.com/books?id=U3Gtlot_hYEC&pg=PA474 }}</ref>
* It can be used to prove existence and uniqueness of an equilibrium in [[Cournot competition]],<ref>{{cite journal |first1=Ngo Van |last1=Long |first2=Antoine |last2=Soubeyran |title=Existence and Uniqueness of Cournot Equilibrium: A Contraction Mapping Approach |journal=[[Economics Letters]] |volume=67 |issue=3 |year=2000 |pages=345–348 |doi=10.1016/S0165-1765(00)00211-1 |url=https://www.cirano.qc.ca/pdf/publication/99s-22.pdf |archive-url=https://web.archive.org/web/20041230225125/http://www.cirano.qc.ca/pdf/publication/99s-22.pdf |archive-date=2004-12-30 |url-status=live }}</ref> and other dynamic economic models.<ref>{{cite book |first1=Nancy L. |last1=Stokey|author1-link=Nancy Stokey |first2=Robert E. Jr. |last2=Lucas |author-link2=Robert Lucas Jr. |title=Recursive Methods in Economic Dynamics |___location=Cambridge |publisher=Harvard University Press |year=1989 |isbn=0-674-75096-9 |pages=508–516 |url=https://books.google.com/books?id=BgQ3AwAAQBAJ&pg=PA508 }}</ref>
 
==Converses==
Line 83 ⟶ 84:
Let ''T'' : ''X'' → ''X'' be a map on a complete non-empty metric space. Then, for example, some generalizations of the Banach fixed-point theorem are:
*Assume that some iterate ''T<sup>n</sup>'' of ''T'' is a contraction. Then ''T'' has a unique fixed point.
*Assume that for each ''n'', there exist ''c<sub>n</sub>'' such that ''d''(''T''<sup>''n''</sup>(''x''), ''T''<sup>''n''</sup>(''y'')) ≤ ''c''<sub>''n''</sub>''d''(''x'', ''y)'') for all ''x'' and ''y'', and that
::<math>\sum\nolimits_n c_n <\infty.</math>
:Then ''T'' has a unique fixed point.
In applications, the existence and uniqueness of a fixed point often can be shown directly with the standard Banach fixed point theorem, by a suitable choice of the metric that makes the map ''T'' a contraction. Indeed, the above result by Bessaga strongly suggests to look for such a metric. See also the article on [[fixed point theorems in infinite-dimensional spaces]] for generalizations.
 
In a non-empty [[compact metric space]], any function <math>T</math> satisfying <math>d(T(x),T(y))<d(x,y)</math> for all distinct <math>x,y</math>, has a unique fixed point. The proof is simpler than the Banach theorem, because the function <math>d(T(x),x)</math> is continuous, and therefore assumes a minimum, which is easily shown to be zero.
 
A different class of generalizations arise from suitable generalizations of the notion of [[metric space]], e.g. by weakening the defining axioms for the notion of metric.<ref>{{cite book |first1=Pascal |last1=Hitzler | author-link1=Pascal Hitzler|first2=Anthony |last2=Seda |title=Mathematical Aspects of Logic Programming Semantics |publisher=Chapman and Hall/CRC |year=2010 |isbn=978-1-4398-2961-5 }}</ref> Some of these have applications, e.g., in the theory of programming semantics in theoretical computer science.<ref>{{cite journal |first1=Anthony K. |last1=Seda |first2=Pascal |last2=Hitzler | author-link2=Pascal Hitzler|title=Generalized Distance Functions in the Theory of Computation |journal=The Computer Journal |volume=53 |issue=4 |pages=443–464 |year=2010 |doi=10.1093/comjnl/bxm108 }}</ref>
 
==Example==
 
An application of the Banach fixed-point theorem and fixed-point iteration can be used to quickly obtain an approximation of {{pi}} with high accuracy. Consider the function <math>f(x)=\sin(x)+x</math>. It can be verified that {{pi}} is a fixed point of ''f'', and that ''f'' maps the interval <math>\left[3\pi/4,5\pi/4\right]</math> to itself. Moreover, <math>f'(x)=1+\cos(x)</math>, and it can be verified that
 
:<math>0\leq1+\cos(x)\leq1-\frac{1}{\sqrt{2}}<1</math>
 
on this interval. Therefore, by an application of the [[mean value theorem]], ''f'' has a Lipschitz constant less than 1 (namely <math>1-1/\sqrt{2}</math>). Applying the Banach fixed-point theorem shows that the fixed point {{pi}} is the unique fixed point on the interval, allowing for fixed-point iteration to be used.
 
For example, the value 3 may be chosen to start the fixed-point iteration, as <math>3\pi/4\leq3\leq5\pi/4</math>. The Banach fixed-point theorem may be used to conclude that
 
: <math>\pi=f(f(f(\cdots f(3)\cdots)))).</math>
 
Applying ''f'' to 3 only three times already yields an expansion of {{pi}} accurate to 33 digits:
 
: <math>f(f(f(3)))=3.141592653589793238462643383279502\ldots\,.</math>
 
==See also==
Line 114 ⟶ 133:
{{PlanetMath attribution |urlname=banachfixedpointtheorem |title=Banach fixed point theorem }}
 
{{Metric spaces}}
{{Topology}}
 
Line 119 ⟶ 139:
 
[[Category:Articles containing proofs]]
[[Category:Eponymous theorems of mathematics]]
[[Category:Fixed-point theorems]]
[[Category:Metric geometry]]