Inverse function theorem: Difference between revisions

Content deleted Content added
Correct typo in single-variable proof
Over a real closed field: add disambiguating chapter no. to reference
 
(19 intermediate revisions by 11 users not shown)
Line 2:
{{Use dmy dates|date=December 2023}}
{{Calculus}}
In [[mathematicsreal analysis]], specificallya branch of [[differential calculusmathematics]], the '''inverse function theorem''' givesis a [[Necessity and sufficiency|sufficient conditiontheorem]] forthat aasserts [[functionthat, (mathematics)|function]]if to bea [[Invertiblereal function|invertible]] in''f'' has a [[Neighbourhoodcontinuously (mathematics)differentiable function|neighborhoodcontinuous derivative]] ofnear a point inwhere its [[___domainderivative ofis a function|___domain]]:nonzero, namelythen, thatnear this point, its ''derivativef'' ishas continuousan and[[inverse non-zero at the point''function]]. The theoreminverse alsofunction givesis aalso [[formuladifferentiable function|differentiable]], forand the ''[[derivativeinverse function rule]]'' ofexpresses its derivative as the [[inversemultiplicative functioninverse]] of the derivative of ''f''.
 
In [[multivariable calculus]], this theorem can be generalized to any [[continuously differentiable]], [[vector-valued function]] whose [[Jacobian determinant]] is nonzero at a point in its ___domain, giving a formula for the [[Jacobian matrix]] of the inverse. There are also versions of the inverse function theorem for [[holomorphic function]]s, for differentiable maps between [[manifold]]s, for differentiable functions between [[Banach space]]s, and so forth.
The theorem applies verbatim to [[complex-valued function]]s of a [[complex number|complex variable]]. It generalizes to functions from
''n''-[[tuples]] (of real or complex numbers) to ''n''-tuples, and to functions between [[vector space]]s of the same finite dimension, by replacing "derivative" with "[[Jacobian matrix]]" and "nonzero derivative" with "nonzero [[Jacobian determinant]]".
 
InIf [[multivariablethe calculus]],function thisof the theorem canbelongs beto generalizeda to anyhigher [[continuouslydifferentiability differentiableclass]], [[vector-valuedthe function]] whose [[Jacobian determinant]]same is nonzero at a point in its ___domain, giving a formulatrue for the [[Jacobianinverse matrix]] of the inversefunction. There are also versions of the inverse function theorem for [[holomorphic function]]s, for differentiable maps between [[manifold]]s, for differentiable functions between [[Banach space]]s, and so forth.
 
The theorem was first established by [[Émile Picard|Picard]] and [[Édouard Goursat|Goursat]] using an iterative scheme: the basic idea is to prove a [[fixed point theorem]] using the [[contraction mapping theorem]].
Line 23 ⟶ 27:
 
There are two variants of the inverse function theorem.<ref name="Hörmander" /> Given a continuously differentiable map <math>f : U \to \mathbb{R}^m</math>, the first is
*The derivative <math>f'(a)</math> is surjective (i.e., the Jacobian matrix representing it has rank <math>m</math>) if and only if there exists a continuously differentiable function <math>g</math> on a neighborhood <math>V</math> of <math>b = f(a)</math> such that <math>f \circ g = I</math> near <math>b</math>,
and the second is
*The derivative <math>f'(a)</math> is injective if and only if there exists a continuously differentiable function <math>g</math> on a neighborhood <math>V</math> of <math>b = f(a)</math> such that <math>g \circ f = I</math> near <math>a</math>.
 
In the first case (when <math>f'(a)</math> is surjective), the point <math>b = f(a)</math> is called a [[regular value]]. Since <math>m = \dim \ker(f'(a)) + \dim \operatorname{im}(f'(a))</math>, the first case is equivalent to saying <math>b = f(a)</math> is not in the image of [[Critical point (mathematics)#Critical point of a differentiable map|critical points]] <math>a</math> (a critical point is a point <math>a</math> such that the kernel of <math>f'(a)</math> is nonzero). The statement in the first case is a special case of the [[submersion theorem]].
Line 42 ⟶ 46:
\end{bmatrix}.
</math>
The Jacobian matrix of it at <math>(x, y)</math> is:
:<math>
J_FJF(x,y)=
\begin{bmatrix}
{e^x \cos y} & {-e^x \sin y}\\
Line 50 ⟶ 54:
\end{bmatrix}
</math>
with Jacobianthe determinant:
:<math>
\det J_FJF(x,y)=
e^{2x} \cos^2 y + e^{2x} \sin^2 y=
e^{2x}.
Line 73 ⟶ 77:
Yet another proof uses [[Newton's method]], which has the advantage of providing an [[effective method|effective version]] of the theorem: bounds on the derivative of the function imply an estimate of the size of the neighborhood on which the function is invertible.<ref name="hubbard_hubbard">{{cite book |first1=John H. |last1=Hubbard |author-link=John H. Hubbard |first2=Barbara Burke |last2=Hubbard|author2-link=Barbara Burke Hubbard |title=Vector Analysis, Linear Algebra, and Differential Forms: A Unified Approach |edition=Matrix |year=2001 }}</ref>
 
=== Proof for Singlesingle-Variablevariable Functionsfunctions ===
We want to prove the following: ''Let <math>D \subseteq \R</math> be an open set with <math>x_0 \in D, f: D \to \R</math> a continuously differentiable function defined on <math>D</math>, and suppose that <math>f'(x_0) \ne 0</math>. Then there exists an open interval <math>I</math> with <math>x_0 \in I</math> such that <math>f</math> maps <math>I</math> bijectively onto the open interval <math>J = f(I)</math>. Moreover, and such that the inverse function <math>f^{-1} : J \to I</math> is continuously differentiable, and for any <math>y \in J</math>, if <math>x \in I</math> is such that <math>f(x) = y</math>, then <math>(f^{-1})'(y) = \dfrac{1}{f'(x)}</math>.''
 
We may without loss of generality assume that <math>f'(x_0) > 0</math>. Given that <math>D</math> is an open set and <math>f'</math> is continuous at <math>x_0</math>, there exists <math>r > 0</math> such that <math>(x_0 - r, x_0 + r) \subseteq D</math> and<math display="block">|f'(x) - f'(x_0)| < \dfrac{f'(x_0)}{2} \qquad \text{for all } |x - x_0| < r.</math>
 
In particular,<math display="block">f'(x) > \dfrac{f'(x_0)}{2} >0 \qquad \text{for all } |x - x_0| < r.</math>
 
This shows that <math>f</math> is strictly increasing for all <math>|x - x_0| < r</math>. Let <math>\delta > 0</math> be such that <math>\delta < r</math>. Then <math>[x - \delta, x + \delta] \subseteq (x_0 - r, x_0 + r)</math>. By the intermediate value theorem, we find that <math>f</math> maps the interval <math>[x - \delta, x + \delta]</math> bijectively onto <math>[f(x - \delta), f(x + \delta)]</math>. Denote by <math>I = (x-\delta, x+\delta)</math> and <math>J = (f(x - \delta),f(x + \delta))</math>. Then <math>f: I \to J</math> is a bijection and the inverse <math>f^{-1}: J \to I</math> exists. The fact that <math>f^{-1}: J \to I</math> is differentiable follows from the differentiability of <math>f</math>. In particular, the result follows from the fact that if <math>f: I \to \R</math> is a strictly monotonic and continuous function that is differentiable at <math>x_0 \in I</math> with <math>f'(x_0) \ne 0</math>, then <math>f^{-1}: f(I) \to \R</math> is differentiable with <math>(f^{-1})'(y_0) = \dfrac{1}{f'(y_0)}</math>, where <math>y_0 = f(x_0)</math> (a standard result in analysis). This completes the proof.
 
=== A proof using successive approximation ===
Line 99 ⟶ 103:
To check that <math>g=f^{-1}</math> is C<sup>1</sup>, write <math>g(y+k) = x+h</math> so that
<math>f(x+h)=f(x)+k</math>. By the inequalities above, <math>\|h-k\| <\|h\|/2</math> so that <math>\|h\|/2<\|k\| < 2\|h\|</math>.
On the other hand, if <math>A=f^\prime(x)</math>, then <math>\|A-I\|<1/2</math>. Using the [[geometric series]] for <math>B=I-A</math>, it follows that <math>\|A^{-1}\| < 2</math>. But then
 
:<math> {\|g(y+k) -g(y) - f^\prime(g(y))^{-1}k \| \over \|k\|}
Line 160 ⟶ 164:
*given a map <math>f : \mathbb{R}^n \times \mathbb{R}^m \to \mathbb{R}^m</math>, if <math>f(a, b) = 0</math>, <math>f</math> is continuously differentiable in a neighborhood of <math>(a, b)</math> and the derivative of <math>y \mapsto f(a, y)</math> at <math>b</math> is invertible, then there exists a differentiable map <math>g : U \to V</math> for some neighborhoods <math>U, V</math> of <math>a, b</math> such that <math>f(x, g(x)) = 0</math>. Moreover, if <math>f(x, y) = 0, x \in U, y \in V</math>, then <math>y = g(x)</math>; i.e., <math>g(x)</math> is a unique solution.
To see this, consider the map <math>F(x, y) = (x, f(x, y))</math>. By the inverse function theorem, <math>F : U \times V \to W</math> has the inverse <math>G</math> for some neighborhoods <math>U, V, W</math>. We then have:
:<math>(x, y) = F(G_1(x, y), G_2(x, y)) = (G_1(x, y), f(G_1(x, y), G_2(x, y))),</math>
implying <math>x = G_1(x, y)</math> and <math>y = f(x, G_2(x, y)).</math> Thus <math>g(x) = G_2(x, 0)</math> has the required property. <math>\square</math>
 
Line 186 ⟶ 190:
The lemma implies the following (a sort of) global version of the inverse function theorem:
 
{{math_theorem|name=Inverse function theorem|math_statement=<ref>Ch. I., § 3, Exercise 10. and § 8, Exercise 14. in V. Guillemin, A. Pollack. "Differential Topology". Prentice-Hall Inc., 1974. ISBN 0-13-212605-2.</ref> Let <math>f : U \to V</math> be a map between open subsets of <math>\mathbb{R}^n, \mathbb{R}^m</math> or more generally of manifolds. Assume <math>f</math> is continuously differentiable (or is <math>C^k</math>). If <math>f</math> is injective on a closed subset <math>A \subset U</math> and if the Jacobian matrix of <math>f</math> is invertible at each point of <math>A</math>, then <math>f</math> is injective inon a neighborhood <math>A'</math> of <math>A</math> and <math>f^{-1} : f(A') \to A'</math> is continuously differentiable (or is <math>C^k</math>).}}
 
Note that if <math>A</math> is a point, then the above is the usual inverse function theorem.
Line 232 ⟶ 236:
 
=== Over a real closed field ===
The inverse function theorem also holds over a [[real closed field]] ''k'' (or an [[Oo-minimal structure]]).<ref>Chapter 7, Theorem 2.11. in {{cite book |doi=10.1017/CBO9780511525919|title=Tame Topology and O-minimal Structures. London Mathematical Society lecture note series, no. 248|year=1998 |last1=Dries |first1=L. P. D. van den |authorlink = Lou van den Dries|isbn=9780521598385|publisher=Cambridge University Press|___location=Cambridge, New York, and Oakleigh, Victoria }}</ref> Precisely, the theorem holds for a semialgebraic (or definable) map between open subsets of <math>k^n</math> that is continuously differentiable.
 
The usual proof of the IFT uses Banach's fixed point theorem, which relies on the Cauchy completeness. That part of the argument is replaced by the use of the [[extreme value theorem]], which does not need completeness. Explicitly, in {{section link||A_proof_using_the_contraction_mapping_principle}}, the Cauchy completeness is used only to establish the inclusion <math>B(0, r/2) \subset f(B(0, r))</math>. Here, we shall directly show <math>B(0, r/4) \subset f(B(0, r))</math> instead (which is enough). Given a point <math>y</math> in <math>B(0, r/4)</math>, consider the function <math>P(x) = |f(x) - y|^2</math> defined on a neighborhood of <math>\overline{B}(0, r)</math>. If <math>P'(x) = 0</math>, then <math>0 = P'(x) = 2[f_1(x) - y_1 \cdots f_n(x) - y_n]f'(x)</math> and so <math>f(x) = y</math>, since <math>f'(x)</math> is invertible. Now, by the extreme value theorem, <math>P</math> admits a minimal at some point <math>x_0</math> on the closed ball <math>\overline{B}(0, r)</math>, which can be shown to lie in <math>B(0, r)</math> using <math>2^{-1}|x| \le |f(x)|</math>. Since <math>P'(x_0) = 0</math>, <math>f(x_0) = y</math>, which proves the claimed inclusion. <math>\square</math>
 
Alternatively, one can deduce the theorem from the one over real numbers by [[Tarski's principle]].{{citation needed|date=December 2024}}
 
==See also==