Inverse function theorem: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 23:18, 28 October 2024 edit KingUther (talk \| contribs) Extended confirmed users 517 edits Correct typo in single-variable proof ← Previous edit		Latest revision as of 10:12, 22 August 2025 edit undo Lester Mobley (talk \| contribs) Extended confirmed users 1,836 edits →Over a real closed field: add disambiguating chapter no. to reference
(19 intermediate revisions by 11 users not shown)
Line 2: {{Use dmy dates\|date=December 2023}} {{Calculus}} In [[~~mathematics~~real analysis]], ~~specifically~~a branch of [[~~differential calculus~~mathematics]], the '''inverse function theorem''' ~~gives~~is a [[~~Necessity and sufficiency\|sufficient condition~~theorem]] ~~for~~that aasserts ~~[[function~~that, ~~(mathematics)\|function]]~~if ~~to be~~a [[~~Invertible~~real function~~\|invertible~~]] in''f'' has a [[~~Neighbourhood~~continuously ~~(mathematics)~~differentiable function\|~~neighborhood~~continuous derivative]] ofnear a point inwhere its ~~[[___domain~~derivative ofis ~~a function\|___domain]]:~~nonzero, ~~namely~~then, ~~that~~near this point, ~~its~~ ''~~derivative~~f'' ishas ~~continuous~~an ~~and~~[[inverse ~~non-zero at the point''~~function]]. The ~~theorem~~inverse ~~also~~function ~~gives~~is aalso [[~~formula~~differentiable function\|differentiable]], ~~for~~and the ''[[~~derivative~~inverse function rule]]'' ofexpresses its derivative as the [[~~inverse~~multiplicative ~~function~~inverse]] of the derivative of ''f''. In [[multivariable calculus]], this theorem can be generalized to any [[continuously differentiable]], [[vector-valued function]] whose [[Jacobian determinant]] is nonzero at a point in its ___domain, giving a formula for the [[Jacobian matrix]] of the inverse. There are also versions of the inverse function theorem for [[holomorphic function]]s, for differentiable maps between [[manifold]]s, for differentiable functions between [[Banach space]]s, and so forth.▼ The theorem applies verbatim to [[complex-valued function]]s of a [[complex number\|complex variable]]. It generalizes to functions from ''n''-[[tuples]] (of real or complex numbers) to ''n''-tuples, and to functions between [[vector space]]s of the same finite dimension, by replacing "derivative" with "[[Jacobian matrix]]" and "nonzero derivative" with "nonzero [[Jacobian determinant]]". ▲InIf ~~[[multivariable~~the ~~calculus]],~~function ~~this~~of the theorem ~~can~~belongs beto ~~generalized~~a ~~to any~~higher [[~~continuously~~differentiability ~~differentiable~~class]], ~~[[vector-valued~~the ~~function]] whose [[Jacobian determinant]]~~same is ~~nonzero at a point in its ___domain, giving a formula~~true for the ~~[[Jacobian~~inverse ~~matrix]] of the inverse~~function. There are also versions of the inverse function theorem for [[holomorphic function]]s, for differentiable maps between [[manifold]]s, for differentiable functions between [[Banach space]]s, and so forth. The theorem was first established by [[Émile Picard\|Picard]] and [[Édouard Goursat\|Goursat]] using an iterative scheme: the basic idea is to prove a [[fixed point theorem]] using the [[contraction mapping theorem]]. Line 23 ⟶ 27: There are two variants of the inverse function theorem.<ref name="Hörmander" /> Given a continuously differentiable map <math>f : U \to \mathbb{R}^m</math>, the first is The derivative <math>f'(a)</math> is surjective (i.e., the Jacobian matrix representing it has rank <math>m</math>) if and only if there exists a continuously differentiable function <math>g</math> on a neighborhood <math>V</math> of <math>b = f(a)</math> such that <math>f \circ g = I</math> near <math>b</math>, and the second is The derivative <math>f'(a)</math> is injective if and only if there exists a continuously differentiable function <math>g</math> on a neighborhood <math>V</math> of <math>b = f(a)</math> such that <math>g \circ f = I</math> near <math>a</math>. In the first case (when <math>f'(a)</math> is surjective), the point <math>b = f(a)</math> is called a [[regular value]]. Since <math>m = \dim \ker(f'(a)) + \dim \operatorname{im}(f'(a))</math>, the first case is equivalent to saying <math>b = f(a)</math> is not in the image of [[Critical point (mathematics)#Critical point of a differentiable map\|critical points]] <math>a</math> (a critical point is a point <math>a</math> such that the kernel of <math>f'(a)</math> is nonzero). The statement in the first case is a special case of the [[submersion theorem]]. Line 42 ⟶ 46: \end{bmatrix}. </math> The Jacobian matrix of it at <math>(x, y)</math> is: :<math> ~~J_F~~JF(x,y)= \begin{bmatrix} {e^x \cos y} & {-e^x \sin y}\\ Line 50 ⟶ 54: \end{bmatrix} </math> with ~~Jacobian~~the determinant: :<math> \det ~~J_F~~JF(x,y)= e^{2x} \cos^2 y + e^{2x} \sin^2 y= e^{2x}. Line 73 ⟶ 77: Yet another proof uses [[Newton's method]], which has the advantage of providing an [[effective method\|effective version]] of the theorem: bounds on the derivative of the function imply an estimate of the size of the neighborhood on which the function is invertible.<ref name="hubbard_hubbard">{{cite book \|first1=John H. \|last1=Hubbard \|author-link=John H. Hubbard \|first2=Barbara Burke \|last2=Hubbard\|author2-link=Barbara Burke Hubbard \|title=Vector Analysis, Linear Algebra, and Differential Forms: A Unified Approach \|edition=Matrix \|year=2001 }}</ref> === Proof for ~~Single~~single-~~Variable~~variable ~~Functions~~functions === We want to prove the following: ''Let <math>D \subseteq \R</math> be an open set with <math>x_0 \in D, f: D \to \R</math> a continuously differentiable function defined on <math>D</math>, and suppose that <math>f'(x_0) \ne 0</math>. Then there exists an open interval <math>I</math> with <math>x_0 \in I</math> such that <math>f</math> maps <math>I</math> bijectively onto the open interval <math>J = f(I)</math>~~. Moreover~~, and such that the inverse function <math>f^{-1} : J \to I</math> is continuously differentiable, and for any <math>y \in J</math>, if <math>x \in I</math> is such that <math>f(x) = y</math>, then <math>(f^{-1})'(y) = \dfrac{1}{f'(x)}</math>.'' We may without loss of generality assume that <math>f'(x_0) > 0</math>. Given that <math>D</math> is an open set and <math>f'</math> is continuous at <math>x_0</math>, there exists <math>r > 0</math> such that <math>(x_0 - r, x_0 + r) \subseteq D</math> and<math display="block">\|f'(x) - f'(x_0)\| < \dfrac{f'(x_0)}{2} \qquad \text{for all } \|x - x_0\| < r.</math> In particular,<math display="block">f'(x) > \dfrac{f'(x_0)}{2} >0 \qquad \text{for all } \|x - x_0\| < r.</math> This shows that <math>f</math> is strictly increasing for all <math>\|x - x_0\| < r</math>. Let <math>\delta > 0</math> be such that <math>\delta < r</math>. Then <math>[x - \delta, x + \delta] \subseteq (x_0 - r, x_0 + r)</math>. By the intermediate value theorem, we find that <math>f</math> maps the interval <math>[x - \delta, x + \delta]</math> bijectively onto <math>[f(x - \delta), f(x + \delta)]</math>. Denote by <math>I = (x-\delta, x+\delta)</math> and <math>J = (f(x - \delta),f(x + \delta))</math>. Then <math>f: I \to J</math> is a bijection and the inverse <math>f^{-1}: J \to I</math> exists. The fact that <math>f^{-1}: J \to I</math> is differentiable follows from the differentiability of <math>f</math>. In particular, the result follows from the fact that if <math>f: I \to \R</math> is a strictly monotonic and continuous function that is differentiable at <math>x_0 \in I</math> with <math>f'(x_0) \ne 0</math>, then <math>f^{-1}: f(I) \to \R</math> is differentiable with <math>(f^{-1})'(y_0) = \dfrac{1}{f'(y_0)}</math>, where <math>y_0 = f(x_0)</math> (a standard result in analysis). This completes the proof. === A proof using successive approximation === Line 99 ⟶ 103: To check that <math>g=f^{-1}</math> is C<sup>1</sup>, write <math>g(y+k) = x+h</math> so that <math>f(x+h)=f(x)+k</math>. By the inequalities above, <math>\\|h-k\\| <\\|h\\|/2</math> so that <math>\\|h\\|/2<\\|k\\| < 2\\|h\\|</math>. On the other hand, if <math>A=f^\prime(x)</math>, then <math>\\|A-I\\|<1/2</math>. Using the [[geometric series]] for <math>B=I-A</math>, it follows that <math>\\|A^{-1}\\| < 2</math>. But then :<math> {\\|g(y+k) -g(y) - f^\prime(g(y))^{-1}k \\| \over \\|k\\|} Line 160 ⟶ 164: *given a map <math>f : \mathbb{R}^n \times \mathbb{R}^m \to \mathbb{R}^m</math>, if <math>f(a, b) = 0</math>, <math>f</math> is continuously differentiable in a neighborhood of <math>(a, b)</math> and the derivative of <math>y \mapsto f(a, y)</math> at <math>b</math> is invertible, then there exists a differentiable map <math>g : U \to V</math> for some neighborhoods <math>U, V</math> of <math>a, b</math> such that <math>f(x, g(x)) = 0</math>. Moreover, if <math>f(x, y) = 0, x \in U, y \in V</math>, then <math>y = g(x)</math>; i.e., <math>g(x)</math> is a unique solution. To see this, consider the map <math>F(x, y) = (x, f(x, y))</math>. By the inverse function theorem, <math>F : U \times V \to W</math> has the inverse <math>G</math> for some neighborhoods <math>U, V, W</math>. We then have: :<math>(x, y) = F(G_1(x, y), G_2(x, y)) = (G_1(x, y), f(G_1(x, y), G_2(x, y))),</math> implying <math>x = G_1(x, y)</math> and <math>y = f(x, G_2(x, y)).</math> Thus <math>g(x) = G_2(x, 0)</math> has the required property. <math>\square</math> Line 186 ⟶ 190: The lemma implies the following (a sort of) global version of the inverse function theorem: {{math_theorem\|name=Inverse function theorem\|math_statement=<ref>Ch. I., § 3, Exercise 10. and § 8, Exercise 14. in V. Guillemin, A. Pollack. "Differential Topology". Prentice-Hall Inc., 1974. ISBN 0-13-212605-2.</ref> Let <math>f : U \to V</math> be a map between open subsets of <math>\mathbb{R}^n~~, \mathbb{R}^m~~</math> or more generally of manifolds. Assume <math>f</math> is continuously differentiable (or is <math>C^k</math>). If <math>f</math> is injective on a closed subset <math>A \subset U</math> and if the Jacobian matrix of <math>f</math> is invertible at each point of <math>A</math>, then <math>f</math> is injective inon a neighborhood <math>A'</math> of <math>A</math> and <math>f^{-1} : f(A') \to A'</math> is continuously differentiable (or is <math>C^k</math>).}} Note that if <math>A</math> is a point, then the above is the usual inverse function theorem. Line 232 ⟶ 236: === Over a real closed field === The inverse function theorem also holds over a [[real closed field]] ''k'' (or an [[Oo-minimal structure]]).<ref>Chapter 7, Theorem 2.11. in {{cite book \|doi=10.1017/CBO9780511525919\|title=Tame Topology and O-minimal Structures. London Mathematical Society lecture note series, no. 248\|year=1998 \|last1=Dries \|first1=L. P. D. van den \|authorlink = Lou van den Dries\|isbn=9780521598385\|publisher=Cambridge University Press\|___location=Cambridge, New York, and Oakleigh, Victoria }}</ref> Precisely, the theorem holds for a semialgebraic (or definable) map between open subsets of <math>k^n</math> that is continuously differentiable. The usual proof of the IFT uses Banach's fixed point theorem, which relies on the Cauchy completeness. That part of the argument is replaced by the use of the [[extreme value theorem]], which does not need completeness. Explicitly, in {{section link\|\|A_proof_using_the_contraction_mapping_principle}}, the Cauchy completeness is used only to establish the inclusion <math>B(0, r/2) \subset f(B(0, r))</math>. Here, we shall directly show <math>B(0, r/4) \subset f(B(0, r))</math> instead (which is enough). Given a point <math>y</math> in <math>B(0, r/4)</math>, consider the function <math>P(x) = \|f(x) - y\|^2</math> defined on a neighborhood of <math>\overline{B}(0, r)</math>. If <math>P'(x) = 0</math>, then <math>0 = P'(x) = 2[f_1(x) - y_1 \cdots f_n(x) - y_n]f'(x)</math> and so <math>f(x) = y</math>, since <math>f'(x)</math> is invertible. Now, by the extreme value theorem, <math>P</math> admits a minimal at some point <math>x_0</math> on the closed ball <math>\overline{B}(0, r)</math>, which can be shown to lie in <math>B(0, r)</math> using <math>2^{-1}\|x\| \le \|f(x)\|</math>. Since <math>P'(x_0) = 0</math>, <math>f(x_0) = y</math>, which proves the claimed inclusion. <math>\square</math> Alternatively, one can deduce the theorem from the one over real numbers by [[Tarski's principle]].{{citation needed\|date=December 2024}} ==See also==