Convex function: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 15:50, 27 November 2023 edit Erel Segal (talk \| contribs) Extended confirmed users, IP block exemptions 14,576 edits →Strongly convex functions Tag: Visual edit ← Previous edit		Latest revision as of 19:23, 1 August 2025 edit undo Knief (talk \| contribs) 1 edit mNo edit summary
(21 intermediate revisions by 14 users not shown)
Line 5: [[File:Convex vs. Not-convex.jpg\|thumb\|right\|300px\|Convex vs. Not convex]] In [[mathematics]], a [[real-valued function]] is called '''convex''' if the [[line segment]] between any two distinct points on the [[graph of a function\|graph of the function]] lies above or on the graph between the two points. Equivalently, a function is convex if its [[epigraph (mathematics)\|''epigraph'']] (the set of points on or above the graph of the function) is a [[convex set]]. In simple terms, a convex function graph is shaped like a cup <math>\cup</math> (or a straight line like a linear function), while a [[concave function]]'s graph is shaped like a cap <math>\cap</math>. Line 22: <li>For all <math>0 \leq t \leq 1</math> and all <math>x_1, x_2 \in X</math>: <math display=block>f\left(t x_1 + (1-t) x_2\right) \leq t f\left(x_1\right) + (1-t) f\left(x_2\right)</math> The right hand side represents the straight line between <math>\left(x_1, f\left(x_1\right)\right)</math> and <math>\left(x_2, f\left(x_2\right)\right)</math> in the graph of <math>f</math> as a function of <math>t;</math> increasing <math>t</math> from <math>0</math> to <math>1</math> or decreasing <math>t</math> from <math>1</math> to <math>0</math> sweeps this line. Similarly, the argument of the function <math>f</math> in the left hand side represents the straight line between <math>x_1</math> and <math>x_2</math> in <math>X</math> or the <math>x</math>-axis of the graph of <math>f.</math> So, this condition requires that the straight line between any pair of points on the curve of <math>f</math> to be above or just ~~meets~~meeting the graph.<ref>{{Cite web\|last=\|first=\|date=\|title=Concave Upward and Downward\|url=https://www.mathsisfun.com/calculus/concave-up-down-convex.html\|url-status=live\|archive-url=https://web.archive.org/web/20131218034748/http://www.mathsisfun.com:80/calculus/concave-up-down-convex.html \|archive-date=2013-12-18 \|access-date=\|website=}}</ref> </li> <li>For all <math>0 < t < 1</math> and all <math>x_1, x_2 \in X</math> such that <math>x_1 \neq x_2</math>: <math display=block>f\left(t x_1 + (1-t) x_2\right) \leq t f\left(x_1\right) + (1-t) f\left(x_2\right)</math> Line 37: <math display=block>f\left(t x_1 + (1-t) x_2\right) < t f\left(x_1\right) + (1-t) f\left(x_2\right)</math> A strictly convex function <math>f</math> is a function such that the straight line between any pair of points on the curve <math>f</math> is above the curve <math>f</math> except for the intersection points between the straight line and the curve. An example of a function which is convex but not strictly convex is <math>f(x,y) = x^2 + y</math>. This function is not strictly convex because any two points sharing an x coordinate will have a straight line between them, while any two points NOT sharing an x coordinate will have a greater value of the function than the points between them. The function <math>f</math> is said to be '''{{em\|[[Concave function\|concave]]}}''' (resp. '''{{em\|strictly concave}}''') if <math>-f</math> (<math>f</math> multiplied by −1) is convex (resp. strictly convex). Line 50: === Functions of one variable === * Suppose <math>f</math> is a function of one [[real number\|real]] variable defined on an interval, and let <math display=block>R(x_1, x_2) = \frac{f(x_2) - f(x_1)}{x_2 - x_1}</math> (note that <math>R(x_1, x_2)</math> is the slope of the purple line in the ~~above~~first drawing; the function <math>R</math> is [[Symmetric function\|symmetric]] in <math>(x_1, x_2),</math> means that <math>R</math> does not change by exchanging <math>x_1</math> and <math>x_2</math>). <math>f</math> is convex if and only if <math>R(x_1, x_2)</math> is [[monotonically non-decreasing]] in <math>x_1,</math> for every fixed <math>x_2</math> (or vice versa). This characterization of convexity is quite useful to prove the following results. * A convex function <math>f</math> of one real variable defined on some [[open interval]] <math>C</math> is [[Continuous function\|continuous]] on <math>C. </math>. Moreover, <math>f</math> admits [[Semi-differentiability\|left and right derivatives]], and these are [[monotonically non-decreasing]]. In addition, the left derivative is left-continuous and the right-derivative is right-continuous. As a consequence, <math>f</math> is [[differentiable function\|differentiable]] at all but at most [[countable\|countably many]] points, the set on which <math>f</math> is not differentiable can however still be dense. If <math>C</math> is closed, then <math>f</math> may fail to be continuous at the endpoints of <math>C</math> (an example is shown in the [[#Examples\|examples section]]). * A [[differentiable function\|differentiable]] function of one variable is convex on an interval if and only if its [[derivative]] is [[monotonically non-decreasing]] on that interval. If a function is differentiable and convex then it is also [[continuously differentiable]]. * A differentiable function of one variable is convex on an interval if and only if its graph lies above all of its [[tangent]]s:<ref name="boyd">{{cite book\| title=Convex Optimization\| first1=Stephen P.\|last1=Boyd \|first2=Lieven\| last2=Vandenberghe \| year = 2004 \|publisher=Cambridge University Press\| isbn=978-0-521-83378-3\| url= https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf#page=83 \|format=pdf \| access-date=October 15, 2011}}</ref>{{rp\|69}} <math display=block>f(x) \geq f(y) + f'(y) (x-y)</math> for all <math>x</math> and <math>y</math> in the interval. Line 58: * If <math>f</math> is a convex function of one real variable, and <math>f(0)\le 0</math>, then <math>f</math> is [[Superadditivity\|superadditive]] on the [[positive reals]], that is <math>f(a + b) \geq f(a) + f(b)</math> for positive real numbers <math>a</math> and <math>b</math>. {{math proof\|proof= Since <math>f</math> is convex, by using one of the convex function definitions above and letting <math>x_2 = 0,</math> it follows that for all real <math>0 \leq t \leq 1,</math> <math display=block> \begin{align} f(tx_1) & = f(t x_1 + (1-t) \cdot 0) \\ & \leq t f(x_1) + (1-t) f(0) \\ & \leq t f(x_1). \~~rightarrow~~\ \end{align} </math> From <math>f(tx_1)\leq t f(x_1).</math> ~~From this~~, it follows that <math display=block> \begin{align} f(a) + f(b) & = f \left((a+b) \frac{a}{a+b} \right) + f \left((a+b) \frac{b}{a+b} \right) \\ & \leq \frac{a}{a+b} f(a+b) + \frac{b}{a+b} ~~f(a+b) =~~ f(a+b) \~~rightarrow f(a) + f(b)~~ \~~leq f(a+b).</math>~~ & = f(a+b).\\ \end{align}</math> Namely, <math>f(a) + f(b) \leq f(a+b)</math>. }} Line 65 ⟶ 77: === Functions of several variables === * A function that is marginally convex in each individual variable is not necessarily (jointly) convex. For example, the function <math>f(x, y) = x y</math> is [[bilinear map\|marginally linear]], and thus marginally convex, in each variable, but not (jointly) convex. * A function <math>f : X \to [-\infty, \infty]</math> valued in the [[extended real numbers]] <math>[-\infty, \infty] = \R \cup \{\pm\infty\}</math> is convex if and only if its [[Epigraph (mathematics)\|epigraph]] <math display=block>\{(x, r) \in X \times \R ~:~ r \geq f(x)\}</math> is a convex set. * A differentiable function <math>f</math> defined on a convex ___domain is convex if and only if <math>f(x) \geq f(y) + \nabla f(y)^T \cdot (x-y)</math> holds for all <math>x, y</math> in the ___domain. Line 132 ⟶ 145: Strongly convex functions are in general easier to work with than convex or strictly convex functions, since they are a smaller class. Like strictly convex functions, strongly convex functions have unique minima on compact sets. ===~~Uniformly~~ Properties of strongly-convex functions === If ''f'' is a strongly-convex function with parameter ''m'', then:<ref name=":0">{{Cite web \|last=Nemirovsky and Ben-Tal \|date=2023 \|title=Optimization III: Convex Optimization \|url=http://www2.isye.gatech.edu/~nemirovs/OPTIIILN2023Spring.pdf}}</ref>{{Rp\|___location=Prop.6.1.4}} * For every real number ''r'', the [[level set]] {''x'' \| ''f''(''x'') ≤ ''r''} is [[Compact space\|compact]]. * The function ''f'' has a unique [[global minimum]] on ''R<sup>n</sup>''. == Uniformly convex functions == A uniformly convex function,<ref name="Zalinescu">{{cite book\|title=Convex Analysis in General Vector Spaces\|author=C. Zalinescu\|publisher=World Scientific\|year=2002\|isbn=9812380671}}</ref><ref name="Bauschke">{{cite book\|page=[https://archive.org/details/convexanalysismo00hhba/page/n161 144]\|title=Convex Analysis and Monotone Operator Theory in Hilbert Spaces \|url=https://archive.org/details/convexanalysismo00hhba\|url-access=limited\|author=H. Bauschke and P. L. Combettes \|publisher=Springer \|year=2011 \|isbn=978-1-4419-9467-7}}</ref> with modulus <math>\phi</math>, is a function <math>f</math> that, for all <math>x, y</math> in the ___domain and <math>t \in [0, 1],</math> satisfies <math display="block">f(tx+(1-t)y) \le t f(x)+(1-t)f(y) - t(1-t) \phi(\\|x-y\\|)</math> where <math>\phi</math> is a function that is non-negative and vanishes only at 0. This is a generalization of the concept of strongly convex function; by taking <math>\phi(\alpha) = \tfrac{m}{2} \alpha^2</math> we recover the definition of strong convexity. Line 146 ⟶ 164: * The [[absolute value]] function <math>f(x)=\|x\|</math> is convex (as reflected in the [[triangle inequality]]), even though it does not have a derivative at the point <math>x = 0.</math> It is not strictly convex. * The function <math>f(x)=\|x\|^p</math> for <math>p \ge 1</math> is convex. * The [[exponential function]] <math>f(x)=e^x</math> is convex. It is also strictly convex, since <math>f''(x)=e^x >0 </math>, but it is not strongly convex since the second derivative can be arbitrarily close to zero. More generally, the function <math>g(x) = e^{f(x)}</math> is [[Logarithmically convex function\|logarithmically convex]] if <math>f</math> is a convex function. The term "superconvex" is sometimes used instead.<ref>{{Cite journal \| last1 = Kingman \| first1 = J. F. C. \| doi = 10.1093/qmath/12.1.283 \| title = A Convexity Property of Positive Matrices \| journal = The Quarterly Journal of Mathematics \| volume = 12 \| pages = 283–284 \| year = 1961 \| bibcode = 1961QJMat..12..283K }}</ref> * The function <math>f</math> with ___domain [0,1] defined by <math>f(0) = f(1) = 1, f(x) = 0</math> for <math>0 < x < 1</math> is convex; it is continuous on the open interval <math>(0, 1),</math> but not continuous at 0 and 1. * The function <math>x^3</math> has second derivative <math>6 x</math>; thus it is convex on the set where <math>x \geq 0</math> and [[concave function\|concave]] on the set where <math>x \leq 0.</math>