Content deleted Content added
Erel Segal (talk | contribs) |
mNo edit summary |
||
(21 intermediate revisions by 14 users not shown) | |||
Line 5:
[[File:Convex vs. Not-convex.jpg|thumb|right|300px|Convex vs. Not convex]]
In [[mathematics]], a [[real-valued function]] is called '''convex''' if the [[line segment]] between any two distinct points on the [[graph of a function|graph of the function]] lies above or on the graph between the two points. Equivalently, a function is convex if its [[epigraph (mathematics)|''epigraph'']] (the set of points on or above the graph of the function) is a [[convex set]].
In simple terms, a convex function graph is shaped like a cup <math>\cup</math> (or a straight line like a linear function), while a [[concave function]]'s graph is shaped like a cap <math>\cap</math>.
Line 22:
<li>For all <math>0 \leq t \leq 1</math> and all <math>x_1, x_2 \in X</math>:
<math display=block>f\left(t x_1 + (1-t) x_2\right) \leq t f\left(x_1\right) + (1-t) f\left(x_2\right)</math>
The right hand side represents the straight line between <math>\left(x_1, f\left(x_1\right)\right)</math> and <math>\left(x_2, f\left(x_2\right)\right)</math> in the graph of <math>f</math> as a function of <math>t;</math> increasing <math>t</math> from <math>0</math> to <math>1</math> or decreasing <math>t</math> from <math>1</math> to <math>0</math> sweeps this line. Similarly, the argument of the function <math>f</math> in the left hand side represents the straight line between <math>x_1</math> and <math>x_2</math> in <math>X</math> or the <math>x</math>-axis of the graph of <math>f.</math> So, this condition requires that the straight line between any pair of points on the curve of <math>f</math>
</li>
<li>For all <math>0 < t < 1</math> and all <math>x_1, x_2 \in X</math> such that <math>x_1
<math display=block>f\left(t x_1 + (1-t) x_2\right) \leq t f\left(x_1\right) + (1-t) f\left(x_2\right)</math>
Line 37:
<math display=block>f\left(t x_1 + (1-t) x_2\right) < t f\left(x_1\right) + (1-t) f\left(x_2\right)</math>
A strictly convex function <math>f</math> is a function such that the straight line between any pair of points on the curve <math>f</math> is above the curve <math>f</math> except for the intersection points between the straight line and the curve. An example of a function which is convex but not strictly convex is <math>f(x,y) = x^2 + y</math>. This function is not strictly convex because any two points sharing an x coordinate will have a straight line between them, while any two points NOT sharing an x coordinate will have a greater value of the function than the points between them.
The function <math>f</math> is said to be '''{{em|[[Concave function|concave]]}}''' (resp. '''{{em|strictly concave}}''') if <math>-f</math> (<math>f</math> multiplied by −1) is convex (resp. strictly convex).
Line 50:
=== Functions of one variable ===
* Suppose <math>f</math> is a function of one [[real number|real]] variable defined on an interval, and let <math display=block>R(x_1, x_2) = \frac{f(x_2) - f(x_1)}{x_2 - x_1}</math> (note that <math>R(x_1, x_2)</math> is the slope of the purple line in the
* A convex function <math>f</math> of one real variable defined on some [[open interval]] <math>C</math> is [[Continuous function|continuous]] on <math>C
* A [[differentiable function|differentiable]] function of one variable is convex on an interval if and only if its [[derivative]] is [[monotonically non-decreasing]] on that interval. If a function is differentiable and convex then it is also [[continuously differentiable]].
* A differentiable function of one variable is convex on an interval if and only if its graph lies above all of its [[tangent]]s:<ref name="boyd">{{cite book| title=Convex Optimization| first1=Stephen P.|last1=Boyd |first2=Lieven| last2=Vandenberghe | year = 2004 |publisher=Cambridge University Press| isbn=978-0-521-83378-3| url= https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf#page=83 |format=pdf | access-date=October 15, 2011}}</ref>{{rp|69}} <math display=block>f(x) \geq f(y) + f'(y) (x-y)</math> for all <math>x</math> and <math>y</math> in the interval.
Line 58:
* If <math>f</math> is a convex function of one real variable, and <math>f(0)\le 0</math>, then <math>f</math> is [[Superadditivity|superadditive]] on the [[positive reals]], that is <math>f(a + b) \geq f(a) + f(b)</math> for positive real numbers <math>a</math> and <math>b</math>.
{{math proof|proof=
Since <math>f</math> is convex, by using one of the convex function definitions above and letting <math>x_2 = 0,</math> it follows that for all real <math>0 \leq t \leq 1,</math> <math display=block>
\begin{align} f(tx_1) & = f(t x_1 + (1-t) \cdot 0) \\ & \leq t f(x_1) + (1-t) f(0) \\ & \leq t f(x_1). \ \end{align} </math> From <math>f(tx_1)\leq t f(x_1) \begin{align} f(a) + f(b) & = f \left((a+b) \frac{a}{a+b} \right) + f \left((a+b) \frac{b}{a+b} \right) \\ & \leq \frac{a}{a+b} f(a+b) + \frac{b}{a+b} & = f(a+b).\\
\end{align}</math>
Namely, <math>f(a) + f(b) \leq f(a+b)</math>.
}}
Line 65 ⟶ 77:
=== Functions of several variables ===
* A function that is marginally convex in each individual variable is not necessarily (jointly) convex. For example, the function <math>f(x, y) = x y</math> is [[bilinear map|marginally linear]], and thus marginally convex, in each variable, but not (jointly) convex.
* A function <math>f : X \to [-\infty, \infty]</math> valued in the [[extended real numbers]] <math>[-\infty, \infty] = \R \cup \{\pm\infty\}</math> is convex if and only if its [[Epigraph (mathematics)|epigraph]] <math display=block>\{(x, r) \in X \times \R ~:~ r \geq f(x)\}</math> is a convex set.
* A differentiable function <math>f</math> defined on a convex ___domain is convex if and only if <math>f(x) \geq f(y) + \nabla f(y)^T \cdot (x-y)</math> holds for all <math>x, y</math> in the ___domain.
Line 132 ⟶ 145:
Strongly convex functions are in general easier to work with than convex or strictly convex functions, since they are a smaller class. Like strictly convex functions, strongly convex functions have unique minima on compact sets.
===
If ''f'' is a strongly-convex function with parameter ''m'', then:<ref name=":0">{{Cite web |last=Nemirovsky and Ben-Tal |date=2023 |title=Optimization III: Convex Optimization |url=http://www2.isye.gatech.edu/~nemirovs/OPTIIILN2023Spring.pdf}}</ref>{{Rp|___location=Prop.6.1.4}}
* For every real number ''r'', the [[level set]] {''x'' | ''f''(''x'') ≤ ''r''} is [[Compact space|compact]].
* The function ''f'' has a unique [[global minimum]] on ''R<sup>n</sup>''.
== Uniformly convex functions ==
A uniformly convex function,<ref name="Zalinescu">{{cite book|title=Convex Analysis in General Vector Spaces|author=C. Zalinescu|publisher=World Scientific|year=2002|isbn=9812380671}}</ref><ref name="Bauschke">{{cite book|page=[https://archive.org/details/convexanalysismo00hhba/page/n161 144]|title=Convex Analysis and Monotone Operator Theory in Hilbert Spaces |url=https://archive.org/details/convexanalysismo00hhba|url-access=limited|author=H. Bauschke and P. L. Combettes |publisher=Springer |year=2011 |isbn=978-1-4419-9467-7}}</ref> with modulus <math>\phi</math>, is a function <math>f</math> that, for all <math>x, y</math> in the ___domain and <math>t \in [0, 1],</math> satisfies
<math display="block">f(tx+(1-t)y) \le t f(x)+(1-t)f(y) - t(1-t) \phi(\|x-y\|)</math>
where <math>\phi</math> is a function that is non-negative and vanishes only at 0. This is a generalization of the concept of strongly convex function; by taking <math>\phi(\alpha) = \tfrac{m}{2} \alpha^2</math> we recover the definition of strong convexity.
Line 146 ⟶ 164:
* The [[absolute value]] function <math>f(x)=|x|</math> is convex (as reflected in the [[triangle inequality]]), even though it does not have a derivative at the point <math>x = 0.</math> It is not strictly convex.
* The function <math>f(x)=|x|^p</math> for <math>p \ge 1</math> is convex.
* The [[exponential function]] <math>f(x)=e^x</math> is convex. It is also strictly convex, since <math>f''(x)=e^x >0 </math>, but it is not strongly convex since the second derivative can be arbitrarily close to zero. More generally, the function <math>g(x) = e^{f(x)}</math> is [[Logarithmically convex function|logarithmically convex]] if <math>f</math> is a convex function. The term "superconvex" is sometimes used instead.<ref>{{Cite journal | last1 = Kingman | first1 = J. F. C. | doi = 10.1093/qmath/12.1.283 | title = A Convexity Property of Positive Matrices | journal = The Quarterly Journal of Mathematics | volume = 12 | pages = 283–284 | year = 1961 | bibcode = 1961QJMat..12..283K }}</ref>
* The function <math>f</math> with ___domain [0,1] defined by <math>f(0) = f(1) = 1, f(x) = 0</math> for <math>0 < x < 1</math> is convex; it is continuous on the open interval <math>(0, 1),</math> but not continuous at 0 and 1.
* The function <math>x^3</math> has second derivative <math>6 x</math>; thus it is convex on the set where <math>x \geq 0</math> and [[concave function|concave]] on the set where <math>x \leq 0.</math>
|