convex function

ECE1505H Convex Optimization. Lecture 7: Examples of convex and concave functions, local and global minimums. Taught by Prof. Stark Draper

February 2, 2017 Uncategorized No comments , , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Disclaimer

Peeter’s lecture notes from class. These may be incoherent and rough.

These are notes for the UofT course ECE1505H, Convex Optimization, taught by Prof. Stark Draper, from [1].

Today

  • Local and global optimality
  • Compositions of functions
  • Examples

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:20}
\begin{aligned}
F(x) &= x^2 \\
F”(x) &= 2 > 0
\end{aligned}
\end{equation}

strictly convex.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:40}
\begin{aligned}
F(x) &= x^3 \\
F”(x) &= 6 x.
\end{aligned}
\end{equation}

Not always non-negative, so not convex. However \( x^3 \) is convex on \( \textrm{dom} F = \mathbb{R}_{+} \).

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:60}
\begin{aligned}
F(x) &= x^\alpha \\
F'(x) &= \alpha x^{\alpha-1} \\
F”(x) &= \alpha(\alpha-1) x^{\alpha-2}.
\end{aligned}
\end{equation}

 

fig. 1. Powers of x.

This is convex on \( \mathbb{R}_{+} \), if \( \alpha \ge 1 \), or \( \alpha \le 0 \).

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:80}
\begin{aligned}
F(x) &= \log x \\
F'(x) &= \inv{x} \\
F”(x) &= -\inv{x^2} \le 0
\end{aligned}
\end{equation}

This is concave.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:100}
\begin{aligned}
F(x) &= x\log x \\
F'(x) &= \log x + x \inv{x} = 1 + \log x \\
F”(x) &= \inv{x}
\end{aligned}
\end{equation}

This is strictly convex on
\( \mathbb{R}_{++} \), where
\( F”(x) \ge 0 \).

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:120}
\begin{aligned}
F(x) &= e^{\alpha x} \\
F'(x) &= \alpha e^{\alpha x} \\
F”(x) &= \alpha^2 e^{\alpha x} \ge 0
\end{aligned}
\end{equation}

fig. 2. Exponential.

Such functions are plotted in fig. 2, and are convex function for all \( \alpha \).

Example:

For symmetric \( P \in S^n \)

\begin{equation}\label{eqn:convexOptimizationLecture7:140}
\begin{aligned}
F(\Bx) &= \Bx^\T P \Bx + 2 \Bq^\T \Bx + r \\
\spacegrad F &= (P + P^\T) \Bx + 2 \Bq = 2 P \Bx + 2 \Bq \\
\spacegrad^2 F &= 2 P.
\end{aligned}
\end{equation}

This is convex(concave) if \( P \ge 0 \) (\( P \le 0\)).

Example:

A quadratic function

\begin{equation}\label{eqn:convexOptimizationLecture7:780}
F(x, y) = x^2 + y^2 + 3 x y,
\end{equation}

that is neither convex nor concave is plotted in fig 3.

fig 3. Function with saddle point (3d and contours)

This function can be put in matrix form

\begin{equation}\label{eqn:convexOptimizationLecture7:160}
F(x, y) = x^2 + y^2 + 3 x y
=
\begin{bmatrix}
x & y
\end{bmatrix}
\begin{bmatrix}
1 & 1.5 \\
1.5 & 1
\end{bmatrix}
\begin{bmatrix}
x \\
y
\end{bmatrix},
\end{equation}

and has the Hessian

\begin{equation}\label{eqn:convexOptimizationLecture7:180}
\begin{aligned}
\spacegrad^2 F
&=
\begin{bmatrix}
\partial_{xx} F & \partial_{xy} F \\
\partial_{yx} F & \partial_{yy} F \\
\end{bmatrix} \\
&=
\begin{bmatrix}
2 & 3 \\
3 & 2
\end{bmatrix} \\
&= 2 P.
\end{aligned}
\end{equation}

From the plot we know that this is not PSD, but this can be confirmed by checking the eigenvalues

\begin{equation}\label{eqn:convexOptimizationLecture7:200}
\begin{aligned}
0
&=
\det ( P – \lambda I ) \\
&=
(1 – \lambda)^2 – 1.5^2,
\end{aligned}
\end{equation}

which has solutions

\begin{equation}\label{eqn:convexOptimizationLecture7:220}
\lambda = 1 \pm \frac{3}{2} = \frac{3}{2}, -\frac{1}{2}.
\end{equation}

This is not PSD nor negative semi-definite, because it has one positive and one negative eigenvalues. This is neither convex nor concave.

Along \( y = -x \),

\begin{equation}\label{eqn:convexOptimizationLecture7:240}
\begin{aligned}
F(x,y)
&=
F(x,-x) \\
&=
2 x^2 – 3 x^2 \\
&=
– x^2,
\end{aligned}
\end{equation}

so it is concave along this line. Along \( y = x \)

\begin{equation}\label{eqn:convexOptimizationLecture7:260}
\begin{aligned}
F(x,y)
&=
F(x,x) \\
&=
2 x^2 + 3 x^2 \\
&=
5 x^2,
\end{aligned}
\end{equation}

so it is convex along this line.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:280}
F(\Bx) = \sqrt{ x_1 x_2 },
\end{equation}

on \( \textrm{dom} F = \setlr{ x_1 \ge 0, x_2 \ge 0 } \)

For the Hessian
\begin{equation}\label{eqn:convexOptimizationLecture7:300}
\begin{aligned}
\PD{x_1}{F} &= \frac{1}{2} x_1^{-1/2} x_2^{1/2} \\
\PD{x_2}{F} &= \frac{1}{2} x_2^{-1/2} x_1^{1/2}
\end{aligned}
\end{equation}

The Hessian components are

\begin{equation}\label{eqn:convexOptimizationLecture7:320}
\begin{aligned}
\PD{x_1}{} \PD{x_1}{F} &= -\frac{1}{4} x_1^{-3/2} x_2^{1/2} \\
\PD{x_1}{} \PD{x_2}{F} &= \frac{1}{4} x_2^{-1/2} x_1^{-1/2} \\
\PD{x_2}{} \PD{x_1}{F} &= \frac{1}{4} x_1^{-1/2} x_2^{-1/2} \\
\PD{x_2}{} \PD{x_2}{F} &= -\frac{1}{4} x_2^{-3/2} x_1^{1/2}
\end{aligned}
\end{equation}

or
\begin{equation}\label{eqn:convexOptimizationLecture7:340}
\spacegrad^2 F
=
-\frac{\sqrt{x_1 x_2}}{4}
\begin{bmatrix}
\inv{x_1^2} & -\inv{x_1 x_2} \\
-\inv{x_1 x_2} & \inv{x_2^2}
\end{bmatrix}.
\end{equation}

Checking this for PSD against \( \Bv = (v_1, v_2) \), we have
\begin{equation}\label{eqn:convexOptimizationLecture7:360}
\begin{aligned}
\begin{bmatrix}
v_1 & v_2
\end{bmatrix}
\begin{bmatrix}
\inv{x_1^2} & -\inv{x_1 x_2} \\
-\inv{x_1 x_2} & \inv{x_2^2}
\end{bmatrix}
\begin{bmatrix}
v_1 \\ v_2
\end{bmatrix}
&=
\begin{bmatrix}
v_1 & v_2
\end{bmatrix}
\begin{bmatrix}
\inv{x_1^2} v_1 -\inv{x_1 x_2} v_2 \\
-\inv{x_1 x_2} v_1 + \inv{x_2^2} v_2
\end{bmatrix} \\
&=
\lr{ \inv{x_1^2} v_1 -\inv{x_1 x_2} v_2 } v_1 +
\lr{ -\inv{x_1 x_2} v_1 + \inv{x_2^2} v_2 } v_2
\\
&=
\inv{x_1^2} v_1^2
+ \inv{x_2^2} v_2^2
-2 \inv{x_1 x_2} v_1 v_2 \\
&=
\lr{
\frac{v_1}{x_1}
-\frac{v_2}{x_2}
}^2 \\
&\ge 0,
\end{aligned}
\end{equation}

so \( \spacegrad^2 F \le 0 \). This is a negative semi-definite function (concave). Observe that this check required checking PSD for all values of \( \Bx \).

This is an example of a more general result

\begin{equation}\label{eqn:convexOptimizationLecture7:380}
F(x) = \lr{ \prod_{i = 1}^n x_i }^{1/n},
\end{equation}

which is concave (prove on homework).

Summary.

If \( F \) is differentiable in \R{n}, then check the curvature of the function along all lines. i.e. At all locations and in all directions.

If the Hessian is PSD at all \( \Bx \in \textrm{dom} F \), that is

\begin{equation}\label{eqn:convexOptimizationLecture7:400}
\spacegrad^2 F \ge 0 \, \forall \Bx \in \textrm{dom} F,
\end{equation}

then the function is convex.

more examples of convex, but not necessarily differentiable functions

Example:

Over \( \textrm{dom} F = \mathbb{R}^n \)

\begin{equation}\label{eqn:convexOptimizationLecture7:420}
F(\Bx) = \max_{i = 1}^n x_i
\end{equation}

i.e.
\begin{equation}\label{eqn:convexOptimizationLecture7:440}
\begin{aligned}
F((1,2) &= 2 \\
F((3,-1) &= 3
\end{aligned}
\end{equation}

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:460}
F(\Bx) = \max_{i = 1}^n F_i(\Bx),
\end{equation}

where

\begin{equation}\label{eqn:convexOptimizationLecture7:480}
F_i(\Bx)
=
… ?
\end{equation}

max of a set of convex functions is a convex function.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:500}
F(x) =
x_{[1]} +
x_{[2]} +
x_{[3]}
\end{equation}

where

\( x_{[k]} \) is the k-th largest number in the list

Write

\begin{equation}\label{eqn:convexOptimizationLecture7:520}
F(x) = \max x_i + x_j + x_k
\end{equation}

\begin{equation}\label{eqn:convexOptimizationLecture7:540}
(i,j,k) \in \binom{n}{3}
\end{equation}

Example:

For \( \Ba \in \mathbb{R}^n \) and \( b_i \in \mathbb{R} \)

\begin{equation}\label{eqn:convexOptimizationLecture7:560}
\begin{aligned}
F(\Bx)
&= \sum_{i = 1}^n \log( b_i – \Ba^\T \Bx )^{-1} \\
&= -\sum_{i = 1}^n \log( b_i – \Ba^\T \Bx )
\end{aligned}
\end{equation}

This \( b_i – \Ba^\T \Bx \) is an affine function of \( \Bx \) so it doesn’t affect convexity.

Since \( \log \) is concave, \( -\log \) is convex. Convex functions of affine function of \( \Bx \) is convex function of \( \Bx \).

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:580}
F(\Bx) = \sup_{\By \in C} \Norm{ \Bx – \By }
\end{equation}

 

fig. 3. Max length function

 

Here \( C \subseteq \mathbb{R}^n \) is not necessarily convex. We are using \( \sup \) here because the set \( C \) may be open. This function is the length of the line from \( \Bx \) to the point in \( C \) that is furthest from \( \Bx \).

  • \( \Bx – \By \) is linear in \( \Bx \)
  • \( g_\By(\Bx) = \Norm{\Bx – \By} \) is convex in \( \Bx \) since norms are convex functions.
  • \( F(\Bx) = \sup_{\By \in C} \Norm{ \Bx – \By } \). Each \( \By \) index is a convex function. Taking max of those.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture7:600}
F(\Bx) = \inf_{\By \in C} \Norm{ \Bx – \By }.
\end{equation}

Min and max of two convex functions are plotted in fig. 4.

fig. 4. Min and max

The max is observed to be convex, whereas the min is not necessarily so.

\begin{equation}\label{eqn:convexOptimizationLecture7:800}
F(\Bz) = F(\theta \Bx + (1-\theta) \By) \ge \theta F(\Bx) + (1-\theta)F(\By).
\end{equation}

This is not necessarily convex for all sets \( C \subseteq \mathbb{R}^n \), because the \( \inf \) of a bunch of convex function is not necessarily convex. However, if \( C \) is convex, then \( F(\Bx) \) is convex.

Consequences of convexity for differentiable functions

  • Think about unconstrained functions \( \textrm{dom} F = \mathbb{R}^n \).
  • By first order condition \( F \) is convex iff the domain is convex and
    \begin{equation}\label{eqn:convexOptimizationLecture7:620}
    F(\Bx) \ge \lr{ \spacegrad F(\Bx)}^\T (\By – \Bx) \, \forall \Bx, \By \in \textrm{dom} F.
    \end{equation}

If \( F \) is convex and one can find an \( \Bx^\conj \in \textrm{dom} F \) such that

\begin{equation}\label{eqn:convexOptimizationLecture7:640}
\spacegrad F(\Bx^\conj) = 0,
\end{equation}

then

\begin{equation}\label{eqn:convexOptimizationLecture7:660}
F(\By) \ge F(\Bx^\conj) \, \forall \By \in \textrm{dom} F.
\end{equation}

If you can find the point where the gradient is zero (which can’t always be found), then \( \Bx^\conj\) is a global minimum of \( F \).

Conversely, if \( \Bx^\conj \) is a global minimizer of \( F \), then \( \spacegrad F(\Bx^\conj) = 0 \) must hold. If that were not the case, then you would be able to find a direction to move downhill, contracting the optimality of \( \Bx^\conj\).

Local vs Global optimum

 

fig. 6. Global and local minimums

Definition: Local optimum
\( \Bx^\conj \) is a local optimum of \( F \) if \( \exists \epsilon > 0 \) such that \( \forall \Bx \), \( \Norm{\Bx – \Bx^\conj} < \epsilon \), we have

\begin{equation*}
F(\Bx^\conj) \le F(\Bx)
\end{equation*}

 

fig. 5. min length function

Theorem:
Suppose \( F \) is twice continuously differentiable (not necessarily convex)

  • If \( \Bx^\conj\) is a local optimum then\begin{equation*}
    \begin{aligned}
    \spacegrad F(\Bx^\conj) &= 0 \\
    \spacegrad^2 F(\Bx^\conj) \ge 0
    \end{aligned}
    \end{equation*}
  • If
    \begin{equation*}
    \begin{aligned}
    \spacegrad F(\Bx^\conj) &= 0 \\
    \spacegrad^2 F(\Bx^\conj) \ge 0
    \end{aligned},
    \end{equation*}then \( \Bx^\conj\) is a local optimum.

Proof:

  • Let \( \Bx^\conj \) be a local optimum. Pick any \( \Bv \in \mathbb{R}^n \).\begin{equation}\label{eqn:convexOptimizationLecture7:720}
    \lim_{t \rightarrow 0} \frac{ F(\Bx^\conj + t \Bv) – F(\Bx^\conj)}{t}
    = \lr{ \spacegrad F(\Bx^\conj) }^\T \Bv
    \ge 0.
    \end{equation}

Here the fraction is \( \ge 0 \) since \( \Bx^\conj \) is a local optimum.

Since the choice of \( \Bv \) is arbitrary, the only case that you can ensure that \( \ge 0, \forall \Bv \) is

\begin{equation}\label{eqn:convexOptimizationLecture7:740}
\spacegrad F = 0,
\end{equation}

( or else could pick \( \Bv = -\spacegrad F(\Bx^\conj) \).

This means that \( \spacegrad F(\Bx^\conj) = 0 \) if \( \Bx^\conj \) is a local optimum.

Consider the 2nd order derivative

\begin{equation}\label{eqn:convexOptimizationLecture7:760}
\begin{aligned}
\lim_{t \rightarrow 0} \frac{ F(\Bx^\conj + t \Bv) – F(\Bx^\conj)}{t^2}
&=
\lim_{t \rightarrow 0} \inv{t^2}
\lr{
F(\Bx^\conj) + t \lr{ \spacegrad F(\Bx^\conj) }^\T \Bv + \inv{2} t^2 \Bv^\T \spacegrad^2 F(\Bx^\conj) \Bv + O(t^3)
– F(\Bx^\conj)
} \\
&=
\inv{2} \Bv^\T \spacegrad^2 F(\Bx^\conj) \Bv \\
&\ge 0.
\end{aligned}
\end{equation}

Here the \( \ge \) condition also comes from the fraction, based on the optimiality of \( \Bx^\conj \). This is true for all choice of \( \Bv \), thus \( \spacegrad^2 F(\Bx^\conj) \).

References

[1] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.

ECE1505H Convex Optimization. Lecture 5: Sets, epigraphs, quasi-convexity, and sublevel sets. Taught by Prof. Stark Draper

January 26, 2017 ece1505 No comments , , , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Disclaimer

Peeter’s lecture notes from class. These may be incoherent and rough.

These are notes for the UofT course ECE1505H, Convex Optimization, taught by Prof. Stark Draper, from [1].

Last time

  • examples of sets: planes, half spaces, balls, ellipses, cone of positive semi-definite matrices
  • generalized inequalities
  • examples of convexity preserving operations

Today

  • more examples of convexity preserving operations
  • separating and supporting hyperplanes
  • basic definitions of convex functions
  • epigraphs, quasi-convexity, sublevel sets
  • first and second order conditions for convexity of differentiable functions.

Operations that preserve convexity

If \( S_\alpha \) is convex \( \forall \alpha \in A \), then

\begin{equation}\label{eqn:convexOptimizationLecture5:40}
\cup_{\alpha \in A} S_\alpha,
\end{equation}

is convex.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture5:60}
F(\Bx) = A \Bx + \Bb
\end{equation}

\begin{equation}\label{eqn:convexOptimizationLecture5:80}
\begin{aligned}
\Bx &\in \mathbb{R}^n \\
A &\in \mathbb{R}^{m \times n} \\
F &: \mathbb{R}^{n} \rightarrow \mathbb{R}^m \\
\Bb &\in \mathbb{R}^m
\end{aligned}
\end{equation}

  1. If \( S \in \mathbb{R}^n \) is convex, then\begin{equation}\label{eqn:convexOptimizationLecture5:100}
    F(S) = \setlr{ F(\Bx) | \Bx \in S }
    \end{equation}is convex if \( F \) is affine.
  2. If \( S \in \mathbb{R}^m \) is convex, then\begin{equation}\label{eqn:convexOptimizationLecture5:120}
    F^{-1}(S) = \setlr{ \Bx | F(\Bx) \in S }
    \end{equation}

    is convex.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture5:140}
\setlr{ \By | \By = A \Bx + \Bb, \Norm{\Bx} \le 1}
\end{equation}

is convex. Here \( A \Bx + \Bb \) is an affine function (\(F(\Bx)\). This is the image of a (convex) unit ball, through an affine map.

Earlier saw when defining ellipses

\begin{equation}\label{eqn:convexOptimizationLecture5:160}
\By = P^{1/2} \Bx + \Bx_c
\end{equation}

Example :

\begin{equation}\label{eqn:convexOptimizationLecture5:180}
\setlr{ \Bx | \Norm{ A \Bx + \Bb } \le 1 },
\end{equation}

is convex. This can be seen by writing

\begin{equation}\label{eqn:convexOptimizationLecture5:200}
\begin{aligned}
\setlr{ \Bx | \Norm{ A \Bx + \Bb } \le 1 }
&=
\setlr{ \Bx | \Norm{ F(\Bx) } \le 1 } \\
&=
\setlr{ \Bx | F(\Bx) \in \mathcal{B} },
\end{aligned}
\end{equation}

where \( \mathcal{B} = \setlr{ \By | \Norm{\By} \le 1 } \). This is the pre-image (under \(F()\)) of a unit norm ball.

Example:

\begin{equation}\label{eqn:convexOptimizationLecture5:220}
\setlr{ \Bx \in \mathbb{R}^n | x_1 A_1 + x_2 A_2 + \cdots x_n A_n \le \mathcal{B} }
\end{equation}

where \( A_i \in S^m \) and \( \mathcal{B} \in S^m \), and the inequality is a matrix inequality. This is a convex set. The constraint is a “linear matrix inequality” (LMI).

This has to do with an affine map:

\begin{equation}\label{eqn:convexOptimizationLecture5:240}
F(\Bx) = B – 1 x_1 A_1 – x_2 A_2 – \cdots x_n A_n \ge 0
\end{equation}

(positive semi-definite inequality). This is a mapping

\begin{equation}\label{eqn:convexOptimizationLecture5:480}
F : \mathbb{R}^n \rightarrow S^m,
\end{equation}

since all \( A_i \) and \( B \) are in \( S^m \).

This \( F(\Bx) = B – A(\Bx) \) is a constant and a factor linear in x, so is affine. Can be written

\begin{equation}\label{eqn:convexOptimizationLecture5:260}
\setlr{ \Bx | B – A(\Bx) \ge 0 }
=
\setlr{ \Bx | B – A(\Bx) \in S^m_{+} }
\end{equation}

This is a pre-image of a cone of PSD matrices, which is convex. Therefore, this is a convex set.

Separating hyperplanes

Theorem: Separating hyperplanes

If \( S, T \subseteq \mathbb{R}^n \) are convex and disjoint
i.e. \( S \cup T = 0\), then
there exists on \( \Ba \in \mathbb{R}^n \) \( \Ba \ne 0 \) and a \( \Bb \in \mathbb{R}^n \) such that

\begin{equation*}
\Ba^\T \Bx \ge \Bb \, \forall \Bx \in S
\end{equation*}

and
\begin{equation*}
\Ba^\T \Bx < \Bb \,\forall \Bx \in T.
\end{equation*}

An example of a hyperplanes that separates two sets and two sets that are not separable is sketched in fig 1.1

Proof in the book.

Theorem: Supporting hyperplane
If \( S \) is convex then \( \forall x_0 \in \partial S = \textrm{cl}(S) \ \textrm{int}(S) \), where
\( \partial S \) is the boundary of \( S \), then \( \exists \) an \( \Ba \ne 0 \in \mathbb{R}^n \) such that \( \Ba^\T \Bx \le \Ba^\T x_0 \, \forall \Bx \in S \).

Here \( \ \) denotes “without”.

An example is sketched in fig. 3, for which

fig. 3. Supporting hyperplane.

  • The vector \( \Ba \) perpendicular to tangent plane.
  • inner product \( \Ba^\T (\Bx – \Bx_0) \le 0 \).

A set with a supporting hyperplane is sketched in fig 4a whereas fig 4b shows that there is not necessarily a unique supporting hyperplane at any given point, even if \( S \) is convex.

 

fig 4a. Set with supporting hyperplane.

 

fig 4b. No unique supporting hyperplane possible.

basic definitions of convex functions

Theorem: Convex functions
If \( F : \mathbb{R}^n \rightarrow \mathbb{R} \) is defined on a convex domain (i.e. \( \textrm{dom} F \subseteq \mathbb{R}^n \) is a convex set), then \( F \) is convex if \( \forall \Bx, \By \in \textrm{dom} F \), \( \forall \theta \in [0,1] \in \mathbb{R} \)

\begin{equation}\label{eqn:convexOptimizationLecture5:340}
F( \theta \Bx + (1-\theta) \By \le \theta F(\Bx) + (1-\theta) F(\By)
\end{equation}

An example is sketched in fig. 5.

 

fig. 5. Example of convex function.

Remarks

  • Require \( \textrm{dom} F \) to be a convex set. This is required so that the function at the point \( \theta u + (1-\theta) v \) can be evaluated. i.e. so that \( F(\theta u + (1-\theta) v) \) is well defined. Example: \( \textrm{dom} F = (-\infty, 0] \cup [1, \infty) \) is not okay, because a linear combination in \( (0,1) \) would be undesirable.
  • Parameter \( \theta \) is “how much up” the line segment connecting \( (u, F(u) \) and \( (v, F(v) \). This line segment never below the bottom of the bowl.
    The function is \underlineAndIndex{concave}, if \( -F \) is convex.
    i.e. If the convex function is flipped upside down. That is\begin{equation}\label{eqn:convexOptimizationLecture5:360}
    F(\theta \Bx + (1-\theta) \By ) \ge \theta F(\Bx) + (1-\theta) F(\By) \,\forall \Bx,\By \in \textrm{dom} F, \theta \in [0,1].
    \end{equation}
  • a “strictly” convex function means \(\forall \theta \in [0,1] \)\begin{equation}\label{eqn:convexOptimizationLecture5:380}
    F(\theta \Bx + (1-\theta) \By ) < \theta F(\Bx) + (1-theta) F(\By).
    \end{equation}
  • Strictly concave function \( F \) means \( -F \) is strictly convex.
  • Examples:\imageFigure{../figures/ece1505-convex-optimization/l5Fig6a}{}{fig:l5:l5Fig6a}{0.2}

    fig 6a. Not convex or concave.

     

    fig 6b. Not strictly convex

Definition: Epigraph of a function

The epigraph \( \textrm{epi} F \) of a function \( F : \mathbb{R}^n \rightarrow \mathbb{R} \) is

\begin{equation*}
\textrm{epi} F = \setlr{ (\Bx,t) \in \mathbb{R}^{n +1} | \Bx \in \textrm{dom} F, t \ge F(\Bx) },
\end{equation*}

where \( \Bx \in \mathbb{R}^n, t \in \mathbb{R} \).

 

fig. 7. Epigraph.

Theorem: Convexity and epigraph.
If \( F \) is convex implies \( \textrm{epi} F \) is a convex set.

Proof:

For convex function, a line segment connecting any 2 points on function is above the function. i.e. it is \( \textrm{epi} F \).

Many authors will go the other way around, showing \ref{dfn:convexOptimizationLecture5:400} from \ref{thm:convexOptimizationLecture5:420}. That is:

Pick any 2 points in \( \textrm{epi} F \), \( (\Bx,\mu) \in \textrm{epi} F\) and \( (\By, \nu) \in \textrm{epi} F \). Consider convex combination

\begin{equation}\label{eqn:convexOptimizationLecture5:420}
\theta( \Bx, \mu ) + (1-\theta) (\By, \nu) =
(\theta \Bx (1-\theta) \By, \theta \mu (1-\theta) \nu )
\in \textrm{epi} F,
\end{equation}

since \( \textrm{epi} F \) is a convex set.

By definition of \( \textrm{epi} F \)

\begin{equation}\label{eqn:convexOptimizationLecture5:440}
F( \theta \Bx (1-\theta) \By ) \le \theta \mu (1-\theta) \nu.
\end{equation}

Picking \( \mu = F(\Bx), \nu = F(\By) \) gives
\begin{equation}\label{eqn:convexOptimizationLecture5:460}
F( \theta \Bx (1-\theta) \By ) \le \theta F(\Bx) (1-\theta) F(\By).
\end{equation}

Extended value function

Sometimes convenient to work with “extended value function”

\begin{equation}\label{eqn:convexOptimizationLecture5:500}
\tilde{F}(\Bx) =
\left\{
\begin{array}{l l}
F(\Bx) & \quad \mbox{If \( \Bx \in \textrm{dom} F\)} \\
\infty & \quad \mbox{otherwise.}
\end{array}
\right.
\end{equation}

Examples:

  • Linear (affine) functions (fig. 8) are both convex and concave.

    fig. 8. Linear functions.

  • \( x^2 \) is convex, sketched in fig. 9.

    fig. 9. Convex (quadratic.)

  • \( \log x, \textrm{dom} F = \mathbb{R}_{+} \) concave, sketched in fig. 10.

    fig. 10. Concave (logarithm.)

  • \( \Norm{\Bx} \) is convex. \( \Norm{ \theta \Bx + (1-\theta) \By } \le \theta \Norm{ \Bx } + (1-\theta) \Norm{\By } \).
  • \( 1/x \) is convex on \( \setlr{ x | x > 0 } = \textrm{dom} F \), and concave on \( \setlr{ x | x < 0 } = \textrm{dom} F \). \begin{equation}\label{eqn:convexOptimizationLecture5:520} \tilde{F}(x) = \left\{ \begin{array}{l l} \inv{x} & \quad \mbox{If \( x > 0 \)} \\
    \infty & \quad \mbox{else.}
    \end{array}
    \right.
    \end{equation}

Definition: Sublevel

The sublevel set of a function \( F : \mathbb{R}^n \rightarrow \mathbb{R} \) is

\begin{equation*}
C(\alpha) = \setlr{ \Bx \in \textrm{dom} F | F(\Bx) \le \alpha }
\end{equation*}

 

Convex sublevel

 

Non-convex sublevel.

Theorem:
If \( F \) is convex then \( C(\alpha) \) is a convex set \( \forall \alpha \).

This is not an if and only if condition, as illustrated in fig. 12.

 

fig. 12. Convex sublevel does not imply convexity.

There \( C(\alpha) \) is convex, but the function itself is not.

Proof:

Since \( F \) is convex, then \( \textrm{epi} F \) is a convex set.

  • Let\begin{equation}\label{eqn:convexOptimizationLecture5:580}
    \mathcal{A} = \setlr{ (\Bx,t) | t = \alpha }
    \end{equation}is a convex set.
  • \( \mathcal{A} \cap \textrm{epi} F \)is a convex set since it is the intersection of convex sets.
  • Project \( \mathcal{A} \cap \textrm{epi} F \) onto \R{n} (i.e. domain of \( F \) ). The projection is an affine mapping. Image of a convex set through affine mapping is a convex set.

Definition: Quasi-convex.

A function is quasi-convex if \underline{all} of its sublevel sets are convex.

Composing convex functions

Properties of convex functions:

  • If \( F \) is convex, then \( \alpha F \) is convex \( \forall \alpha > 0 \).
  • If \( F_1, F_2 \) are convex, then the sum \( F_1 + F_2 \) is convex.
  • If \( F \) is convex, then \( g(\Bx) = F(A \Bx + \Bb) \) is convex \( \forall \Bx \in \setlr{ \Bx | A \Bx + \Bb \in \textrm{dom} F } \).

Note: for the last

\begin{equation}\label{eqn:convexOptimizationLecture5:620}
\begin{aligned}
g &: \mathbb{R}^m \rightarrow \mathbb{R} \\
F &: \mathbb{R}^n \rightarrow \mathbb{R} \\
\Bx &\in \mathbb{R}^m \\
A &\in \mathbb{R}^{n \times m} \\
\Bb &\in \mathbb{R}^n
\end{aligned}
\end{equation}

Proof (of last):

\begin{equation}\label{eqn:convexOptimizationLecture5:640}
\begin{aligned}
g( \theta \Bx + (1-\theta) \By )
&=
F( \theta (A \Bx + \Bb) + (1-\theta) (A \By + \Bb) ) \\
&\le
\theta F( A \Bx + \Bb) + (1-\theta) F (A \By + \Bb) \\
&= \theta g(\Bx) + (1-\theta) g(\By).
\end{aligned}
\end{equation}

References

[1] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.