天天看点

Optimization Week 2: Convex Functions1 Definition of convex function2 Operations preserving convexity3 Examples

Week 2: Convex Functions

  • 1 Definition of convex function
    • 1.1 Basic definition
    • 1.2 Definition of differentiable function
    • 1.3 Definition with monotone gradient
  • 2 Operations preserving convexity
    • 2.1 Non-negative sum
    • 2.2 Affine map
    • 2.3 Pointwise max
    • 2.4 Minimizing out a subset of vars
  • 3 Examples
    • 3.1 Linear, Exponential, Negative entrophy (use hessian)
    • 3.2 All norms
    • 3.3 Maximum singular value of X X X
    • 3.4 Pointwise max: Sum of largest n components, Piece wise linear functions

1 Definition of convex function

1.1 Basic definition

A function f : R d → R f: \mathbb{R}^d\rightarrow \mathbb{R} f:Rd→R is convex if for ∀ \forall ∀ x 1 , x 2 x_1,x_2 x1​,x2​ and ∀ \forall ∀ λ ∈ [ 0 , 1 ] \lambda \in [0,1] λ∈[0,1]:

f ( λ x 1 + ( 1 − λ ) x 2 ) ≤ λ f ( x 1 ) + ( 1 − λ ) f ( x 2 ) f(\lambda x_1+(1-\lambda)x_2)\leq \lambda f(x_1)+(1-\lambda)f(x_2) f(λx1​+(1−λ)x2​)≤λf(x1​)+(1−λ)f(x2​)

1.2 Definition of differentiable function

  • If f f f is differentiable, f f f is convex ⇔ \Leftrightarrow ⇔ ∀ \forall ∀ x x x and y y y, f ( y ) ≥ f ( x ) + ∇ f ( x ) T ( y − x ) f(y)\geq f(x)+\nabla f(x)^T(y-x) f(y)≥f(x)+∇f(x)T(y−x)
  • If f f f is twice differentiable, f f f is convex ⇔ \Leftrightarrow ⇔ ∀ \forall ∀ x x x, ∇ 2 f ( x ) ≥ 0 \nabla^2f(x)\geq 0 ∇2f(x)≥0

1.3 Definition with monotone gradient

  • If f f f is differentiable, f f f is convex ⇔ \Leftrightarrow ⇔ ∀ \forall ∀ x x x and y y y, ( ∇ f ( x ) − ∇ f ( y ) ) ( x − y ) ≥ 0 (\nabla f(x)-\nabla f(y))(x-y)\geq 0 (∇f(x)−∇f(y))(x−y)≥0

2 Operations preserving convexity

2.1 Non-negative sum

f = a 1 f 1 ( x ) + a 2 f 2 ( x ) f=a_1f_1(x)+a_2f_2(x) f=a1​f1​(x)+a2​f2​(x)

2.2 Affine map

f = f 1 ( A x + b ) f=f_1(Ax+b) f=f1​(Ax+b)

2.3 Pointwise max

f = max ⁡ { f 1 ( x ) , f 2 ( x ) } f=\max\{f_1(x),f_2(x)\} f=max{f1​(x),f2​(x)}

2.4 Minimizing out a subset of vars

f ( x ) = min ⁡ y f 1 ( x , y ) f(x)=\min_y f_1(x,y) f(x)=ymin​f1​(x,y)

3 Examples

3.1 Linear, Exponential, Negative entrophy (use hessian)

f ( x ) = a x + b f(x)=ax+b f(x)=ax+b f ( x ) = e a T x f(x)=e^{a^T x} f(x)=eaTx f ( x ) = x log ⁡ x f(x)=x \log x f(x)=xlogx

3.2 All norms

f ( x ) = ∣ x ∣ p , p ≥ 1 f(x)=|x|_p,p\geq 1 f(x)=∣x∣p​,p≥1

3.3 Maximum singular value of X X X

f ( X ) = max  λ  of  X f(X)=\text{max } \lambda \text{ of } X f(X)=max λ of X

3.4 Pointwise max: Sum of largest n components, Piece wise linear functions

继续阅读