Top Banner

of 45

Milano08_optimization

Jun 04, 2018

Download

Documents

Tariq Khan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/13/2019 Milano08_optimization

    1/45

    1Numerical geometry of non-rigid shapes Numerical Optimization

    Numerical Optimization

    Alexander Bronstein, Michael Bronstein

    2008 All rights reserved. Web: tosca.cs.technion.ac.il

  • 8/13/2019 Milano08_optimization

    2/45

    2Numerical geometry of non-rigid shapes Numerical Optimization

    LongestShortest

    Largest Smallest

    Minimal

    MaximalFastest

    Slowest

    Common denominator: optimization problems

  • 8/13/2019 Milano08_optimization

    3/45

    3Numerical geometry of non-rigid shapes Numerical Optimization

    Optimization problems

    Generic unconstrained minimizationproblem

    Vector space is the search space

    is a cost (or objective)function

    A solution is the minimizer of

    The value is the minimum

    where

  • 8/13/2019 Milano08_optimization

    4/45

    4Numerical geometry of non-rigid shapes Numerical Optimization

    Local vs. global minimum

    Local

    minimum

    Global

    minimum

    Find minimum by analyzing the localbehavior of the cost function

  • 8/13/2019 Milano08_optimization

    5/45

    5Numerical geometry of non-rigid shapes Numerical Optimization

    Local vs. global in real life

    Main summit

    8,047 m

    False summit

    8,030 m

    Broad Peak (K3), 12thhighest mountain on Earth

  • 8/13/2019 Milano08_optimization

    6/45

    6Numerical geometry of non-rigid shapes Numerical Optimization

    Convex functions

    A function defined on a convex set is called convex if

    for any and

    Non-convexConvex

    For convex function local minimum = global minimum

  • 8/13/2019 Milano08_optimization

    7/45

    7Numerical geometry of non-rigid shapes Numerical Optimization

    One-dimensional optimality conditions

    Point is the local minimizer of a -function if

    .

    Approximate a function around as a parabola using Taylor expansion

    guaranteesthe minimum at

    guaranteesthe parabola is convex

  • 8/13/2019 Milano08_optimization

    8/45

    8Numerical geometry of non-rigid shapes Numerical Optimization

    Gradient

    In multidimensional case, linearization of the function according to Taylor

    gives a multidimensional analogy of the derivative.

    The function , denoted as , is called the gradientof

    In one-dimensional case, it reduces to standard definition of derivative

  • 8/13/2019 Milano08_optimization

    9/45

    9Numerical geometry of non-rigid shapes Numerical Optimization

    Gradient

    In Euclidean space ( ), can be represented in standard basis

    in the following way:

    which gives

    i-th place

  • 8/13/2019 Milano08_optimization

    10/45

    10Numerical geometry of non-rigid shapes Numerical Optimization

    Example 1: gradient of a matrix function

    Given (space of real matrices) with standard inner

    product

    Compute the gradient of the function where is

    an matrix

    For square matrices

  • 8/13/2019 Milano08_optimization

    11/45

    11Numerical geometry of non-rigid shapes Numerical Optimization

    Example 2: gradient of a matrix function

    Compute the gradient of the function where is

    an matrix

  • 8/13/2019 Milano08_optimization

    12/45

    12Numerical geometry of non-rigid shapes Numerical Optimization

    Hessian

    Linearization of the gradient

    gives a multidimensional analogy of the second-

    order derivative.

    The function , denoted as

    is called the Hessianof

    In the standard basis, Hessian is a symmetric matrix of mixed second-order

    derivatives

    Ludwig Otto Hesse

    (1811-1874)

  • 8/13/2019 Milano08_optimization

    13/45

    13Numerical geometry of non-rigid shapes Numerical Optimization

    Point is the local minimizer of a -function if

    .

    for all , i.e., the Hessian is a positive definite

    matrix (denoted )

    Approximate a function around as a parabola using Taylor expansion

    guarantees

    the minimum at

    guarantees

    the parabola is convex

    Optimality conditions, bis

  • 8/13/2019 Milano08_optimization

    14/45

    14Numerical geometry of non-rigid shapes Numerical Optimization

    Optimization algorithms

    Descent directionStep size

  • 8/13/2019 Milano08_optimization

    15/45

    15Numerical geometry of non-rigid shapes Numerical Optimization

    Generic optimization algorithm

    Start with some

    Determine descent direction

    Choose step size such that

    Update iterate

    Increment iteration counter

    Solution

    Until

    convergence

    Descent direction Step size Stopping criterion

  • 8/13/2019 Milano08_optimization

    16/45

    16Numerical geometry of non-rigid shapes Numerical Optimization

    Stopping criteria

    Near local minimum, (or equivalently )

    Stop when gradient normbecomes small

    Stop when step size becomes small

    Stop when relative objective change becomes small

  • 8/13/2019 Milano08_optimization

    17/45

    17Numerical geometry of non-rigid shapes Numerical Optimization

    Line search

    Optimal step size can be found by solving a one-dimensional optimization

    problem

    One-dimensional optimization algorithms for finding the optimal step size

    are generically called exact line search

  • 8/13/2019 Milano08_optimization

    18/45

    18Numerical geometry of non-rigid shapes Numerical Optimization

    Armijo [ar-mi-xo] rule

    The function sufficiently decreasesif

    Armijo rule(Larry Armijo, 1966): start with and decrease it by

    multiplying by some until the function sufficiently decreases

  • 8/13/2019 Milano08_optimization

    19/45

    19Numerical geometry of non-rigid shapes Numerical Optimization

    Descent direction

    Devils Tower Topographic map

    How to descend in the fastest way?

    Go in the direction in which the height lines are the densest

    http://en.wikipedia.org/wiki/Image:Devil's_tower.gif
  • 8/13/2019 Milano08_optimization

    20/45

    20Numerical geometry of non-rigid shapes Numerical Optimization

    Find a unit-length direction minimizing directional

    derivative

    Steepest descent

    Directional derivative: how much changes in

    the direction (negative for a descent direction)

  • 8/13/2019 Milano08_optimization

    21/45

    21Numerical geometry of non-rigid shapes Numerical Optimization

    Steepest descent

    L1normL2norm

    Coordinate descent (coordinate

    axis in which descent is maximal)

    Normalized steepest descent

  • 8/13/2019 Milano08_optimization

    22/45

    22Numerical geometry of non-rigid shapes Numerical Optimization

    Steepest descent algorithm

    Start with some Compute steepest descent direction

    Choose step sizeusing line search

    Update iterate

    Increment iteration counter

    Until

    convergence

  • 8/13/2019 Milano08_optimization

    23/45

    23Numerical geometry of non-rigid shapes Numerical Optimization

    MATLAB

    intermezzoSteepest descent

  • 8/13/2019 Milano08_optimization

    24/45

    24Numerical geometry of non-rigid shapes Numerical Optimization

    Condition number

    -1 -0.5 0 0.5 1-1

    -0.5

    0

    0.5

    1

    -1 -0.5 0 0.5 1-1

    -0.5

    0

    0.5

    1

    Condition numberis the ratio of maximal and minimal eigenvalues of theHessian ,

    Problem with large condition number is called ill-conditioned

    Steepest descent convergence rate is slowfor ill-conditioned problems

  • 8/13/2019 Milano08_optimization

    25/45

    25Numerical geometry of non-rigid shapes Numerical Optimization

    Q-norm

    Function

    Gradient

    L2normQ-norm

    Descent direction

    Change of

    coordinates

  • 8/13/2019 Milano08_optimization

    26/45

    26Numerical geometry of non-rigid shapes Numerical Optimization

    Preconditioning

    Using Q-norm for steepest descent can be regarded as a change ofcoordinates, called preconditioning

    Preconditioner should be chosen to improve the condition number of

    the Hessian in the proximity of the solution

    In system of coordinates, the Hessian at the solution is

    (a dream)

  • 8/13/2019 Milano08_optimization

    27/45

    27Numerical geometry of non-rigid shapes Numerical Optimization

    Newton method as optimal preconditioner

    Best theoretically possible preconditioner , giving descentdirection

    Newton direction:use Hessian as a preconditioner at each iteration

    Problem:the solution is unknown in advance

    Ideal condition number

  • 8/13/2019 Milano08_optimization

    28/45

    28Numerical geometry of non-rigid shapes Numerical Optimization

    Another derivation of the Newton method

    (quadratic function in )

    Approximate the function as a quadratic function using second-order Taylorexpansion

    Close to solution the function looks like a quadratic function; the Newton

    method converges fast

  • 8/13/2019 Milano08_optimization

    29/45

    29Numerical geometry of non-rigid shapes Numerical Optimization

    Newton method

    Start with some

    Compute Newton direction

    Choose step sizeusing line search

    Update iterate

    Increment iteration counter

    Until

    convergence

  • 8/13/2019 Milano08_optimization

    30/45

    30Numerical geometry of non-rigid shapes Numerical Optimization

    Frozen Hessian

    Observation:close to the optimum, the Hessian does not changesignificantly

    Reduce the number of Hessian inversions by keeping the Hessian from

    previous iterations and update it once in a few iterations

    Such a method is called Newton with frozen Hessian

  • 8/13/2019 Milano08_optimization

    31/45

    31Numerical geometry of non-rigid shapes Numerical Optimization

    Cholesky factorization

    Andre Louis Cholesky

    (1875-1918)

    Decompose the Hessian

    where is a lower triangular matrix

    Forward substitution

    Solve the Newton system

    in two steps

    Backward substitution

    Complexity: , better than straightforward matrix inversion

  • 8/13/2019 Milano08_optimization

    32/45

    32Numerical geometry of non-rigid shapes Numerical Optimization

    Truncated Newton

    Solve the Newton system approximately

    A few iterations of conjugate gradientsor other algorithm for the solution

    of linear systems can be used

    Such a method is called truncatedor inexact Newton

  • 8/13/2019 Milano08_optimization

    33/45

    33Numerical geometry of non-rigid shapes Numerical Optimization

    Non-convex optimization

    MultiresolutionGood initialization

    Local

    minimum

    Global

    minimum

    Using convex optimization methods with non-convex functions does notguarantee global convergence!

    There is no theoretical guaranteed global optimization, just heuristics

  • 8/13/2019 Milano08_optimization

    34/45

    34Numerical geometry of non-rigid shapes Numerical Optimization

    Iterative majorization

    Construct a majorizing function satisfying

    .

    Majorizing inequality: for all

    is convex or easier to optimize w.r.t.

    3

  • 8/13/2019 Milano08_optimization

    35/45

    35Numerical geometry of non-rigid shapes Numerical Optimization

    Iterative majorization

    Start with some

    Find such that

    Update iterate

    Increment iteration counter

    Solution

    Untilconvergence

    36

  • 8/13/2019 Milano08_optimization

    36/45

    36Numerical geometry of non-rigid shapes Numerical Optimization

    Constrained optimization

    MINEFIELD

    CLOSED ZONE

    37

  • 8/13/2019 Milano08_optimization

    37/45

    37Numerical geometry of non-rigid shapes Numerical Optimization

    Constrained optimization problems

    Generic constrained minimizationproblem

    are inequality constraints

    are equality constraints

    A subset of the search space in which the constraints hold is called

    feasible set

    A point belonging to the feasible set is called a feasible solution

    where

    A minimizer of the problem may be infeasible!

    38

  • 8/13/2019 Milano08_optimization

    38/45

    38Numerical geometry of non-rigid shapes Numerical Optimization

    An example

    Inequality constraint

    Equality constraint

    Feasible set

    Inequality constraint is activeat point if , inactiveotherwise

    A point is regularif the gradients of equality constraints and of

    active inequality constraints are linearly independent

    39N i l t f i id h N i l O ti i ti

  • 8/13/2019 Milano08_optimization

    39/45

    39Numerical geometry of non-rigid shapes Numerical Optimization

    Lagrange multipliers

    Main idea to solve constrained problems: arrange the objective andconstraints into a single function

    is called Lagrangian

    and are called Lagrange multipliers

    and minimize it as an unconstrained problem

    40N i l t f i id h N i l O ti i ti

  • 8/13/2019 Milano08_optimization

    40/45

    40Numerical geometry of non-rigid shapes Numerical Optimization

    KKT conditions

    If is a regular point and a local minimum, there exist Lagrange multipliersand such that

    for all and for all

    such that for active constraints and zero forinactive constraints

    Known as Karush-Kuhn-Tucker conditions

    Necessary but not sufficient!

    41N i l t f i id h N i l O ti i ti

  • 8/13/2019 Milano08_optimization

    41/45

    41Numerical geometry of non-rigid shapes Numerical Optimization

    KKT conditions

    If the objective is convex, the inequality constraints are convex

    and the equality constraints are affine, and

    for all and for all

    such that for active constraints and zero for

    inactive constraints

    Sufficient conditions:

    then is the solution of the constrained problem (global constrained

    minimizer)

    42N i l t f i id h N i l O ti i ti

  • 8/13/2019 Milano08_optimization

    42/45

    42Numerical geometry of non-rigid shapes Numerical Optimization

    Geometric interpretation

    The gradient of objective and constraint must line up at the solution

    Equality constraint

    Consider a simpler problem:

    43Numerical geometry of non rigid shapes Numerical Optimization

  • 8/13/2019 Milano08_optimization

    43/45

    43Numerical geometry of non-rigid shapes Numerical Optimization

    Penalty methods

    Define a penalty aggregate

    where and are parametric penalty functions

    For larger values of the parameter , the penalty on the constraint violation

    is stronger

    44Numerical geometry of non rigid shapes Numerical Optimization

  • 8/13/2019 Milano08_optimization

    44/45

    44Numerical geometry of non-rigid shapes Numerical Optimization

    Penalty methods

    Inequality penalty Equality penalty

    45Numerical geometry of non rigid shapes Numerical Optimization

  • 8/13/2019 Milano08_optimization

    45/45

    45Numerical geometry of non-rigid shapes Numerical Optimization

    Penalty methods

    Start with some and initial value of

    Find

    by solving an unconstrained optimization

    problem initialized with

    Set

    Set

    Update Solution

    Until

    convergence