Top Banner
… since the fabric of the universe is most perfect and the work of a most wise Creator, nothing at all takes place in the universe in which some rule of maximum or minimum does not appear. Introduction to introduction to introduction to Optimizat ion Leonhard Euler
23

Introduction to introduction to introduction to … Optimization

Feb 20, 2016

Download

Documents

Herb

Introduction to introduction to introduction to … Optimization . Leonhard Euler. … since the fabric of the universe is most perfect and the work of a most wise Creator, nothing at all takes place in the universe in which some rule of maximum or minimum does not appear. Boredom. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to  introduction to  introduction to  … Optimization

… since the fabric of the universe is most perfect and the work of a most wise Creator, nothing at all takes place in the universe in which some rule of maximum or minimum does not appear.

Introduction to introduction to introduction to …

Optimization

Leonhard Euler

Page 2: Introduction to  introduction to  introduction to  … Optimization

Boredom

Lecture time

Understanding

1 min 10 mins 30 mins 1 hour

Optimal listening time for a talk: 8 minutes 25 seconds *

Page 3: Introduction to  introduction to  introduction to  … Optimization

height

time

Action at a point := Kinetic Energy – Potential Energy.Action for a path := Integrate action at points over the path.

Nature chooses the path of “least” action! Pierre Louis Moreau de Maupertuis

Page 4: Introduction to  introduction to  introduction to  … Optimization

A + BC AB + C

Reference: ‘Holy grails of Chemistry’, Pushpendu Kumar Das. Acknowledgement: Deepika Viswanathan, PhD student, IPC, IISc

Page 5: Introduction to  introduction to  introduction to  … Optimization
Page 6: Introduction to  introduction to  introduction to  … Optimization

The path taken between two points by a ray of light is the path that can be traversed in the least time

Fermat

For all thermodynamic processes between the same initial and final state, the delivery of work is a maximum for a reversible process

Gibbs

William of Ockham

Among competing hypotheses, the hypothesis with the fewest assumptions should be selected.

Page 7: Introduction to  introduction to  introduction to  … Optimization

Travelling Salesman Problem (TSP)

Courtesy: xkcd

Page 8: Introduction to  introduction to  introduction to  … Optimization

• A hungry cow is at position (2,2) in a open field.• It is tied to a rope that is 1 unit long.• Grass is at position (4,3)• A perpendicular electric fence passes through the point (2.5,2)• How close can the cow get to the fodder?

0 1 2 3 4

5

4

3

2

1

0

• What do we want to find?• Position of cow: Let (x,y)

be the solution.

• What do we want to be solution to satisfy?• Grass: min (x-4)^2 + (y-

3)^2

• What restrictions does the cow have?• Rope: (x-2)^2 + (y-2)^2

<= 1• Fence: x <= 2.5

Page 9: Introduction to  introduction to  introduction to  … Optimization

Variables: (x,y) (position of cow)

Objective : (x-4)^2 + (y-3)^2 (distance from grass)

Constraints:

(x-2)^2 + (y-2)^2 <= 1 (rope) x <= 2.5 (fence)

Framework

minimize/maximize Objective (a function of Variables) subject to Constraints (functions of Variables)

Page 10: Introduction to  introduction to  introduction to  … Optimization

How?Unconstrained case:

min (x-4)^2 + (y-3)^2

- Cow starts at (2,2)- Does not know where grass is. Knows only distance from

grass.- Needs ‘good’ direction to move from current point.

5

4

3

2

1

00 1 2 3 4

5101

The key question in optimization is ‘What is a good direction?’

Page 11: Introduction to  introduction to  introduction to  … Optimization

Y = X^2+2

X

Y

Cur_X = 0.8Cur_X = -0.5

In general

Cur_X > 0

Cur_X < 0

New_X = cur_X + d

If Cur_X > 0 , want d < 0 Cur_X < 0, want d > 0

How can you choose d? Derivative!

Page 12: Introduction to  introduction to  introduction to  … Optimization

Y = X^2+2

X

Y

New_X = cur_X + d

If Cur_X > 0 , want d < 0 Cur_X < 0, want d > 0

Derivative at Cur_X: 2(Cur_X)

Negative of the derivative does the trick !

Page 13: Introduction to  introduction to  introduction to  … Optimization

Y = X^2+2

X

Y

Example: Cur_X = 0.5d = Negative derivative(Cur_X) = - 2(0.5) = -1

New_X = Cur_X + d = 0.5 -1 = -0.5Update: Cur_X = New_X = -0.5

d = Negative derivative(Cur_X) = -2(-0.5) = 1New_X = Cur_X + d = -0.5 + 1 = 0.5

What was the problem? “Step Size”

Think: How should you modify step size at every step to avoid this problem?

Page 14: Introduction to  introduction to  introduction to  … Optimization

0 1 2 3 4

5

4

3

2

1

0

Objective : (x-4)^2 + (y-3)^2

x

y

Algorithm: Gradient descent1. Start at any position Cur.2. Find gradient at Cur3. Cur = Cur –

(stepSize)*gradient 4. Repeat

Gradient is the generalization of derivative to higher dimensions

Negative gradient at (2,2) = (4,2)Points towards grass!Negative gradient at (1,5) = (6,-4)Points towards grass!

Page 15: Introduction to  introduction to  introduction to  … Optimization

• Gradient descent is the simplest unconstrained optimization procedure. Easy to implement.

• If stepSize is chosen properly, it will provably converge to a local minimum

Think: Why doesn’t the gradient descent algorithm always converge to a global minimum?

Think: How to modify the algorithm to find a local maximum?

• Host of other methods which pick the ‘direction’ differently

Think: Can you come up with a method that picks a different direction than just the negative gradient?

Gradient descent - summary

Page 16: Introduction to  introduction to  introduction to  … Optimization

Constrained Optimization

0 1 2 3 4

5

4

3

2

1

0

Real world problems are rarely unconstrained!

Need to understand gradients betterto understand how to solve them.

Page 17: Introduction to  introduction to  introduction to  … Optimization

Let us begin with the Taylor series expansion of a function.

For small enough , we have

What should the value of d be such that is as small as possible?

The negative derivative is the direction of maximum descent.

Important: Any direction such that is a descent direction !

Functions of one variable

Page 18: Introduction to  introduction to  introduction to  … Optimization

Functions of many variables

Any direction such that is a descent direction

Page 19: Introduction to  introduction to  introduction to  … Optimization

Constrained Optimization

Minimize f(x)

Such that g(x) = 0

Say somebody gives you a point x* and claims it is the solution to this problem.

What properties should this point satisfy?

- Must be feasible g(x*) = 0 - There must NOT be a descent direction that is also feasible!

Given a point x,

Descent direction: Any direction which will lead to a point x’ such that f(x’) < f(x)

Feasible direction: Any direction which will lead to a point x’ such that g(x’) = 0

Page 20: Introduction to  introduction to  introduction to  … Optimization

There must NOT be a descent direction that is also feasible!

Minimize f(x)

Such that g(x) = 0

Page 21: Introduction to  introduction to  introduction to  … Optimization

Minimize f(x)

Such that g(x) = 0

Minimize f(x) + g(x) (x,

Constrained Optimization Problem

Unconstrained Optimization Problem

𝜆𝑖𝑠𝑐𝑎𝑙𝑙𝑒𝑑 h𝑡 𝑒𝑳𝒂𝒈𝒓𝒂𝒏𝒈𝒆𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟

Page 22: Introduction to  introduction to  introduction to  … Optimization

What we did not cover?

• Constrained optimization with multiple constraints

• Constrained optimization with inequality constraints

• Karush-Kuhn-Tucker (KKT) conditions

• Linear Programs• Convex optimization• Duality theory

• etc etc etc

Page 23: Introduction to  introduction to  introduction to  … Optimization

Summary

• Optimization is a very useful branch of applied mathematics

• Very well studied, yet there are numerous problems to work on

• If interested, we can talk more

Thank you !