Lesson 25: Unconstrained Optimization I

Lesson 25 (Chapter 17)Unconstrained Optimization I

Math 20

November 19, 2007

Announcements

I Problem Set 9 on the website. Due November 21.

I There will be class November 21 and homework dueNovember 28.

I next OH: Monday 1-2pm, Tuesday 3-4pm

I Midterm II: Thursday, 12/6, 7-8:30pm in Hall A.

Outline

Single-variable recollections

From one to two dimensionsCritical pointsThe Hessian

The second derivative test

More examplesThe discriminating monopolist

Maximum and Minimum Value in single-variable calculus

Theorem (Fermat’s Theorem)

Let f be a function of one variable. If f has a local maximum orminimum at a, then f ′(a) = 0.

Theorem (Theorem 9.2, a/k/a The Second Derivative Test)

Let f be a function of one variable, and suppose f ′(a) = 0.

I If f ′′(a) > 0, then f has a local minimum at a.

I If f ′′(a) < 0, then f has a local maximum at a.

I (If f ′′(a) = 0, this theorem has nothing to say).

Maximum and Minimum Value in single-variable calculus


Let f be a function of one variable. If f has a local maximum orminimum at a, then f ′(a) = 0.

Theorem (Theorem 9.2, a/k/a The Second Derivative Test)

Let f be a function of one variable, and suppose f ′(a) = 0.

I If f ′′(a) > 0, then f has a local minimum at a.

I If f ′′(a) < 0, then f has a local maximum at a.

I (If f ′′(a) = 0, this theorem has nothing to say).

Justification of 2DT

Using Taylor’s Theorem

f (x) = f (a) + f ′(a)(x − a) +1

2f ′′(a)(x − a)2 + R(x),

where R(x)(x−a)2

→ 0 as x → a. (See Sections 5.5 and 7.4) So near a,

f (x) “looks like” a parabola with vertex at (a, f (a)). f ′′(a) is whatdetermines whether this parabola opens up or down.

Outline





How do we generalize this to functions of two variables?

The first derivative f ′(x) is replaced by the gradient

Df =(

∂f∂x

∂f∂y

)∇f =

( ∂f∂x∂f∂y

)


Let f (x , y) be a function of two variables. If f has a localmaximum or minimum at (a, b), and is differentiable at (a,b), then

∂f

∂x(a, b) = 0

∂f

∂y(a, b) = 0

As in one variable, we’ll call these points critical points.


The first derivative f ′(x) is replaced by the gradient

Df =(

∂f∂x

∂f∂y

)∇f =

( ∂f∂x∂f∂y

)


Let f (x , y) be a function of two variables. If f has a localmaximum or minimum at (a, b), and is differentiable at (a,b), then

∂f

∂x(a, b) = 0

∂f

∂y(a, b) = 0

As in one variable, we’ll call these points critical points.

Example

Example

Let f (x , y) = 8x3 − 24xy + y3. Find the critical points of f .

SolutionWe have

∂f

∂x= 24x2 − 24y = 24(x2 − y)

∂f

∂y= 24x − 3y2 = 3(8x − y2)

Both of these are zero if x2 − y = 0 and 8x − y2 = 0. Substitutingthe first into the second gives

0 = (x2)2 − 8x = x4 − 8x = x(x3 − 8) = x(x − 2)(x2 + 2x + 4)

and the solutions are x = 0 and x = 2 (the third factor has no realroots). If x = 0 then y = 0, and If x = 2 then y = 4. So thecritical points are (0, 0) and (2, 4).

Example

Example

Let f (x , y) = 8x3 − 24xy + y3. Find the critical points of f .

SolutionWe have

∂f

∂x= 24x2 − 24y = 24(x2 − y)

∂f

∂y= 24x − 3y2 = 3(8x − y2)

Both of these are zero if x2 − y = 0 and 8x − y2 = 0. Substitutingthe first into the second gives

0 = (x2)2 − 8x = x4 − 8x = x(x3 − 8) = x(x − 2)(x2 + 2x + 4)

and the solutions are x = 0 and x = 2 (the third factor has no realroots). If x = 0 then y = 0, and If x = 2 then y = 4. So thecritical points are (0, 0) and (2, 4).


The second derivative f ′′(x) is replaced by . . .

a matrix, theHessian of f :

Hf =

∂2f

∂x2

∂2f

∂x∂y∂2f

∂y∂x

∂2f

∂y2


The second derivative f ′′(x) is replaced by . . . a matrix, theHessian of f :

Hf =

∂2f

∂x2

∂2f

∂x∂y∂2f

∂y∂x

∂2f

∂y2

Compare and contrast the Hessians at (0, 0) for these functions:

(i) f (x , y) = x2 + y2

(ii) f (x , y) = 1− x2 − y2

(iii) f (x , y) = x2 − y2

(iv) f (x , y) = xy

How are they alike and how are they different?

(i) Hf =

(2 00 2

)(ii) Hf =

(−2 00 −2

) (iii) Hf =

(2 00 −2

)(iv) Hf =

(0 11 0

)

Compare and contrast the Hessians at (0, 0) for these functions:

(i) f (x , y) = x2 + y2

(ii) f (x , y) = 1− x2 − y2

(iii) f (x , y) = x2 − y2

(iv) f (x , y) = xy

How are they alike and how are they different?

(i) Hf =

(2 00 2

)(ii) Hf =

(−2 00 −2

) (iii) Hf =

(2 00 −2

)(iv) Hf =

(0 11 0

)

Outline





Second order Taylor polynomials in two dimensions

The two-variable analog of

f (x) ≈ f (a) + f ′(a)(x − a) +1

2f ′′(a)(x − a)2

is

f (x , y) ≈ f (a, b) + f ′x(a, b)(x − a) + f ′y (a, b)(y − b)

+ 12 f ′′xx(a, b)(x − a)2 + f ′′xy (a, b)(x − a)(y − b)

+ 12 f ′′yy (a, b)(y − a)2

or

f (x) ≈ f (a) +∇f (a) · (x− a) + 12(x− a) · H(a)(x− a)

http://en.wiktionary.org/wiki/Analogous

Analogous

Similar or alike in such a way as to permit the drawing of an analogy.

Recall

This was the big fact about quadratic forms in two variables:

FactLet f (x , y) = ax2 + 2bxy + cy2 be a quadratic form.

I If a > 0 and ac − b2 > 0, then f is positive definite

I If a < 0 and ac − b2 > 0, then f is negative definite

I If ac − b2 < 0, then f is indefinite

Theorem (The Second Derivative Test)

Let f (x , y) be a function of two variables, and let (a, b) be acritical point of f . Then

I If ∂2f∂x2

∂2f∂y2 −

(∂2f

∂x∂y

)2> 0 and ∂2f

∂x2 > 0, the critical point is a

local minimum.

I If ∂2f∂x2

∂2f∂y2 −

(∂2f

∂x∂y

)2> 0 and ∂2f

∂x2 < 0, the critical point is a

local maximum.

I If ∂2f∂x2

∂2f∂y2 −

(∂2f

∂x∂y

)2< 0, the critical point is a saddle point.

All derivatives are evaluated at the critical point (a, b).

Return to the example

Let f (x , y) = 8x3 − 24xy + y3. Classify the critical points.

∂2f

∂x2= 48x

∂2f

∂x∂y= −24

∂2f

∂y∂x= −24

∂2f

∂y2= 6y

I Hf (0, 0) =

(0 −24−24 0

), which has negative determinant.

Hence (0, 0) is a saddle point.

I Hf (2, 4) = 24

(4 −1−1 1

)which, since the determinant is

positive and the top left entry is positive, indicates a localminimum.

Return to the example

Let f (x , y) = 8x3 − 24xy + y3. Classify the critical points.

∂2f

∂x2= 48x

∂2f

∂x∂y= −24

∂2f

∂y∂x= −24

∂2f

∂y2= 6y

I Hf (0, 0) =

(0 −24−24 0

), which has negative determinant.

Hence (0, 0) is a saddle point.

I Hf (2, 4) = 24

(4 −1−1 1

)which, since the determinant is

positive and the top left entry is positive, indicates a localminimum.

Plotting the function

-2

0

2

4

0

5

-10

-5

0

5

10

-1 0 1 2 3-1

0

1

2

3

4

5


-2

0

2

4

0

5

-10

-5

0

5

10

-1 0 1 2 3-1

0

1

2

3

4

5

Online Demo

Try this site (thanks to Tony Pino):

http://www.slu.edu/classes/maymk/banchoff/LevelCurve.html

Launch the applet and enter:

I f (x , y) = x^3 - 3 * x * y + y^3/8 (1/3 of f from theexample)

I x from −1 to 10 in 50 steps

I y from −1 to 10 in 50 steps

I z from −10 to 10 in 50 steps

http://www.slu.edu/classes/maymk/banchoff/LevelCurve.html

Remarks

I The Hessian matrix will always be symmetric in our cases.I If the Hessian has determinant zero, nothing can be said from

this theorem:I f (x , y) = x4 + y4 has a local min at (0, 0)I f (x , y) = −x4 − y4 has a local max at (0, 0)I f (x , y) = x4 − y4 has a saddle point at (0, 0)

In each case Hf (x , y) =

(±12x2 0

0 ±12y2

), so Hf (0, 0) is the

zero matrix.

Outline





Example

A firm sells a product in two separate areas with distinct lineardemand curves, and has monopoly power to decide how much tosell in each area. How does its maximal profit depend on thedemand in each area?

Let the demand curves be given by

P1 = a1 − b1Q1 P2 = a2 − b2Q2

And the cost function by C = α(Q1 + Q2). The profit is therefore

π = P1Q1 + P2Q2 − α(Q1 + Q2)

= (a1 − b1Q1)Q1 + (a2 − b2Q2)Q2 − α(Q1 + Q2)

= (a1 − α)Q1 − b1Q21 + (a2 − α)Q2 − b2Q2

2

Example

A firm sells a product in two separate areas with distinct lineardemand curves, and has monopoly power to decide how much tosell in each area. How does its maximal profit depend on thedemand in each area?

Let the demand curves be given by

P1 = a1 − b1Q1 P2 = a2 − b2Q2

And the cost function by C = α(Q1 + Q2). The profit is therefore

π = P1Q1 + P2Q2 − α(Q1 + Q2)

= (a1 − b1Q1)Q1 + (a2 − b2Q2)Q2 − α(Q1 + Q2)

= (a1 − α)Q1 − b1Q21 + (a2 − α)Q2 − b2Q2

2

π(Q1,Q2) = (a1 − α)Q1 − b1Q21 + (a2 − α)Q2 − b2Q2

2

Solution

We have

∂π

∂Q1= a1 − α− 2b1Q1

∂π

∂Q2= a2 − α− 2b2Q2

So

Q∗1 =a1 − α

2b1Q∗2 =

a2 − α2b2

is the critical point. Also,

Hπ =

(−2b1 0

0 −2b2

)So the critical point (Q∗1 ,Q

∗2 ) is a local maximum.

π(Q1,Q2) = (a1 − α)Q1 − b1Q21 + (a2 − α)Q2 − b2Q2

2

SolutionWe have

∂π

∂Q1= a1 − α− 2b1Q1

∂π

∂Q2= a2 − α− 2b2Q2

So

Q∗1 =a1 − α

2b1Q∗2 =

a2 − α2b2


Hπ =

(−2b1 0

0 −2b2



π(Q1,Q2) = (a1 − α)Q1 − b1Q21 + (a2 − α)Q2 − b2Q2

2

SolutionWe have

∂π

∂Q1= a1 − α− 2b1Q1

∂π

∂Q2= a2 − α− 2b2Q2

So

Q∗1 =a1 − α

2b1Q∗2 =

a2 − α2b2

is the critical point.

Also,

Hπ =

(−2b1 0

0 −2b2



π(Q1,Q2) = (a1 − α)Q1 − b1Q21 + (a2 − α)Q2 − b2Q2

2

SolutionWe have

∂π

∂Q1= a1 − α− 2b1Q1

∂π

∂Q2= a2 − α− 2b2Q2

So

Q∗1 =a1 − α

2b1Q∗2 =

a2 − α2b2


Hπ =

(−2b1 0

0 −2b2

)

So the critical point (Q∗1 ,Q∗2 ) is a local maximum.

π(Q1,Q2) = (a1 − α)Q1 − b1Q21 + (a2 − α)Q2 − b2Q2

2

SolutionWe have

∂π

∂Q1= a1 − α− 2b1Q1

∂π

∂Q2= a2 − α− 2b2Q2

So

Q∗1 =a1 − α

2b1Q∗2 =

a2 − α2b2


Hπ =

(−2b1 0

0 −2b2



Example

Find the critical points of f (x , y) = xx2+y2+1

and classify them.

SolutionThe derivatives are

f ′x =1− x2 + y2

(1 + x2 + y2)2f ′y = − 2xy

(1 + x2 + y2)2

The only way these can both be zero is if y = 0 and x = ±1. Thesecond derivatives are

f ′′xx =2x(x2 − 3

(y2 + 1

))(x2 + y2 + 1)3

f ′′xy =2y(−3x2 + y2 + 1

)(x2 + y2 + 1)3

f ′′yy = −2(x3 − 3y2x + x

)(x2 + y2 + 1)3

Example


and classify them.


f ′x =1− x2 + y2

(1 + x2 + y2)2f ′y = − 2xy

(1 + x2 + y2)2


f ′′xx =2x(x2 − 3

(y2 + 1

))(x2 + y2 + 1)3

f ′′xy =2y(−3x2 + y2 + 1

)(x2 + y2 + 1)3

f ′′yy = −2(x3 − 3y2x + x

)(x2 + y2 + 1)3

Example


and classify them.


f ′x =1− x2 + y2

(1 + x2 + y2)2f ′y = − 2xy

(1 + x2 + y2)2

The only way these can both be zero is if y = 0 and x = ±1.

Thesecond derivatives are

f ′′xx =2x(x2 − 3

(y2 + 1

))(x2 + y2 + 1)3

f ′′xy =2y(−3x2 + y2 + 1

)(x2 + y2 + 1)3

f ′′yy = −2(x3 − 3y2x + x

)(x2 + y2 + 1)3

Example


and classify them.


f ′x =1− x2 + y2

(1 + x2 + y2)2f ′y = − 2xy

(1 + x2 + y2)2


f ′′xx =2x(x2 − 3

(y2 + 1

))(x2 + y2 + 1)3

f ′′xy =2y(−3x2 + y2 + 1

)(x2 + y2 + 1)3

f ′′yy = −2(x3 − 3y2x + x

)(x2 + y2 + 1)3

So

Hf (1, 0) =

(−1

2 00 −1

2

)Hf (−1, 0) =

(12 00 1

2

)

So we have a local max and a local min.

So

Hf (1, 0) =

(−1

2 00 −1

2

)Hf (−1, 0) =

(12 00 1

2

)So we have a local max and a local min.


-2

-1

0

1

2-2

-1

0

1

2

-0.5

0.0

0.5

-2 -1 0 1 2-2

-1

0

1

2


-2

-1

0

1

2-2

-1

0

1

2

-0.5

0.0

0.5

-2 -1 0 1 2-2

-1

0

1

2

Lesson 25: Unconstrained Optimization I

Technology