Top Banner
Math 320 – Exam I. Friday Feb 13, 09:55-10:45 Answers I. (60 points.) (a) Find x = x(t) if dx dt + x 1+ t 2 = 0 and x(0) = x 0 . Answer: By separation of variables ln |x| = Z dx x = - Z dt 1+ t 2 = - tan -1 (t)+ C so (as |x| = ±x) we get x = ± exp ( - tan -1 (t)+ C ) = c exp ( - tan -1 (t) ) , c = ±e C . As tan -1 (0) = 0 the initial condition gives x 0 = c. so x = x 0 exp ( - tan -1 (t) ) . (b) Find y = y(t) if dy dt + y 1+ t 2 = exp(- tan -1 t) and y(0) = y 0 . Answer: This is an inhomogeneous linear equation and the general solution x(t)= cΦ(t), Φ(t) := exp ( - tan -1 t ) . of the corresponding homogeneous linear equation was found in part (a). We use as Ansatz y(t)= c(t)Φ(t). Then dy dt + y 1+ t 2 = c (t)Φ(t)+ c(t (t) + c(t)Φ(t) 1+ t 2 = c (t)Φ(t)+ c(t) Φ (t)+ Φ(t) 1+ t 2 = c (t)Φ(t) = exp ( - tan -1 t ) if c (t)=1 i.e. if c(t)= t+constant. Evaluating at t = 0 shows that the constant must be y 0 so c(t)= t + y 0 so the solution is y(t)=(t + y 0 ) exp ( - tan -1 t ) . 1
305

MATH320

Nov 01, 2014

Download

Documents

Seyed Sadegh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MATH320

Math 320 – Exam I. Friday Feb 13, 09:55-10:45

Answers

I. (60 points.) (a) Find x = x(t) ifdx

dt+

x

1 + t2= 0 and x(0) = x0.

Answer: By separation of variables

ln |x| =∫

dx

x= −

∫dt

1 + t2= − tan−1(t) + C

so (as |x| = ±x) we get

x = ± exp(− tan−1(t) + C

)= c exp

(− tan−1(t)), c = ±eC .

As tan−1(0) = 0 the initial condition gives x0 = c. so

x = x0 exp(− tan−1(t)

).

(b) Find y = y(t) ifdy

dt+

y

1 + t2= exp(− tan−1 t) and y(0) = y0.

Answer: This is an inhomogeneous linear equation and the general solution

x(t) = cΦ(t), Φ(t) := exp(− tan−1 t

).

of the corresponding homogeneous linear equation was found in part (a). We useas Ansatz

y(t) = c(t)Φ(t).

Thendy

dt+

y

1 + t2=

(c′(t)Φ(t) + c(t)Φ′(t)

)+

c(t)Φ(t)1 + t2

= c′(t)Φ(t) + c(t)(

Φ′(t) +Φ(t)1 + t2

)

= c′(t)Φ(t)

= exp(− tan−1 t

)if c′(t) = 1

i.e. if c(t) = t+constant. Evaluating at t = 0 shows that the constant must be y0

so c(t) = t + y0 so the solution is

y(t) = (t + y0) exp(− tan−1 t

).

1

Page 2: MATH320

II. (40 points.) (a) State the Existence and Uniqueness Theorem for Ordi-nary Differential Equations.

Answer: If f(t, x) has continuous partial derivatives then the initial value prob-lem

dy

dx= f(x, y), y(x0) = y0

has a unique solution y = y(x).

(b) State the Exactness Criterion for the differential equation

M(x, y) dx + N(x, y) dy = 0.

Answer: If M(x, y) and N(x, y) have continuous partial derivatives, then thereis a function F = F (x, y) solving the equations

∂F

∂x= M,

∂F

∂y= N

if and only if∂M

∂y=

∂N

∂x

(c) Does the differential equation

(1 + x2 + x4y6) dx + (1 + x2 + y2) dy = 0

have a solution y = y(x) satisfying the initial condition y(0) = 5? (Explainyour answer.)

Answer: Yes. The differential equation can be written as

dy

dx= f(x, y), f(x, y) = −1 + x2 + x4y6

1 + x2 + y2

so the Existence and Uniqueness Theorem applies. (The Exactness Criterion isirrelevant here.)

III. (30 points.) Consider the differential equationdx

dt= (1− x)x(1 + x).

2

Page 3: MATH320

(a) Draw a phase diagram.

Answer:

x < −1 −1 < x < 0 0 < x < 1 1 < x

(1− x)x(1 + x) > 0 < 0 > 0 < 0

r r r−1 0 1- ¾ - ¾

(b) Determine the limit limt→∞

x(t) if x(t) is the solution with x(0) = −0.5.

Answer: From the phase diagram limt→∞x(t) = −1.

(b) Determine the limit limt→∞

x(t) if x(t) is the solution with x(0) = −1.

Answer: The constant function x(t) = −1 satisfies both the differential equationand initial condition x(0) = −1. Therefore it is the only solution by the theExistence and Uniqueness Theorem so lim

t→∞x(t) = limt→∞−1 = −1. Similarly if

x(0) = 0 then limt→∞x(t) = lim

t→∞ 0 = 0.

IV. (60 points.) A 1200 gallon tank initially holds 900 gallons of salt waterwith a concentration of 0.5 pounds of salt per gallon. Salt water with aconcentration of 11 pounds of salt per gallon flows into the tank at a rate of 8gallons per minute and the well stirred mixture flows out of the tank at a rateof 3 gallons per minute. Write a differential equation for the amount x = x(t)of salt in the tank after t minutes. (You need not solve the differentialequation but do give the initial condition.) Show your reasoning.

Answer: After t minutes 8t gallons of saltwater has flowed into the tank and 3tgallons has flowed out so the volume of the saltwater in the tank is V = 900 + 5t.The concentration of salt in this saltwater is x/V pounds per gallon. In a tiny timeinterval of size dt the amount of saltwater flowing out of the tank is 3 dt gallonsand the amount of salt in this saltwater is (x/V )×(3 dt) pounds. In this same tinytime interval the amount of saltwater flowing in is 8 dt gallons and the amount of

3

Page 4: MATH320

salt in that saltwater is 11 × 8 dt = 88 dt pounds. Hence the net change in theamount of salt is

dx = 88 dt− 3x dt

V=

(88− 3x

900 + 5t

)dt.

Initially V = 900 gallons so x(0) = 0.5× 900 pounds. Thus the ODE is

dx

dt= 88− 3x

900 + 5t, x(0) = 450.

The fact that the tank holds 1200 gallons means that it is full after 60 minutes.This problem (with different numbers) is Example 5 on page 53 of the text.

V. (60 points.) A projectile is launched straight upward from its intial posi-tion y0 with initial velocity v0 > 0. Air resistance exerts a force proportionalto the square of the projectile’s velocity so that Newton’ second law givesthat

dv

dt=

FG + FR

m= −1− v|v|.

(To simplify the problem we chose units where the gravitation constant isg = 1.) The projectile goes up for 0 ≤ t < T and then goes down.

(a) Find a formula for v(t) for 0 < t < T .

Answer:

tan−1 v =∫

dv

1 + v2= −

∫dt = −t + tan−1 v0

sov = tan(−t + tan−1 v0) =

v0 − tan t

1 + v0 tan t.

(b) Find a formula for v(t) for t > T .

Answer:

tanh−1 v =∫

dv

1− v2= −

∫dt = −t + T

as v = 0 when t = T sov = tanh(T − t).

(c) Find a formula T .

Answer: From part (a) v0 − tanT = 0 so T = tan−1(v0).

4

Page 5: MATH320

Table of Integrals and Identities1

a + u

a− u= e2aw ⇐⇒ u

a=

eaw − e−aw

eaw + e−aw

∫du

a2 + u2=

1

atan−1 u

a+ C.

∫du

a2 − u2=

1

2aln

∣∣∣∣u + a

u− a

∣∣∣∣ + C.

sin(t) =eit − e−it

2isinh(t) =

et − e−t

2

cos(t) =eit + e−it

2cosh(t) =

et + e−t

2

i tan(t) =eit − e−it

eit + e−ittanh(t) =

et − e−t

et + e−t

cos2(t) + sin2(t) = 1 cosh2(t)− sinh2(t) = 1

d sin(t) = cos(t) dt d sinh(t) = cosh(t) dt

d cos(t) = − sin(t) dt d cosh(t) = sinh(t) dt

tan(t + s) =tan(t) + tan(s)

1− tan(t) tan(s)tanh(t + s) =

tanh(t) + tanh(s)

1 + tanh(t) tanh(s)

Remark (added on answer sheet). The formulas

∫dv

a2 + vu2=

1

atan−1 v

a+ C.

∫du

a2 − u2=

1

atanh−1 u

a+ C.

can be related as follows. The formula∫

du

a2 − u2=

1

2aln

∣∣∣∣u + a

u− a

∣∣∣∣ + C

can be proved with partial fractions. Assume that −a < u < a so that(u + a)/(u− a) > 0 and introduce the abbreviation

w :=1

2aln

(u + a

u− a

).

Multiply by 2a and exponentiate to get

e2aw =a + u

a− u

1This is the same table that was emailed to the class yesterday morning.

5

Page 6: MATH320

so ae2aw − ue2aw = a + u so a(e2aw − 1) = u(e2aw + 1) so

u

a=

e2aw − 1

e2aw + 1=

eaw − e−aw

eaw + e−aw= tanh(aw)

by high school algebra. Thus tanh−1(u/a) = aw so

w =1

atanh−1 u

a.

Now the trig functions and hyperbolic functions are related by the formulas

i sin(t) = sinh(it), cos(t) = cosh(it), i tan(t) = tanh(it)

so the substitution u = iv, du = i dv gives

∫du

a2 + u2=

∫i dv

a2 − v2=

i

atanh−1 iv

a+ C =

1

atan−1 v

a+ C.

6

Page 7: MATH320

Math 320 – Exam II. Friday Mar 27, 09:55-10:45

Answers

I. (50 points.) Complete the definition.

(i) A subset W ⊆ V of a vector space V is called a subspace iff

Answer: A subset W ⊆ V of a vector space V is called a subspace iff it is closed under the vectorspace operations, i.e. iff

(i) 0 ∈ W ,

(ii) u,v ∈ W =⇒ u + w ∈ W , and

(iii) c ∈ R,u ∈ W =⇒ cu ∈ W .

(ii). The vectors v1,v2, . . . ,vk are said to be linearly independent iff

Answer: The vectors v1,v2, . . . ,vk are said to be linearly independent iff the only solution of theequation

x1v2 + x2v2 + · · ·+ xkvk = 0

is the trivial solution x1 = x2 = · · · = xk = 0.

(iii). The vectors v1,v2, . . . ,vk are said to span V iff they lie in V and

Answer: The vectors v1,v2, . . . ,vk are said to span V iff they lie in V and every vector in V is alinear combination of v1,v2, . . . ,vk, i.e. iff for every vector v in V there exist numbers x1, x2, . . . , xk

such thatv = x1v2 + x2v2 + · · ·+ xkvk.

II. (50 points.) I have been assigned the task of finding matrices P, P−1 and W so that W isin reduced echelon form and PA = W where

A =

1 2 −3 −2−2 −4 6 4

5 10 −14 −8

.

After some calculation I found matrices B and N such that NA = B and

B =

1 2 0 40 0 1 20 0 1 2

, N =

0 7 33 4 11 3 1

, N−1 =

1 2 −5−2 −3 9

5 7 −21

.

Only one elementary row operation to go! Find P, P−1, and W.

Answer: Let E be the elementary matrix E =

1 0 00 1 00 −1 1

so E−1 =

1 0 00 1 00 1 1

and W =

ENA = EB =

1 2 0 40 0 1 20 0 0 0

is in reduced echelon form. We can take

P = EN =

1 0 00 1 00 −1 1

0 7 33 4 11 3 1

=

0 7 33 4 1

−2 −1 0

Page 8: MATH320

so

P−1 = N−1E−1 =

1 2 −5−2 −3 9

5 7 −21

1 0 00 1 00 1 1

=

1 −3 −5−2 6 9

5 −14 −21

III. (50 points.) Here are the matrices from Problem II again.

A =

1 2 −3 −2−2 −4 6 4

5 10 −14 −8

.

B =

1 2 0 40 0 1 20 0 1 2

, N =

0 7 33 4 11 3 1

, N−1 =

1 2 −5−2 −3 9

5 7 −21

.

(They still satisfy the equation NA = B.) True or false? The second and fourth columns of Aform a basis for its column space. Justify your answer.

Answer: This is true. First note that the second and fourth coulumns of B form a basis for the columnspace of B =

[b1 b2 b3 b4

]. This is because b1 = 1

2b2 +0b4 and b3 = 12b4−b2. Hence the span

of b1,b2,b3,b4 (i.e. the column space of B) is the same as the span of b2,b4. The vectors b2,b4 are

independent since the submatrix[

2 40 2

]of

[b2 b4

]has nonzero determinant so the only soultion

of xb2 + yb4 = 0 is x = y = 0. Because ai = N−1bi we have a1 = 12a2 + 0a4 and a3 = 1

2a4 − a2 so thespan of a1,a2,a3,a4 (i.e. the column space of A) is the same as the span of a2,a4. If xa2 + ya4 = 0then xNa2 + yNa4 = xb2 + yb40 so x = y = 0 so a2,a4 are independent. Thus a2,a4 is a basis for thecolumn space of A and the dimension of this column space is two. (Both the algorithm from the bookon pages 250-260 and the proof of Theorem 73 in the notes tell us that the first and third columns alsoform a basis, and the arguments we just used are the same as the arguments used there.)

IV. (50 points.) (i) Find the inverse of the matrix

[7 32 1

]. Hint: You can use the formula.

Answer:

[a bc d

]−1

= 1ad−bc

[d −b−c a

]so

[7 32 1

]−1

=[

1 −3−2 7

].

(ii) The vector (−4, 1, b) is in the span of the vectors (7, 2, 3) and (3, 1, 4). What is b? Hint: Youcan save a little bit of work by using part (i).

Answer: If

−41b

= x1

723

+ x2

314

then

[ −41

]= x1

[72

]+ x2

[31

]=

[7 32 1

] [x1

x2

]so

[x1

x2

]=

[7 32 1

]−1 [ −41

]=

[1 −3

−2 7

] [ −41

]=

[ −715

]so b = 3x1 + 4x2 = −21 + 60 = 39.

V. (50 points.) Find a basis for the subspace of R5 consisting of all vectors w which areorthogonal to the vectors (1, 0, 2, 3, 4) and (0, 1, 4, 5, 6). What is the dimension of this subspace?

Answer: This is the same as the null space of the matrix[

1 0 2 3 40 1 4 5 6

]. The three vectors

(−1,−4, 1, 0, 0), (−3,−5, 0, 1, 0), and (−4,−6, 0, 0, 1), form a basis and so the dimension is three.

Page 9: MATH320

Math 320 (part 1) : First-Order Dierential Equations (by Evan Dummit, 2012, v. 1.01)

Contents

1 First-Order Dierential Equations 1

1.1 Introduction and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Some Motivating Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 First-Order: Separable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 The Logistic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 First-Order: Linear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5 Substitution Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5.1 Bernoulli Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5.2 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 First Order: Exact Equations and Integrating Factors . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.7 First Order: General Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.8 First Order: General Problems and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.8.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.8.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1 First-Order Dierential Equations

1.1 Introduction and Terminology

• A dierential equation is merely an equation involving a derivative (or several derivatives) of a function orfunctions. In every branch of science from physics to chemistry to biology (as well as 'other' elds such asengineering, economics, and demography) virtually any interesting kind of process is modeled by a dierentialequation, or a system of dierential equations.

The reason for this is that most anything interesting involves change of some kind, and the derivativeexpresses the rate of change.

Thus, anything that can be measured numerically with values that change in some known manner willgive rise to a dierential equation.

Example: The populations of several species in an ecosystem which aect one another in some way.

Example: The position, velocity, and acceleration of a physical object which has external forces actingon it.

Example: The concentrations of molecules involved in a chemical reaction.

Example: The production of goods, availability of labor, prices of supplies, and many other quantitiesover time in economic processes.

• Here are some examples of single dierential equations and systems of dierential equations, with and withoutadditional conditions.

Example: y′ + y = 0.

∗ Answer: y(x) = Ce−x for any constant C.

Example: y′′ + 2y′ + y = 3x2, with y(0) = y′(0) = 1

Page 10: MATH320

∗ Answer: y(x) = 3x2 − 12x+ 18− 4xe−x − 17e−x.

Example: f ′′ · f = (f ′)2, with f(1) = f ′(1) = 1.

∗ Answer: f(x) = ex−1.

Example: f ′ = 2f − g and g′ = f + 2g, with f(0) = g(0) =√2.

∗ Answer: f(x) = 2e2x cos(x− π

4) and g(x) = 2e2x sin(x− π

4).

Example:df

ds+df

dt= s+ t.

∗ Answer: Many solutions. Two examples are f(s, t) = st and f(s, t) =1

2s2 +

1

2t2.

• Most dierential equations (very much unlike the carefully chosen ones above) are dicult if not impossibleto nd exact solutions to, in the same way that most random integrals or innite series are hard to evaluateexactly.

Prototypical example: (f ′′)7 − 2ef′ − f = sin20(x) + e−x.

However, it is generally possible to nd very accurate approximate solutions using numerical techniquesor by using Taylor series.

In this course we will only cover how to solve some of the more basic types of equations and systems:rst-order separable, linear, and exact equations; higher-order linear equations with constant coecients;and systems of rst-order linear equations with constant coecients.

• If a dierential equation involves functions of only a single variable (i.e., if y is a function only of x) then itis called an ordinary dierential equation (or ODE).

We will only talk about ODEs in this course. But for completeness, dierential equations involving func-tions of several variables are called partial dierential equations, or PDEs. (Recall that the derivativesof functions of more than one variable are called partial derivatives, hence the name.)

PDEs, obviously, arise when functions depend on more than one variable. They occur often in physics(with functions that depend on space and time) and economics (with functions that depend on time andother parameters).

• An nth order dierential equation is an equation in which the highest derivative is the nth derivative.

Example: The equations y′ + xy = 3x2 and y′ · y = 2 are rst-order.

Example: The equation y′′ + y′ + y = 0 is second-order.

Example: The equation ey = y′′′′ is fourth-order.

• A dierential equation is linear if it is linear in the terms involving y and its derivatives. In other words, ifthere are no terms like y2, or (y′)3, or y · y′, or ln(y), or ey.

Example: The equations y′ + xy = 3x2 and y′′ + y′ + y = 0 are linear.

Example: The equations y′ · y = 3x2, x2 + (y′)2 = 1, and y′′ = − sin(y) are not linear.

1.2 Some Motivating Applications

• Simple motivating example: A population (unrestricted by space or resources) tends to grow at a rate propor-tional to its size. [Reason: imagine each male pairing o with a female and having a xed number of ospringeach year.]

In symbols, this means thatdP

dt= k ·P , where P (t) is the population at time t and k is the growth rate.

This is a homogeneous rst-order linear dierential equation with constant coecients.

It's not hard to see that one population model that works is P (t) = ek·t hence, exponential growth.

Page 11: MATH320

• More complicated example: The Happy Sunshine Valley is home to Cute Mice and Adorable Kittens. TheCute Mice grow at a rate proportional to their population, minus the number of Mice that are eaten by theirpredators, the Kittens. The population of Adorable Kittens grows proportional to the number of mice (sincethey have to catch Mice to survive and reproduce).

Symbolically, this saysdM

dt= k1 ·M−k2 ·K, and

dK

dt= k3 ·M , whereM(t) and K(t) are the populations

of Mice and Kittens, and k1, k2, k3 are some constants.

This is a system of two linear dierential equations; we will learn how to solve a system like this later inthe course.

The conditions here are fairly natural for a simple predator-prey system. But in general, there couldbe non-linear terms too perhaps when two Kittens meet, they ght with each other and cause injury,

which might change the equation todK

dt= k3 ·M − k4 ·K2.

This system would get even more dicult if we added additional species each of which interacts in someway with the others.

• Non-linear example: A simple pendulum consists of a weight suspended on a string, with gravity the onlyforce acting on the weight. If θ is the angle the pendulum's string makes with a vertical line, then horizontalforce on the weight toward the vertical is proportional to sin(θ).

Symbolically, this saysd2θ

dt2= −k · sin(θ). This is a non-linear second-order dierential equation.

This equation cannot be solved exactly for the function θ(t). However, a reasonably good approximationcan be found by using the rough estimate sin(θ) ≈ θ, which turns the problem into the linear second-order

dierential equationd2θ

dt2= −k · θ, whose solutions are much easier to nd.

1.3 First-Order: Separable

• A separable equation is of the form y′ = f(x) ·g(y) for some functions f(x) and g(y), or an equation equivalentto something of this form.

• Here is the method for solving such equations:

Step 1: Replace y′ withdy

dx, and then cross-multiply as necessary to get all the y-stu on one side

(includingdy

dx) and all of the x-stu on the other.

Step 2: Integrate both sides (indenitely, with respect to x).

∗ For the integral involving y terms, remember that you can canceldy

dx· dx to get dy and an integral

in terms of y only.

∗ Don't forget to put the +C on the side with the x-terms.

Step 3: If given, plug in the initial condition to solve for the constant C. (Otherwise, just leave it whereit is.)

Step 4: Solve for y as a function of x, if possible.

• Example: Solve y′ = k · y, where k is some constant.

Step 1: Rewrite as1

y

dy

dx= k.

Step 2: Integrate to get´ 1

y

dy

dxdx =

´k dx, or

´ 1

ydy =

´k dx. Evaluate to get ln(y) = kx+ C.

Step 4: Exponentiate to get y = ekx+C = C · ekx .

Page 12: MATH320

• Example: Solve the dierential equation y′ = ex−y.

Step 1: Rewrite as eydy

dx= ex.

Step 2: Integrate to get´eydy

dxdx =

´ex dx. Simplify to then evaluate to get ey = ex + C.

Step 4: Take the natural logarithm to get y = ln(ex + C) .

• Example: Find y given that y′ = x+ xy2 and y(0) = 1.

Step 1: Rewrite asdy

1 + y2= x dx.

Step 2: Integrate to get´ 1

1 + y2dy

dxdx =

´x dx, then simplify and evaluate to get tan−1(y) =

1

2x2+C.

Step 3: Plug in the initial condition to get tan−1(1) = C hence C = π/4.

Step 4: Take the natural logarithm to get y = tan

(1

2x2 +

π

4

).

1.3.1 The Logistic Equation

• Example: Solve the dierential equation P ′ = aP (b− P ), where a and b are positive constants.

Step 1: Rewrite asdP

P (b− P )= a dt.

Step 2: Integrate both sides to obtain´ 1

P (b− P )dP =

´a dt. To evaluate the P -integral, use partial

fraction decomposition:1

P (b− P )=

1/b

P+

1/b

b− P. Evaluating the integrals therefore yields

1

bln(P ) −

1

bln(b− P ) = at+ C.

Step 4: Multiply both sides by b and then combine the logarithms to obtain ln

(P

b− P

)= abt+C; now

exponentiate to getP

b− P= Ceabt. Solving for P yields, nally, P (t) =

b

1 + Ce−abt.

Note: If we want to satisfy the initial condition P (0) = P0, then plugging in shows C =b

P0− 1. Then

the solution can be rewritten in the form P (t) =bP0

P0 + (b− P0)e−abt.

• Remark: Dierential equations of this form are called logistic equations. With the explicit solution given here,we can observe some properties of the solution curves.

For example, as t → ∞, as long as the starting population P0 is positive, the population P (t) tendstoward the carrying capacity of b.

We can also see directly from the original dierential equation P ′ = aP (b− P ) that if the population isless than b, then P ′ > 0, so that P is increasing but that as P approaches b, the value of P ′ shrinks to0. (This can also be seen from graphing some of the solution curves.)

Similarly, if the population is greater than b, then P ′ < 0 and P is decreasing. And if the population isexactly b, then P ′ = 0 and the population is constant.

Thus we can see that the value b is an attracting point for the population, because as time goes on,nearby values of P all get pushed closer toward b. This is an example of a stable critical point.

By doing a similar analysis near the value P = 0 we can see that the value 0 is a repulsing point,because as time goes on, nearby values of P all get pushed farther away from 0. This is an example ofan unstable critical point.

Page 13: MATH320

1.4 First-Order: Linear

• The general form for a rst-order linear dierential equation is (upon dividing by the coecient of y′) givenby y′ + P (x) · y = Q(x), where P (x) and Q(x) are some functions of x.

• We would really like it if we could just integrate both sides to solve the equation. However, in general, wecannot: the y′ term is easy to integrate, but the P (x) · y term causes trouble.

• To solve this equation we use an integrating factor: we multiply by a function I(x) which will turn theleft-hand side into the derivative of a single function.

What we would like to happen is for I(x) · y′ + I(x)P (x) · y to be the derivative of something nice.

When written this way, this sum looks sort of like the output of the product rule. If we can nd I(x) so

that the derivative of I(x) is I(x)P (x), then this sum will be the derivatived

dx[I(x) · y].

What we want is I(x)P (x) = I ′(x). This is now a (very easy) separable equation for the function I(x),and the solution is I(x) = e

´P (x) dx.

• Motivated by the above logic, here is the method for solving rst-order linear equations:

Step 1: Put the equation into the form y′ + P (x) · y = Q(x).

Step 2: Multiply both sides by the integrating factor e´P (x) dx to get e

´P (x) dxy′ + e

´P (x) dxP (x) · y =

e´P (x) dxQ(x).

Step 3: Observe that the right-hand side isd

dx

[e´P (x) dx · y

], and take the antiderivative on both sides.

Don't forget the constant of integration C.

Step 4: If given, plug in the initial condition to solve for the constant C. (Otherwise, just leave it whereit is.)

Step 5: Solve for y as a function of x.

• Example: Find y given that y′ + 2xy = x and y(0) = 1.

Step 1: We have P (x) = 2x and Q(x) = x.

Step 2: Multiply both sides by e´P (x) dx = ex

2

to get ex2

y′ + ex2 · 2x · y = x · ex2

.

Step 3: Taking the antiderivative on both sides yields ex2

y =1

2ex

2

+ C.

Step 4: Plugging in yields e0 · 1 =1

2e0 + C hence C =

1

2.

Step 5: Solving for y gives y =1

2+

1

2e−x

2

.

• Example: Find all functions y for which xy′ = x4 − 4y.

Step 1: We have y′ +4

xy = x3, so P (x) =

4

xand Q(x) = x3.

Step 2: Multiply both sides by e´P (x) dx = e4 ln(x) = x4 to get x4y′ + 4x3y = x7,

Step 3: Taking the antiderivative on both sides yields x4y =1

8x8 + C.

Step 5: Solving for y gives y =1

8x4 + C · x−4 .

• Example: Find y given that y′ · cot(x) = y + 2 cos(x) and y(0) = −1

2.

Step 1: We have y′ − y tan(x) = 2 sin(x), with P (x) = − tan(x) and Q(x) = 2 sin(x).

Step 2: Multiply both sides by e´P (x) dx = eln(cos(x)) = cos(x) to get y′ ·cos(x)−y ·sin(x) = 2 sin(x) cos(x).

Page 14: MATH320

Step 3: Taking the antiderivative on both sides yields [y · cos(x)] = −1

2cos(2x) + C.

Step 4: Plugging in yields −1

2= −1

2· 1 + C hence C = 0.

Step 5: Solving for y gives y = − cos(2x)

2 cos(x).

1.5 Substitution Methods

• Just like with integration, sometimes we come across dierential equations which we cannot obviously solve,but which, if we change variables, will turn into a form we know how to solve.

• Determining what substitutions to try is a matter of practice, in much the same way as in integral calculus.In general, there are two kinds of substitutions: obvious ones that arise from the form of the dierentialequation, and formulaic ones which are standard substitutions to use if a dierential equation has a particularform.

• The general procedure is the following:

Step 1: Express the new variable v in terms of y and x.

Step 2: Finddv

dxin terms of y′, y, and x using implicit dierentiation.

Step 3: Rewrite the original dierential equation in y as a dierential equation in v.

Step 4: Solve the new equation in v. (The hope is, after making the substitution, the new equation is ina form that can be solved with one of the other methods.)

Step 5: Substitute back for y.

• Example: Solve the equation y′ = (x+ y)2.

This equation is not linear, nor is it separable as written. The obstruction is that the term x+y involvesboth x and y.

Step 1: Let us try substituting v = x+ y.

Step 2: Dierentiating yieldsdv

dx= 1 +

dy

dx, so y′ = v′ − 1.

Step 3: The new equation is v′ − 1 = v2, or v′ = v2 + 1.

Step 4: The equation in v is separable. Separating it gives´ dv

v2 + 1=´1 dx, so that tan−1(v) = x+C,

or v = tan(x+ C).

Step 5: Substituting back yields y = tan(x+ C)− x .

1.5.1 Bernoulli Equations

• An equation of the form y′ + P (x)y = Q(x) · yn for some integer n 6= 0, 1 is called a Bernoulli equation. (Therestriction that n not be 0 or 1 is not really a restriction, because if n = 0 then the equation is rst-orderlinear, and if n = 1 then the equation is the same as y′ = (Q(x)− P (x))y, which is separable.)

As with rst-order linear equations, sometimes Bernoulli equations can be hidden in a slightly dierentform.

The trick for solving a Bernoulli equation is to make the substitution v = y1−n. The algebra is simpliedif we rst multiply both sides of the original equation by (1− n) · y−n, and then make the substitution.

So we have (1− n)y′ · y−n + (1− n)P (x) · y1−n = (1− n)Q(x).

For v = y1−n we have v′ = (1− n)y−n · y′.

Page 15: MATH320

Substituting in to the original equation then yields the rst-order linear equation v′ + (1− n)P (x) · v =(1− n)Q(x) for v.

• Example: Solve the equation y′ + 2xy = xy3.

This equation is of Bernoulli type, with P (x) = 2x, Q(x) = x, and n = 3. Making the substitutionv = y−2 thus results in the equation v′ − 4xv = −2x. Next, we compute the integrating factor I(x) = e

´−4x dx = e−2x

2

.

Scaling by the integrating factor gives e−2x2

v′ − 4xe−2x2

v = −2xe−2x2

.

Taking the antiderivative on both sides then yields e−2x2

v =1

2e−2x

2

+ C, so that v =1

2+ Ce−2x

2

.

Finally, solving for y gives y =

(1

2+ Ce−2x

2

)−1/2.

• Example: Solve the equation y2y′ = ex − y3.

The equation as written is not of Bernoulli type. However, if to both sides we add y3 and then divideby y2, we obtain the equation y′ + y = exy−2, which is now Bernoulli with P (x) = 1, Q(x) = ex, andn = −2. Making the substitution v = y3 results in the equation v′ + 3v = 3ex.

The integrating factor is I(x) = e´3 dx = e3x, so the new equation is e3xv′ + 3e3xv = 3e4x.

Taking the antiderivative then gives e3xv =3

4e4x+C, so v =

3

4ex+Ce−3x and y =

(3

4ex + Ce−3x

)1/3

.

1.5.2 Homogeneous Equations

• An equation of the form y′ = f(yx

)for some function f is called a homogeneous equation.

The trick to solving an equation of this form is to make the substitution v =y

x, or equivalently to set

y = vx.

Then dierentiating y = vx shows y′ = v + xv′, hence the equation becomes v + xv′ = f(v), which is

separable once written in the formv′

f(v)− v=

1

x.

• Example: Solve the dierential equation 2x2y′ = x2 + y2.

This equation is not separable nor linear, and it is not a Bernoulli equation. If we divide both sides by

2x2 then we obtain y′ =1

2+

1

2

(yx

)2, which is homogeneous.

Setting v = y/x yields the equation xv′ =1

2v2 − v + 1

2, and rearranging gives

2v′

(v − 1)2=

1

x.

Then integrating yields´ 2dv

(v − 1)2=´ 1

xdx, so

−2v − 1

= ln(x)+C. Solving for v gives v = 1− 2

ln(x) + C,

so y = x− 2x

ln(x) + C.

• Example: Solve the dierential equation y′ =x2 + y2

xy.

If we divide the numerator and denominator of the fraction by x2, we obtain y′ =1 + (y/x)2

(y/x), which is

homogeneous.

Page 16: MATH320

Setting v = y/x yields xv′ =1 + v2

v− v =

1

v.

Separating and integrating yields´v dv =

´ 1

xdx, so that

1

2v2 = ln(x) + C, so v =

√2 ln(x) + C and

then y = x√2 ln(x) + C .

1.6 First Order: Exact Equations and Integrating Factors

• Theorem (Exact Equations): For functions M(x, y) and N(x, y) with My = Nx (on some rectangle), thereexists a function F (x, y) with Fx =M and Fy = N (on that rectangle). Then the solutions to the dierentialequation M(x, y) +N(x, y) y′ = 0 are given (implicitly) by F (x, y) = C where C is an arbitrary constant.

My denotes the partial derivative of M with respect to y, namely∂M

∂y, and similarly for the other

functions.

The equation M(x, y) + N(x, y) y′ = 0 is also sometimes written M(x, y) dx + N(x, y) dy = 0. In thisform, it is more symmetric between the variables x and y. I will generally do this.

Fun Fact: The part of the theorem stating that My = Nx implies the existence of a function F suchthat Fx =M and Gy = N is a theorem from vector calculus: the criterion My = Nx is equivalent to thevector eld 〈M,N〉 being conservative. The function F is the corresponding potential function, with∇F = 〈M,N〉. The rest of the theorem is really just an application of this result.

Remark: Note that if M = f(x) is a function only of x and N = − 1

g(y)is a function only of y, then our

equation looks like f(x)− 1

g(y)y′ = 0. Rearranging it gives the general form y′ = f(x) g(y) of a separable

equation. Since My = 0 = Nx in this case, separable equations are a special case of exact equations.

• We can use the theorem to solve exact equations, where My = Nx. If the partial derivatives are not equal,we are not necessarily out of luck like with rst-order linear equations, there may exist an integrating factorI(x, y) which we can multiply the equation by, in order to make the equation exact.

Unfortunately, we don't really get much for free: trying to solve for the integrating factor is often as hard

as solving the original equation. Finding I(x, y), in general, requires solving the PDE∂I

∂y·M − ∂I

∂x·N +

I · (My −Nx) = 0, which is just as tricky to solve as the original equation. Only in a few special casesare there methods for computing the integrating factor I(x, y).

Case 1: Suppose we want to see if there exists an integrating factor that depends only on x (and not

on y). Then∂I

∂ywould be zero, since I does not depend on y, and so I(x) would need to satisfy

I ′

I=My −Nx

N. This can only happen if the ratio

My −Nx

Nis a function P (x) only of x (and not y);

then I(x) = e´P (x) dx.

∗ The form of this integrating factor should look familiar it is the same as the one from a rst-orderlinear equation. There is a very good reason for this; namely, a rst-order linear equation is a specialcase of this form of equation.

Case 2: We could also look to see if there is an integrating factor that depends only on y and not on x.

We can do the same calculation, this time using∂I

∂x= 0, to see that such an integrating factor exists if

the ratioNx −My

Mis a function Q(y) only of y (and not x); then I(y) = e

´Q(y) dy.

Remark: There is no really good reason only to consider these cases, aside from the fact that they're theeasiest. We could just as well try to look for integrating factors that are a function of the variable t = xy.Or of v = x/y. Or of w = y + ln(x). In each case we'd end up with some other kind of condition. Butwe won't think about those things we really just care about the two kinds of integrating factors above.

• Example: Solve for y(x), if (4y2 + 2x) + (8xy)y′ = 0.

Page 17: MATH320

There is no obvious substitution to make, and it is not separable, linear, homogeneous, or Bernoulli. Sowe must check for exactness.

In dierential form the equation is (4y2 + 2x) dx+ 8xy dy = 0. Therefore, M = 4y2 + 2x and N = 8xy.

Therefore we have My = 8y and Nx = 8y. Since these are equal, the equation is exact.

So we want to nd F with Fx =M and Fy = N . Taking the anti-partial-derivative of M with respectto x yields F (x, y) = 4xy2 + x2 + g(y) for some function g(y). Checking then shows Fy = 8xy + g′(y) sog′(y) = 0.

Therefore, our solutions are given implicitly by 4xy2 + x2 = C .

• Example: Solve for y(x), if (2xy2 − 4y) + (3x2y − 8x)y′ = 0.

There is no obvious substitution to make, and it is not separable, linear, homogeneous, or Bernoulli. Sowe must check for exactness.

In dierential form the equation is (2xy2 − 4y) dx+ (3x2y − 8x) dy = 0. Therefore, M = 2xy2 − 4y andN = 3x2y − 8x.

Therefore we have My = 4xy − 4 and Nx = 6xy − 8. These are not equal, so the equation isn't exact.

We look for integrating factors using the two criteria we know.

∗ First, we haveMy −Nx

N=−2xy + 4

3x2y − 8xis not a function of x only.

∗ Second, we haveNx −My

M=

2xy − 4

2xy2 − 4y=

1

yis a function of y only. Therefore we need to multiply

by the integrating factor I(y) = e´(1/y) dy = y.

Our new equation is therefore (2xy3 − 4y2) + (3x2y2 − 8xy)y′ = 0.

Now we want to nd F with Fx = 2xy3−4y2 and Fy = 3x2y2−8xy. Taking the anti-partial-derivativeof the rst equation gives F (x, y) = x2y3 − 4xy2 + f(y) and checking in the second equation showsf ′(y) = 0.

Therefore, our solutions are given implicitly by x2y3 − 4xy2 = C .

1.7 First Order: General Procedure

• We can combine all of the techniques for solving rst-order dierential equations into a handy list of steps. Forthe purposes of this course, if the equation cannot be simplied via a substitution (an obvious substitution, orif it is homogeneous or Bernoulli) then it is either exact, or can be made exact by multiplying by an integratingfactor. (If it's not one of those, then in this course we have no idea how to solve it.)

Note: It is not really necessary to check ahead of time whether the equation is separable or a rst-orderlinear equation. Separable equations are exact, and rst-order linear equations can be made exact after

multiplying by an integrating factor, which will be detected using theMy −Nx

Ntest. I check for these

two special types at the beginning only because it's faster to solve it using the usual methods.

• Here is the general procedure to follow to solve rst-order equations:

Step 1: Write the equation in the two standard forms y′ = f(x, y) and M(x, y) + N(x, y) · y′ = 0 andcheck to see if it is rst-order linear or separable.

∗ Step 1a: If the equation is rst-order linear namely, of the form y′+P (x)y = Q(x) then multiplyby the integrating factor I(x) = e

´P (x) dx and then take the antiderivative of both sides.

∗ Step 1b: If the equation is separable namely, of the form y′ = f(x) · g(y) then separate they-terms and x-terms on opposite sides of the equation and then take the antiderivative of both sides.

Step 2: Look for possible substitutions (generally, using the y′ = f(x, y) form).

∗ Step 2a: Check to see if there is any 'obvious' substitution that would simplify the equation.

Page 18: MATH320

∗ Step 2b: Check to see if the equation is of Bernoulli type namely, of the form y′+P (x)y = Q(x)·yn.If so, multiply both sides by (1 − n) · y−n and then make the substitution v = y1−n to obtain a

rst-order linear equationdv

dx+ (1− n)P (x) · v = (1− n)Q(x).

∗ Step 2c: Check to see if the equation is homogeneous namely, of the form y′ = F(yx

)for some

function F . If so, make the substitution v =y

xto obtain a separable equation x · dv

dx= F (v)− v.

Step 3: If the equation is not of a special type, use the M(x, y)+N(x, y) · y′ = 0 form to nd the partialderivatives My and Nx.

Step 4: If My = Nx, no integrating factor is needed. Otherwise, if My 6= Nx, look for an integratingfactor I to multiply both sides of the equation by.

∗ Step 3a: ComputeMy −Nx

N. If it is a function P (x) only of x, then the integrating factor is

I(x) = e´P (x) dx.

∗ Step 3b: ComputeNx −My

M. If it is a function Q(y) only of y, then the integrating factor is

I(y) = e´Q(y) dy.

∗ If neither of these methods works, you're out of luck unless you can nd an integrating factor someother way.

Step 5: Take antiderivatives to nd the function F (x, y) with Fx = M and Fy = N , and write thesolutions as F (x, y) = C.

1.8 First Order: General Problems and Solutions

• Part of the diculty of seeing rst-order dierential equations outside of a homework set (e.g., on exams) isthat it is not always immediately obvious which method or methods will solve the problem. Thus, it is goodto practice problems without being told which method to use.

1.8.1 Problems

• Solve the equation xy′ = y +√xy.

• Solve the equation y′ =y − 2xy2

3x2y − 2x.

• Solve the equation xy′ = y +√x.

• Solve the equation y′ − 1 = y2 + x3 + x3y2.

• Solve the equation y′ = −4x3y2 + y

2x4y + x.

• Solve the equation y′ = xy3 − 6xy.

• Solve the equation y′ = −2xy + 2x

x2 + 1.

1.8.2 Solutions

• Solve the equation xy′ = y +√xy.

Step 1: The two standard forms are y′ =y

x+

√y

xand (−y − √xy) + xy′ = 0. The equation is not

separable or rst-order linear.

Step 2: We go down the list and recognize that y′ =y

x+

√y

xis a homogeneous equation.

Page 19: MATH320

Setting v = y/x (with y = vx and y′ = xv′ + v) yields xv′ =√v.

This equation is separable: we have´ dv√

v=´ 1

xdx hence 2v1/2 = ln(x) + C, so v =

(ln(x)

2+ C

)2

.

Solving for y gives y = x

(ln(x)

2+ C

)2

.

Note: The equation is also of Bernoulli type, and could be solved that way too. Of course, it will givethe same answer.

• Solve the equation y′ =y − 2xy2

3x2y − 2x.

Step 1+2: The other standard form is (2xy2 − y) + (3x2y − 2x) y′ = 0. The equation is not separable orlinear, nor is it homogeneous or Bernoulli.

Step 3: We have M = 2xy2 − y and N = 3x2y − 2x so My = 4xy − 1 and Nx = 6xy − 2.

Step 4: We need to look for an integrating factor, because My 6= Nx. We haveMy −Nx

N=−2xy + 1

3x2y − 2x,

which is not a function of x alone. Next we tryNx −My

M=

2xy − 1

2xy2 − y=

1

y, so the integrating factor is

I(y) = e

´ 1y

dy

= y.

Step 5: The new equation is (2xy3 − y2) + (3x2y2 − 2xy) y′ = 0. Taking the anti-partial of the newM with respect to x gives F (x, y) = x2y3 + xy2 + f(y), and checking shows that f ′(y) = 0. Hence the

solutions are x2y3 + xy2 = C .

• Solve the equation xy′ = y +√x.

Step 1: The two standard forms are y′ =y

x+

1√xand (−y −

√x) + xy′ = 0. The equation is rst-order

linear.

Rewrite in the usual rst-order linear form y′ − (x−1)y = x−1/2.

We have the integrating factor I(x) = e´−x−1 dx = e− ln(x) = eln(x

−1) = x−1.

Thus the new equation is x−1y′ − x−2y = x−3/2.

Taking the antiderivative on both sides yields x−1y = −1

2x−1/2 + C, so y = −1

2x1/2 + Cx .

• Solve the equation y′ − 1 = y2 + x3 + x3y2.

Step 1: Adding 1 and then factoring the right-hand side gives y′ = (y2 + 1)(x3 + 1). This equation isseparable.

Separating it givesy′

y2 + 1= x3 + 1.

Integrating yields´ dy

y2 + 1=´(x3+1) dx, so tan−1(y) =

x4

4+x2

2+C. Then y = tan

(x4

4+x2

2+ C

).

• Solve the equation y′ = − (4x3y2 + y)

(2x4y + x).

Step 1+2: The other standard form is (4x3y2 + y) + (2x4y + x) y′ = 0. The equation is not separable orlinear, nor is it homogeneous or Bernoulli.

Step 3: We have M = 4x3y2 + y and N = 2x4y + x so My = 8x3y + 1 and Nx = 8x3y + 1.

Step 4: No integrating factor is needed since My = Nx.

Page 20: MATH320

Step 5: Taking the anti-partial of M with respect to x gives F (x, y) = x4y + xy + f(y), and checking

shows that f ′(y) = 0. Hence the solutions are x4y2 + xy = C .

• Solve the equation y′ = xy3 − 6xy.

Step 1: The equation is of Bernoulli type when written as y′ + 6xy = xy3.

Multiply both sides by −2y−3 to get −2y−3y′ − 12

xy−2 = −2x.

Making the substitution v = y−2 with v′ = −2y−3y′ then yields the linear equation v′ − 12

xv = −2x.

The integrating factor is e´−(12/x) dx = e−12 ln(x) = x−12.

The new equation is x−12v′ − 12x−13v = −2x−11.

Taking the antiderivative on both sides yields x−12v =1

5x−10+C, so v =

1

5x2+Cx12 and y =

(1

5x2 + Cx12

)−1/2.

• Solve the equation y′ = −2xy + 2x

x2 + 1.

(method #1)

Step 1: The equation is separable, since after factoring we see that y′ = − 2x

x2 + 1(y + 1).

Separating and integrating gives´ dy

y + 1= −´ 2x

x2 + 1dx, so that ln(y + 1) = − ln(x2 + 1) + C.

Exponentiating yields y + 1 = e− ln(x2+1)+C =C

x2 + 1, so y =

C

x2 + 1− 1 .

(method #2)

Step 1: The other standard form is (2xy + 2x) + (x2 + 1)y′ = 0.

Step 3: We have M = 2xy + 2x and N = x2 + 1 so My = 2x and Nx = 2x.

Step 4: We have My = Nx, so the equation is exact.

Step 5: Taking the anti-partial of M with respect to x gives F (x, y) = x2y + x2 + f(y), and checking

shows that f ′(y) = 1 so f(y) = y. Hence the solutions are x2y + x2 + y = C .

Note of course that we can solve for y explicitly, and we obtain exactly the same expression as in theother solution.

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 21: MATH320

Math 320 (part 1a) : First-Order Supplement (by Evan Dummit, 2012, v. 1.00)

Contents

0.1 First Order: Some Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

0.1.1 The General Mixing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

0.2 Autonomous Equations, Equilibria, and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

0.3 Euler's Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

0.1 First Order: Some Applications

0.1.1 The General Mixing Problem

• Setup:

We have some reservoir (pool, lake, ocean, planet, room) of liquid (water, gas) which has some substance(pollution, solute) dissolved in it.

The reservoir starts at an initial volume V0 and there is an initial amount of substance y0 in the reservoir.

We have some amount of liquid In(t) owing in with a given concentration k(t) of the substance, andsome other amount of liquid Out(t) owing out.

We assume that the substance is uniformly and perfectly mixed in the reservoir, and want to know theamount y(t) of the substance that remains in the reservoir after time t.

• Note that this is the general setup. In more specic examples, the amount of liquid owing in or out may beconstants (i.e., not depending on time), and similarly the concentration of the liquid owing in could also bea constant. The solution is the same, of course.

• Solution:

Let V (t) be the total volume of the reservoir. Then if y(t) is the total amount of substance in the

reservoir, the concentration of substance in the reservoir isy(t)

V (t). Thus the total amount of substance

moving in is k · In(t) and the total amount of substance moving out is the concentration of substance

times the volume moving out, ory(t)

V (t)·Out(t).

Thus we have V ′(t) = In(t) − Out(t), and y′(t) = k(t) · In(t) − y(t)

V (t)· Out(t). For clarity, refer to the

diagram:

Now to solve this system, we integrate to nd V (t) explicitly.

Then we can rewrite the other equation as y′ +Out(t)

V (t)· y = k(t) · In(t), which we can now solve because

it is rst-order linear.

Page 22: MATH320

0.2 Autonomous Equations, Equilibria, and Stability

• An autonomous equation is a rst-order equation of the formdy

dt= f(y) for some function f .

An equation of this form is separable, and thus solvable in theory.

However, sometimes the function f(y) is suciently complicated that we cannot actually solve theequation explicitly.

Nonetheless, would like to be able to say something about what the solutions look like, without actuallysolving the equation. Happily, this is possible.

• An equilibrium solution, also called a steady state solution or a critical point, is a solution of the form y(t) = c,for some constant c. (In other words, it is just a constant-valued solution.)

Clearly, if y(t) is constant, then y′(t) is zero everywhere. Thus, in order to nd the equilibrium solutionsto an autonomous equation y′ = f(y), we just need to solve f(y) = 0. (And this is not generally sohard.)

• For equilibrium solutions, we have some notions of stability:

An equilibrium solution y = c is stable from above if, when we solve y′ = f(y) with the initial conditiony(0) = c+ε for some small but positive ε, the solution y(t) moves toward c as t increases. This statementis equivalent to f(c+ ε) < 0.

A solution y = c is stable from below if when we solve y′ = f(y) with the initial condition y(0) = c−ε forsome small but positive ε, the solution y(t) moves toward c as t increases. This statement is equivalentto f(c− ε) > 0.

A solution y = c is unstable from above if when we solve y′ = f(y) with the initial condition y(0) = c+ εfor some small but positive ε, the solution y(t) moves away from c as t increases. This statement isequivalent to f(c+ ε) > 0.

A solution y = c is unstable from below if when we solve y′ = f(y) with the initial condition y(0) = c− εfor some small but positive ε, the solution y(t) moves away from c as t increases. This statement isequivalent to f(c− ε) < 0.

• We say a solution is stable if it is stable from above and from below. We say it is unstable if it unstablefrom above and from below. Otherwise (if it is stable from one side and unstable from the other) we say it issemistable.

• From the equivalent conditions about the sign of f , here are the steps to follow to nd and classify theequilibrium states of y′ = f(y):

Step 1: Find all values of c for which f(c) = 0, to nd the equilibrium states.

Step 2: Mark all the equilibrium values on a number line, and then in each interval between two criticalpoints, plug in a test value to f to determine whether f is positive or negative on that interval.

Step 3: On each interval where f is positive, draw right-arrows, and on each interval where f is negative,draw left-arrows.

Step 4: Using the arrows, classify each critical point: if the arrows point toward it from both sides, itis stable. If the arrows point away, it is unstable. If the arrows both point left or both point right, it issemistable.

Step 5 (optional): Draw some solution curves, either by solving the equation or by using the stabilityinformation.

• Example: Find the equilibrium states of y′ = y and determine stability.

Step 1: We have f(y) = y, which obviously is zero only when y = 0.

Step 2: We draw the line and plug in 2 test points (or just think for a second) to see that the signdiagram looks like |

0⊕.

Page 23: MATH320

Step 3: Changing the diagram to arrows gives ← |0→.

Step 4: So we can see from the diagram that the only equilibrium point 0 is unstable .

Step 5: We can of course solve the equation to see that the solutions are of the form y(t) = C et, andindeed, the equilibrium solution y = 0 is unstable:

• Example: Find the equilibrium states of y′ = y2(y − 1)(y − 2) and determine stability.

Step 1: We have f(y) = y2(y − 1)(y − 2), which conveniently is factored. We see it is zero when y = 0,y = 1, and y = 2.

Step 2: We draw the line and plug in 4 test points (or just think for a second) to see that the signdiagram looks like ⊕|

0⊕ |

1 |

2⊕.

Step 3: Changing the diagram to arrows gives → |0→ |

1← |

2→.

Step 4: So we can see from the diagram that 0 is semistable , 1 is stable , and 2 is unstable .

Step 5: In this case, it is possible to obtain an implicit solution by integration; however, an explicitsolution does not exist. However, we can graph some solution curves to see, indeed, our classication isaccurate:

Page 24: MATH320

0.3 Euler's Method

• If we have exhausted all of our techniques trying to solve a rst-order initial value problem y′ = f(x, y) withy(a) = y0, we are sad. However, perhaps we would like to be able to nd an approximate solution on someinterval [a, b].

• One method we can use to nd an approximate solution is Euler's Method, named after the Swiss mathemati-cian Leonhard Euler (pronounced oiler).

The general idea behind Euler's Method, which should bring back memories of basic calculus, is to breakup the interval [a, b] into many small pieces, and then to use a linear approximation to the function y(x)on each interval to trace a rough solution to the equation.

• Here is the method, more formally:

Step 1: Choose the number of subintervals n, and let h =b− an

be the width of the subintervals.

Step 2: Dene the x-values x0 = a, x1 = x0 + h, x2 = x1 + h, ... , xn = xn−1 + h = b.

Step 3: Take y0 to be the given initial value. Then compute, iteratively, the values y1 = y0+h ·f(x0, y0),y2 = y1 + h · f(x1, y1), ... , yn = yn−1 + f(xn−1, yn−1). It is easiest to organize this information in atable.

Step 4 (optional): Plot the points (x0, y0), (x1, y1), . . . , (xn, yn) and connect them with a smooth curve.

• Example: Use Euler's Method to nd an approximate solution on the interval [1, 2] to the dierential equationy′ = ln(x+ y) with y(1) = 1.

Step 1: Let's take 10 subintervals. Then h = 0.1.

Steps 2+3: We organize our information in the table below. We ll out the rst row with the x-values.Then we ll in the empty columns one at a time: to start the next column, we add the y-value and theh · f(x, y) value to get the next y-value.

x 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0

y 1 1.0693 1.1467 1.2320 1.3249 1.4251 1.5324 1.6466 1.7674 1.8946 2.0280

f(x, y) 0.693 0.774 0.853 0.929 1.002 1.073 1.1418 1.208 1.272 1.334 -

h · f(x, y) 0.0693 0.0774 0.0853 0.0929 0.1002 0.1073 0.1142 0.1208 0.1272 0.1334 -

Step 4: Finally, we can plot the points, and (for comparison) the actual solution curve obtained using acomputer. As can be seen from the graph, the approximation is very good:

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 25: MATH320

Math 320 (part 2) : Matrices (by Evan Dummit, 2012, v. 1.00)

Contents

1 Matrices 1

1.1 Linear Equations, Gauss-Jordan Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Matrix Operations: Addition, Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Determinants and Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Matrices and Systems of Linear Equations, revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1 Matrices

• An m× n matrix is an array of numbers with m columns and n rows. For example,

[4 1 −13 2 0

]is a 2× 3

matrix, and

π 0 00 π 00 0 9

is a 3× 3 matrix.

A square matrix is one with the same number of rows and columns; i.e., an n× n matrix for some n.

• Matrices originally arose as a way to simplify the algebra involved in solving systems of linear equations.

1.1 Linear Equations, Gauss-Jordan Elimination

• Reminder: A linear equation is an equation of the form a1x1 + a2x2 + · · · + anxn = b, for some constantsa1, . . . an, b, and variables x1, . . . , xn.

• The traditional method for solving a system of linear equations (typically covered in basic algebra) is byelimination i.e., by solving one equation for the rst variable x1, then plugging in the result to all the otherequations to obtain a reduced system involving fewer variables. Eventually, the system will simplify either toa contradiction (e.g., 1 = 0), a unique solution, or an innite family of solutions.

Example: Given the systemx + y = 72x − 2y = −2

we can solve the rst equation for x to obtain x = 7 − y. Then plugging in this relation to the secondequation gives 2(7 − y) − 2y = −2, or 14 − 4y = −2, so that y = 4 . Then since x = 7 − y we obtain

x = 3 .

• Another way to perform elimination is to add and subtract multiples of the equations, so to eliminate variables(and remove the need to solve for each individual variable before eliminating it). In the example above, insteadof solving the rst equation for x, we could multiply the rst equation by −2 and then add it to the secondequation, so as to eliminate x from the second equation. This yields the same overall result, but is lesscomputationally dicult.

Example: Given the systemx + y + 3z = 42x + 3y − z = 1−x + 2y + 2z = 1

Page 26: MATH320

if we label the equations #1, #2, #3, then we can eliminate x by taking [#2]− 2[#1] and [#3] + [#1].This gives

x + y + 3z = 4y − 7z = −73y + 5z = 5

and now we can eliminate y by taking [#3]− 3[#1]. This yields

x − 2y + 3z = 4y − 7z = −7

26z = 26.

Now the third equation gives z = 1, and then the second equation requires y = 0, and the rst equationgives x = 1.

• Now, this procedure of elimination can be simplied even more, because we don't really need to write thevariables down every time. We only need to keep track of the coecients, which we can do by putting theminto a matrix.

Example: The systemx + y + 3z = 42x + 3y − z = 1−x + 2y + 2z = 1

in matrix form becomes 1 1 32 3 −1−1 2 2

∣∣∣∣∣∣411

Note: When working with a coecient matrix, I generally draw a line to separate the coecients of thevariables from the constant terms.

• Now we can perform the elimination operations on the rows of the matrix, (the elementary row operations)in order to reduce the matrix to row-echelon form. The three basic row operations on the coecient matrixwhich will not change the solutions to the equations are the following:

1. Interchange two rows.

2. Multiply all entries in a row by a nonzero constant.

3. Add a constant multiply of one row to another row.

• Here are the steps to follow to solve a system of linear equations by reducing the coecient matrix to row-echelon form. This procedure is known as Gauss-Jordan Elimination:

Step 1: Convert the system to its coecient matrix.

Step 2: If all entries in the rst column are zero, then the variable corresponding to that column is afree variable. Ignore this column and repeat this step until an entry in the rst column is nonzero. Swaprows, if necessary, so that the upper-left entry in the rst column is nonzero.

Step 3: Use row operation #3 (and #2, if it will simplify arithmetic) to clear out all entries in the rstcolumn below the rst row.

Step 4: Repeat steps 2 and 3 on the submatrix obtained by ignoring the rst row and rst column, untilall remaining rows have all entries equal to zero.

• After following the steps, the matrix will now be in row-echelon form, and the system can be fairly easilysolved. To put the matrix in reduced row-echelon form, we have an optional extra step:

Step 5: Identify the pivotal columns (columns containing a leading row-term), and then perform rowoperations to clear out all non-leading entries in each pivotal column.

Page 27: MATH320

• Reduced row-echelon form is useful because, when the matrix is converted back into a system of equations,it expresses each of the determined variables in terms of the free variables. It is also theoretically usefulbecause it can be proven that every matrix has a unique reduced row-echelon form.

• Note: It is rather distressingly dicult to write a clear description of how to reduce a matrix to row-echelonform. Examples illustrate the procedure much better.

• Example: Solve the system x+ y + 3z = 4, 2x+ 3y − z = 1, −x+ 2y + 2z = 1.

Given the systemx + y + 3z = 42x + 3y − z = 1−x + 2y + 2z = 1

we write it in matrix form to get 1 1 32 3 −1−1 2 2

∣∣∣∣∣∣411

and now we repeatedly apply elementary row operations to clear out the rst column: 1 1 3

2 3 −1−1 2 2

∣∣∣∣∣∣411

R2−2R1−→

1 1 30 1 −7−1 2 2

∣∣∣∣∣∣4−71

R3+R1−→

1 1 30 1 −70 3 5

∣∣∣∣∣∣4−75

.Now we are done with the rst column and look at the second column (ignoring the rst row), and obtain

1 1 30 1 −70 3 5

∣∣∣∣∣∣4−75

R3−3R2−→

1 1 30 1 −70 0 26

∣∣∣∣∣∣4−726

1

26R3

−→

1 1 30 1 −70 0 1

∣∣∣∣∣∣4−71

Now the system is in row-echelon form. To put it in reduced row-echelon form, we can work from thebottom up: 1 1 3

0 1 −70 0 1

∣∣∣∣∣∣4−71

R2+7R3−→

1 1 30 1 00 0 1

∣∣∣∣∣∣401

R1−R2−→

1 0 30 1 00 0 1

∣∣∣∣∣∣401

R2−3R3−→

1 0 00 1 00 0 1

∣∣∣∣∣∣101

.And now from here it is very easy to get the solution to the system: x = 1, y = 0, z = 1 , which is the

same answer we got when we did the row operations without using a matrix.

• Example: Solve the system x+ y − 2z = 3, −x+ 3y − 5z = 1, 3x− y + z = 2.

The coecient matrix is 1 1 −2−1 3 −53 −1 1

∣∣∣∣∣∣312

.Now put it in row-echelon form: 1 1 −2−1 3 −53 −1 1

∣∣∣∣∣∣312

R2+R1−→

1 1 −20 4 −73 −1 1

∣∣∣∣∣∣342

R3−3R1−→

1 1 −20 4 −70 −4 7

∣∣∣∣∣∣34−7

R2+R3−→

1 1 −20 4 −70 0 0

∣∣∣∣∣∣34−3

.At this stage we have reached a contradiction, since the bottom row reads 0 = −3. Therefore, there isno solution .

• Example: Find all solutions to the system x− y + z − t+ w = 5, −x+ y − z + t = 4, 2t+ 3w = 3.

Page 28: MATH320

The coecient matrix is 1 −1 1 −1 1−1 1 −1 1 00 0 0 2 3

∣∣∣∣∣∣543

.Now put it in row-echelon form: 1 −1 1 −1 1−1 1 −1 1 00 0 0 2 3

∣∣∣∣∣∣543

R2+R1−→

1 −1 1 −1 10 0 0 0 10 0 0 2 3

∣∣∣∣∣∣593

R2⇔R3−→

1 −1 1 −1 10 0 0 2 30 0 0 0 1

∣∣∣∣∣∣539

Now put it in reduced row-echelon form (the pivotal columns are 1, 4, and 5): 1 −1 1 −1 1

0 0 0 2 30 0 0 0 1

∣∣∣∣∣∣539

R2−3R3−→

1 −1 1 −1 10 0 0 2 00 0 0 0 1

∣∣∣∣∣∣5−249

1

2R2

−→

1 −1 1 −1 10 0 0 1 00 0 0 0 1

∣∣∣∣∣∣5−129

R1−R3−→

1 −1 1 −1 00 0 0 1 00 0 0 0 1

∣∣∣∣∣∣−4−129

R1+R2−→

1 −1 1 0 00 0 0 1 00 0 0 0 1

∣∣∣∣∣∣−16−129

.And now we see that the general solution to the system is (x, y, z, t, w) = (y − z − 16, y, z, −12, 9) ,

where y and z are our arbitrary free variables.

1.2 Matrix Operations: Addition, Multiplication

• Notation: If A is a matrix, we will denote by ai,j the entry of A in the i-th column and j-th row of A. Thisentry will also be called the (i, j)-entry of A.

Memory aid: If you have trouble getting the coordinates mixed up, you can think of i and j as analogousto the x and y coordinates in the Cartesian plane: relative to the entry in the top left, the value of iindicates the horizontal position, and the value of j indicates the vertical position.

• We say two matrices are equal if all of their entries are equal.

• Like with vectors, we can add matrices of the same dimension, and multiply a matrix by a scalar. Each ofthese operations is done componentwise: to add, we just add corresponding entries of the two matrices; tomultiply by a scalar, we just multiply each entry by that scalar.

Example: If A =

[1 62 2

]and B =

[3 00 2

], then A + B =

[1 + 3 6 + 02 + 0 2 + 2

]=

[4 62 4

], 2A =[

2 · 1 2 · 62 · 2 2 · 2

]=

[2 124 4

], and A− 1

3B =

[0 62 4

3

].

• We also have a transposition operation, where we interchange the rows and columns of the matrix. Moreexplicitly, given an n×m matrix A, the transpose of A, denoted AT is the m× n matrix whose (i, j)-entry isequal to the (j, i)-entry of A.

Example: If A =

[1 2 34 5 6

], then AT =

1 42 53 6

.• Matrix multiplication, however, is NOT performed componentwise. Instead, the product of two matrices isthe row-column product. Explicitly, if A is an m×n matrix and B is an n× q matrix, then the product ABis the m× q matrix whose (i, j)-entry is the dot product of the jth row of A and the ith column of B (wherethe rows and columns are thought of as vectors of length n).

More explicitly, the (i, j)-entry of AB is given by the sum of products (AB)i,j =

n∑k=1

ai,kbk,j .

Page 29: MATH320

Note: In order for the matrix product to exist, the number of columns of A must equal the numberof rows of B. In particular, if A and B are the same size, their product exists only if they are squarematrices. Also, if AB exists, then BA may not necessarily exist.

• If you are wondering why matrix multiplication is dened this way, the general answer is: in order to makecompositions of linear transformations of vector spaces work correctly. The more specic answer is: in orderto solidify the relationship between solving systems of linear equations and matrices.

Motivating Example #1: The system of equations

x + y = 72x − 2y = −2

can be rewritten as a matrix equation[1 12 −2

]·[xy

]=

[7−2

]

since the product on the left-hand side is the column matrix

[x+ y2x− 2y

].

It is very useful to rewrite systems of equations as a single matrix equation; we will do this frequently.

Motivating Example #2: Consider what happens if we are given the equations x1 = y1+y2, x2 = 2y1−y2and the equations y1 = 3z1− z2, y2 = z1− z2, and want to express x1 and x2 in terms of z1 and z2. Youcan just plug in and check that x1 = 4z1 − 2z2 and x2 = 5z1 − z2.

∗ In terms of matrices this says

[x1x2

]=

[1 22 −1

]·[y1y2

]and

[y1y2

]=

[3 −11 −1

]·[z1z2

].

∗ So we would want to be able to say

[x1x2

]=

[1 22 −1

]·[

3 −11 −1

]·[z1z2

].

∗ Indeed, we have the matrix product

[1 22 −1

]·[

3 −11 −1

]=

[4 −25 −1

]. So the denition of

matrix multiplication makes everything consistent with what we'd want to happen.

• Example: If A =

[−1 1 20 1 1

]and B =

1 22 03 3

, then AB is dened and is a 2× 2 matrix.

The (1, 1) entry of AB equals the dot product 〈−1, 1, 2〉 · 〈1, 2, 3〉 = (−1)(1) + (1)(2) + (2)(3) = 7.

The (2, 1) entry of AB equals the dot product 〈−1, 1, 2〉 · 〈2, 0, 3〉 = (−1)(2) + (1)(0) + (2)(3) = 4.

The (1, 2) entry of AB equals the dot product 〈0, 1, 1〉 · 〈1, 2, 3〉 = (0)(1) + (1)(2) + (1)(3) = 5.

The (2, 2) entry of AB equals the dot product 〈0, 1, 1〉 · 〈2, 0, 3〉 = (0)(2) + (1)(0) + (1)(3) = 3.

Putting all of this together gives AB =

[7 45 3

].

• Example: If A =

[−1 1 20 1 1

]and B =

1 22 03 3

, then BA is also dened and is a 3× 3 matrix.

The (1, 1) entry of BA equals the dot product 〈1, 2〉 · 〈−1, 0〉 = −1. The (2, 1) entry of BA equals the dot product 〈1, 2〉 · 〈1, 1〉 = 3.

The (3, 1) entry of BA equals the dot product 〈1, 2〉 · 〈2, 1〉 = 4.

The (1, 2) entry of BA equals the dot product 〈2, 0〉 · 〈−1, 0〉 = −2. The (2, 2) entry of BA equals the dot product 〈2, 0〉 · 〈1, 1〉 = 2.

The (3, 2) entry of BA equals the dot product 〈2, 0〉 · 〈2, 1〉 = 4.

The (1, 3) entry of BA equals the dot product 〈3, 3〉 · 〈−1, 0〉 = −3.

Page 30: MATH320

The (2, 3) entry of BA equals the dot product 〈3, 3〉 · 〈1, 1〉 = 6.

The (3, 3) entry of BA equals the dot product 〈3, 3〉 · 〈2, 1〉 = 8.

Putting all of this together gives BA =

−1 3 4−2 2 3−3 6 8

.

• If we restrict our attention to square matrices, then matrices under addition and multiplication obey some,but not all, of the algebraic properties that real numbers do.

In general, multiplication is NOT commutative: AB typically isn't equal to BA, even if A and B areboth square matrices.

∗ Example: A =

[1 23 4

]and B =

[2 34 5

]. Then AB =

[10 1322 29

]while BA =

[11 1619 28

].

Matrix multiplication distributes over addition, on both sides: (A+B)C = AC +BC and A(B + C) =AB +AC.

Matrix multiplication is associative: (AB)C = A(BC), if A,B,C are of the proper dimensions.

∗ In particular, taking the nth power of a square matrix is well-dened for every positive integer n.

∗ This is not easy to see from the denition.

The transpose of the product of two matrices is the product of their transposes, in reverse order: (AB)T =BT AT .

If A is an n × n matrix, then there is a zero matrix Zn which has the properties Zn + A = A andZn ·A = A · Zn = Zn. This matrix Zn is the matrix whose entries are all zeroes.

∗ Example: The 2× 2 zero matrix is

[0 00 0

].

∗ Semi-Related Interesting Example: If A =

[0 10 0

], then A is not the zero matrix, but A2 is the

zero matrix. This is in contrast to real (or complex numbers), where x2 = 0 implies x = 0.

If A is an n× n matrix, then there is an n× n identity matrix In which has the property that In · A =A · In = A. This matrix In is the matrix whose diagonal entries are 1s and whose other entries are 0s.

∗ Example: The 2 × 2 identity matrix is I2 =

[1 00 1

]. It is not hard to check that

[1 00 1

]·[

a bc d

]=

[a bc d

]=

[a bc d

]·[

1 00 1

]for any 2× 2 matrix

[a bc d

].

1.3 Determinants and Inverses

• Given a square n × n matrix A, we might like to know whether it has a multiplicative inverse: namely, amatrix A−1 (necessarily of the same dimension) such that A · A−1 = A−1 · A = In, where In is the n × nidentity matrix.

Motivating Example: Suppose we have a system of linear equations, written in matrix form

A · ~x = ~c

for a matrix of coecients A =

a1,1 · · · an,1...

. . ....

a1,n · · · an,n

, a column vector ~x =

x1...xn

of variables and a

column vector ~c =

c1...cn

of constants.

∗ If we knew that A were invertible, then we could multiply both sides of the equation on the left byA−1 to see that A−1A · ~x = A−1 · ~c.

Page 31: MATH320

∗ Now since A−1A is the identity matrix we have A−1A · ~x = ~x, so we see ~x = A−1 · ~c.∗ In other words, if we knew the matrix A−1, we would immediately be able to write down the solutionto the system.

• If A and B are invertible matrices with inverses A−1 and B−1, then AB is also an invertible matrix, withinverse B−1A−1: observe that (AB)(B−1A−1) = A(BB−1)A = A(In)A

−1 = AA−1 = In, and similarly forthe product in the other order.

By induction, one can verify that (A1A2 · · ·An)−1 = (An)

−1 · · · (A2)−1(A1)

−1, provided that each ofA1, A2, . . . , An are invertible.

• Not every matrix has a multiplicative inverse. We say a matrix is singular if it is not invertible, and we sayit is nonsingular if it is invertible.

Obviously,

[0 00 0

]does not. (But we wouldn't expect the zero matrix to be invertible.)

But we can also check that

[0 10 0

]·[a bc d

]=

[c d0 0

], so

[0 10 0

]does not have an inverse either.

• Theorem: An n× n matrix A is invertible if and only if it is row-equivalent to the identity matrix.

To prove this theorem, simply consider the reduced row-echelon form of the matrix A. Because thematrix is square, the reduced row-echelon form is either the identity matrix, or a matrix with a row ofall zeroes.

Suppose A is row-equivalent to the identity matrix. Then, because each elementary row operationcorresponds to left-multiplication by an invertible matrix, we can write A as a product A = A1A2 . . . Ak,where each of the Ai is an invertible matrix corresponding to an elementary row operation. ThenA−1 = A−1k . . . A−12 A−11 .

If A is not row-equivalent to the identity matrix, then its reduced row-echelon form Ared must contain arow of all zero entries. Clearly Ared cannot be invertible, and since A = A1A2 . . . AkAred, then if A hadan inverse then so would Ared.

• In order to compute the inverse of an n× n matrix A using row reduction (or see if it is non-invertible), usethe following procedure:

Step 1: Set up the double matrix [A |In] where In is the identity matrix.

Step 2: Perform row operations to put A in reduced row-echelon form. (Carry the computations throughon the entire matrix, but only pay attention to the A-side.)

Step 3: If A can be reduced to the n× n identity matrix, then A−1 will be what appears on the In-sideof the double matrix. If A cannot be row-reduced to the n× n identity matrix, then A is not invertible.

• Among other things, this algorithm shows that the inverse of a matrix is unique.

• Example: Find the inverse of the matrix A =

1 0 −12 −1 10 2 −5

. We set up the starting matrix 1 0 −1

2 −1 10 2 −5

∣∣∣∣∣∣1 0 00 1 00 0 1

. Now we perform row-reductions: 1 0 −1

2 −1 10 2 −5

∣∣∣∣∣∣1 0 00 1 00 0 1

R2−2R1−→

1 0 −10 −1 30 2 −5

∣∣∣∣∣∣1 0 0−2 1 00 0 1

R3+2R2−→

1 0 −10 −1 30 0 1

∣∣∣∣∣∣1 0 0−2 1 0−4 2 1

R2−3R2−→

1 0 −10 −1 00 0 1

∣∣∣∣∣∣1 0 010 −5 −3−4 2 1

R1+R3−→

1 0 00 −1 00 0 1

∣∣∣∣∣∣−3 2 110 −5 −3−4 2 1

−R2−→

1 0 00 1 00 0 1

∣∣∣∣∣∣−3 2 1−10 5 3−4 2 1

.

Page 32: MATH320

We've reduced A to the identity matrix, and so A is invertible, with A−1 =

−3 2 1−10 5 3−4 2 1

.

• We can work out the calculations explicitly in small cases. For example, the matrix

[a bc d

]is invertible if

and only if ad− bc 6= 0, and if so, the inverse is given by1

ad− bc

[d −b−c a

].

• We might like to know, without performing all of the row-reductions, if A is invertible. This motivates theidea of the determinant, which will tell us precisely when a matrix is invertible.

• Denition: The determinant of a square matrix A, denoted det(A) or |A|, is dened inductively. For a1 × 1 matrix [a] it is just the constant a. For an n × n matrix we compute the determinant via cofactorexpansion: dene A(i,j) to be the matrix obtained from A by deleting the ith column and jth row. Then we

set det(A) =

n∑k=1

(−1)k+1ak,1 det(A(k,1)).

Note: The calculation of the determinant this way is called expansion by minors. It can be shown thatthe same value results from expanding along any row or column.

The best way to understand determinants is to work out some examples.

• Example: The determinant

∣∣∣∣ a bc d

∣∣∣∣ is given by

∣∣∣∣ a bc d

∣∣∣∣ = ad− bc.

So, as particular cases,

∣∣∣∣ 1 23 4

∣∣∣∣ = (1)(4)− (2)(3) = −2 and

∣∣∣∣ 1 12 2

∣∣∣∣ = (1)(2)− (1)(2) = 0.

• Example: The general determinant

∣∣∣∣∣∣a1 a2 a3b1 b2 b3c1 c2 c3

∣∣∣∣∣∣ is given by∣∣∣∣∣∣a1 a2 a3b1 b2 b3c1 c2 c3

∣∣∣∣∣∣ = a1

∣∣∣∣ b2 b3c2 c3

∣∣∣∣−a2 ∣∣∣∣ b1 b3c1 c3

∣∣∣∣+a3

∣∣∣∣ b1 b2c1 c2

∣∣∣∣. As a particular case,

∣∣∣∣∣∣1 2 4−1 1 0−2 1 3

∣∣∣∣∣∣ = 1

∣∣∣∣ 1 01 3

∣∣∣∣−2

∣∣∣∣ −1 0−2 3

∣∣∣∣+4

∣∣∣∣ −1 1−2 1

∣∣∣∣ = 1(3)−2(−3)+4(1) = 13.

• Here are some very useful properties of the determinant:

Interchanging two rows multiplies the determinant by −1. Multiplying all entries in one row by a constant scales the determinant by the same constant.

Adding or subtracting a scalar multiple of one row to another leaves the determinant unchanged.

If a matrix has a row of all zeroes, its determinant is zero. More generally, if one row is a scalar multipleof another, then its determinant is zero.

The determinant is multiplicative: det(AB) = det(A) det(B).

The determinant of the transpose matrix is the same as the original determinant: det(AT ) = det(A).

The determinant of any upper-triangular matrix (a matrix whose entries below the diagonal are allzeroes) is equal to the product of the diagonal entries. In particular, the determinant of the identitymatrix is 1.

∗ Example:

∣∣∣∣∣∣6 −1 30 2 00 0 3

∣∣∣∣∣∣ = 36.

A matrix A is invertible precisely when det(A) 6= 0.

∗ We can see this by putting a matrix in reduced row-echelon form, and applying the properties above.

Page 33: MATH320

∗ If A is invertible, then det(A−1) =1

det(A).

• With some more eort, we can even write down a formula for the inverse:

• Theorem: The matrix A is invertible precisely when det(A) 6= 0, and in that case, A−1 =1

det(A)[adj(A)]

T,

where adj(A) is the matrix whose (i, j)-entry is given by (−1)i+j det(A(i,j)).

The name adj(A) is short for adjugate.

Remember that [adj(A)]Tis the transpose of adj(A), and A(i,j) is the matrix obtained from A by deleting

the row and column containing the (i, j)-entry of A.

• Example: Use the formula to compute the inverse of the matrix A =

1 0 −12 −1 10 2 −5

. First, we have det(A) = 1

∣∣∣∣ −1 12 −5

∣∣∣∣− 0

∣∣∣∣ 2 10 −5

∣∣∣∣− 1

∣∣∣∣ 2 −10 2

∣∣∣∣ = 3− 4 = −1.

Now we compute all of the entries of the adjugate matrix:

∗ The (1,1)-entry is +

∣∣∣∣ −1 12 −5

∣∣∣∣ = 3, the (2,1)-entry is −∣∣∣∣ 2 10 −5

∣∣∣∣ = 10, the (3,1)-entry is

+

∣∣∣∣ 2 −10 2

∣∣∣∣ = 4.

∗ The (1,2)-entry is −∣∣∣∣ 0 −12 −5

∣∣∣∣ = −2, the (2,2)-entry is +

∣∣∣∣ 1 −10 −5

∣∣∣∣ = −5, the (3,2)-entry is

−∣∣∣∣ 1 00 2

∣∣∣∣ = −2.∗ The (1,3)-entry is +

∣∣∣∣ 0 −1−1 1

∣∣∣∣ = −1, the (2,3)-entry is −∣∣∣∣ 1 −12 1

∣∣∣∣ = −3, the (3,3)-entry is

+

∣∣∣∣ 1 02 −1

∣∣∣∣ = −1. Thus, we have adj(A) =

3 10 4−2 −5 −2−1 −3 −1

. Since det(A) = −1, we thus obtain A−1 =

−3 2 1−10 5 3−4 2 1

.

Note that this is the same answer we obtained by doing row-reductions.

1.4 Matrices and Systems of Linear Equations, revisited

• As an application of the utility of the matrix approach, let us revisit systems of linear equations.

• First suppose we have a homogeneous system of k equations in n variables, of the form

a1,1x1 + · · ·+ an,1xn = 0

......

a1,kx1 + · · ·+ an,kxn = 0

Let A =

a1,1 · · · an,1...

. . ....

a1,k · · · an,k

be the matrix of coecients, ~x =

x1...xk

the column vector of variables,

and ~0 =

0...0

the zero column vector.

Page 34: MATH320

Then in matrix form, the system takes the much simpler form A · ~x = ~0.

Clearly, ~x = ~0 (i.e., all variables equal to zero) is a solution.

If we have any solution ~v (so that A ·~v = ~0), then for any constant c, then the vector c~v is also a solution,because A · c~v = c(A · ~v) = c(~0) = ~0.

∗ In other words, any scalar multiple of a solution to a homogeneous system is also a solution to thesystem.

∗ Explicit example: Given the system x1 + x2 + x3 = 0, with the solution x1 = 1, x2 = 1, x3 = −2,then we claim that x1 = c, x2 = c, x3 = −2c is also a solution. (Which it is.)

If we have any two solutions ~v and ~w (so that A · ~v = A · ~w = ~0), then ~v + ~w is also a solution, becauseA · (~v + ~w) = A · ~v +A · ~w = ~0 +~0 = ~0.

∗ In other words, the sum of any two solutions to a homogeneous system is also a solution to thesystem.

∗ Explicit example: Given the system x1+x2+x3 = 0, with the solutions (x1, x2, x3) = (1, 1,−2) and(x1, x2, x3) = (−3,−1, 4), then we claim that (x1, x2, x3) = (1− 3, 1− 1,−2 + 4) = (−2, 0, 2) is alsoa solution. (Which it is.)

This means that the set of solutions to the homogeneous linear system forms a vector space (which wewill discuss in much more detail later).

• Now suppose we have a general system of n linear equations in n variables, written in matrix form

A · ~x = ~c

for a square matrix of coecients A =

a1,1 · · · an,1...

. . ....

a1,n · · · an,n

, a column vector ~x =

x1...xn

of variables and a

column vector ~c =

c1...cn

of constants.

We claim that every solution to this system (if there is one) is of the form ~x = ~vparticular + ~vhomogeneous,where ~vparticular is any one solution to the general system A · ~x = ~c, and ~vhomogeneous is a solution to thehomogeneous system A · ~x = ~c.

∗ To see this, rst observe that if A · ~vparticular = ~c and A · ~vhomogeneous = ~0, then

A · (~vparticular + ~vhomogeneous) = A · ~vparticular +A · ~vhomogeneous = ~c+~0 = ~c,

and so ~vparticular + ~vhomogeneous is also a solution to the original system.

∗ Conversely, if ~v and ~w are two solutions to the original system, then A·(~v−~w) = A·~v−A·~w = ~c−~c = ~0,so that ~v − ~w is a solution to the homogeneous system.

∗ The upshot of this result is: if we can nd one solution to the original system, then we can nd all

of them just by solving the homogeneous system.

If A is invertible, then we can multiply both sides of the equation on the left by A−1 to see that ~x = A−1 ·~c.In particular, the system has a unique solution.

If A is not invertible, then the homogeneous system has innitely many solutions (as the reduced row-echelon form of A must have a row of all zeroes, and hence at least one free variable). Then the originalsystem either has no solutions, or innitely many solutions.

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 35: MATH320

Math 320 (part 3) : Vector Spaces (by Evan Dummit, 2012, v. 1.00)

Contents

1 Vector Spaces 1

1.1 Review of Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Formal Denition of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Span, Independence, Bases, Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4.1 Linear Combinations and Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4.2 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4.3 Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1 Vector Spaces

1.1 Review of Vectors in Rn

• A vector, as we typically think of it, is a quantity which has both a magnitude and a direction. This is incontrast to a scalar, which carries only a magnitude.

Real-valued vectors are extremely useful in just about every aspect of the physical sciences, since justabout everything in Newtonian physics is a vector position, velocity, acceleration, forces, etc. Thereis also vector calculus namely, calculus in the context of vector elds which is typically part of amultivariable calculus course; it has many applications to physics as well.

• We often think of vectors geometrically, as a directed line segment (having a starting point and an endpoint).

• We denote the n-dimensional vector from the origin to the point (a1, a2, · · · , an) as ~v = 〈a1, a2, · · · , an〉,where the ai are scalars.

Some vectors: 〈1, 2〉, 〈3, 5,−1〉,⟨−π, e2, 27, 3, 4

3, 0, 0,−1

⟩.

Notation: I prefer to use angle brackets 〈·〉 rather than parentheses (·) so as to draw a visual distinctionbetween a vector, and the coordinates of a point in space. I also draw arrows above vectors or typesetthem in boldface (thus ~v or v), in order to set them apart from scalars. This is not standard notationeverywhere; many other authors use regular parentheses for vectors.

• Note/Warning: Vectors are a little bit dierent from directed line segments, because we don't care where avector starts: we only care about the dierence between the starting and ending positions. Thus: the directedsegment whose start is (0, 0) and end is (1, 1) and the segment starting at (1, 1) and ending at (2, 2) representthe same vector, 〈1, 1〉.

• We can add vectors (provided they are of the same length!) in the obvious way, one component at a time: if~v = 〈a1, · · · , an〉 and ~w = 〈b1, · · · , bn〉 then ~v + ~w = 〈a1 + b1, · · · , an + bn〉.

We can justify this using our geometric idea of what a vector does: ~v moves us from the origin to the point(a1, · · · , an). Then ~w tells us to add 〈b1, · · · , bn〉 to the coordinates of our current position, and so ~wmoves us from (a1, · · · , an) to (a1+b1, · · · , an+bn). So the net result is that the sum vector ~v+ ~w movesus from the origin to (a1 + b1, · · · , an + bn), meaning that it is just the vector 〈a1 + b1, · · · , an + bn〉.

Another way (though it's really the same way) to think of vector addition is via the parallelogramdiagram, whose pairs of parallel sides are ~v and ~w, and whose long diagonal is ~v + ~w.

Page 36: MATH320

• We can also 'scale' a vector by a scalar, one component at a time: if r is a scalar, then we have r ~v =〈ra1, · · · , ran〉.

Again, we can justify this by our geometric idea of what a vector does: if ~v moves us some amount in a

direction, then1

2~v should move us half as far in that direction. Analogously,2~v should move us twice as

far in that direction, and −~v should move us exactly as far, but in the opposite direction.

• Example: If ~v = 〈−1, 2, 2〉 and ~w = 〈3, 0,−4〉 then 2~w = 〈6, 0,−8〉 , and ~v+ ~w = 〈2, 2,−2〉 . Furthermore,~v−

2~w = 〈−7, 2, 10〉 .

• The arithmetic of vectors in Rn satises several algebraic properties (which follow more or less directly fromthe denition):

Addition of vectors is commutative and associative.

There is a zero vector (namely, the vector with all entries zero) such that every vector has an additiveinverse.

Scalar multiplication distributes over addition of both vectors and scalars.

1.2 Formal Denition of Vector Spaces

• The two operations of addition and scalar multiplication (and the various algebraic properties they satisfy)are the key properties of vectors in Rn. We would like to investigate other collections of things which possessthe same properties as vectors in Rn.

• Denition: A (real) vector space is a collection V of vectors together with two binary operations, addition ofvectors (+) and scalar multiplication of a vector by a real number (·), satisfying the following axioms:

Let ~v, ~v1, ~v2, ~v3 be any vectors and α, α1,α2 be any (real number) scalars.

Note: The statement that + and · are binary operations means that ~v1+~v2 and α ·~v are always dened. [A1] Addition is commutative: ~v1 + ~v2 = ~v2 + ~v1.

[A2] Addition is associative: (~v1 + ~v2) + ~v3 = ~v1 + (~v2 + ~v3).

[A3] There exists a zero vector ~0, with ~v +~0 = ~v.

[A4] Every vector ~v has an additive inverse −~v, with ~v + (−~v) = ~0.

[M1] Scalar multiplication is consistent with regular multiplication: α1 · (α2 · ~v) = (α1α2) · ~v. [M2] Addition of scalars distributes: (α1 + α2) · ~v = α1 · ~v + α2 · ~v. [M3] Addition of vectors distributes: α · (~v1 + ~v2) = α · ~v1 + α · ~v2. [M4] The scalar 1 acts like the identity on vectors: 1 · ~v1 = ~v1.

• Important Remark: One may also consider vector spaces where the collection of scalars is something otherthan the real numbers for example, there exists an equally important notion of a complex vector space,whose scalars are the complex numbers. (The axioms are the same.)

We will principally consider real vector spaces, in which the scalars are the real numbers.

The most general notion of a vector space involves scalars from a eld, which is a collection of numberswhich possess addition and multiplication operations which are commutative, associative, and distribu-tive, with an additive identity 0 and multiplicative identity 1, such that every element has an additiveinverse and every nonzero element has a multiplicative inverse.

Aside from the real and complex numbers, another example of a eld is the rational numbers (i.e.,fractions). One can formulate an equally interesting theory of vector spaces over the rational numbers.

• Examples: Here are some examples of vector spaces:

The vectors in Rn are a vector space, for any n > 0. (This had better be true!)

Page 37: MATH320

∗ In particular, if we take n = 1, then we see that the real numbers themselves are a vector space.

∗ Note: For simplicity I will demonstrate all of the axioms for vectors in R2; there, the vectors are ofthe form 〈x, y〉 and scalar multiplication is dened as α · 〈x, y〉 = 〈αx, αy〉.∗ [A1]: We have 〈x1, y1〉+ 〈x2, y2〉 = 〈x1 + x2, y1 + y2〉 = 〈x2, y2〉+ 〈x1, y1〉.∗ [A2]: We have (〈x1, y1〉+ 〈x2, y2〉)+〈x3, y3〉 = 〈x1 + x2 + x3, y1 + y2 + y3〉 = 〈x1, y1〉+(〈x2, y2〉+ 〈x3, y3〉).∗ [A3]: The zero vector is 〈0, 0〉, and clearly 〈x, y〉+ 〈0, 0〉 = 〈x, y〉.∗ [A4]: The additive inverse of 〈x, y〉 is 〈−x,−y〉, since 〈x, y〉+ 〈−x,−y〉 = 〈0, 0〉.∗ [M1]: We have α1 · (α2 · 〈x, y〉) = 〈α1α2x, α1α2y〉 = (α1α2) · 〈x, y〉.∗ [M2]: We have (α1 + α2) · 〈x, y〉 = 〈(α1 + α2)x, (α1 + α2)y〉 = α1 · 〈x, y〉+ α2 · 〈x, y〉.∗ [M3]: We have α · (〈x1, y1〉+ 〈x2, y2〉) = 〈α(x1 + x2), α(y1 + y2)〉 = α · 〈x1, y1〉+ α · 〈x2, y2〉.∗ [M4]: Finally, we have 1 · 〈x, y〉 = 〈x, y〉.

The zero space with a single element ~0, with ~0 +~0 = ~0 and α ·~0 = ~0 for every α, is a vector space.

∗ All of the axioms in this case eventually boil down to ~0 = ~0.

∗ This space is rather boring: since it only contains one element, there's really not much to say aboutit.

The set of m× n matrices for any m and any n, forms a vector space.

∗ The various algebraic properties we know about matrix addition give [A1] and [A2] along with [M1],[M2], [M3], and [M4].

∗ The zero vector in this vector space is the zero matrix (with all entries zero), and [A3] and [A4]follow easily.

∗ Note of course that in some cases we can also multiply matrices by other matrices. However, therequirements for being a vector space don't care that we can multiply matrices by other matrices!(All we need to be able to do is add them and multiply them by scalars.)

The complex numbers a+ bi, where i2 = −1, are a vector space.

∗ The axioms all follow from the standard properties of complex numbers. As might be expected, thezero vector is just the complex number 0 = 0 + 0i.

∗ Again, note that the complex numbers have more structure to them, because we can also multiplytwo complex numbers, and the multiplication is also commutative, associative, and distributive overaddition. However, the requirements for being a vector space don't care that the complex numbershave these additional properties.

The collection of all real-valued functions on any part of the real line is a vector space, where we denethe sum of two functions as (f + g)(x) = f(x) + g(x) for every x, and scalar multiplication as(α · f)(x) = α f(x).

∗ To illustrate: if f(x) = x and g(x) = x2, then f + g is the function with (f + g)(x) = x+ x2, and 2fis the function with (2f)(x) = 2x.

∗ The axioms follow from the properties of functions and real numbers. The zero vector in this spaceis the zero function; namely, the function z which has z(x) = 0 for every x.

∗ For example (just to demonstrate a few of the axioms), for any value x in [a, b] and any functions fand g, we have

· [A1]: (f + g)(x) = f(x) + g(x) = g(x) + f(x) = (g + f)(x).

· [M2]: α · (f + g)(x) = α f(x) + α g(x) = (αf)(x) + (αg)(x).

· [M4]: (1 · f)(x) = f(x).

• There are many simple algebraic properties that can be derived from the axioms (and thus, are true in everyvector space), using some amount of cleverness. For example:

1. Addition has a cancellation law: for any vector ~v, if ~v + ~a = ~v +~b then ~a = ~b.

Idea: Add −~v to both sides and then use [A1]-[A4] to rearrange (~v + ~a) + (−~v) = (~v +~b) + (−~v) to~a = ~b.

2. The zero vector is unique: for any vector ~v, if ~v + ~a = ~v, then ~a = ~0.

Page 38: MATH320

Idea: Use property (1) applied when ~b = 0.

3. The additive inverse is unique: for any vector ~v, if ~v + ~a = ~0 then ~a = −~v. Idea: Use property (1) applied when ~b = −~v.

4. The scalar 0 times any vector gives the zero vector: 0 · ~v = ~0 for any vector ~v.

Idea: Expand ~v = (1 + 0) · ~v = ~v + 0 · ~v via [M2] and then apply property (2).

5. Any scalar times the zero vector is the zero vector: α ·~0 = ~0 for any scalar α.

Idea: Expand α ·~0 = α · (~0 +~0) = α ·~0 + α ·~0 via [M1] and then apply property (1).

6. The scalar −1 times any vector gives the additive inverse: (−1) · ~v = −~v for any vector ~v.

Idea: Use property (3) and [M2]-[M4] to write ~0 = 0 · ~v = (1 + (−1)) · ~v = ~v + (−1)~v, and then useproperty (1) with ~a = −~v.

7. The additive inverse of the additive inverse is the original vector: − (−~v) = ~v for any vector ~v.

Idea: Use property (5) and [M1], [M4] to write −(−~v) = (−1)2 · ~v = 1 · ~v = ~v.

1.3 Subspaces

• Denition: A subspaceW of a vector space V is a subset of the vector space V which, under the same additionand scalar multiplication operations as V , is itself a vector space.

Very often, if we want to check that something is a vector space, it is often much easier to verify that itis a subspace of something else we already know is a vector space.

∗ We will make use of this idea when we talk about the solutions to a homogeneous linear dierentialequation (see the examples below), and prove that the solutions form a vector space merely bychecking that they are a subspace of the set of all functions, rather than going through all of theaxioms.

We are aided by the following criterion, which tells us exactly what properties a subspace must satisfy:

• Theorem (Subspace Criterion): To check that W is a subspace of V , it is enough to check the following threeproperties:

[S1] W contains the zero vector of V .

[S2] W is closed under addition: For any ~w1 and ~w2 in W , the vector ~w1 + ~w2 is also in W .

[S3] W is closed under scalar multiplication: For any scalar α and ~w in W , the vector α · ~w is also in W .

• The reason we don't need to check everything to verify that a collection of vectors forms a subspace is thatmost of the axioms will automatically be satised in W because they're true in V .

As long as all of the operations are dened, axioms [A1]-[A2] and [M1]-[M4] will hold in W because theyhold in V . But we need to make sure we can always add and scalar-multiply, which is why we need [S2]and [S3].

In order to get axiom [A3] for W , we need to know that the zero vector is in W , which is why we need[S1].

In order to get axiom [A4] for W we can use the result that (−1) · ~w = −~w, to see that the closure underscalar multiplication automatically gives additive inverses.

• Remark: Any vector space automatically has two easy subspaces: the entire space V , and the trivial subspaceconsisting only of the zero vector.

• Examples: Here is a rather long list of examples of less trivial subspaces (of vector spaces which are of interestto us):

The vectors of the form 〈t, t, t〉 are a subspace of R3. [This is the line x = y = z.]

∗ [S1]: The zero vector is of this form: take t = 0.

Page 39: MATH320

∗ [S2]: We have 〈t1, t1, t1〉 + 〈t2, t2, t2〉 = 〈t1 + t2, t1 + t2, t1 + t2〉, which is again of the same form ifwe take t = t1 + t2.

∗ [S3]: We have α · 〈t1, t1, t1〉 = 〈αt1, αt1, αt1〉, which is again of the same form if we take t = αt1.

The vectors of the form 〈s, t, 0〉 are a subspace of R3. [This is the xy-plane, aka the plane z = 0.]

∗ [S1]: The zero vector is of this form: take s = t = 0.

∗ [S2]: We have 〈s1, t1, 0〉+ 〈s2, t2, 0〉 = 〈s1 + s2, t1 + t2, 0〉, which is again of the same form, if we takes = s1 + s2 and t = t1 + t2.

∗ [S3]: We have α · 〈s1, t1, 0〉 = 〈αs1, αt1, 0〉, which is again of the same form, if we take s = αs1 andt = αt1.

The vectors 〈x, y, z〉 with 2x− y + z = 0 are a subspace of R3.

∗ [S1]: The zero vector is of this form, since 2(0)− 0 + 0 = 0.

∗ [S2]: If 〈x1, y1, z1〉 and 〈x2, y2, z2〉 have 2x1 − y1 + z1 = 0 and 2x2 − y2 + z2 = 0 then adding theequations shows that the sum 〈x1 + x2, y1 + y2, z1 + z2〉 also lies in the space.

∗ [S3]: If 〈x1, y1, z1〉 has 2x1 − y1 + z1 = 0 then scaling the equation by α shows that 〈αx1, αx2, αx3〉also lies in the space.

More generally, the collection of solution vectors 〈x1, · · · , xn〉 to any homogeneous equation, or systemof m homogeneous equations, form a subspace of Rn.

∗ It is possible to check this directly by working with equations. But it is much easier to use matrices:write the system in matrix form, as A~x = ~0, where ~x = 〈x1, · · · , xn〉 is a solution vector.

∗ [S1]: We have A~0 = ~0, by the properties of the zero vector.

∗ [S2]: If ~x and ~y are two solutions, the properties of matrix arithmetic imply A(~x+ ~y) = A~x+A~y =~0 +~0 = ~0 so that ~x+ ~y is also a solution.

∗ [S3]: If α is a scalar and ~x is a solution, then A(α · ~x) = α · (A~x) = α · ~0 = ~0, so that α · ~x is also asolution.

The collection of 2× 2 matrices of the form

[a b0 a

]is a subspace of the space of all 2× 2 matrices.

∗ [S1]: The zero matrix is of this form, with a = b = 0.

∗ [S2]: We have

[a1 b10 a1

]+

[a2 b20 a2

]=

[a1 + a2 b1 + b2

0 a1 + a2

], which is also of this form.

∗ [S3]: We have α ·[a1 b10 a1

]=

[αa1 αb10 αa1

], which is also of this form.

The collection of complex numbers of the form a+ 2πai is a subspace of the complex numbers.

∗ The three requirements should be second nature by now!

The collection of continuous functions on [a, b] is a subspace of the space of all functions on [a, b].

∗ [S1]: The zero function is continuous.

∗ [S2]: The sum of two continuous functions is continuous, from basic calculus.

∗ [S3]: The product of continuous functions is continuous, so in particular a constant times a continuousfunction is continuous.

The collection of n-times dierentiable functions on [a, b] is a subspace of the space of continuous functionson [a, b], for any positive integer n.

∗ The zero function is dierentiable, as are the sum and product of any two functions which aredierentiable n times.

The collection of all polynomials is a vector space.

∗ Observe that polynomials are functions on the entire real line. Therefore, it is sucient to verify thesubspace criteria.

∗ The zero function is a polynomial, as is the sum of two polynomials, and any scalar multiple of apolynomial.

The collection of solutions to the (homogeneous, linear) dierential equation y′′ + 6y′ + 5y = 0 form avector space.

Page 40: MATH320

∗ We show this by verifying that the solutions form a subspace of the space of all functions.

∗ [S1]: The zero function is a solution.

∗ [S2]: If y1 and y2 are solutions, then y′′1 +6y′1 +5y1 = 0 and y′′2 +6y′2 +5y2 = 0, so adding and usingproperties of derivatives shows that (y1 + y2)

′′ + 6(y1 + y2)′ + 5(y1 + y2) = 0, so y1 + y2 is also a

solution.

∗ [S3]: If α is a scalar and y1 is a solution, then scaling y′′1 + 6y′1 + 5y1 = 0 by α and using propertiesof derivatives shows that (αy1)

′′ + 6(αy1)′ + 5(αy1) = 0, so αy1 is also a solution.

∗ Note: Observe that we can say something about what the set of solutions to this equation looks like,namely that it is a vector space, without actually solving it!

· For completeness, the solutions are y = Ae−x + Be−5x for any constants A and B. From here,if we wanted to, we could directly verify that such functions form a vector space.

The collection of solutions to any nth-order homogeneous linear dierential equation y(n)+Pn(x)·y(n−1)+· · ·+ P2(x) · y′ + P1(x) · y = 0 for continuous functions P1(x), · · · , Pn(x), form a vector space.

∗ Note that y(n) means the nth derivative of y.

∗ As in the previous example, we show this by verifying that the solutions form a subspace of thespace of all functions.

∗ [S1]: The zero function is a solution.

∗ [S2]: If y1 and y2 are solutions, then by adding the equations y(n)1 +Pn(x) ·y(n−1)1 + · · ·+P1(x) ·y1 = 0

and y(n)2 + Pn(x) · y(n−1)2 + · · · + P1(x) · y2 = 0 and using properties of derivatives shows that

(y1 + y2)(n) + Pn(x) · (y1 + y2)

(n−1) + · · ·+ P1(x) · (y1 + y2) = 0, so y1 + y2 is also a solution.

∗ [S3]: If α is a scalar and y1 is a solution, then scaling y(n)1 +Pn(x)·y(n−1)1 +· · ·+P2(x)·y′1+P1(x)·y1 = 0

by α and using properties of derivatives shows that (αy1)(n)+Pn(x)·(αy1)(n−1)+· · ·+P1(x)·(αy1) = 0,

so αy1 is also a solution.

∗ Note: This example is a fairly signicant amount of the reason we are interested in linear algebra(as it relates to dierential equations): because the solutions to homogeneous linear dierentialequations form a vector space. In general, for arbitrary functions P1(x), · · · , Pn(x), it is not possibleto solve the dierential equation explicitly for y; nonetheless, we can still say something about whatthe solutions look like.

1.4 Span, Independence, Bases, Dimension

• One thing we would like to know, now that we have the denition of a vector space and a subspace, is whatelse we can say about elements of a vector space i.e., we would like to know what kind of structure theelements of a vector space have.

• In some of the earlier examples we saw that, in Rn and a few other vector spaces, subspaces could all bewritten down in terms of one or more parameters. In order to discuss this idea more precisely, we rst needsome terminology.

1.4.1 Linear Combinations and Span

• Denition: Given a set ~v1, · · · , ~vn of vectors, we say a vector ~w is a linear combination of ~v1, · · · , ~vn if thereexist scalars a1, · · · , an such that ~w = a1 · ~v1 + · · ·+ an · ~vn.

Example: In R2, the vector 〈1, 1〉 is a linear combination of 〈1, 0〉 and 〈0, 1〉, because 〈1, 1〉 = 1 · 〈1, 0〉+1 · 〈0, 1〉. Example: In R4, the vector 〈4, 0, 5, 9〉 is a linear combination of 〈1, 0, 0, 1〉, 〈0, 1, 0, 0〉, and 〈1, 1, 1, 2〉,because 〈4, 0, 5, 9〉 = 1 · 〈1,−1, 2, 3〉 − 2 · 〈0, 1, 0, 0〉+ 3 · 〈1, 1, 1, 2〉.

Non-Example: In R3, the vector 〈0, 0, 1〉 is not a linear combination of 〈1, 1, 0〉 and 〈0, 1, 1〉 because thereexist no scalars a1 and a2 for which a1 · 〈1, 1, 0〉 + a2 · 〈0, 1, 1〉 = 〈0, 0, 1〉: this would require a commonsolution to the three equations a1 = 0, a1 + a2 = 0, and a2 = 1, and this system has no solution.

Page 41: MATH320

• Denition: We dene the span of vectors ~v1, · · · , ~vn, denoted span(~v1, · · · , ~vn), to be the set W of all vectorswhich are linear combinations of ~v1, · · · , ~vn. Explicitly, the span is the set of vectors of the form a1 ·~v1+ · · ·+an · ~vn, for some scalars a1, · · · , an.

Remark 1: The span is always subspace: since the zero vector can be written as 0 · ~v1 + · · ·+ 0 · ~vn, andthe span is closed under addition and scalar multiplication.

Remark 2: The span is, in fact, the smallest subspace W containing the vectors ~v1, · · · , ~vn: because forany scalars a1, · · · , an, closure under scalar multiplication requires each of a1~v1, a2~v2, · · · , an~vn to be inW . Then closure under vector addition forces the sum a1~v1 + · · ·+ an~vn to be in W .

Remark 3: For technical reasons, we dene the span of the empty set to be the zero vector.

Example: The span of the vectors 〈1, 0, 0〉 and 〈0, 1, 0〉 in R3 is the set of vectors of the form a · 〈1, 0, 0〉+b · 〈0, 1, 0〉 = 〈a, b, 0〉.∗ Equivalently, the span of these vectors is the set of vectors whose z-coordinate is zero i.e., theplane z = 0.

• Denition: Given a vector space V , if the span of vectors ~v1, · · · , ~vn is all of V , we say that ~v1, · · · , ~vn are agenerating set for V , or that they generate V .

Example: The three vectors 〈1, 0, 0〉, 〈0, 1, 0〉, and 〈0, 0, 1〉 generate R3, since for any vector 〈a, b, c〉 wecan write 〈a, b, c〉 = a · 〈1, 0, 0〉+ b · 〈0, 1, 0〉+ c · 〈0, 0, 1〉.

1.4.2 Linear Independence

• Denition: We say a nite set of vectors ~v1, · · · , ~vn is linearly independent if a1 ·~v1 + · · ·+ an ·~vn = ~0 impliesa1 = · · · = an = 0. (Otherwise, we say the collection is linearly dependent.)

Note: For an innite set of vectors, we say it is linearly independent if every nite subset is linearlyindependent (per the denition above); otherwise (if some nite subset displays a dependence) we say itis dependent.

In other words, ~v1, · · · , ~vn are linearly independent precisely when the only way to form the zero vectoras a linear combination of ~v1, · · · , ~vn is to have all the scalars equal to zero.

An equivalent way of thinking of linear (in)dependence is that a set is dependent if one of the vectors is alinear combination of the others i.e., it depends on the others. Explicitly, if a1·~v1+a2·~v2+· · ·+an·~vn =

~0 and a1 6= 0, then we can rearrange to see that ~v1 = − 1

a1(a2 · ~v2 + · · ·+ an · ~vn).

Example: The vectors 〈1, 1, 0〉 and 〈0, 2, 1〉 in R3 are linearly independent, because if we have scalars aand b with a · 〈1, 1, 0〉 + b · 〈0, 2, 1〉 = 〈0, 0, 0〉, then comparing the two sides requires a = 0, a + 2b = 0,b = 0, which has only the solution a = b = 0.

Example: The vectors 〈1, 1, 0〉 and 〈2, 2, 0〉 in R3 are linearly dependent, because we can write 2·〈1, 1, 0〉+(−1) · 〈2, 2, 0〉 = 〈0, 0, 0〉. Or, in the equivalent formulation, we have 〈2, 2, 0〉 = 2 · 〈1, 1, 0〉.

Example: The vectors 〈1, 0, 2, 2〉, 〈2,−2, 0, 3〉, 〈0, 3, 3, 1〉, and 〈0, 4, 2, 1〉 in R4 are linearly dependent,because we can write 2 · 〈1, 0, 2, 2〉+ (−1) · 〈2,−2, 0, 3〉+ (−2) · 〈0, 3, 3, 1〉+ 1 · 〈0, 4, 2, 1〉 = 〈0, 0, 0, 0〉.

• Theorem: The vectors ~v1, · · · , ~vn are linearly independent if and only if every vector ~w in the span of ~v1, · · · , ~vnmay be uniquely written as a sum ~w = a1 · ~v1 + a2 · ~v2 + · · ·+ an · ~vn.

For one direction, if the decomposition is always unique, then a1 · ~v1 + a2 · ~v2 + · · ·+ an · ~vn = ~0 impliesa1 = · · · = an = 0, because 0 · ~v1 + · · ·+ 0 · ~vn = ~0 is by assumption the only decomposition of ~0.

For the other direction, suppose we had two dierent ways of decomposing a vector ~w, say as ~w =a1 · ~v1 + a2 · ~v2 + · · ·+ an · ~vn and ~w = b1 · ~v1 + b2 · ~v2 + · · ·+ bn · ~vn. Then subtracting and then rearranging the dierence between these two equations yields ~w − ~w =(a1 − b1) · ~v1 + · · ·+ (an − bn) · ~vn. Now ~w − ~w is the zero vector, so we have (a1 − b1) · ~v1 + · · ·+ (an − bn) · ~vn = ~0.

But now because ~v1, · · · , ~vn are linearly independent, we see that all of the scalar coecients a1 −b1, · · · , an − bn are zero. But this says a1 = b1, a2 = b2, . . . , an = bn which is to say, the twodecompositions are actually the same.

Page 42: MATH320

1.4.3 Bases and Dimension

• Denition: A linearly independent set of vectors which generate V is called a basis for V .

Terminology Note: The plural form of the (singular) word basis is bases.

Example: The three vectors 〈1, 0, 0〉, 〈0, 1, 0〉, and 〈0, 0, 1〉 generate R3, as we saw above. They are alsolinearly independent, since a · 〈1, 0, 0〉+ b · 〈0, 1, 0〉+ c · 〈0, 0, 1〉 is the zero vector only when a = b = c = 0.Thus, these three vectors are a basis for R3.

Example: More generally, in Rn, the standard unit vectors e1, e2, · · · , en (where ej has a 1 in the jthcoordinate and 0s elsewhere) are a basis.

Non-Example: The vectors 〈1, 1, 0〉 and 〈0, 2, 1〉 in R3 are not a basis, as they fail to generate V : it isnot possible to obtain the vector 〈1, 0, 0〉 as a linear combination of 〈1, 1, 0〉 and 〈0, 2, 1〉. Non-Example: The vectors 〈1, 0, 0〉, 〈0, 1, 0〉, 〈0, 0, 1〉, and 〈1, 1, 1〉 in R3 are not a basis, as they arelinearly dependent: we have 1 · 〈1, 0, 0〉+ 1 · 〈0, 1, 0〉+ 1 · 〈0, 0, 1〉+ (−1) · 〈1, 1, 1〉 = 〈0, 0, 0〉. Example: The polynomials 1, x, x2, x3, · · · are a basis for the vector space of all polynomials.

∗ First observe that 1, x, x2, x3, · · · certainly generate the set of all polynomials (by denition of apolynomial).

∗ Now we want to see that these polynomials are linearly independent. So suppose we had scalarsa0, a1, · · · , an such that a0 · 1 + a1 · x+ · · ·+ an · xn = 0, for all values of x.

∗ Then if we take the nth derivative of both sides (which is allowable because a0·1+a1·x+· · ·+an·xn = 0is assumed to be true for all x) then we obtain n! an = 0, from which we see that an = 0.

∗ Then repeat by taking the (n−1)st derivative to see an−1 = 0, and so on, until nally we are left withjust a0 = 0. Hence the only way to form the zero function as a linear combination of 1, x, x2, · · · , xnis with all coecients zero, which says that 1, x, x2, x3, · · · is a linearly-independent set.

• Theorem: A collection of n vectors ~v1, · · · , ~vn in Rn is a basis if and only if the n×n matrix B, whose columnsare the vectors ~v1, · · · , ~vn, is an invertible matrix.

The idea behind the theorem is to multiply out and compare coordinates, and then analyze the resultingsystem of equations.

So suppose we are looking for scalars a1, · · · , an such that a1~v1 + · · ·+ an~vn = ~w, for some vector ~w inRn.

This vector equation is the same as the matrix equation B ·~a = ~w, where B is the matrix whose columnsare the vectors ~v1, · · · , ~vn, ~a is the column vector whose entries are the scalars a1, · · · , an, and ~w isthought of as a column vector.

Now from what we know about matrix equations, we know that B is an invertible matrix precisely whenB · ~a = ~w has a unique solution for every ~w.

But having a unique way to write any vector as a linear combination of vectors in a set is precisely thestatement that the set is a basis. So we are done.

• Theorem: Every vector space V has a basis. Any two bases of V contain the same number of elements. Anygenerating set for V contains a basis. Any linearly independent set of vectors can be extended to a basis.

Remark: If you only remember one thing about vector spaces, remember that every vector space has a basis !

Remark: That a basis always exists is really, really, really useful. It is without a doubt the most usefulfact about vector spaces: vector spaces in the abstract are very hard to think about, but a vector spacewith a basis is something very concrete (since then we know exactly what the elements of the vectorspace look like).

To show the rst and last parts of the theorem, we show that we can build any set of linearly independentvectors into a basis:

∗ Start with S being some set of linearly independent vectors. (In any vector space, the empty set isalways linearly independent.)

Page 43: MATH320

1. If S spans V , then we are done, because then S is a linearly independent generating set i.e., abasis.

2. If S does not span V , there is an element v of V which is not in the span of S. Then if we putv in S, the new S is still linearly independent. Then start over.

∗ Eventually (to justify this statement in general, some fairly technical and advanced machinery maybe needed), it can be proven that we will eventually land in case (1).

· If V has dimension n (see below), then we will always be able to construct a basis in at most nsteps; it is in the case when V has innite dimension that things get tricky and confusing, andrequires use of what is called the axiom of choice.

To show the third part of the theorem, the idea is to imagine going through the list of elements in agenerating set and removing elements until it becomes linearly independent.

∗ This idea is not so easy to formulate with an innite list, but if we have a nite generating set,then we can go through the elements of the generating set one at a time, throwing out an elementif it is linearly dependent with the elements that came before it. Then, once we have gotten to theend of the generating set, the collection of elements which we have not thrown away will still be agenerating set (since removing a dependent element will not change the span), but the collection willalso now be linearly independent (since we threw away elements which were dependent).

To show the second part of the theorem, we will show that if A is a set of vectors with m elements andB is a basis with n elements, with m > n, then A is linearly dependent.

∗ To see this, since B is a basis, we can write every element ai in A as a linear combination of the

elements of B, say as ai =

n∑j=1

ci,j · bj for 1 ≤ i ≤ m.

∗ Now suppose we have a linear combination of the ai which is the zero vector. We would like to see

that there is some choice of scalars dk, not all zero, such that

n∑k=1

dk · ak = ~0.

∗ If we substitute in for the vectors in B, then we obtain a linear combination of the elements ofB equalling the zero vector. Since B is a basis, this means each coecient of bj in the resultingexpression must be zero.

∗ If we tabulate the resulting system, we can check that it is equivalent to the matrix equation C · ~d = ~0,where C is the m× n matrix of coecients with entries ci,j , and ~d is the n× 1 matrix with entriesthe scalars dk.

∗ Now since C is a matrix which has more rows than columns, by the assumption that m > n, we seethat the homogeneous system C · ~d = ~0 has a solution vector ~d which is not the zero vector.

∗ But then we have

n∑k=1

dk · ak = ~0 for scalars dk not all zero, so the set A is linearly dependent.

• Denition: We dene the number of elements in any basis of V to be the dimension of V .

The theorem above assures us that this quantity is always well-dened.

Example: The dimension of Rn is n, since the n standard unit vectors form a basis.

∗ This says that the term dimension is reasonable, since it is the same as our usual notion of dimen-sion.

Example: The dimension of the vector space of m×n matrices is mn, because there is a basis consistingof the mn matrices Ei,j , where Ei,j is the matrix with a 1 in the (i, j)-entry and 0s elsewhere.

Example: The dimension of the vector space of all polynomials is ∞, because the (innite list of)polynomials 1, x, x2, x3, · · · are a basis for the space.

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 44: MATH320

Math 320 (part 3a) : Linear Transformations supplement (by Evan Dummit, 2012, v. 2.00)

This supplement discusses the basic ideas of linear transformations. We do not ocially cover linear transformationsin this class, for reasons which remain unclear to me. However, I think that knowing about linear transformationsis useful, because among other things, the topic ties together vector spaces and matrices very nicely.

Contents

0.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

0.1.1 Kernel and Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

0.1.2 Isomorphisms of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

0.1.3 The Derivative as a Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

0.1 Linear Transformations

• Now that we have a reasonably good idea of what the structure of a vector space is, the next natural questionis: what do maps from one vector space to another look like?

• It turns out that we don't want to ask about arbitrary functions, but about functions from one vector spaceto another which preserve the structure (namely, addition and scalar multiplication) of the vector space.

The analogy to the real numbers is: once we know what the real numbers look like, what can we sayabout arbitrary real-valued functions?

The answer is, not much, unless we specify that the functions preserve the structure of the real numbers which is abstract math-speak for saying that we want to talk about continuous functions, which turnout to behave much more nicely.

• This is the idea behind the denition of a linear transformation: it is a map that preserves the structure of avector space.

• Denition: If V and W are vector spaces, we say a map T from V to W (denoted T : V → W ) is alinear transformation if, for any vectors ~v,~v1, ~v2 and scalar α, we have the two properties

[T1] The map respects addition of vectors: T (~v1 + ~v2) = T (~v1) + T (~v2)

[T2] The map respects scalar multiplication: T (α · ~v) = α · ~v.

• Remark: Like with the denition of a vector space, one can show a few simple algebraic properties of lineartransformations for example, that any linear transformation sends the zero vector (of V ) to the zero vector(of W ).

• Example: If V =W = R2, then the map T which sends 〈x, y〉 to 〈x, x+ y〉 is a linear transformation.

Let ~v = 〈x, y〉, ~v1 = 〈x1, y1〉, and ~v2 = 〈x2, y2〉, so that ~v1 + ~v2 = 〈x1 + x2, y1 + y2〉. [T1]: We have T (~v1 +~v2) = 〈x1 + x2, x1 + x2 + y1 + y2〉 = 〈x1, x1 + y1〉+ 〈x2, x2 + y2〉 = T (~v1) + T (~v2).

[T2]: We have T (α · ~v) = 〈αx, αx+ αy〉 = α · 〈x, x+ y〉 = α · T (~v).

• More General Example: If V = W = R2, then the map T which sends 〈x, y〉 to 〈ax+ by, cx+ dy〉 is a lineartransformation.

Just like in the previous example, we can work out the calculations explicitly.

But another way we can think of this map is as a matrix map: T sends the column vector

[xy

]to the

column vector

[ax+ bycx+ dy

]=

[a bc d

]·[xy

].

Page 45: MATH320

So, in fact, this map T is really just (left) multiplication by the matrix

[a bc d

].

When we think of the map in this way, it is easier to see what is happening:

[T1]: We have T (~v1 + ~v2) =

[a bc d

]· (~v1 + ~v2) =

[a bc d

]· ~v1 +

[a bc d

]· ~v2 = T (~v1) + T (~v2).

[T2]: Also, T (α · ~v) =[a bc d

]· α~v1 = α ·

([a bc d

]· ~v1)

= α · T (~v).

• Really General Example: If V = Rm (thought of as m × 1 matrices) and W = Rn (thought of as n × 1matrices) and A is any n×m matrix, then the map T sending ~v to A · ~v is a linear transformation.

The verication is exactly the same as in the previous example.

[T1]: We have T (~v1 + ~v2) = A · (~v1 + ~v2) = A · ~v1 +A · ~v2 = T (~v1) + T (~v2).

[T2]: Also, T (α · ~v) = A · α~v1 = α · (A · ~v1) = α · T (~v).

• This last example is very general: in fact, it is so general that every linear transformation from Rm to Rn isof this form! Namely, if T is a linear transformation from Rm to Rn, then there is some m× n matrix A suchthat T (~v) = A · ~v (where we think of ~v as a column matrix).

The reason is actually very simple, and it is easy to write down what the matrix A is: it is just the m×nmatrix whose rows are the vectors T (e1), T (e2), . . . , T (em), where e1, · · · , em are the standard basiselements of Rm (ej is the vector with a 1 in the jth position and 0s elsewhere).

To see that this choice of A works, note that every vector ~v in Rm can be written as a unique linear

combination ~v =

m∑j=1

aj · ej of the basis elements. Then, after applying the two properties of a linear

transformation, we obtain T (~v) =

m∑j=1

aj · T (ej). If we write down this map one coordinate at a time, we

see that it agrees with the result of computing the matrix product of the matrix A with the coordinatesof ~v.

Tangential Remark: If we write down the map T explicitly, we see that the term in each coordinate inW

is a linear function of the coordinates in V e.g., if A =

[a bc d

]then the linear functions are ax+ by

and cx+ dy. This is the reason that linear transformations are named so because they are really justlinear functions, in the traditional sense.

• In fact, we can state something far more general is true: if we take any m-dimensional vector space V andany n-dimensional vector space W and choose bases for each space, then a linear transformation T from V toW behaves just like multiplication by (some) n×m matrix A.

The proof in this general case is the same as for linear transformations from Rm to Rn: rst choose abasis ~v1, ~v2, · · · , ~vm for V and a basis ~w1, ~w2, · · · , ~wn for W , and then look at how the map T behaves.

Let ~v be any element of V . The claim is that if we write ~v and T (~v) as linear combinations of the basiselements ~v1, ~v2, · · · , ~vm and ~w1, ~w2, · · · , ~wn (respectively) then, suitably interpreted, the relation betweenthe coecients of ~v and T (~v) will be multiplication by a matrix.

By the hypothesis that the vi are a basis for V , every element of V can be written uniquely as a linearcombination ~v = a1 ·~v1+a2 ·~v2+ · · ·+am ·~vm. Then by using the fact that T is a linear transformation,

we obtain T (~v) =

m∑j=1

aj · T (~vj).

Now we can also express each element T (~vj) uniquely in terms of the basis elements of W say, as

T (~vi) = ci,1 · ~w1 + ci,2 · ~w2 + · · ·+ ci,n · ~wn. Plugging in all of these expressions to T (~v) =

m∑j=1

aj · T (~vj)

then gives us an explicit expression for T (~v) as a linear combination T (~v) = b1 · ~w1+ b2 · ~w2+ · · ·+ bn · ~wn

of the basis elements ~w1, · · · , ~wn.

Page 46: MATH320

If we multiply out, we will (eventually) end up with a system of equations equivalent to the matrix

equality C · ~a = ~b, where C =

c1,1 c1,2 · · · c1,nc2,1 c2,2 · · · c2,n...

.... . .

...cm,1 cm,1 · · · cm,n

, ~a =

a1a2...an

, and ~b =

b1b2...bm

. Remark 1: This result underlines one of the reasons that matrices and vector spaces (which initially seemlike they have almost nothing to do with one another) are in fact very closely related: because matricesdescribe the maps from one vector space to another.

Remark 2: One can also use this relationship between maps on vector spaces and matrices to providealmost trivial proofs of some of the algebraic properties of matrix multiplication which are hard to proveby direct computation.

∗ For example: the composition of linear transformations is associative (because linear transformationsare functions, and function composition is associative). Multiplication of matrices is the same ascomposition of functions. Hence multiplication of matrices is associative.

0.1.1 Kernel and Image

• Denition: If T : V → W is a linear transformation, then the kernel of T , denoted ker(T ), is the set ofelements ~v in V with T (~v) = ~0. The image of T , denoted im(T ), is the set of elements w in W such that thereexists a ~v in V with T (~v) = ~w.

Intuitively, the kernel is the elements which are sent to zero by T , and the image is the elements in Wwhich are hit by T (the range of T ).

Essentially (see below), the kernel measures how far from one-to-one the map T is, and the imagemeasures how far from onto the map T is.

One of the reasons we care about these subspaces is that (for example) the set of solutions to a set ofhomogeneous linear equations A · ~x = ~0 is the kernel of the linear transformation T of multiplicationby A. And the set of vectors ~b for which there exists a solution to A · ~x = ~b is the image of the lineartransformation T of multiplication by A.

• The kernel is a subspace of V .

[S1] We have T (~0) = ~0, by simple properties of linear transformations.

[S2] If ~v1 and ~v2 are in the kernel, then T (~v1) = ~0 and T (~v2) = ~0. Therefore, T (~v1+~v2) = T (~v1)+T (~v2) =~0 +~0 = ~0.

[S3] If ~v is in the kernel, then T (~v) = ~0. Hence T (α · ~v) = α · T (~v) = α ·~0 = ~0.

• The image is a subspace of W .

[S1] We have T (~0) = ~0, by simple properties of linear transformations.

[S2] If ~w1 and ~w2 are in the image, then there exist ~v1 and ~v2 are such that T (~v1) = ~w1 and T (~v2) = ~w2.Then T (~v1 + ~v2) = T (~v1) + T (~v2) = ~w1 + ~w2, so that ~w1 + ~w2 is also in the image.

[S3] If ~w is in the image, then there exists ~v with T (~v) = ~w. Then T (α · ~v) = α · T (~v) = α · ~w, so α · ~w isalso in the image.

• Theorem: The kernel ker(T ) consists of only the zero vector if and only if the map T is one-to-one. The imageim(T ) consists of all of W if and only if the map T is onto.

The statement about the image is just the denition of onto.

If T is one-to-one, then (at most) one element of V maps to ~0. But since the zero vector is taken to thezero vector, we see that T cannot send anything else to ~0. Thus ker(T ) = ~0.

If ker(T ) is only the zero vector, then since T is a linear transformation, the statement T (~v1) = T (~v2) isequivalent to the statement that T (~v1)− T (~v2) = T (~v1 −~v2) is the zero vector. But, by the denition ofthe kernel, T (~v1 − ~v2) = ~0 precisely when ~v1 − ~v2 is in the kernel. However, this means ~v1 − ~v2 = ~0, so~v1 = ~v2. Hence T (~v1) = T (~v2) implies ~v1 = ~v2, which means T is one-to-one.

Page 47: MATH320

• Denitions: The dimension of ker(T ) is called the nullity of T , and the dimension of im(T ) is called the rankof T .

A linear transformation with a large nullity has a large kernel, which means it sends many elements tozero (hence nullity).

• Theorem (Rank-Nullity): For any linear transformation T : V →W , dim(ker(T )) + dim(im(T )) = dim(V ).

The idea behind this theorem is that if we have a basis for im(T ), say ~w1, · · · , ~wk, then there exist~v1, · · · , ~vk with T (~v1) = ~w1, . . . , T (~vk) = ~wk. Then if ~a1, · · · ,~al is a basis for ker(T ), the goal is to showthat the set of vectors ~v1, · · · , ~vk,~a1, · · · ,~al is a basis for V .

To do this, given any ~v write T (~v) =

k∑j=1

βj · ~wj =

k∑j=1

βj · T (~vj) = T

k∑j=1

βj · ~vj

, where the βj are

unique.

Then subtraction shows that T

~v − k∑j=1

βj · ~vj

= ~0 so that ~v −k∑

j=1

βj · ~vj is in ker(T ), hence can be

written as a sum

l∑i=1

γi · ~ai, where the γi are unique.

Putting all this together shows ~v =

k∑j=1

βj · ~vj +l∑

i=1

γi · ~ai for unique scalars βj and γi, which says that

~v1, · · · , ~vk,~a1, · · · ,~al is a basis for V .

• Remark: Here is another way of interpreting and proving the nullity-rank theorem.

If we x a basis for V and for W and view the linear transformation as a matrix, then the kernel of thetransformation is the solution space to the homogeneous system A · ~x = ~0, and the image is the space ofvectors ~b such that there exists a solution to A · ~x = ~b.

The value of dim(ker(T )) is the size of a basis for the solutions to the homogeneous equation, which weknow is the number of nonpivotal columns in the reduced row echelon form of A.

The value of dim(im(T )) is the size of a basis for the collection of row vectors of A, since the row vectorsspan the image. So the dimension of im(T ) is the number of pivotal columns in the reduced row-echelonform of A.

Therefore, the sum of these two numbers is the number of columns of the matrix A (since every columnis either pivotal or nonpivotal). But this is just dim(V ).

0.1.2 Isomorphisms of Vector Spaces

• Denition: A linear transformation T : V →W is called an isomorphism if T is also one-to-one and onto.

Equivalently, T is an isomorphism if ker(T ) = 0 and im(T ) =W .

We say that two vector spaces are isomorphic if there exists an isomorphism between them.

Saying that two spaces are isomorphic is a very strong statement: it says that the spaces V and W haveexactly the same structure, as vector spaces.

∗ Informally, this means that if we used T to relabel the elements of V to have the same names as theelements of W , we wouldn't be able to tell V and W apart at all, as vector spaces.

• Example: The space R4 is isomorphic to the space M2×2 of 2× 2 matrices, with an isomorphism T given by

T (x1, x2, x3, x4) =

[x1 x2x3 x4

].

This map is a linear transformation; it clearly is additive and respects scalar multiplication.

Page 48: MATH320

Also, ker(T ) = 0 since the only element mapping to the zero matrix is (0, 0, 0, 0). And it is also clearthat im(T ) =M2×2.

Thus T is an isomorphism.

• Isomorphisms preserve linear independence: for T is an isomorphism, the vectors ~v1, · · · , ~vn are linearlyindependent if and only if T (~v1), · · · , T (~vn) are.

Because T is a linear transformation, we have a1 · T (~v1) + · · ·+ an · T (~vn) = T (a1 · ~v1 + · · ·+ an · ~vn). To see that ~v1, · · · , ~vn independent implies T (~v1), · · · , T (~vn) independent:∗ If a1 · T (~v1) + · · ·+ an · T (~vn) = ~0, then by the above we have T (a1 · ~v1 + · · ·+ an · ~vk) = ~0.

∗ But now since ker(T ) = 0, we get a1 · ~v1 + · · · + an · ~vn = ~0, and independence of ~v1, · · · , ~vn thengives a1 = · · · = an = 0.

∗ So T (~v1), · · · , T (~vn) are independent. To see that T (~v1), · · · , T (~vn) independent implies ~v1, · · · , ~vn independent:

∗ If a1 · ~v1 + · · ·+ an · ~vn = ~0, then a1 · T (~v1) + · · ·+ an · T (~vn) = T (a1 · ~v1 + · · ·+ an · ~vn) = T (~0) = ~0.

∗ But now the independence of T (~v1), · · · , T (~vn) gives a1 = · · · = an = 0, so ~v1, · · · , ~vn are indepen-dent.

• If T is an isomorphism, then (because T is one-to-one and onto) there exists an inverse function T−1 :W → V ,with T−1(T (~v)) = ~v and T (T−1(w)) = w for any ~v in V and ~w in W .

As we might hope, the inverse map T−1is actually a linear transformation, too:

∗ [T1] If T (v1) = ~w1 and T (v2) = ~w2, then because T (~v1 + ~v2) = ~w1 + ~w2, we have T−1(~w1 + ~w2) =~v1 + ~v2 = T−1(~w1) + T−1(~w2).

∗ [T2] If T (~v) = ~w, then because T (α · ~v) = α · ~w, we have T−1(α · ~w) = α · ~v = α · T−1(~w).

• Theorem: Two (nite-dimensional) vector spaces V and W are isomorphic if they have the same dimension.In particular, any nite-dimensional vector space is isomorphic to Rn for some value of n.

To show the result, choose a basis ~v1, · · · , ~vn for V and a basis ~w1, · · · , ~wn for W .

We claim the map T dened by T (a1 · ~v1 + · · · + an · ~vn) = a1 · ~w1 + · · · + a1 · ~wn is an isomorphismbetween V and W .

We need to check ve things: that T is unambiguously dened, that T respects addition, that T respectsscalar multiplication, that T is one-to-one, and that T is onto.

∗ [Well-dened]: We need to make sure that we have not made the denition of T ambiguous i.e.,that we have dened T on every element of V , and that we haven't tried to send one element of V totwo dierent elements of W . However, we are safe because ~v1, · · · , ~vn is a basis, which means thatfor every ~v in V , we have a unique way of writing ~v as a linear combination of ~v1, · · · , ~vn.∗ [Addition]: If ~v = a1 ·~v1 + · · ·+ an ·~vn and v = b1 ·~v1 + · · ·+ bn ·~vn, then T (~v+ v) = (a1 + b1) · ~w1 +· · ·+ (an + bn) · ~wn = T (~v) + T (v) by the distributive law.

∗ [Multiplication]: For any scalar β we have T (β · ~v) = (βa1) · ~w1 + · · · + (βan) · ~wn = β · T (~v) byconsistency of multiplication.

∗ [One-to-one]: Since ~w1, · · · , ~wn are linearly independent, the only way that a1 · ~w1 + · · · + an · ~wn

can be the zero vector is if a1 = a2 = · · · = an = 0, which means ker(T ) = 0.

∗ [Onto]: Since ~w1, · · · , ~wn span W , every element ~w in W can be written as ~w = a1 · ~w1+ · · ·+an · ~wn

for some scalars a1, · · · an. Then for ~v = a1 · ~v1 + · · ·+ an · ~vn, we have T (~v) = ~w.

Remark 1: Isomorphisms preserve linear independence, hence they also preserve dimension. So thetheorem could be strengthened a little to say that two vector spaces are isomorphic if and only if theyhave the same dimension.

Remark 2: I think this result is rather unexpected, at least the rst time: it certainly doesn't seemobvious, just from the eight axioms of a vector space, that all nite-dimensional vector spaces are reallythe same as Rn. But they are!

Page 49: MATH320

0.1.3 The Derivative as a Linear Transformation

• Example: If V and W are any vector spaces, and T1 and T2 are any linear transformations from V to W ,then T1 + T2 and β · T1 are also linear transformations, for any scalar β.

These follow from the criteria. (They are somewhat confusing to follow when written down, so I won'tbother.)

• Example: If V is the vector space of real-valued functions and W = R, then the evaluation at 0 map takingf to the value f(0) is a linear transformation.

[T1]: We have T (f1 + f2) = (f1 + f2)(0) = f1(0) + f2(0) = T (f1) + T (f2).

[T2]: Also, T (α · f) = (αf)(0) = α · f(0) = α · T (f). Note of course that being a linear transformation has nothing to do with the fact that we are evaluatingat 0. We could just as well evaluate at 1, or π, and the map would still be a linear transformation.

• Example: If V and W are both the vector space of real-valued functions and P (x) is any real-valued function,then the map taking f(x) to the function P (x)f(x) is a linear transformation.

[T1]: We have T (f1 + f2) = P (x)(f1 + f2)(x) = P (x)f1(x) + P (x)f2(x) = T (f1) + T (f2).

[T2]: Also, T (α · f) = P (x)(αf)(x) = αP (x)f(x) = α · T (f).

• Example: If V is the vector space of all n-times dierentiable functions and W is the vector space of allfunctions, then the nth derivative map, taking f(x) to its nth derivative f (n)(x), is a linear transformation.

[T1]: The nth derivative of the sum is the sum of the nth derivatives, so we have T (f1 + f2) = (f1 +

f2)(n)(x) = f

(n)1 (x) + f

(n)2 (x) = T (f1) + T (f2).

[T2]: Also, T (α · f) = (αf)(n)(x) = α · f (n)(x) = α · T (f).

• If we combine the results from the previous four examples, we can show that if V is the vector space of alln-times dierentiable functions, then the map T which sends a function y to the function y(n)+Pn(x) y

(n−1)+· · ·+ P2(x) y

′ + P1(x) y is a linear transformation, for any functions Pn(x), · · · , P1(x).

In particular, the kernel of this linear transformation is the collection of all functions y such that y(n) +Pn(x) y

(n−1) + · · ·+ P2(x) y′ + P1(x) y = 0 i.e., the set of solutions to this dierential equation.

Note that since we know the kernel is a vector space (as it is a subspace of V ), we see that the set ofsolutions to y(n) +Pn(x) y

(n−1) + · · ·+P2(x) y′+P1(x) y = 0 forms a vector space. (Of course, we could

just show this statement directly, by checking the subspace criteria.)

However, it is very useful to be able to think of this linear dierential operator sending y to y(n) +Pn(x) y

(n−1) + · · ·+ P2(x) y′ + P1(x) y as a linear transformation.

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 50: MATH320

Math 320 (part 4): Linear Dierential Equations (by Evan Dummit, 2012, v. 1.10)

Contents

1 Linear Dierential Equations 1

1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 General Theory of Linear Dierential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Homogeneous Linear Equations with Constant Coecients . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Non-Homogeneous Linear Equations with Constant Coecients . . . . . . . . . . . . . . . . . . . . . 5

1.4.1 Undetermined Coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4.2 Variation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5 Second-Order Equations: Applications to Newtonian Mechanics . . . . . . . . . . . . . . . . . . . . . 10

1.5.1 Spring Problems and Damping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5.2 Resonance and Forcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1 Linear Dierential Equations

• The general nth-order linear dierential equation is of the form y(n) + Pn(x) y(n−1) + · · ·+ P2(x) y

′ + P1(x) y = Q(x) ,

for some functions Pn(x), · · · , P2(x), P1(x), and Q(x). (Note that y(n) denotes the nth derivative of y.)

The goal is to study the behavior of the solutions to these equations.

• Solving general linear dierential equations explicitly is generally hard, unless we are lucky and the equationhas a particularly nice form.

We can only give a method for writing down the full set of solutions for a small classes of linear equations:namely, linear dierential equations with constant coecients.

There are a few equation types (e.g., Euler equations like x2y′′ + xy′ + y = 0) which can be reduced toconstant-coecient equations via substitution.

• Thus, we will spend most of our eort discussing how to solve linear equations with constant coecients,because we can always solve them these are equations of the form y(n)+an−1y

(n−1)+ · · ·+a1y′+a0y = Q(x)for some constants an−1, · · · , a0 and some function Q(x).

• Even harder than the general linear dierential equation are are non-linear equations of higher order, such asy′′ = ey − y or y′′ · y2 = y′ + 1.

We will not discuss non-linear equations at all.

Compare with the example of rst-order equations: we can solve any rst-order linear equation, but veryfew types of rst-order nonlinear equations.

1.1 Terminology

• The standard form of a dierential equation is when it is written with all terms involving y or higher derivativeson one side, and functions of the variable on the other side.

Example: The equation y′′ + y′ + y = 0 is in standard form.

Example: The equation y′ = 3x2 − xy is not in standard form.

Page 51: MATH320

• An equation is homogeneous if, when it is put into standard form, the x-side is zero. An equation isnonhomogeneous otherwise.

Example: The equation y′′ + y′ + y = 0 is homogeneous.

Example: The equation y′ + xy = 3x2 is nonhomogeneous.

• An nth order dierential equation is an equation in which the highest derivative is the nth derivative.

Example: The equations y′ + xy = 3x2 and y′ · y = 2 are rst-order.

Example: The equation y′′ + y′ + y = 0 is second-order.

• A dierential equation is linear if it is a linear combination of y and its derivatives. (Note that coecientsare allowed to be functions of x.) In other words, if there are no terms like y2, or (y′)3, or y · y′, or ey.

Example: The equations y′ + xy = 3x2 and y′′ + y′ + y = 0 are linear.

Example: The equations y′ · y = 3x2 and y′′ + ey = 0 are not linear.

• We say a linear dierential equation has constant coecients if the coecients of y, y′, y′′, ... are all constants.

Example: The equation y′′ + y′ + y = 0 has constant coecients.

Example: The equation y′ + xy = 3x2 does not have constant coecients.

• For n functions y1, y2, · · · , yn which are each dierentiable (n− 1) times, the Wronskian W (y1, y2, · · · , yn) of

the functions is dened to be the determinant of the matrix

y1 y2 · · · yny′1 y′2 · · · y′n...

.... . .

...

y(n−1)1 y

(n−1)2 · · · y

(n−1)n

. Note that the Wronskian will also be a function of x.

The purpose of the Wronskian is to provide a way to show that functions are linearly independent.

• Theorem: A collection of n, n-times dierentiable functions y1, y2, · · · , yn is linearly independent (in the vectorspace of n-times dierentiable functions) if their Wronskian is not the zero function.

To prove the theorem, note that if the functions are linearly dependent with

n∑j=1

ajyj = 0, then by

dierentiating the appropriate number of times we see that

n∑j=1

ajy(i)j = 0 for any 0 ≤ i ≤ n. Hence,

in particular, the rows of the matrix are linearly dependent (as vectors), and so the determinant of thematrix is zero. Therefore, if the determinant is not zero, the functions cannot be dependent.

Remark: The theorem becomes an if-and-only if statement (i.e., the functions are linearly indepen-dent if and only if the Wronskian is nonzero) if we know that the functions y1, y2, · · · , yn are innitelydierentiable. The proof of the other direction is signicantly more dicult.

Example: The functions 1 and x are linearly independent, because we can computeW (1, x) =

∣∣∣∣ 1 x0 1

∣∣∣∣ =1.

Example: The functions sin(x) and cos(x) are linearly independent, asW (sin(x), cos(x)) =

∣∣∣∣ sin(x) cos(x)cos(x) − sin(x)

∣∣∣∣ =−1. Example: The functions 1, x, and 1+x are (rather clearly) linearly dependent. We haveW (1, x, 1+x) =∣∣∣∣∣∣

1 x x+ 10 1 10 0 0

∣∣∣∣∣∣ = 0, by expanding along the bottom row.

Page 52: MATH320

1.2 General Theory of Linear Dierential Equations

• Theorem (Homogeneous Linear Equations): If Pn(x), · · · , P1(x) are continuous functions, then the set ofsolutions to the homogeneous nth order equation y(n) + Pn(x) y

(n−1) + · · · + P2(x) y′ + P1(x) y = 0 is an

n-dimensional vector space.

The fact that the set of solutions is a vector space is not so hard to show using the subspace criteria.

The real result of this theorem, which is analogous to the existence-uniqueness theorem for rst-orderequations, is that the set of solutions is n-dimensional.

• Example: Consider the homogeneous equation y′′(x) = 0.

We can just integrate twice to see that the solutions are y(x) = Ax+B , for any constants A and B.

Indeed, as the theorem dictates, the solution functions form a two-dimensional space, spanned by thetwo basis elements 1 and x.

• Theorem (Existence-Uniqueness for Linear Equations): If Pn(x), · · · , P1(x) andQ(x) are functions continuouson an interval containing a, then there is a unique solution (possibly on a smaller interval) to the initial valueproblem y(n) +Pn(x) y

(n−1) + · · ·+P2(x) y′+P1(x) y = Q(x), for any initial conditions y(a) = b1, y

′(a) = b2,· · · , and y(n−1)(a) = bn. Additionally, every solution ygen to the general nth order equation may be writtenas ygen = ypar + yhom, where ypar is any one particular solution to the equation, and yhom is a solution to thehomogeneous equation y(n) + Pn(x) y

(n−1) + · · ·+ P2(x) y′ + P1(x) y = 0.

What this theorem says is: in order to solve the general equation y(n) + Pn(x) y(n−1) + · · ·+ P2(x) y

′ +P1(x) y = Q(x), it is enough to nd one solution to this equation along with the general solution to thehomogeneous equation.

The existence-uniqueness part of the theorem is hard, but the second part is fairly simple to show: y1and y2 are solutions to the general equation, then their dierence y1−y2 is a solution to the homogeneousequation to see this, just subtract the resulting equations and apply derivatives rules.

∗ A more advanced way to see the same thing is to use the fact that the map L sending y to y(n) +Pn(x) y

(n−1) + · · ·+ P2(x) y′ + P1(x) y is a linear transformation.

∗ Then L(y1) = L(y2) says L(y1 − y2) = 0, so that y1 − y2 is a solution to the homogeneous equation.

• Example: In order to solve the equation y′′(x) = ex, the theorem says we only need to nd one function whichis a solution, and then solve the homogeneous equation y′′(x) = 0.

We can just try simple functions until we discover that y(x) = ex has y′′(x) = ex.

Then we need only solve the homogeneous equation y′′(x) = 0, whose solutions we know are Ax+B.

Thus the general solution to the general equation y′′(x) = ex is y(x) = ex +Ax+B .

We can also verify that if we impose the initial conditions y(0) = c1 and y′(0) = c2, then (as the theorem

dictates) there is the unique solution y = ex + (c2 − 1)x+ (c1 − 1).

1.3 Homogeneous Linear Equations with Constant Coecients

• The general linear homogeneous dierential equation with constant coecients is y(n) + an−1y(n−1) + · · · +

a1y′ + a0y = 0, where an−1, · · · , a0 are some constants.

From the existence-uniqueness theorem we know that the set of solutions is an n-dimensional vectorspace.

Based on solving rst-order linear homogeneous equations (i.e., y′ + ky = 0), we might expect thesolutions to involve exponentials. If we try setting y = erx then after some arithmetic we end up withrnerx + an−1r

n−1erx + · · · + a1rerx + a0e

rx = 0. Multiplying both sides by e−rx and cancelling yields

the characteristic equation rn + an−1rn−1 + · · ·+ a1r + a0 = 0 .

Page 53: MATH320

If we can nd n values of r satisfying this nth-degree polynomial i.e., if we can factor the polynomialand see that it has n distinct roots, then the theorem tells us we will have found all of the solutions. Ifwe are unlucky and the polynomial has a repeated root, then we need to try something else.

If there are non-real roots (note they will come in complex conjugate pairs!) r1 = α+βi and r2 = α−βithen we would end up with er1x and er2x as our solutions. But we really want real-valued solutions,and er1x and er2x have complex numbers in the exponents. However we can just write out the realand imaginary parts using Euler's Theorem, and take linear combinations to obtain the two real-valued

solutions eαx sin(βx) =1

2i[er1x − er2x] and eαx cos(βx) = 1

2[er1x + er2x].

Taking motivation from the case of y(k) = 0, whose characteristic equation is rk = 0 (with the k-foldrepeated root 0) and whose solutions are y(x) = A1 +A2x+A3x

2 + · · ·+Akxk−1, we guess wildly that

if other roots are repeated, we want to multiply the corresponding exponentials erx by a power of x.

If we put all of these ideas together we can prove that this general outline will, in fact, give us nlinearly independent functions, and hence gives the general solution to any homogeneous linear dierentialequation with constant coecients.

Advanced Remark: Another way to organize this information is to think of everything in terms oflinear transformations. If D represents the linear transformation sending a function to its derivative,then we are looking for functions y with the property that L(y) = 0, where L is the linear operatorDn + an−1D

n−1 + · · ·+ a1D + a0.

∗ Note that D2, for instance, means taking the derivative twice in a row i.e., D2 is the secondderivative.

∗ Then what we are secretly doing is factoring the polynomial Dn+an−1Dn−1+ · · ·+a1D+a0 into a

product of linear terms (D−r1)k1 · · · (D−rj)kj , solving each of the simpler equations (D−ri)kiy = 0,and adding up the resulting terms.

• To solve a linear homogeneous dierential equation with constant coecients, follow these steps:

Step 1: Rewrite the dierential equation in the standard form y(n) + an−1y(n−1) + · · ·+ a1y

′ + a0y = 0(if necessary).

Step 2: Factor the characteristic equation rn + an−1rn−1 + · · ·+ a1r + a0 = 0.

Step 3: For each irreducible factor in the characteristic equation, write down the corresponding terms inthe solution:

∗ For terms (r − α)k where is a real number then add the terms of the form eαx, xeαx, · · · , xk−1eαx.∗ For irreducible terms (r2 + cx + d)k with roots r = α ± βi, then add the terms of the formeαx sin(βx), xeαx sin(βx), · · · , xk−1eαx sin(βx) and eαx cos(βx), xeαx cos(βx), · · · , xk−1eαx cos(βx).

Step 4: If given additional conditions on the solution, solve for the coecients (if necessary).

• Example: Find all functions y such that y′′ + y′ − 6 = 0.

Step 2: The characteristic equation is r2 + r − 6 = 0 which has roots r = 2 and r = −3. Step 3: We have two distinct real roots, so our terms are e2x and e−3x. So the general solution is

y = C1e2x + C2e

−3x .

• Example: Find all functions y such that y′′ − 2y′ + 1 = 0, with y(0) = 1 and y′(0) = 2.

Step 2: The characteristic equation is r2 − 2r + 1 = 0 which has only the solution r = 1.

Step 3: There is a double root at r = 1, so our terms are ex and xex. Hence the general solution isy = C1e

x + C2xex.

Step 4: Plugging in the two conditions gives 1 = C1 · e0 + C2 · 0, and 2 = C1e0 + C2

[(0 + 1)e0

]from

which C1 = 1 and C2 = 1. Hence the particular solution requested is y = ex + xex .

• Example: Find all real-valued functions y such that y′′ = −4y.

Step 1: The standard form here is y′′ + 4y = 0.

Page 54: MATH320

Step 2: The characteristic equation is r2 + 4 = 0 which has roots r = 2i and r = −2i. Step 3: We have two complex-conjugate roots. Since the problem asks for real-valued functions wewrite er1x = cos(2x) + i sin(2x) and er2x = cos(2x) − i sin(2x) to see that the general solution is

y = C1 cos(2x) + C2 sin(2x) .

• Example: Find all real-valued functions y such that y(5) + 5y(4) + 10y′′′ + 10y′′ + 5y′ + y = 0.

Step 2: The characteristic equation is r5 + 5r4 + 10r3 + 10r2 + 5r+ 1 = 0 which factors as (r+ 1)5 = 0.

Step 3: We have a 5-fold repeated root r = −1. Thus the terms are e−x, xe−x, x2e−x, x3e−x, and x4e−x.

Hence the general solution is y = C1e−x + C2xe

−x + C3x2e−x + C4x

3e−x + C5x4e−x .

• Example: Find all real-valued functions y whose fourth derivative is the same as y.

Step 1: This is the equation y′′′′ = y, or in standard form, y′′′′ − y = 0.

Step 2: The characteristic equation is r4 − 1 = 0 which factors as (r + 1)(r − 1)(r + i)(r − i) = 0.

Step 3: We have the four roots 1,−1, i,−i. Thus the terms are ex, e−x, sin(x), and cos(x). Hence the

general solution is y = C1ex + C2e

−x + C3 sin(x) + C4 cos(x) .

1.4 Non-Homogeneous Linear Equations with Constant Coecients

• The general linear dierential equation with constant coecients is of the form y(n)+an−1y(n−1)+ · · ·+a1y′+

a0y = Q(x), where an−1, · · · , a0 are some constants and Q(x) is some function of x.

• From the general theory, all we need to do is nd one solution to the general equation, and nd all solutionsto the homogeneous equation. Since we know how to solve the homogeneous equation in full generality, wejust need to develop some techniques for nding one solution to the general equation.

• There are essentially two ways of doing this.

The Method of Undetermined Coecients is just a fancy way of making an an educated guess aboutwhat the form of the solution will be and then checking if it works. It will work whenever the functionQ(x) is linear combination of terms of the form xkeαx (where k is an integer and α is a complex number):thus, for example, we could use the method for something like Q(x) = x3e8x cos(x)− 4 sin(x) + x10 butnot something like Q(x) = tan(x).

Variation of Parameters is a more complicated method which uses some linear algebra and cleverness touse the solutions of the homogeneous equation to nd a solution to the non-homogeneous equation. Itwill always work, for any function Q(x), but generally requires more setup and computation.

1.4.1 Undetermined Coecients

• The idea behind the method of undetermined coecients is that we can 'guess' what our solution shouldlook like (up to some coecients we have to solve for), if Q(x) involves sums and products of polynomials,exponentials, and trigonometric functions. Specically, we try a solution y = [stu], where the 'stu' is a sumof things similar to the terms in Q(x).

• Here is the procedure for generating the trial solution:

Step 1: Generate the rst guess for the trial solution as follows:

∗ Replace all numerical coecients of terms in Q(x) with variable coecients. If there is a sine (orcosine) term, add in the companion cosine (or sine) terms, if they are missing. Then group termsof Q(x) into blocks of terms which are the same up to a power of x, and add in any missinglower-degree terms in each block.

∗ Thus, if a term of the form xnerx appears inQ(x), ll in the terms of the form erx·[A0 +A1x+ · · ·+Anxn],

and if a term of the form xneαx sin(βx) or xneαx cos(βx) appears in Q(x), ll in the terms of theform eax cos(bx) · [D0 +D1x+ · · ·+Dnx

n] + eax sin(bx) [E0 + E1x+ · · ·+ Enxn].

Page 55: MATH320

Step 2: Solve the homogeneous equation, and write down the general solution.

Step 3: Compare the rst guess for the trial solution with the solutions to the homogeneous equation.If any terms overlap, multiply all terms in the overlapping block by the appropriate power of x whichwill remove the duplication.

• Here is a series of examples demonstrating the procedure for generating the trial solution:

Example: y′′ − y = x.

∗ Step 1: We ll in the missing constant term in Q(x) to get D0 +D1x.

∗ Step 2: General solution is A1ex +A2e

−x.

∗ Step 3: There is no overlap, so the trial solution is D0 +D1x .

Example: y′′ + y′ = x− 2.

∗ Step 1: We have D0 +D1x.

∗ Step 2: General homogeneous solution is A+Be−x.

∗ Step 3: There is an overlap (the solution D0) so we multiply the corresponding trial solution terms

by x, to get D0x+D1x2. Now there is no overlap, so D0x+D1x

2 is the trial solution.

Example: y′′ − y = ex.

∗ Step 1: We have D0ex.

∗ Step 2: General homogeneous solution is Aex +Be−x.

∗ Step 3: There is an overlap (the solution D0ex) so we multiply the trial solution term by x, to get

D0xex. Now there is no overlap, so D0xe

x is the trial solution.

Example: y′′ − 2y′ + y = 3ex.

∗ Step 1: We have D0ex.

∗ Step 2: General homogeneous solution is Aex +Bxex.

∗ Step 3: There is an overlap (the solution D0ex) so we multiply the trial solution term by x2, to get

rid of the overlap, giving us the trial solution D0x2ex .

Example: y′′ − 2y′ + y = x3ex.

∗ Step 1: We ll in the lower-degree terms to get D0ex +D1xe

x +D2x2ex +D3x

3ex.

∗ Step 2: The general homogeneous solution is A0ex +A1xe

x.

∗ Step 3: There is an overlap (namely D0ex+D1xe

x) so we multiply the trial solution terms by x2 to

get D0x2ex +D1x

3ex +D2x4ex +D3x

5ex as the trial solution.

Example: y′′ + y = sin(x).

∗ Step 1: We ll in the missing cosine term to get D0 cos(x) + E0 sin(x).

∗ Step 2: The general homogeneous solution is A cos(x) +B sin(x).

∗ Step 3: There is an overlap (all of D0 cos(x) + E0 sin(x)) so we multiply the trial solution terms by

x to get D0x cos(x) + E0x sin(x). There is now no overlap so D0x cos(x) + E0x sin(x) is the trial

solution.

Example: y′′ + y = x sin(x).

∗ Step 1: We ll in the missing cosine term and then all the lower-degree terms to get D0 cos(x) +E0 sin(x) +D1x cos(x) + E1x sin(x).

∗ Step 2: The general homogeneous solution is A cos(x) +B sin(x).

∗ Step 3: There is an overlap (all of D0 cos(x) + E0 sin(x)) so we multiply the trial solution terms

in that group by x to get D0x cos(x) + E0x sin(x) +D1x2 cos(x) + E1x

2 sin(x) , which is the trial

solution since now there is no overlap.

Example: y′′′ − y′′ = x+ xex.

∗ Step 1: We ll in the lower-degree term for xex and the lower-degree term for x, to get A0 +A1x+B0e

x +B1xex.

Page 56: MATH320

∗ Step 2: The general homogeneous solution is C0 + C1x+Dex.

∗ Step 3: There are overlaps in both groups of terms: A0+A1x and B0ex each overlap, so we multiply

the x group by x2 and the ex group by x to get rid of the overlaps. There are now no additional

overlapping terms, so the trial solution is A0x2 +A1x

3 +B0xex +B1x

2ex .

Example: y′′′′ + 2y′′ + y = xex + x cos(x).

∗ Step 1: We ll in the lower-degree term for xex, then the missing sine term for x cos(x), and thenthe lower-degree terms for x cos(x) and x sin(x), to get A0e

x + A1xex + D0 cos(x) + E0 sin(x) +

D1x cos(x) + E1x sin(x).

∗ Step 2: The general homogeneous solution is B0 cos(x) + C0 sin(x) +B1x cos(x) + C1x sin(x).

∗ Step 3: There is an overlap (namely, all of D0 cos(x) + E0 sin(x) +D1x cos(x) +D1x sin(x)) so wemultiply that group by x2 to get rid of the overlap. There are no additional overlapping terms, so

the trial solution is A0ex +A1xe

x +D0x2 cos(x) + E0x

2 sin(x) +D1x3 cos(x) + E1x

3 sin(x) .

• Here is a series of examples nding the general trial solution and then solving for the coecients:

Example: Find a function y such that y′′ + y′ + y = x.

∗ The procedure produces our trial solution as y = D0 + D1x, because there is no overlap with thesolutions to the homogeneous equation.

∗ We plug in and get 0 + (D1) + (D1x+D0) = x, so that D1 = 1 and D0 = −1.∗ So our solution is y = x− 1 .

Example: Find a function y such that y′′ − y = 2ex.

∗ The procedure gives the trial solution as y = D0xex, since D0e

x overlaps with the solution to thehomogeneous equation.

∗ If y = D0xex then y′′ = D0(x+ 2)ex so plugging in yields y′′ − y = [D0(x+ 2)ex]− [D1x e

x] = 2ex.

Solving yields D0 = 1, so our solution is y = xex .

Example: Find a function y such that y′′ − 2y′ + y = x+ sin(x).

∗ The procedure gives the trial solution as y = (D0 +D1x) + (D2 cos(x) +D3 sin(x)), by lling in themissing constant term and cosine term, and because there is no overlap with the solutions to thehomogeneous equation.

∗ Then we have y′′ = −D2 cos(x)−D3 sin(x) and y′ = D1−D2 sin(x)+D3 cos(x) so plugging in yields

y′′−2y′+y = [−D2 cos(x)−D3 sin(x)]−2 [D1 −D2 sin(x) +D3 cos(x)]+[D0 +D1x+D2 cos(x) +D3 sin(x)]

and setting this equal to x + sin(x) then requires D0 − 2D1 = 0, D1 = 1, D2 + 2D3 − D2 = 1,

D3 − 2D2 −D3 = 0, so our solution is y = x+ 2 +1

2cos(x) .

Example: Find all functions y such that y′′ + y = sin(x).

∗ The solutions to the homogeneous system y′′ + y = 0 are y = C1 cos(x) + C2 sin(x).

∗ Then the procedure gives the trial solution for the non-homogeneous equation as y = D0x cos(x) +D1x sin(x), by lling in the missing cosine term and then multiplying both by x due to the overlapwith the solutions to the homogeneous equation.

∗ We can compute (eventually) that y′′ = −D0x cos(x)− 2D0 sin(x)−D1x sin(x) + 2D1 cos(x).

∗ Plugging in yields y′′+y = (−D0x cos(x)− 2D0 sin(x)−D1x sin(x) + 2D1 cos(x))+(D0x sin(x) +D1x cos(x)),

and so setting this equal to sin(x), we obtain D0 = 0 and D1 = −1

2.

∗ Therefore the set of solutions is y = −1

2x cos(x) + C1 cos(x) + C2 sin(x) , for constants C1 and C2.

• Advanced Remark: The formal idea behind the method of undetermined coecients is that the terms on theright-hand side are themselves solutions of dierential equations and hence are sent to zero by a polynomialin the dierential operator D sending a function to its derivative. (Thus, for example, D(x2) = 2x andD(ex) = ex.)

Page 57: MATH320

Then if we apply that polynomial inD which sends Q(x) to zero, to both sides of the original equation, wewill end up with a homogeneous equation whose characteristic polynomial is the product of the originalcharacteristic polynomial and the characteristic polynomial for Q(x).

Example: Find the form of a solution to y′′ − y = x2.

∗ If we dierentiate both sides 3 times (i.e., apply the dierential operator D3 to both sides), in orderto kill o the x2 term on the right-hand side then we get y(5) − y(3) = 0, which has characteristicpolynomial r5− r3 = (r2−1) · (r3). Observe that this polynomial is the product of the characteristicpolynomial r2 − 1 of the original equation and the polynomial r3 corresponding to D3.

∗ Now y(5) − y(3) = 0 is homogeneous, so we can write down the general solution to obtain y =C1 + C2x+ C3x

2 + C4ex + C5e

−x.

∗ This is the same solution form that the method of undetermined coecients gives, though with someextra lower-degree terms (which are solutions to the homogeneous equation y′′ − y = 0).

Example: Find the form of a solution to y′′ + y = x+ sin(x).

∗ This is the same as the equation (D2 + 1) · y = x+ sin(x)

∗ We want to apply the operator D2 to kill the x term, and the operator D2+1 to kill the sin(x) term.

∗ The new dierential equation, after we are done, is (D2 + 1)(D2)(D2 + 1) · y = 0.

∗ The characteristic polynomial is (r2 + 1)2r2, which has a double root at each of r = 0 and ±i.∗ Solving the resulting homogeneous equation gives the solutions as y = C1 sin(x) + C2x sin(x) +C3 cos(x) + C4x cos(x) + C5 + C6x, which is the same thing that the method of undeterminedcoecients gives, up to some extra terms.

1.4.2 Variation of Parameters

• Variation of Parameters will solve any non-homogeneous linear equation provided that the solutions to thehomogeneous equation are known. However, the derivation is not entirely enlightening, so I will just give thesteps to follow to solve y(n)+Pn(x) y

(n−1)+ · · ·+P2(x) y′+P1(x) y = Q(x). (The method requires being able

to solve the homogeneous equation, so we will typically apply this method to the constant-coecient equationy(n) + an−1y

(n−1) + · · ·+ a1y′ + a0y = 0.)

Step 1: Solve the corresponding homogeneous equation y(n)+Pn(x) y(n−1)+ · · ·+P2(x) y

′+P1(x) y = 0and nd n (linearly independent) solutions y1, · · · , yn. Step 2: Look for functions v1, · · · , vn making yp = v1 · y1 + v2 · y2 + · · ·+ vn · yn a solution to the originalequation: do this by requiring v′1, v

′2, · · · , v′n to satisfy the system of equations

v′1 · y1 + v′2 · y2 + · · ·+ v′n · yn = 0

v′1 · y′1 + v′2 · y′2 + · · ·+ v′n · y′n = 0

......

...

v′1 · y(n−2)1 + v′2 · y

(n−2)2 + · · ·+ v′n · y(n−2)n = 0

v′1 · y(n−1)1 + v′2 · y

(n−1)2 + · · ·+ v′n · y(n−1)n = Q(x)

Solve the relations for v′1, v′2, · · · , v′n using Cramer's Rule (or any other method). This yields v′i =

Wi(x)

W (x),

where W is the Wronskian of the functions y1, y2, · · · , yn and Wi is the same Wronskian determinant

except with the ith column replaced with the column vector

0...0

Q(x)

. Step 3: Integrate each of these relations to nd v1, · · · , vn. (Ignore constants of integration.) Step 4: Write down the particular solution to the nonhomogeneous equation, yp = v1 · y1 + v2 · y2 + · · ·+ vn · yn .

Page 58: MATH320

Step 5: If asked, add the particular solution to the general solution to the homogeneous equation, to nd

all solutions of the nonhomogeneous equation. This will yield y = yp + C1y1 + · · ·+ Cnyn . Plug in any

extra conditions given to solve for coecients.

• Example: Find all functions y for which y′′ + y = sec(x).

Step 1: The homogeneous equation is y′′ + y = 0 which has two independent solutions of y1 = cos(x)and y2 = sin(x).

Step 2: We have Q(x) = sec(x).

∗ We have W =

∣∣∣∣ cos(x) sin(x)− sin(x) cos(x)

∣∣∣∣ = 1.

∗ Also, W1 =

∣∣∣∣ 0 sin(x)sec(x) cos(x)

∣∣∣∣ = − sin(x) · sec(x).

∗ Finally, W2 =

∣∣∣∣ cos(x) 0− sin(x) sec(x)

∣∣∣∣ = cos(x) · sec(x) = 1.

∗ Thus plugging in to the formulas gives v′1 = − sin(x) · sec(x) = − tan(x) and v′2 = cos(x) · sec(x) = 1.

Step 3: Integrating yields v1 = ln(cos(x)) and v2 = x.

Step 4: We obtain the particular solution of yp = ln(cos(x)) · cos(x) + x · sin(x).

Step 5: The general solution is, therefore, given by y = [ln(cos(x)) · cos(x) + x · sin(x)] + C1 sin(x) + C2 cos(x) .

• Example: Find all functions y for which y′′′ − y′′ + y′ − y = ex.

We could use undetermined coecients to solve this, but let's use variation of parameters.

Step 1: The homogeneous equation is y′′′− y′′+ y′− y = 0 which has characteristic polynomial r3− r2+r − 1 = (r − 1)(r2 + 1). So three independent solutions are y1 = cos(x), y2 = sin(x), and y3 = ex.

Step 2: We have Q(x) = ex.

∗ We have W =

∣∣∣∣∣∣cos(x) sin(x) ex

− sin(x) cos(x) ex

− cos(x) − sin(x) ex

∣∣∣∣∣∣ R3+R1=

∣∣∣∣∣∣cos(x) sin(x) ex

− sin(x) cos(x) ex

0 0 2ex

∣∣∣∣∣∣ = 2ex.

∗ Next, W1 =

∣∣∣∣∣∣0 sin(x) ex

0 cos(x) ex

ex − sin(x) ex

∣∣∣∣∣∣ = e2x(sin(x)− cos(x)).

∗ Also, W2 =

∣∣∣∣∣∣cos(x) 0 ex

− sin(x) 0 ex

− cos(x) ex ex

∣∣∣∣∣∣ = −e2x(cos(x) + sin(x)).

∗ Finally, W3 =

∣∣∣∣∣∣cos(x) sin(x) 0− sin(x) cos(x) 0− cos(x) − sin(x) ex

∣∣∣∣∣∣ = ex.

∗ Thus plugging in to the formulas gives v′1 =1

2ex(sin(x)− cos(x)), v′2 = −1

2ex(cos(x) + sin(x)), and

v′3 =1

2.

Step 3: Integrating yields v1 = −1

2ex cos(x), v2 = −1

2ex sin(x), and v3 =

1

2x.

Step 4: The particular solution is yp = −1

2ex cos2(x)− 1

2ex sin2(x) +

1

2xex = −1

2ex +

1

2xex.

Step 5: The general solution is, therefore, given by y =1

2xex + C1 cos(x) + C2 sin(x) + C3e

x . (Note

that we absorbed the −1

2ex term from the particular solution into C3e

x.)

• Theorem (Abel's Identity): If y1, y2, · · · , yn are any solutions to the homogeneous linear dierential equationy(n)+Pn(x) y

(n−1)+ · · ·+P2(x) y′+P1(x) y = 0, then the WronskianW (y1, · · · , yn) is equal to C ·e−

´Pn(x) dx,

for some constant C.

Page 59: MATH320

Abel's Identity allows one to compute the Wronskian (up to a constant factor) without actually ndingthe solutions y1, y2, · · · , yn. Another result of Abel's Identity is that the Wronskian is either zero everywhere (if C = 0) or nonzeroeverywhere (if C 6= 0, since exponentials are never zero).

Abel's Identity provides some relationships between the solution functions, which can sometimes be usedto deduce information about them.

∗ For example, for a second-order linear dierential equation with two solutions y1 and y2, the Wron-skian is W (x) = y′2y1 − y′1y2.∗ Therefore, if one solution y1 is known, Abel's identity gives the value of W (x) (up to a value for C,which we can take to be 1 by rescaling), and therefore yields a rst-order linear dierential equationfor the other solution y2, which can then be solved for y2.

∗ This procedure is known in general as reduction of order.

The (very clever) proof of Abel's Identity involves showing that the WronskianW satises the dierentialequation W ′ = −Pn(x) · W , by dierentiating the determinant that denes W , and then using rowoperations to deduce that W ′ = −Pn(x) ·W .

1.5 Second-Order Equations: Applications to Newtonian Mechanics

• One of the applications we somewhat care about is the use of second-order dierential equations to solvecertain physics problems. Most of the examples involve springs, because springs are easy to talk about.

Note: Second-order linear equations also arise often in basic circuit problems in physics and electricalengineering. All of the discussion of the behaviors of the solutions to these second-order equations alsocarries over to that setup.

1.5.1 Spring Problems and Damping

• Basic setup: An object is attached to one end of a spring whose other end is xed. The mass is displacedsome amount from the equilibrium position, and the problem is to nd the object's position as a function oftime.

Various modications to this basic setup include any or all of (i) the object slides across a surface thusadding a force (friction) depending on the object's velocity or position, (ii) the object hangs vertically thusadding a constant gravitational force, (iii) a motor or other device imparts some additional nonconstantforce (varying with time) to the object.

• In order to solve problems like this one, follow these steps:

Step 1: Draw a diagram and label the quantity or quantities of interest (typically, it is the position of amoving object) and identify and label all forces acting on those quantities.

Step 2: Find the values of the forces involved, and then use Newton's Second Law (F = ma) to writedown a dierential equation modeling the problem. Also use any information given to write down initialconditions.

∗ In the above, F is the net force on the object i.e., the sum of each of the individual forces acting onthe mass (with the proper sign) whilem is the mass of the object, and a is the object's acceleration.

∗ Remember that acceleration is the second derivative of position with respect to time thus, if y(t)is the object's position, acceleration is y′′(t).

∗ You may need to do additional work to solve for unknown constants e.g., for a spring constant, ifit is not explicitly given to you before you can fully set up the problem.

Step 3: Solve the dierential equation and nd its general solution.

Step 4: Plug in any initial conditions to nd the specic solution.

Step 5: Check that the answer obtained makes sense in the physical context of the problem.

Page 60: MATH320

∗ In other words, if you have an object attached to a xed spring sliding on a frictionless surface,you should expect the position to be sinusoidal, something like C1 sin(ωt)+C2 cos(ωt)+D for someconstants C1, C2, ω,D.

∗ If you have an object on a spring sliding on a surface imparting friction, you should expect theposition to tend to some equilibrium value as t grows to ∞, since the object should be 'slowingdown' as time goes on.

• Basic Example: An object, mass m, is attached to a spring of spring constant k whose other end is xed. Theobject is displaced a distance d from the equilibrium position of the spring, and is let go with velocity v0 attime t = 0. If the object is restricted to sliding horizontally on a frictionless surface, nd the position of theobject as a function of time.

Step 1: Take y(t) to be the displacement of the object from the equilibriumposition. The only force acting on the object is from the spring, Fspring.

Step 2: We know that Fspring = −k · y from Hooke's Law (aka, the only thing we know about springs).Therefore we have the dierential equation −k · y = m · y′′. We are also given the initial conditionsy(0) = d and y′(0) = v0.

Step 3: We can rewrite the dierential equation as m · y′′ + k · y = 0, or as y′′ +k

m· y = 0. The

characteristic equation is then r2 +k

m= 0 with roots r = ±

√k

mi. Hence the general solution is

y = C1 cos(ωt) + C2 sin(ωt) , where ω =

√k

m.

Step 4: The initial conditions give d = y(0) = C1 and v0 = y′(0) = ωC2 hence C1 = d and C2 = v0/ω.

Hence the solution we want is y = d · cos(ωt) + v0ω· sin(ωt) .

Step 5: The solution we have obtained makes sense in the context of this problem, since on a frictionlesssurface we should expect that the object's motion would be purely oscillatory it should just bounceback and forth along the spring forever since there is nothing to slow its motion. We can even see that the

form of the solution agrees with our intuition: the fact that the frequency ω =

√k

mincreases with bigger

spring constant but decreases with bigger mass makes sense a stronger spring with larger k should pullback harder on the object and cause it to oscillate more quickly, while a heavier object should resist thespring's force and oscillate more slowly.

• Most General Example: An object, mass m, is attached to a spring of spring constant k whose other end isxed. The object is displaced a distance d from the equilibrium position of the spring, and is let go withvelocity v0 at time t = 0. A motor attached to the object imparts a force along its direction of motion givenby R(t). If the object is restricted to sliding horizontally on a surface which imparts a frictional force of µtimes the velocity of the object (opposite to the object's motion), set up a dierential equation modeling theproblem.

Here is the diagram: .

As before we take y(t) to be the displacement of the object from the equilibrium position. The forcesacting on the object are from the spring, Fspring, from friction, Ffriction, and from the motor, Fmotor.

Page 61: MATH320

We know that Fspring = −k · y from Hooke's Law (aka, the only thing we know about springs). We arealso given that Ffric = −µ · y′, since the force acts opposite to the direction of motion and velocity isgiven by y′. And we are just given Fmotor = R(t).

Plugging in gives us the dierential equation −k · y − µ · y′ + R(t) = m · y′′, which in standard form is

m · y′′ + µ · y′ + k · y = R(t) . We are also given the initial conditions y(0) = d and y′(0) = v0.

• Some Terminology: If we were to solve the dierential equation m · y′′ + µ · y′ + k · y = 0 (here we assumethat there is no outside force acting on the object, other than the spring and friction), we would observe afew dierent kinds of behavior depending on the parameters m, µ, and k.

Overdamped Case: If µ2 − 4mk > 0 and R(t) = 0, we would end up with general solutions of the formC1e

−r1t+C2e−r2t, which when graphed is just a sum of two exponentially-decaying functions. Physically,

as we can see from the condition µ2−4mk > 0, this means we have 'too much' friction, since we can justsee from the form of the solution function that the position of the object will just slide back towards itsequilibrium at y = 0 without oscillating at all. This is the overdamped case. [Overdamped becausethere is 'too much' damping.]

Critically Damped Case: If µ2 − 4mk = 0 and R(t) = 0, we would end up with general solutions of theform (C1 + C2t)e

−rt, which when graphed is a slightly-slower-decaying exponential function that stilldoes not oscillate, but could possibly cross the position y = 0 once, depending on the values of C1 andC2. This is the critically damped case. [Critically because we give the name 'critical' to values wheresome kind of behavior transitions from one thing to another.]

Underdamped Case: If µ2 − 4mk < 0 and R(t) = 0, we end up with general solutions of the form

e−αt · [C1 cos(ωt) + C2 sin(ωt)], where α = − µ

2mand ω2 =

4mk − µ2

4m2. When graphed this is a sine

curve times an exponentially-decaying function. Physically, this means that there is some friction (theexponential), but 'not enough' friction to eliminate the oscillations entirely the position of the objectwill still tend toward y = 0, but the sine and cosine terms will ensure that it continues oscillating. Thisis the underdamped case. [Underdamped because there's not enough damping.]

Undamped Case: If there is no friction (i.e., µ = 0), we saw earlier that the solutions are of the formy = C1 cos(ωt) + C2 sin(ωt) where ω

2 = k/m. Since there is no friction, it is not a surprise that this isreferred to as the undamped case.

1.5.2 Resonance and Forcing

• Suppose an object of mass m (sliding on a frictionless surface) is oscillating on a spring with frequency ω.Examine what happens to the object's motion if an external force F (t) = A cos(ωt) is applied which oscillatesat the same frequency ω.

From the solution to the Basic Example above, we know that ω =√k/m so we must have k = m · ω2.

Then if y(t) is the position of the object once we add in this new force R(t) = A cos(ωt+ θ), Newton'sSecond Law now gives −k · y +R(t) = m · y′′, or m · y′′ + k · y = R(t).

If we divide through by m and put in k = m · ω2 we get y′′ + ω2y =A

mcos(ωt).

Now we use the method of undetermined coecients to nd a solution to this dierential equation.

We would like to try something of the form y = D1 cos(ωt) +D2 sin(ωt), but this will not work becausefunctions of that form are already solutions to the homogeneous equation y′′ + ω2y = 0.

Instead the method instructs that the appropriate solution will be of the form y = D1t · cos(ωt) +D2t · sin(ωt). We can use a trigonometric formula (the sum-to-product formula) to rewrite this asy = D t · cos(ωt+φ), where φ is a phase shift. (We can solve for the coecients in terms of A,m,ω butit will not be so useful.)

We can see from this formula that as t grows, so does the amplitude D · t: in other words, as time goeson, the object will continue oscillating with frequency ω around its equilibrium point, but the swingsback and forth will get larger and larger.

Page 62: MATH320

You can observe this phenomenon for yourself if you sit in a rocking chair, or swing an object back andforth you will quickly nd that the most eective way to rock the chair or swing the object is to pushback and forth at the same frequency that the object is already moving at.

• We may work out the same computation with an external force F (t) = A cos(ω1t) oscillating at a frequencyω1 6= ω.

In this case (using the same argument as above) we have y′′ + ω2y =A

mcos(ω1t).

The trial solution (again by undetermined coecients) is y(t) = B cos(ω1t) , where B =A/m

ω2 − ω21

.

Thus the overall solution is B cos(ω1t), plus a solution to the homogeneous system.

Now as we can see, if ω1 and ω are far apart (i.e., the driving force is oscillating at a very dierentfrequency from the frequency of the original system) then B will be small, and so the overall changeB cos(ω1t) that the driving force adds will be relatively small.

However, if ω1 and ω are very close to one another (i.e., the driving force is oscillating at a frequenceclose to that of the original system) then B will be large, and so the driving force will cause the systemto oscillate with a much bigger amplitude.

As ω1 approaches ω, the amplitude B will go to ∞, which agrees with the behavior seen in the previousexample (where we took ω1 = ω).

• Important Remark: Understanding how resonance arises (and how to minimize it!) is a very, very importantapplication of dierential equations to structural engineering.

A poor understanding of resonance is something which, at several times in the not-too-distant past, hascaused bridges to fall down, airplanes to crash, and buildings to fall over.

We can see from the two examples that resonance arises when an external force acts on a system at (orvery close to) the same frequency that the system is already oscillating at.

Of course, resonance is not always bad. The general principle, of applying an external driving force at (oneof) a system's natural resonance frequencies, is the underlying physical idea behind the construction ofmany types of musical instruments.

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 63: MATH320

Math 320 (part 5): Eigenvalues and Eigenvectors (by Evan Dummit, 2012, v. 1.00)

Contents

1 Eigenvalues and Eigenvectors 1

1.1 The Basic Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Some Slightly More Advanced Results About Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Theory of Similarity and Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 How To Diagonalize A Matrix (if possible) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1 Eigenvalues and Eigenvectors

• We have discussed (perhaps excessively) the correspondence between solving a system of homogeneous linearequations and solving the matrix equation A · ~x = ~0, for A an n × n matrix and ~x and ~0 each n × 1 columnvectors.

• For reasons that will become more apparent soon, a more general version of this question which is also ofinterest is to solve the matrix equation A · ~x = λ~x, where λ is a scalar. (The original homogeneous systemproblem corresponds to λ = 0.)

• In the language of linear transformations, this says the following: given a linear transformation T : V → Vfrom a vector space V to itself, on what vectors ~x does T act as multiplication by a constant λ?

1.1 The Basic Setup

• Denition: For A an n× n matrix, a nonzero vector ~x with A · ~x = λ~x is called an eigenvector of A, and thecorresponding scalar λ is called an eigenvalue of A.

Important note: We do not consider the zero vector ~0 an eigenvector.

For a xed value of λ, the set Sλ whose elements are the eigenvectors ~x with A · ~x = λ~x, together withthe zero vector, is a subspace of V . (This set Sλ is called the eigenspace associated to the eigenvalue λ.)

∗ [S1]: Sλ contains the zero vector.

∗ [S2]: Sλ is closed under addition, because if A · ~x1 = λ~x1 and A · ~x2 = λ~x2, then A · (~x1 + ~x2) =λ(~x1 + ~x2).

∗ [S3]: Sλ is closed under scalar multiplication, because for any scalar β, A · (β~x) = β(A ·~x) = β(λ~x) =λ(β~x).

• It turns out that it is fairly straightforward to nd all of the eigenvalues: because λ~x = (λI) · ~x where I is then× n identity matrix, we can rewrite the eigenvalue equation A · ~x = λ~x = (λI) · ~x as (λI − A) · ~x = ~0. Butwe know precisely when there will be a nonzero vector ~x with (λI −A) ·~x = ~0: it is when the matrix (λI −A)is not invertible, or, in other words, when det(λI −A) = 0.

• Denition: When we expand the determinant det(tI − A), we will obtain a polynomial of degree n in thevariable t. This polynomial is called the characteristic polynomial p(t) of the matrix A, and its roots areprecisely the eigenvalues of A.

Notation 1: Some authors instead dene the characteristic polynomial as the determinant of the matrixA− tI rather than tI −A. I dene it this way because then the coecient of tn will always be 1, ratherthan (−1)n.

Page 64: MATH320

Notation 2: It is often customary, when referring to the eigenvalues of a matrix, to include an eigenvaluethe appropriate number of extra times if the eigenvalue is a multiple root of the characteristic polynomial.Thus, for the characteristic polynomial t2(t− 1)3, we could say the eigenvalues are λ = 0, 0, 1, 1, 1 if wewanted to emphasize that the eigenvalues occurred more than once.

Remark: The characteristic polynomial may have non-real numbers as roots. Non-real eigenvalues areabsolutely acceptable; the only wrinkle is that the eigenvectors for these eigenvalues will also necessarilycontain non-real entries. (If A has real number entries, then any non-real roots of the characteristic poly-nomial will come in complex conjugate pairs. The eigenvectors for one root will be complex conjugatesof the eigenvectors for the other root.)

• Proposition: The eigenvalues of an upper-triangular matrix are the diagonal entries.

This statement follows from the observation that the determinant of an upper-triangular matrix is theproduct of the diagonal entries, combined with the observation that if A is upper-triangular, then tI−Ais also upper-triangular. (If diagonal entries are repeated, the eigenvalues are repeated the same numberof times.)

Example: The eigenvalues of

1 i√3

0 3 −80 0 π

are 1, 3, and π, and the eigenvalues of

2 0 10 3 20 0 2

are 2,

2, and 3.

• To nd all the eigenvalues (and eigenvectors) of a matrix A, follow these steps:

Step 1: Write down the matrix tI − A and compute its determinant (using any method) to obtain thecharacteristic polynomial p(t).

Step 2: Set p(t) equal to zero and solve. The roots are precisely the eigenvalues λ of A.

Step 3: For each eigenvalue λ, solve for all vectors ~x satisfying A · ~x = λ~x. (Either do this directly, or bysolving the homogeneous system (λI−A)·~x = ~0 via row-reduction.) The resulting solution vectors ~x formthe eigenspace associated to λ, and the nonzero vectors in the space are the eigenvectors correspondingto λ.

• Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix A =

[1 00 1

].

Step 1: We have tI −A =

[t− 1 00 t− 1

], so p(t) = det(tI −A) = (t− 1)2.

Step 2: The characteristic equation (t−1)2 = 0 has a double root t = 1. So the eigenvalues are λ = 1, 1 .

(Alternatively, we could have used the fact that the matrix is upper-triangular.)

Step 3: We want to nd the vectors with

[1 00 1

]·[ab

]=

[ab

]. Clearly, all vectors

[ab

]have this

property. Therefore, a basis for the eigenspace with λ = 1 is given by

[10

]and

[01

].

• Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix A =

[1 10 1

].

Step 1: We have tI −A =

[t− 1 −10 t− 1

], so p(t) = det(tI −A) = (t− 1)2.

Step 2: The characteristic equation (t−1)2 = 0 has a double root t = 1. So the eigenvalues are λ = 1, 1 .

(Alternatively, we could have used the fact that the matrix is upper-triangular.)

Step 3: We want to nd the vectors with

[1 10 1

]·[ab

]=

[ab

]. This requires

[ab

]=

[a+ bb

],

which means a can be arbitrary and b = 0. So the vectors we want are those of the form

[a0

], and so

a basis for the eigenspace with λ = 1 is given by

[10

].

Page 65: MATH320

Remark: Note that this matrix

[1 10 1

]and the identity matrix

[1 00 1

]have the same characteristic

polynomial and eigenvalues, but do not have the same eigenvectors. In fact, for λ = 1, the eigenspace

for

[1 10 1

]is 1-dimensional, while the eigenspace for

[1 00 1

]is 2-dimensional.

• Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix A =

[2 23 1

].

Step 1: We have tI−A =

[t− 2 −2−3 t− 1

], so p(t) = det(tI−A) = (t−2)(t−1)−(−2)(−3) = t2−3t−4.

Step 2: Since p(t) = t2 − 3t− 4 = (t− 4)(t+ 1), the eigenvalues are λ = 1, 4 .

Step 3:

∗ For λ = −1 we want

[2 23 1

]·[ab

]= −

[ab

], so we need

[2a+ 2b3a+ b

]=

[−a−b

], which reduces

to a = −2

3b. So the vectors we want are those of the form

[−2

3b

b

], so a basis is given by

[−23

].

∗ For λ = 1 we want

[2 23 1

]·[ab

]= 4

[ab

], so we need

[2a+ 2b3a+ b

]=

[4a4b

], which reduces to

a = b. So the vectors we want are those of the form

[bb

], so a basis is given by

[11

].

• Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix A =

0 0 01 0 −10 1 0

. Step 1: We have tI −A =

t 0 0−1 t 10 −1 t

, so p(t) = det(tI −A) = t ·∣∣∣∣ t 1−1 t

∣∣∣∣ = t · (t2 + 1).

Step 2: Since p(t) = t · (t2 + 1), the eigenvalues are λ = 0, i, −i . Step 3:

∗ For λ = 0 we want

0 0 01 0 −10 1 0

· abc

= 0

abc

, so we need

0a− cb

=

000

, so a = c and

b = 0. So the vectors we want are those of the form

a0a

, so a basis is given by

101

.∗ For λ = i we want

0 0 01 0 −10 1 0

· abc

= i

abc

, so we need

0a− cb

=

iaibic

, so a = 0

and b = ic. So the vectors we want are those of the form

0icc

, so a basis is given by

0i1

.∗ For λ = −i we want

0 0 01 0 −10 1 0

· abc

= −i

abc

, so we need

0a− cb

=

−ia−ib−ic

, soa = 0 and b = −ic. So the vectors we want are those of the form

0−icc

, so a basis is given by

0−i1

.

Page 66: MATH320

• Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix A =

1 0 1−1 1 3−1 0 3

.

Step 1: We have tI−A =

t− 1 0 −11 t− 1 −31 0 t− 3

, so p(t) = (t−1)·∣∣∣∣ t− 1 −3

0 t− 3

∣∣∣∣+(−1)·∣∣∣∣ 1 t− 11 0

∣∣∣∣ =(t− 1)2(t− 3) + (t− 1).

Step 2: Since p(λ) = (t− 1) · [(t− 1)(t− 3) + 1] = (t− 1)(t− 2)2, the eigenvalues are λ = 1, 2, 2 .

Step 3:

∗ For λ = 1 we want

1 0 1−1 1 3−1 0 3

· abc

= 1

abc

, so we need

a+ c−a+ b+ 3c−a+ 3c

=

abc

, soc = 0 and a = 0. So the vectors we want are those of the form

0b0

, so a basis is given by

010

.∗ For λ = 2 we want

1 0 1−1 1 3−1 0 3

· abc

= 2

abc

, so we need

a+ c−a+ b+ 3c−a+ 3c

=

2a2b2c

, soa = c and b = 2c. So the vectors we want are those of the form

c2cc

, so a basis is given by

121

.

1.2 Some Slightly More Advanced Results About Eigenvalues

• Theorem: If λ is an eigenvalue of the matrix A which appears exactly k times as a root of the characteristicpolynomial, then the dimension of the eigenspace corresponding to λ is at least 1 and at most k.

Remark: The number of times that λ appears as a root of the characteristic polynomial is called thealgebraic multiplicity of λ, and the dimension of the eigenspace corresponding to λ is called the geometricmultiplicity of λ. So what the theorem says is that the geometric multiplicity is at most the algebraicmultiplicity.

Example: If the characteristic polynomial is (t − 1)3(t − 3)2, then the eigenspace for λ = 1 is at most3-dimensional, and the eigenspace for λ = 3 is at most 2-dimensional.

Proof: The statement that the eigenspace has dimension at least 1 is immediate, because (by assumption)λ is a root of the characteristic polynomial and therefore has at least one nonzero eigenvector associatedto it.

∗ For the statement that the dimension is at most k, the idea is to look at the homogeneous system(λI −A) · ~x = ~0.

∗ If λ appears k times as a root of the characteristic polynomial, then when we put the matrix λI −Ainto its reduced row-echelon form B, then B must have at most k rows of all zeroes.

∗ Otherwise, the matrix B (and hence λI −A too, although this requires a check) would have 0 as aneigenvalue more than k times, because B is in echelon form and therefore upper-triangular.

∗ But the number of rows of all zeroes in a square matrix is the same as the number of nonpivotalcolumns, which is the number of free variables, which is the dimension of the solution space.

∗ So, putting all the statements together, we see that the dimension of the eigenspace is at most k.

• Theorem: If ~v1, ~v2, . . . , ~vn are eigenvectors ofA associated to distinct eigenvalues λ1, λ2, . . . , λn, then ~v1, ~v2, . . . , ~vnare linearly independent.

Page 67: MATH320

Proof: Suppose we had a nontrivial dependence relation between ~v1, . . . , ~vn, say a1~v1 + · · · + an~vn = ~0.(Note that at least two coecients have to be nonzero, because none of ~v1, . . . , ~vn is the zero vector.)

∗ Multiply both sides by the matrix A: this gives A · (a1~v1 + · · ·+ an~vn) = A ·~0 = ~0.

∗ Now since ~v1, . . . , ~vn are eigenvectors this says a1(λ1~v1) + · · ·+ an(λn~vn) = ~0.

∗ But now if we scale the original equation by λ1 and subtract (to eliminate ~v1), we obtain a2(λ2 −λ1)~v2 + a3(λ3 − λ1)~v3 + · · ·+ an(λn − λ1)~vn = ~0.

∗ Since by assumption all of the eigenvalues λ1, λ2, . . . , λn were dierent, this dependence is stillnontrivial, since each of λj − λ1 is nonzero, and at least one of a2, · · · , an is nonzero.

∗ But now we can repeat the process to eliminate each of ~v2, ~v3, . . . , ~vn−1 in turn. Eventually weare left with the equation b ·~vn = ~0 for some nonzero b. But this is impossible, because it would saythat ~vn = ~0, contradicting our denition saying that the zero vector is not an eigenvector.

∗ So there cannot be a nontrivial dependence relation, meaning that ~v1, . . . , ~vn are linearly independent.

• Corollary: If A is an n × n matrix with n distinct eigenvalues λ1, λ2, . . . , λn, and ~v1, ~v2, . . . , ~vn are (any)eigenvectors associated to those eigenvalues, then ~v1, ~v2, . . . , ~vn are a basis for Rn.

This result follows from the previous theorem: it guarantees that ~v1, ~v2, . . . , ~vn are linearly independent,so since they are vectors in the n-dimensional vector space Rn, they are a basis.

• Theorem: The product of the eigenvalues of A is the determinant of A.

Proof: If we expand out the product p(t) = (t− λ1) · (t− λ2) · · · (t− λn), we see that the constant termis equal to (−1)nλ1λ2 · · ·λ2. But the constant term is also just p(0), and since p(t) = det(tI − A) wehave p(0) = det(−A) = (−1)n det(A). Thus, setting the two expressions equal shows that the productof the eigenvalues equals the determinant of A.

• Theorem: The sum of the eigenvalues of A equals the trace of A.

Note: The trace of a matrix is dened to be the sum of its diagonal entries.

Proof: If we expand out the product p(t) = (t − λ1) · (t − λ2) · · · (t − λn) we see that the coecient oftn−1 is equal to −(λ1 + · · · + λn). If we expand out the determinant det(tI − A) to nd the coecientof tn−1, we can show (with a little bit of eort) that the coecient is the negative of the sum of thediagonal entries of A. Therefore, setting the two expressions equal shows that the sum of the eigenvaluesequals the trace of A.

1.3 Theory of Similarity and Diagonalization

• Denition: We say two n× n matrices A and B are similar (or conjugate) if there exists an invertible n× nmatrix P such that B = P−1AP . (We refer to P−1AP as the conjugation of A by P .)

Example: The matrices A =

[3 −12 −1

]and B =

[1 21 1

]are similar: with P =

[2 31 2

], so that

P−1 =

[2 −3−1 2

], we can verify that

[2 −3−1 2

]·[

3 −12 −1

]·[

2 31 2

]=

[1 21 1

], so that

P−1AP = B.

Remark: The matrix Q =

[0 −11 2

]also has B = Q−1AQ. In general, if two matrices A and B are

similar, then there can be many dierent matrices Q with Q−1AQ = B.

• Similar matrices have quite a few useful algebraic properties (which justify the name similar). If B = P−1APand D = P−1CP , then we have the following:

The sum of the conjugates is the conjugate of the sum: B +D = P−1AP + P−1CP = P−1(A+ C)P .

The product of the conjugates is the conjugate of the product: BD = P−1AP · P−1CP = P−1(AC)P .

The inverse of the conjugate is the conjugate of the inverse: A−1 exists if and only if B−1 exists, andB−1 = P−1A−1P .

Page 68: MATH320

The determinant of the conjugate is equal to the original determinant: det(B) = det(P−1AP ) =det(P−1) det(A) det(P ) = det(A) det(P−1P ) = det(A).

The conjugate has the same characteristic polynomial as the original matrix: det(tI − B) = det(P−1 ·tI · P − P−1 ·A · P ) = det(P−1(tI −A)P ) = det(tI −A).∗ In particular, a matrix and its conjugate have the same eigenvalues (with the same multiplicities).

∗ Also, by using the fact that the trace is equal both to the sum of the diagonal elements and acoecient in the characteristic polynomial, we see that a matrix and its conjugate have the sametrace.

If ~x is an eigenvector of A with eigenvalue λ, then P−1 · ~x is an eigenvector of B with eigenvalue λ: ifA · ~x = λ~x then B · (P−1~x) = P−1A(PP−1)~x = P−1A · ~x = P−1 · (λ~x) = λ(P−1~x).

∗ This is also true in reverse: if ~y is an eigenvector of B then P · ~y is an eigenvector of A (with thesame eigenvalue).

∗ In particular, the eigenspaces for B have the same dimensions as the eigenspaces for A.

• One question we might have about similarity is: given a matrix A, what is the simplest matrix B that A issimilar to?

As observed above, any matrix similar to A has the same eigenvalues for A. So, if the eigenvalues are

λ1, · · · , λn, the simplest form we could plausibly hope for would be a diagonal matrix

λ1. . .

λn

whose diagonal elements are the eigenvalues of A.

• Denition: We say that a matrix A is diagonalizable if it is similar to a diagonal matrix D; that is, if thereexists an invertible matrix P with D = P−1AP .

Example: The matrix A =

[−2 −63 7

]is diagonalizable. We can check that for P =

[−1 21 −1

]and

P−1 =

[1 21 1

], then we have P−1AP =

[4 00 1

]= D.

• If we know that A is diagonalizable and have D = P−1AP , then it is very easy to compute any power of A:

Since D is diagonal, Dk is the diagonal matrix whose diagonal entries are the kth powers of D.

Then Dk = (P−1AP )k = P−1(Ak)P , so Ak = P ·Dk · P−1.

Example: With A =

[−2 −63 7

]as above, we have Dk =

[4k 00 1

], so that Ak =

[−1 21 −1

]·[

4k 00 1

]·[

1 21 1

]=

[2− 4k 2− 2 · 4k−1 + 4k −1 + 2 · 4k

].

∗ Observation: This formula also makes sense for values of k which are not positive integers. For

example, if k = −1 we get the matrix

7

4

3

2

−3

4−1

2

, which is actually the inverse matrix A−1. And

if we set k =1

2we get the matrix B =

[0 −21 3

], whose square satises B2 =

[−2 −63 7

]= A.

• Theorem: An n × n matrix A is diagonalizable if and only if it has n linearly independent eigenvectors. Inparticular, every matrix whose eigenvalues are distinct is diagonalizable.

Proof: If A has n linearly independent eigenvectors ~v1, · · · , ~vn with respective eigenvalues λ1, · · · , λn,

then consider the matrix P =

| | |~v1 · · · ~vn| | |

whose columns are the eigenvectors of A.

Page 69: MATH320

∗ Because ~v1, · · · , ~vn are eigenvectors, we have A ·P =

| | |A~v1 · · · A~vn| | |

=

| | |λ1~v1 · · · λn~vn| | |

.∗ But we also have

| | |λ1~v1 · · · λn~vn| | |

=

| | |~v1 · · · ~vn| | |

· λ1

. . .

λn

= P ·D.

∗ Therefore, A · P = P ·D. Now since the eigenvectors are linearly independent, P is invertible, andwe can therefore write D = P−1AP , as desired.

For the other direction, if D = P−1AP then (like above) we can rewrite this to say AP = PD.

∗ If P =

| | |~v1 · · · ~vn| | |

then AP = PD says

| | |A~v1 · · · A~vn| | |

=

| | |λ1~v1 · · · λn~vn| | |

, which(by comparing columns) says that A~v1 = λ1~v1, . . . , A~vn = λn~vn. Thus the columns ~v1, · · · , ~vn ofP are eigenvectors, and (because P is invertible) they are linearly independent.

Finally, the last statement in the proof follows because (as shown earlier) a matrix with n distincteigenvalues has n linearly independent eigenvectors.

• Advanced Remark: As the theorem demonstrates, if we are trying to diagonalize a matrix, we can run intotrouble if the matrix has repeated eigenvalues. However, we might still like to know what the simplest forma non-diagonalizable matrix is similar to.

The answer is given by what is called the Jordan Canonical Form (of a matrix): every matrix is similar

to a matrix of the form

J1

J2. . .

Jn

, where each J1, · · · , Jn is a square Jordan block matrix

of the form J =

λ 1

λ 1. . . 1

λ

, with λs on the diagonal and 1s directly above the diagonal (where

blank entries are zeroes).

Example: The non-diagonalizable matrix

2 1 00 2 00 0 3

is in Jordan Canonical Form, with J1 =

[2 10 2

]and J2 = [3].

The existence and uniqueness of the Jordan Canonical Form can be proven using a careful analysis ofgeneralized eigenvectors: vectors ~x satisfying (λI − A)k · ~x = ~0 for some positive integer k. (Regulareigenvectors would correspond to k = 1.)

∗ Roughly speaking, the idea is to use certain carefully-chosen generalized eigenvectors to ll in forthe missing eigenvectors; doing this causes the appearance of the extra 1s appearing above thediagonal in the Jordan blocks.

• Theorem (Cayley-Hamilton): If p(x) is the characteristic polynomial of a matrix A, then p(A) is the zeromatrix 0 (where in applying a polynomial to a matrix, we replace the constant term with that constant timesthe identity matrix).

Example: For the matrix A =

[2 23 1

], we have det(tI − A) =

∣∣∣∣ t− 2 −2−3 t− 1

∣∣∣∣ = (t − 1)(t − 2) − 6 =

t2 − 3t − 4. We can compute A2 =

[10 69 7

], and then indeed we have A2 − 3A − 4I =

[10 69 7

]−[

6 69 3

]−[

4 00 4

]=

[0 00 0

].

Page 70: MATH320

Proof (if A is diagonalizable): If A is diagonalizable, then let D = P−1AP with D diagonal, and p(x) bethe characteristic polynomial of A.

∗ The diagonal entries of D are the eigenvalues λ1, · · · , λn of A, hence are roots of the characteristicpolynomial of A. So p(λ1) = · · · = p(λn) = 0.

∗ Then, because raising D to a power just raises all of its diagonal entries to that power, we can see

that p(D) = p

λ1

. . .

λn

=

p(λ1). . .

p(λn)

=

0. . .

0

= 0.

∗ Now by conjugating each term and adding the results, we see that 0 = p(D) = p(P−1AP ) =P−1 [p(A)]P . So by conjugating back, we see that p(A) = P · 0 · P−1 = 0.

In the case where A is not diagonalizable, the proof is more dicult. One way is to use the JordanCanonical Form J of A in place of the diagonal matrix D; then (one can verify) p(J) = 0, and then theremainder of the argument is the same.

1.4 How To Diagonalize A Matrix (if possible)

• In order to determine whether a matrix A is diagonalizable (and if it is, how to nd a diagonalizationA = P−1DP ), follow these steps:

Step 1: Find the characteristic polynomial and eigenvalues of A.

Step 2: Find a basis for each eigenspace of A.

Step 3a: Determine whether A is diagonalizable if each eigenspace has the proper dimension (namely,the number of times the corresponding eigenvalue appears as a root of the characteristic polynomial)then the matrix is diagonalizable. Otherwise, the matrix is not diagonalizable.

Step 3b: If the matrix is diagonalizable, then D is the diagonal matrix whose diagonal entries are theeigenvalues of A (with appropriate multiplicities), and then P can be taken to be the matrix whosecolumns are linearly independent eigenvectors of A in the same order as the eigenvalues appear in D.

• Example: For A =

[0 −23 5

], determine whether there exists a diagonal matrix D and an invertible matrix

P with D = P−1AP , and if so, nd them.

Step 1: We have tI − A =

[t 2−3 t− 5

]so det(tI − A) = t(t − 5) + 6 = t2 − 5t + 3 = (t − 2)(t − 3).

The eigenvalues are therefore λ = 2, 3.

Step 2:

∗ For λ = 2 we need to solve

[0 −23 5

]·[ab

]= 2

[ab

], so

[−2b

3a+ 5b

]=

[2a2b

]and thus a = −b.

The eigenvectors are of the form

[−bb

]so a basis for the λ = 2 eigenspace is

[−11

].

∗ For λ = 3 we need to solve

[0 −23 5

]·[ab

]= 3

[ab

], so

[−2b

3a+ 5b

]=

[3a3b

]and thus a = −2

3b.

The eigenvectors are of the form

[−2

3b

b

]so a basis for the λ = 3 eigenspace is

[−23

].

Step 3: Since the eigenvalues are distinct we know that A is diagonalizable, and D =

[2 00 3

]. We

have two linearly independent eigenvectors, and so we can take P =

[−1 −21 3

].

To check: we have P−1 =

[−3 −21 1

], so P−1AP =

[−3 −21 1

] [0 −23 5

] [−1 −21 3

]=

[2 00 3

]=

D.

Page 71: MATH320

Note: We could also take D =

[3 00 2

]if we wanted. There is no particular reason to care much about

which diagonal matrix we want as long as we make sure to arrange the eigenvectors in the correct order.

• Example: For A =

1 −1 00 2 00 2 1

, determine whether there exists a diagonal matrix D and an invertible

matrix P with D = P−1AP , and if so, nd them.

Step 1: We have tI−A =

t− 1 1 00 t− 2 00 −2 t− 1

so det(tI−A) = (t−1)·∣∣∣∣ t− 2 0−2 t− 1

∣∣∣∣ = (t−1)2(t−2).

The eigenvalues are therefore λ = 1, 1, 2.

Step 2:

∗ For λ = 1 we need to solve

1 −1 00 2 00 2 1

· abc

=

abc

, so a− b

2b2b+ c

=

abc

and thus b = 0.

The eigenvectors are of the form

a0c

so a basis for the λ = 1 eigenspace is

100

, 0

01

.∗ For λ = 2 we need to solve

1 −1 00 2 00 2 1

· abc

= 2

abc

, so a− b

2b2b+ c

=

2a2b2c

and thus

a = −b and c = 2b. The eigenvectors are of the form

−bb2b

so a basis for the λ = 2 eigenspace is −112

. Step 3: Since the eigenspace for λ = 1 is 2-dimensional, the matrixA is diagonalizable, and D =

1 0 00 1 00 0 2

.

We have three linearly independent eigenvectors, so we can take P =

1 0 −10 0 10 1 2

.

To check: we have P−1 =

1 1 00 −2 10 1 0

, so P−1AP =

1 1 00 −2 10 1 0

1 −1 00 2 00 2 1

1 0 −10 0 10 1 2

= 1 0 00 1 00 0 2

= D.

• Example: For A =

1 1 10 1 10 0 1

, determine whether there exists a diagonal matrix D and an invertible matrix

P with D = P−1AP , and if so, nd them.

Step 1: We have tI−A =

t− 1 −1 −10 t− 1 −10 0 t− 1

so det(tI−A) = (t−1)3 since tI−A is upper-triangular.

The eigenvalues are therefore λ = 1, 1, 1.

Step 2:

Page 72: MATH320

∗ For λ = 1 we need to solve

1 1 10 1 10 0 1

· abc

=

abc

, so a+ b+ c

b+ cc

=

abc

and thus

a = b = 0. The eigenvectors are of the form

00c

so a basis for the λ = 1 eigenspace is

001

. Step 3: Since the eigenspace for λ = 1 is 1-dimensional but the eigenvalue appears 3 times as a root of

the characteristic polynomial, the matrix A is not diagonalizable .

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 73: MATH320

Math 320 (part 6): Supplement (by Evan Dummit, 2012, v. 1.00)

0.1 Repeated-Eigenvalue Systems

• We would like to be able to solve systems ~y′ = A · ~y where A is an n × n matrix which does not have nlinearly-independent eigenvectors (equivalently: for which A is not diagonalizable).

Recall that if ~v is an eigenvector with eigenvalue λ, then ~y = ~veλt is a solution to the dierential equation.

By the existence-uniqueness theorem we know that the system ~y′ = A · ~y has an n-dimensional solutionspace.

So if A has n linearly-independent eigenvectors, then we can write down the general solution to thesystem directly.

If A has fewer than n linearly-independent eigenvectors, we are still missing some of the solutions, andneed to construct the missing ones.

We do this via generalized eigenvectors: vectors which are not eigenvectors, but are close enough thatwe can use them to write down more solutions to the system ~y′ = A · ~y. If λ is a root of the characteristic equation k times, we say that λ has multiplicity k. If the eigenspacefor λ has dimension less than k, we say that λ is defective.

• Here is the procedure to follow, to solve a system ~y′ = A · ~y which may be defective:

Step 1: Find the eigenvalues of the n× n matrix A.

Step 2: For each eigenvalue λ (appearing as a root of the characteristic polynomial with multiplicity n),nd a basis ~v1, · · · , ~vk for the λ-eigenspace.

∗ Step 2a: If the eigenspace is not defective (in other words, if k = n), then the solutions to the system~y′ = A · ~y coming from this eigenspace are eλt~v1, · · · , eλt~vn.∗ Step 2b: If the eigenspace is defective (in other words, if k < n), then above each eigenvector ~w inthe basis ~v1, · · · , ~vk, look for a chain of generalized eigenvectors satisfying

(A− λI) · ~w2 = ~w

(A− λI) · ~w3 = ~w2

......

...

(A− λI) · ~wl = ~wl−1.

Then the solutions to the system for the eigenspace are eλt [~w], eλt[t~w + ~w2], eλt[ t2

2 ~w + t~w2 + ~w3],

· · · , eλt[tl−1

l! ~w + tl−2

(l−2)! ~w2 + · · ·+ t~wl−1 + ~wl

].

∗ Note: If the λ-eigenspace is 1-dimensional, then there is only 1 chain, and it will always be of lengthn, where n is the multiplicity of λ. If the λ-eigenspace has dimension > 1, then the chains may havedierent lengths, and it may be necessary to toss out some elements of some chains, as they maylead to linearly dependent solutions.

Step 3: If ~y1, · · · , ~yn are the n solution functions obtained in step 2, then the general solution to the

system ~y′ = A · ~y is ~y = c1~y1 + c2~y2 + · · ·+ cn~yn .

∗ If there are complex-conjugate eigenvalues then we generally want to write the solutions as real-valued functions. To obtain real-valued solutions to the system from a pair of complex-conjugatesolutions y and y, replace y and y with Re(y) and Im(y), the real and imaginary parts of y.

• Example: Find the general solution to the systemy′1 = 5y1 − 9y2y′2 = 4y1 − 7y2

.

Step 1: In matrix form this is ~y′ = A · ~y, where A =

[5 −94 −7

].

Page 74: MATH320

∗ We have A−tI =

[5− t −9

4 −7− t

]so det(A−tI) = (5−t)(−7−t)−(4)(−9) = 1+2t+t2 = (t+1)2.

∗ Thus there is a double eigenvalue λ = −1.

∗ To compute the eigenvectors for λ = −1, we want

[6 −94 −6

]·[ab

]=

[00

], so that 2a− 3b = 0.

So the eigenvectors are of the form

[a23a

], and the eigenspace is 1-dimensional with a basis given

by

[32

].

Step 2: There is only one eigenvector (the triple eigenvector λ = −1), so we need to compute a chain ofgeneralized eigenvectors to nd the remaining solution to the system.

∗ We start with ~w =

[32

], and also have A− λI =

[6 −94 −6

].

∗ We want to nd ~w2 =

[ab

]with

[6 −94 −6

]·[ab

]=

[32

].

· Dividing the rst row by 3 and the second row by 2 yields the system

[2 −32 −3

∣∣∣∣ 11

], and so

we want 2a− 3b = 1, so (for example) we can take a = 2 and b = 1.

· This gives our choice of ~w2 =

[21

].

∗ Now we have the chain of the proper length (namely 2), so we can write down the two solutions for

this eigenspace: they are

[32

]e−t and

[32

]te−t +

[21

]e−t.

Step 3: We thus obtain the general solution

[y1y2

]= c1

([32

]e−t)

+ c2

([32

]te−t +

[21

]e−t)

.

∗ Slightly more explicitly, this isy′1 = (3c1 + 2c2 + 3c2t)e

−t

y′2 = (2c1 + c2 + 2c2t)e−t .

• Example: Find the general solution to the systemy′1 = 4y1 − y2 − 2y3y′2 = 2y1 + y2 − 2y3y′3 = 5y1 − 3y3

.

Step 1: In matrix form this is ~y′ = A · ~y, where A =

4 −1 −22 1 −25 0 −3

.∗ We haveA−tI =

4− t −1 −22 1− t −25 0 −3− t

so det(A−tI) = 5

∣∣∣∣ −1 −21− t −2

∣∣∣∣+(3−t)∣∣∣∣ 4− t −1

2 1− t

∣∣∣∣ =

2− t+ 2t2 − t3 = (2− t)(1 + t2).

∗ Thus the eigenvalues are λ = 2, i,−i.

∗ λ = 2: For λ = 2, we want

2 −1 −22 −1 −25 0 −5

· abc

=

000

, so that 2a−b−2c = 0 and 5a−5c = 0.

Hence c = a and b = 2a − 2c = 0, so the eigenvectors are of the form

a0a

. So the eigenspace is

1-dimensional, and has a basis given by

101

.∗ λ = i: For λ = i, we want

4− i −1 −22 1− i −25 0 −3− i

· abc

=

000

. Subtracting the rst row

Page 75: MATH320

from the second row, and then dividing the second row by 2 − i yields

4− i −1 −2−1 1 05 0 −3− i

· abc

=

000

. Hence −a + b = 0 so b = a; then the third row gives 5a + (−3 − i)c = 0, or

c =5

3 + ia =

5(3− i)10

a. So the eigenvectors are of the form

aa

3− i2

a

. So the eigenspace is

1-dimensional, and has a basis given by

22

3− i

.∗ λ = −i: For λ = −i we can just take the conjugate of the eigenvectors for λ = i, so a basis is given

by

22

3 + i

. Step 2: The eigenspaces are all the proper sizes, so we do not need to compute any generalized eigenvec-tors.

Step 3: The general solution is

y1y2y3

= c1

101

e2t+ c2

22

3− i

eit+ c3

22

3 + i

e−it.

∗ With real-valued functions:

y1y2y3

= c1

101

e2t + c2

2 cos(t)2 cos(t)

3 cos(t) + sin(t)

+ c3

2 sin(t)2 sin(t)

− cos(t) + 3 sin(t)

.

∗ Slightly more explicitly, this isy′1 = c1e

2t + 2c2 cos(t) + 2c3 sin(t)y′2 = 2c2 cos(t) + 2c3 sin(t)y′3 = c1e

2t + c2(3 cos(t) + sin(t)) + c3(− cos(t) + 3 sin(t)).

• Example: Find the general solution to the systemy′1 = 4y1 − y3y′2 = 2y1 + 2y2 − y3y′3 = 3y1 + y2

.

Step 1: In matrix form this is ~y′ = A · ~y, where A =

4 0 −12 2 −13 1 0

.∗ We have A−tI =

4− t 0 −12 2− t −13 1 −t

so det(A−tI) = (4−t)∣∣∣∣ 2− t −1

1 −t

∣∣∣∣+(−1)

∣∣∣∣ 2 2− t3 1

∣∣∣∣ =

8− 12t+ 6t2 − t3 = (2− t)3.∗ Thus there is a triple eigenvalue λ = 2.

∗ To compute the eigenvectors for λ = 2, we want

2 0 −12 0 −13 1 −2

· abc

=

000

, so that 2a− c = 0

and 3a+ b− 2c = 0. Hence c = 2a and b = 2c− 3a = a, so the eigenvectors are of the form

aa2a

.∗ So the eigenspace is 1-dimensional, and has a basis given by

112

. Step 2: There is only one eigenvector (the triple eigenvector λ = 2), so we need to compute a chain ofgeneralized eigenvectors to nd the remaining 2 solutions to the system.

Page 76: MATH320

∗ We start with ~w =

112

, and also have A− λI =

2 0 −12 0 −13 1 −2

.∗ First we want to nd ~w2 =

abc

with

2 0 −12 0 −13 1 −2

· abc

=

112

.· The corresponding system of equations in matrix form is

2 0 −12 0 −13 1 −2

∣∣∣∣∣∣112

, which we now

row-reduce: 2 0 −12 0 −13 1 −2

∣∣∣∣∣∣112

R2−R1−→

2 0 −10 0 03 1 −2

∣∣∣∣∣∣102

R3−2R1−→

2 0 −10 0 0−1 1 0

∣∣∣∣∣∣100

and hence −a+ b = 0 and 2a− c = 1, so one possibility for ~w2 is ~w2 =

111

.∗ Now we want to nd ~w3 =

def

with

2 0 −12 0 −13 1 −2

· def

=

111

.· The corresponding system of equations in matrix form is

2 0 −12 0 −13 1 −2

∣∣∣∣∣∣111

; we can use the

same row-reduction as above to obtain 2 0 −12 0 −13 1 −2

∣∣∣∣∣∣111

R2−R1−→

2 0 −10 0 03 1 −2

∣∣∣∣∣∣101

R3−2R1−→

2 0 −10 0 0−1 1 0

∣∣∣∣∣∣10−1

and hence −a+ b = −1 and 2a− c = 1, so one possibility for ~w3 is ~w3 =

101

.∗ Now we have the chain of the proper length (namely 3), so we can write down the three solutions for

this eigenspace: they are

112

e2t, 1

12

te2t+ 1

11

e2t, and 1

12

t22e2t+

111

te2t+ 1

01

e2t. Step 3: We thus obtain the general solution as the (rather unwieldy and complicated) expression y1

y2y3

= c1

112

e2t+ c2

112

te2t +

111

e2t+ c3

112

t22e2t +

111

te2t +

101

e2t .

∗ Slightly more explicitly, this isy′1 = ( c1 + c2 + c3 + c2t+ c3t+ 1

2c3t2)e2t

y′2 = ( c1 + c2 + c2t+ c3t+ 12c3t

2)e2t

y′3 = (2c1 + c2 + c3 + 2c2t+ c3t+ c3t2)e2t

.

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 77: MATH320

Math 320 (part 6): Systems of First-Order Dierential Equations (by Evan Dummit, 2012, v. 1.00)

Contents

1 Systems of First-Order Linear Dierential Equations 1

1.1 General Theory of (First-Order) Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Eigenvalue Method (for diagonalizable coecient matrices) . . . . . . . . . . . . . . . . . . . . . . . 2

1 Systems of First-Order Linear Dierential Equations

• In many (perhaps most) applications of dierential equations, we have not one but several quantities whichchange over time and interact with one another: populations in an ecosystem, economic quantities, concen-trations of molecules in a reaction, etc. Naturally, we would like to develop methods to solve such systems.

1.1 General Theory of (First-Order) Linear Systems

• Before we start our discussion of systems of linear dierential equations, we rst observe that we can reduceany system of linear dierential equations to a system of rst-order linear dierential equations (in morevariables): if we dene new variables equal to the higher-order derivatives of our old variables, then we canrewrite the old system as a system of rst-order equations (in more variables).

Example: Consider the single 3rd-order equation y′′′ + y′ = 0.

∗ If we dene new variables z = y′ and w = y′′ = z′, then the original equation tells us that y′′′ = −y′,so w′ = y′′′ = −y′ = −z.∗ Thus, this single 3rd-order equation is equivalent to the rst-order system y′ = z, z′ = w, w′ = −z.

Example: Consider the system y′′1 + y1 − y2 = 0 and y′′2 + y′1 + y′2 sin(x) = ex.

∗ If we dene new variables z1 = y′1 and z2 = y′2, then z′1 = y′′1 = −y1 + y2 and z′2 = y′′2 = ex − y′1 −

y′2 sin(x) = ex − z1 − z2 sin(x)

∗ So this system is equivalent to the rst-order system y′1 = z1, y′2 = z2, z

′1 = −y1 + y2, z

′2 =

ex − z1 − z2 sin(x).

• Thus, whatever we can show about solutions of systems of rst-order linear equations, will carry over toarbitrary systems of linear dierential equations. So we will talk only about systems of rst-order lineardierential equations from now on.

• A system of rst-order linear dierential equations (with unknown functions y1, · · · , yn) has the general form

y′1 = a1,1(x) · y1 + a1,2(x) · y2 + · · ·+ a1,n(x) · yn + p1(x)

y′2 = a2,1(x) · y1 + a2,2(x) · y2 + · · ·+ a2,n(x) · yn + p2(x)

......

y′n = an,1(x) · y1 + an,2(x) · y2 + · · ·+ an,n(x) · yn + pn(x)

for some functions ai,j(x) and pj(x), where 1 ≤ i, j ≤ n.

Most of our time we will be dealing with systems with constant coecients, in which all of the ai,j(x)are constant functions.

We say a rst-order system is homogeneous if each of p1(x), p2(x), · · · , pn(x) is zero.

An initial condition for this system consists of n pieces of information: y1(x0) = b1, y2(x0) = b2, . . . ,yn(x0) = bn, where x0 is the starting value for x and the bi are constants.

Page 78: MATH320

• Many of the theorems about general systems of rst-order linear equations are very similar to the theoremsabout nth order linear equations.

• Theorem (Homogeneous Systems): If the coecient functions ai,j(x) are continuous, then the set of solutions(y1, y2, · · · , yn) to the homogeneous system

y′1 = a1,1(x) · y1 + a1,2(x) · y2 + · · ·+ a1,n(x) · yny′2 = a2,1(x) · y1 + a2,2(x) · y2 + · · ·+ a2,n(x) · yn...

...

y′n = an,1(x) · y1 + an,2(x) · y2 + · · ·+ an,n(x) · yn

is an n-dimensional vector space.

The fact that the set of solutions forms a vector space is not so hard to show using the subspace criteria.

The real result of this theorem, which follows from the existence-uniqueness theorem below, is that theset of solutions is n-dimensional.

• Theorem (Existence-Uniqueness): For a system of rst-order linear dierential equations, if the coecientfunctions ai,j(x) and nonhomogeneous terms pj(x) are each continuous in an interval around x = x0, thenthe system

y′1 = a1,1(x) · y1 + a1,2(x) · y2 + · · ·+ a1,n(x) · yn + p1(x)

y′2 = a2,1(x) · y1 + a2,2(x) · y2 + · · ·+ a2,n(x) · yn + p2(x)

......

y′n = an,1(x) · y1 + an,2(x) · y2 + · · ·+ an,n(x) · yn + pn(x)

with initial conditions y1(x0) = b1, . . . , yn(x0) = bn has a unique solution (y1, y2, · · · , yn) in some (possiblysmaller) interval around x = x0.

Example: The system y′ = ex ·y+sin(x) ·y, z′ = 3x2 ·y has a unique solution for every initial condition

y(a) = b1, z(a) = b2.

• Denition: Given n vectors s1 = (y1,1, y1,2, · · · , y1,n), s2 = (y2,1, y2,2, · · · , y2,n), · · · , sn = (yn,1, yn,2, · · · , yn,n)

with functions as entries, their Wronskian is dened as the determinant W =

∣∣∣∣∣∣∣∣∣y1,1 y1,2 · · · y1,n

y2,1 y2,2 · · · y2,n

......

. . ....

yn,1 yn,2 · · · yn,n

∣∣∣∣∣∣∣∣∣. The n vectors s1, · · · , sn are linearly independent if their Wronskian is nonzero.

1.2 Eigenvalue Method (for diagonalizable coecient matrices)

• We now restrict our discussion to homogeneous rst-order systems with constant coecients: those of theform

y′1 = a1,1y1 + a1,2y2 + · · ·+ a1,nyn

y′2 = a2,1y1 + a2,2y2 + · · ·+ a2,nyn...

...

y′n = an,1y1 + an,2y2 + · · ·+ an,nyn

• We can rewrite this system in matrix form as ~y′ = A·~y, where ~y =

y1

y2

...yn

andA =

a1,1 a1,2 · · · a1,n

a2,1 a2,2 · · · a2,n

......

. . ....

an,1 an,2 · · · an,n

.

Page 79: MATH320

• The idea behind the so-called Eigenvalue Method is the following observation:

Observation: If ~v =

c1c2...cn

is an eigenvector of A with eigenvalue λ, then ~y =

c1c2...cn

eλx is a solution

to ~y′ = A · ~y. Proof: Dierentiating ~y = eλx ~v with respect to x gives ~y′ = λeλx~v = λ~y = A · ~y.

• Theorem: If A has n linearly independent eigenvectors ~v1, · · · , ~vn with eigenvalues λ1, · · · , λn, then thesolutions to the matrix dierential system ~y′ = A · ~y are given by ~y = c1e

λ1x~v1 + c2eλ2x~v2 + · · · + cne

λnx~vn,where c1, · · · , cn are arbitrary constants.

Important Remark: The statement that A has n linearly independent eigenvectors ~v1, · · · , ~vn with eigen-values λ1, · · · , λn is equivalent to the statement that A is diagonalizable with A = P−1DP where thediagonal elements of D are λ1, · · · , λn and the columns of P are the vectors ~v1, · · · , ~vn. Proof: By the observation above, each of eλ1x~v1, e

λ2x~v2, · · · , eλnx~vn is a solution to ~y′ = A ·~y. We claimthat they are a basis for the solution space.

∗ We can compute the Wronskian of these solutions; after factoring out the exponentials from each

column, we obtain W = e(λ1+···+λn)x ·

| | |~v1 · · · ~vn| | |

. Then this product is nonzero because the

exponential is nonzero and the vectors ~v1, · · · , ~vn are linearly independent. Hence eλ1x~v1, eλ2x~v2,

· · · , eλnx~vn are linearly independent.

∗ We also know by the existence-uniqueness theorem that the set of solutions to the system ~y′ = A · ~yis n-dimensional.

∗ So since we have n linearly independent elements eλ1x~v1, eλ2x~v2, · · · , eλnx~vn in an n-dimensional

vector space, they are a basis.

∗ Finally, since these solutions are a basis, all solutions are of the form ~y = c1eλ1x~v1 + c2e

λ2x~v2 + · · ·+cne

λnx~vn, where c1, · · · , cn are arbitrary constants.

• By the remark, the theorem allows us to solve all homogeneous systems of linear dierential equations whosecoecient matrix A is diagonalizable. To do this, follow these steps:

Step 1: Write the system in the form ~y′ = A · ~y for an n× 1 column matrix ~y and an n× n matrix A (ifthe system is not already in this form). If the system has equations which are not rst order, introducenew variables to make the system rst-order.

Step 2: Find the eigenvalues and eigenvectors of A, and check that A is diagonalizable. If A is diagonal-izable, generate the list of n linearly independent eigenvectors ~v1, · · · , ~vn with corresponding eigenvaluesλ1, · · · , λn.∗ If there are complex-conjugate eigenvalues λ and λ, then the eigenvectors for λ are the complexconjugates of those for λ.

Step 3: Write down the general solution to the system: ~y = c1eλ1x~v1 + c2e

λ2x~v2 + · · ·+ cneλnx~vn, where

c1, · · · , cn are arbitrary constants.

∗ Note: If there are complex-conjugate eigenvalues then we generally want to write the solutions asreal-valued functions.

∗ To do this, we take a linear combination: if λ = a + bi has an eigenvector ~v = ~w1 + i ~w2 so thatλ = a− bi has an eigenvector ~v = ~w1 − i ~w2 (the conjugate of ~v).

∗ Then to obtain real-valued solutions to the system, replace the two complex-valued solutions eλx~v andeλx~v with the two real-valued solutions eax(~w1 cos(bx)− ~w2 sin(bx)) and eax(~w1 sin(bx)+ ~w2 cos(bx)).

Step 4 (if necessary): Plug in any initial conditions and solve for c1, · · · , cn.

• Example: Find all functions y1 and y2 such thaty′1 = y1 − 3y2

y′2 = y1 + 5y2.

Page 80: MATH320

Step 1: The system is ~y′ = A · ~y, with ~y′ =

[y1

y2

]and A =

[1 −31 5

].

Step 2: The characteristic polynomial of A is det(tI−A) =

∣∣∣∣ t− 1 3−1 t− 5

∣∣∣∣ = (t−1)(t−5)+3 = t2−6t+8,

so the eigenvalues are λ = 2, 4.

∗ For λ = 2 we want

[1 −31 5

]·[ab

]=

[2a2b

]so that

[a− 3ba+ 5b

]=

[2a2b

]. This yields a = −3b,

so

[−31

]is an eigenvector.

∗ For λ = 2 we want

[1 −31 5

]·[ab

]=

[4a4b

]so that

[a− 3ba+ 5b

]=

[4a4b

]. This yields a = −b,

so

[−11

]is an eigenvector.

Step 3: The general solution is

[y1

y2

]= c1

[−31

]· e2x + c2

[−11

]· e4x =

[−3c1e

2x + c2e4x

− c2e2x + c2e4x

].

• Example: Find all real-valued functions y1 and y2 such thaty′1 = y2

y′2 = −y1.

Step 1: The system is ~y′ = A · ~y, with ~y′ =

[y1

y2

]and A =

[0 1−1 0

].

Step 2: The characteristic polynomial of A is det(tI − A) =

∣∣∣∣ t −11 t

∣∣∣∣ = t2 + 1, so the eigenvalues are

λ = ±i.

∗ For λ = i we want

[0 1−1 0

]·[ab

]=

[iaib

]so b = ia and thus

[1i

]is an eigenvector.

∗ For λ = −i we can take the complex conjugate of the eigenvector for λ = i to see that

[1−i

]is an

eigenvector.

Step 3: The general solution is

[y1

y2

]= c1

[1i

]· eix + c2

[1−i

]· e−ix.

∗ But we want real-valued solutions, so we need to replace the complex-valued solutions

[1i

]· eix

and

[1−i

]· e−ix with real-valued ones.

∗ We have λ = i and ~v =

[10

]+

[01

]i so that ~w1 =

[10

]and ~w2 =

[01

].

∗ Plugging into the formula in the note gives us the equivalent real-valued solutions

[10

]· cos(x) +[

01

]· sin(x) =

[cos(x)sin(x)

]and

[10

]· sin(x)−

[01

]· cos(x) =

[sin(x)− cos(x)

].

∗ This gives the solution to the system as

[y1

y2

]= c1

[cos(x)sin(x)

]+c2

[sin(x)− cos(x)

]=

[c1 cos(x) + c2 sin(x)c1 sin(x)− c2 cos(x)

].

• Remark: If the coecient matrix is not diagonalizable, life is more dicult, as we cannot generate a basisfor the solution space using eigenvectors alone. To solve systems with non-diagonalizable coecient matricesrequires introducing the exponential of a matrix, and to develop methods for computing it. We do not coversuch techniques in this course.

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 81: MATH320

Some selected proof-based homework problems

• Suppose A is an n × n matrix with only one eigenvalue and n linearly independent eigenvectors. Show thatevery vector in Rn is an eigenvector of A.

Solution:

∗ Say the only eigenvalue of A is λ, and consider the eigenspace for λ.

∗ We know that the eigenspace is a subspace of Rn.

∗ Because λ is the only eigenvalue, every eigenvector has eigenvalue λ. So since A has n linearlyindependent eigenvectors, the λ-eigenspace contains n linearly independent vectors.

∗ But the only subspace of Rn which can contain n linearly independent vectors is Rn itself (as everyset of n linearly independent vectors spans Rn).

∗ Therefore the λ-eigenspace is all of Rn.

∗ But this means every vector in Rn is an eigenvector of A with eigenvalue λ.

∗ Note: In fact, the argument tells us that A acts by multiplying every vector by λ. But this meansthat A = λI (where I is the identity matrix).

• Suppose that A is an invertible matrix. Show that 0 is not an eigenvalue of A.

Solution 1:

∗ By denition, 0 is an eigenvalue precisely when there is a nonzero vector ~v such that A ·~v = ~0 (since0~v = ~0).

∗ But if A is invertible, then A~v = ~0 only has the trivial solution ~v = ~0.

· Reason: we have ~v = (A−1A)~v = A−1(A~v) = A−1~0 = ~0.

∗ So, if A is invertible, then 0 is not an eigenvalue.

Solution 2:

∗ A is invertible precisely when det(A) 6= 0.

∗ By denition, the characteristic polynomial is p(t) = det(A− tI). In particular, setting t = 0 showsthat det(A) = p(0).

∗ Since they are equal, we see that det(A) 6= 0 if and only if p(0) 6= 0.

∗ But the roots of the characteristic polynomial are precisely the eigenvalues. So p(0) 6= 0 is exactlythe same as saying that 0 is not a root of p(t), and hence that 0 is not an eigenvalue.

Note: Both solutions work both ways, and show that if A is not invertible, then 0 is an eigenvalue.

• Prove that if A and B are n × n matrices and there exists an invertible n × n matrix P with B = P−1AP ,then det(A) = det(B). Also show that if either A or B is invertible, then the other is invertible, andB−1 = P−1A−1P .

Solution:

∗ Taking the determinant of both sides of B = P−1AP yields det(B) = det(P−1) det(A) det(P ).

∗ Since determinants are scalars we can rearrange them to get det(B) = det(A) · det(P−1) det(P ).

∗ Using multiplicativity of determinants again we have det(B) = det(A) ·det(P−1P ) = det(A) ·det(I).∗ Since the determinant of the identity matrix is 1, we obtain det(B) = det(A), as desired.

∗ The equality of the determinants means that det(B) 6= 0 precisely when det(A) 6= 0.

∗ But since a determinant is nonzero exactly when that matrix is invertible, we see that A is invertibleif and only if B is invertible.

∗ Thus, if either is invertible, the other is as well.

∗ Finally, if both are invertible, then we can write B−1 = (P−1AP )−1 = P−1A−1(P−1)−1, since theinverse of a product is the product of the inverses in the opposite order.

∗ Since (P−1)−1 = P we obtain B−1 = P−1A−1P , as desired.

Page 82: MATH320

• Prove that if λ is an eigenvalue of A, then λn is an eigenvalue of An for every n > 1.

Solution 1:

∗ Saying λ is an eigenvalue of A means that there is a nonzero vector ~v such that A · ~v = λ~v.

∗ If we multiply both sides by A then we have A(A · ~v) = A(λ~v).

∗ Since λis a scalar, A(λ~v) = λ(A · ~v) = λ(λ~v) = λ2~v, where in the middle we used the fact thatA · ~v = λ~v.

∗ Therefore we have A2 · ~v = λ2~v.

∗ If we multiply both sides by A again, then we can do the same rearrangement to see that A3 ·~v = λ3~v.

∗ By continuing this process, we will eventually end up with An · ~v = λn~v, for each n > 1.

∗ But this says precisely that ~v is an eigenvector of An with eigenvalue λn.

∗ In particular, λn is an eigenvector of An.

∗ Note: To be fully rigorous, one should set this up as an induction on n.

· The base case n = 1 is immediate, since A1 · ~v = λ1~v.

· The inductive step is: given An−1~v = λn−1~v, multiply both sides by A to get An~v = A(λn−1~v) =λn−1(A~v) = λn−1(λ~v) = λn~v.

Solution 2:

∗ Saying λ is an eigenvalue of A means that det(A−λI) = 0. We want to see that det(An−λnI) = 0,since this is the same as saying that λn is an eigenvalue of An.

∗ We can factor An−λnI as (A−λI) ·[An−1 +An−2(λI) +An−3(λI)2 + · · ·+A(λI)n−2 + (λI)n−1

].

∗ If we write B = An−1+An−2(λI)+An−3(λI)2+· · ·+A(λI)n−2+(λI)n−1, then An−λnI = (A−λI)·B.∗ Taking determinants says det(An − λnI) = det(A− λI) · det(B).

∗ But since det(A− λI) = 0, we see that det(An − λnI) = 0 as well.

∗ Therefore, since det(An − λnI) = 0, we see that λn is an eigenvalue of An.

• Suppose U and W are subspaces of a vector space V . Let S be the set of all vectors in V of the form ~u+ ~w,where ~u is a vector in U and ~w is a vector in W . Prove that S is also a subspace of V .

Solution: We need to show three things: that S contains the zero vector, that S is closed under addition,and that S is closed under scalar multiplication.

[S1]: S contains ~0.

∗ Because U and W are subspaces, we know that each of them contains the zero vector ~0. Therefore,S contains ~0 +~0 = ~0, so S contains the zero vector.

[S2]: S is closed under addition.

∗ Suppose ~v1 and ~v2 are vectors in S. We want to show that ~v1 + ~v2 is also in S.

∗ By denition of S, we can write ~v1 = ~u1 + ~w1 and ~v2 = ~u2 + ~w2, where ~u1 and ~u2 are in U and ~w1

and ~w2 are in W .

∗ Then ~v1 + ~v2 = ~u1 + ~w1 + ~u2 + ~w2.

∗ We can rearrange this to read ~v1 + ~v2 = (~u1 + ~u2) + (~w1 + ~w2).

∗ But now since U is a subspace, ~u1 + ~u2 is also in U . Similarly, ~w1 + ~w2 is in W .

∗ So we have written ~v1 + ~v2 as the sum of a vector in U and a vector in W .

∗ Therefore, ~v1 + ~v2 is in S, by the denition of S.

[S3]: S is closed under scalar multiplication.

∗ Suppose ~v is a vector in S, and α is a scalar. We want to show that α · ~v is also in S.

∗ By denition of S, we can write ~v = ~u+ ~w, where ~u is in U and ~w is in W .

∗ Then by the distributive law we have α · ~v = α · (~u+ ~w) = (α · ~u) + (α · ~w).∗ But now since U is a subspace, α · ~u is also in U . Similarly, α · ~w is in W .

∗ So we have written α · ~v as the sum of a vector in U and a vector in W .

∗ Therefore,α · ~v is in S, by the denition of S.

Page 83: MATH320

• If ~v1, · · · , ~vk, ~vk+1 span a vector space V , and ~vk+1 is a linear combination of ~v1, · · · , ~vk, show that ~v1, · · · , ~vkspan V .

Solution:

∗ The statement that ~v1, · · · , ~vk, ~vk+1 span V says that any vector ~w in V can be written as a linearcombination of ~v1, · · · , ~vk, ~vk+1: say ~w = a1~v1 + a2~v2 + · · ·+ ak~vk + ak+1~vk+1.

∗ We are also told that ~vk+1 is a linear combination of ~v1, · · · , ~vk: say as ~vk+1 = b1~v1+b2~v2+· · ·+bk~vk.∗ Now we can just substitute this expression for ~vk+1 into the expression for ~w: this gives ~w =a1~v1 + a2~v2 + · · ·+ ak~vk + ak+1 (b1~v1 + b2~v2 + · · ·+ bk~vk).

∗ If we expand out the product and collect terms, we obtain the equivalent expression ~w = (a1 + ak+1b1)~v1+(a2 + ak+1b2)~v2 + · · ·+ (ak + ak+1bk)~vk.

∗ This expresses ~w as a linear combination of ~v1, · · · , ~vk. Since ~w was arbitrary, this says every vectorin V is a linear combination of ~v1, · · · , ~vk which is to say, ~v1, · · · , ~vk span V .

Well, you're at the end of my handout. Hope it was helpful.Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this materialwithout my express permission.

Page 84: MATH320

MATH 320: HOMEWORK 2

1.5: 6, 17, 30, 38

2.1: 6, 21, 30

2.2: 10, 20, 24

2.3: 12, 20

EP1.5.6 Solve the initial value problem

xy′ + 5y − 7x2, y(2) = 5.

Solution. The first step is to rewrite the problem appropriately.

y′ +5

xy = 7x.

Now we can apply what we know of integrating factors. First, our integrating factor is

ρ(x) = exp(

∫5/x dx) = exp(log x5) = x5.

The general theory of first order linear ODEs gives us

x5y =

∫x5 · 7x⇒ y(x) =

x7 + C

x5.

Applying the intial condition results in

5 =128 + C

32⇒ C = 5 · 32− 128 = 32

so the solution is y(x) = x7+32x5

.

EP1.5.17 Solve the initial value problem

(1 + x)y′ + y = cosx, y(0) = 1

Solution. Again, rewriting is the key first step.

y′ +1

1 + xy =

cosx

1 + x.

This gives the integrating factor

ρ(x) = exp(log(1 + x)) = 1 + x.

Using this in our original ODE gives

(1 + x)y =

∫cosx dx⇒ y(x) =

sinx+ C

1 + x

Now we apply the initial condition to find

2 =1 + C

1 + 0⇒ C = 1.

Thus the solution is y(x) = 1+sinx1+x .

1

Page 85: MATH320

2 MATH 320: HOMEWORK 2

EP1.5.30 Express the solution of the initial value problem

2xdy

dx= y + 2x cosx, y(1) = 0

as an integral as in Example 3 of this section.

Solution. As per usual, start with a rewrite.

dy

dx− 1

2xy = cosx

Now we get the integrating factor.

ρ(x) = exp

(−∫ x

1

1

2tdt

)⇒ ρ(x) =

1√x

Apply this to the ODE to get

y(x) =√x

(y0 +

∫ x

1

cos t√tdt

)=√x

∫ x

1

cos t√tdt.

Ta-da.

EP1.5.38 Consider the cascade of two tanks shown in Fig. 1.5.5, with V1 = 100 (gal) and V2 = 200

(gal) the volumes of brine in the two tanks. Each tank also initially contains 50 lb of salt.

The three flow rates indicated in the figure are 5 gal/min of pure water flowing into tank

1, 5 gal/min flowing from tank 1 to tank 2, and 5 gal/min flowing out of tank 2.

(a) Find the amount x(t) of salt in tank 1 at time t.

(b) Suppose that y(t) is the amount of salt in tank 2 at time t. Show first that

dy

dt=

5x

100− 5y

200

and then solve for y(t), using the function x(t) found in part (a).

(c) Finally, find the maximum amount of salt ever in tank 2.

Solution. We must set up the appropriate IVP first. Since we are losing 5 gal/min of

solution and there is x/100 lb/gal of salt in the solution, the rate at which we are losing

salt isdx

dt= − 5x

100

and we start with x(0) = 50.

(a) Solving the ODE gives x(t) = De−t/20. Applying the initial condition results in x(t) =

50e−t/20.

(b) Since tank 1 is losing salt at the rate 5x/100, that is the rate at which tank 2 is gaining

salt. Similarly to our reasoning for tank 1, this gives a outflow rate of 5y/200 lb/min

of salt. Using the previous solution for x(t) we get

dy

dt=

250e−t/20

100− 5y

200.

Page 86: MATH320

MATH 320: HOMEWORK 2 3

This can be rewritten as the linear ODE

dy

dt+

5y

200=

250e−t/20

100.

The integrating factor is

ρ(t) = exp

(∫1

40dt

)= exp(t/40).

Using the usual formula for linear ODEs

ρ(t)y(t) =

∫5

2exp

(t− 2t

40

)dt⇒ y(t) =

−100 exp(−t/40) + C

exp(t/40)

⇒ y(t) = −100e−t/20 + Ce−t/40.

Now applying the initial condition gives

50 = −100 + C ⇒ C = 150.

So the solution is y(t) = 150e−t/40 − 100e−t/20.

(c) The max occurs when y′ = 0, so that means

0 =5x

100− 5y

200⇒ y = 2x.

Using our formulas for y and x will show that the max must occur at t = −40 log 34 .

Plugging this back into y(t) shows the max is 2254 = 56.25.

EP2.1.6 Solve the initial value problem

dx

dt= 3x(x− 5), x(0) = 2.

Use the exact solution or a computer-generated slope field to sketch graphs of several

solutions of the ODE, and highlight the particular solution of the IVP.

Solution. This is a separable ODE. It goes like this:

dx

x(x− 5)= 3 dt.

A direct integration would be difficult, so we use partial fractions to break down the left

hand side.

1

x(x− 5)=A

x+

B

x− 5

⇒ 1 = A(x− 5) +Bx

x = 0 : 1 = −5A⇒ A = −1

5

x = 5 : 1 = 5B ⇒ B =1

5

Page 87: MATH320

4 MATH 320: HOMEWORK 2

So we can rewrite the ODE as(− 1

5x+

1

5(x− 5)

)dx = 3 dt

1

5

(log

x− 5

x

)= 3t+ C

Doing a little bit of algebra yields

x(t) =5

1−De15t.

Applying the initial conditions results in

2 =5

1−D⇒ D = −3

2.

Multiply the numerator and denominator by 2 to get the particular solution

x(t) =10

2− 3e15t.

I recommend using DFIELD for the figure portion of this.

EP2.1.21 Suppose that the population P (t) of a country satisfies the differential equation P ′ =

kP (200−P ) with k constant. Its population in 1940 was 100 million and was then growing

at the rate of 1 million per year. Predict this country’s population for the year 2000.

Solution. The first step is to solve this separable ODE using partial fractions.

dP

P (200− P )= k dt

is the separation. For the partial fractions,

1

P (200− P )=A

P+

B

200− P⇒ 1 = A(200− P ) +BP

P = 0 : 1 = 200A⇒ A =1

200

P = 200 : 1 = 200A⇒ B =1

200.

Now doing the integration gives

1

200

(log

P

200− P

)= kt+ C.

Straightforward algebra gives

P (t) =200De200kt

1 +De200kt

and we apply the initial condition P (0) = 100 gives

100 =200D

1 +D⇒ D = 1.

Page 88: MATH320

MATH 320: HOMEWORK 2 5

Since the initial growth is 1 with population 100 million we can find the growth constant k

by

1 = k · 100(200− 100)⇒ k = 10−4 ⇒ P (t) =200

1 + e−2·10−2t.

To predict the population in the year 2000 we evaluate at t = 60:

P (60) =200

1 + e−6/5≈ 153.7 million

which completes the problem.

EP2.1.30 A tumor may be regarded as a population of multiplying cells. It is found empirically

that the “birth rate” of the cells in a tumor decreases exponentially with time, so that

β(t) = β0e−αt (where α and β0 are positive constants) and hence

dP

dt= β0e

αtP, P (0) = P0.

Solve this IVP for

P (t) = P0 exp

(β0α

(1− e−αt)).

Observe that P (t) approaches the finite limiting population P0eβ0/α.

Solution. We start by separating the variables

dP

P= β0e

−αt dt⇒ logP = −β0αe−αt + C ⇒ P (t) = exp

(−β0αeαt + C

)We can apply the initial condition at the middle step in the above line,

β0α

+ logP0 = C.

Now inserting this at the end

P (t) = exp

(−β0αeαt +

β0α

+ logP0

)= P0 exp

(β0α

(1− e−αt))

which verifies the solution. As t→∞ then 1−eαt → 1, so the solution tends to P0eβ0/α.

EP2.2.10 For the autonomous ODE (x′ = f(x))

dx

dt= 7x− x2 − 10

find the critical points. Use the sign (positive or negative) of f(x) to determine each critical

points stability and construct a phase diagram for the ODE. Then solve the ODE explicity

for x(t). Finally, use either the exact solution or a computer-generated slope field to sketch

the typical solution curves, and verify visually the stability of each critical point.

Solution. To find the critical points we solve the quadratic equation:

0 = 7x− x2 − 10⇒ 0 = −(x− 2)(x− 5)⇒ x = 2, 5.

Now for the stability analysis.

Page 89: MATH320

6 MATH 320: HOMEWORK 2

(1) In the region (−∞, 2) the derivative x′ is negative (evaluate at x = 0 to see this), so

the solution x(t) is decreasing.

(2) In the region (2, 5) the derivative x′ is positive (x = 3: x′ = −(1)(−2) = 2 > 0) so the

solution x is increasing.

(3) In the region (5,∞) the derivative x′ is negative (x = 6: x′ = −(4)(1) = −4 < 0) so

the solution is decreasing.

Using this or the phase diagram it is clear the x = 2 is unstable and x = 5 is stable.

For the exact solution we separate variables and use partial fractions:

dx

(x− 2)(x− 5)= −dt,

1

(x− 2)(x− 5)=

A

x− 2+

B

x− 5

1 = A(x− 5) +B(x− 2)

x = 2 : 1 = −3A⇒ A = −1

3

x = 5 : 1 = 3B ⇒ B =1

3.

Now we can integrate to find

1

3

(log

x− 5

x− 2

)= −t+ C

and some algebra leads to

x(t) =5− 2De−3t

1−De−3t.

For the graphing portion I would use DFIELD.

EP2.2.20 The ODE x′ = x(x− 5)/100 + s models a population with stocking rate s. Determine the

dependence of the number of critical points c on the parameter s, and then construct the

corresponding bifurcation diagram in the sc-plane.

Solution. The number of critical points given by the number of solutions to the equation

0 =c(c− 5)

100+ s.

This is a quadratic equation:

0 =1

100c2 − 5

100c+

100

100s(0.1)

⇒ 0 = c2 − 5c+ 100s(0.2)

Now we use the quadratic formula

c =5±

√25− 4(1)(100s)

2=

1±√

1− 16s

2.

There are three regions of solution to this equation.

(1) This formula has no solutions if 1− 16s < 0 which is equivalent to s > 116 .

Page 90: MATH320

MATH 320: HOMEWORK 2 7

(2) There is exactly one solution if 1− 16s = 0 which occurs for s = 116 .

(3) There are two solutions if 1− 16s > 0 which is equivalent to s < 116 .

We can graph this in the sc plane by graphing s = −c(c− 5)/100. The graph is provided.

EP2.2.24 Separate the variables in the logistic harvesting equation x′ = k(N − x)(x −H) and then

use partial fractions to derive the solution

x(t) =N(x0 −H)−H(x0 −N)e−k(N−H)t

(x0 −H)− (x0 −N)e−k(N−H)t.

Solution. Separating the variables we get

dx

(N − x)(x−H)= k dt.

The partial fraction decomposition goes like this:

1

(N − x)(x−H)=

A

N − x+

B

x−H⇒ 1 = A(x−H) +B(N − x)

x = N : 1 = A(N −H)⇒ A =1

N −H

x = H : 1 = B(N −H)⇒ B =1

N −H.

The integration yields

1

N −H(log(x−H)− log(N − x)) = −kt+ C

and resulting algebra gives

x(t) =DN +He−k(N−H)t

D + e−k(N−H)t.

Apply the initial condition x(0) = x0 and do the algebra to find

D =H − x0x0 −N

= −x0 −Hx0 −N

.

Page 91: MATH320

8 MATH 320: HOMEWORK 2

Plug this into our general solution, then multiply numerator and denominator by −(x0−N)

to arrive at

x(t) =N(x0 −H)−H(x0 −N)e−k(N−H)t

(x0 −H)− (x0 −N)e−k(N−H)t.

And we’re done.

EP2.3.12 It is proposed to dispose of nuclear wastes–in drums with weight W = 640 lb and volume

8 ft3–by dropping them into the ocean (v0 = 0). The force equation for a drum falling

through water is

mdv

dt= −W +B + FR

where the buoyant force B is equal to the weight (at 62.5 lb/ft3) of the volume of water

displaced by the drum (Archimedes’ principle) and FR is the force of water resistance, found

empirically to be 1 lb for each foot per second of the velocity of the drum. If the drums are

likely to burst upon an impact of more than 75 ft/s, what is the maximum depth they can

be dropped in the ocean without likelihood of bursting?

Solution. Note that the relation between mass and weight is m = Wg. Plugging in the

relevant constants, the law of motion is

20dv

dt= −640 + 62.5 · 8− v ⇒ dv

dt= −

(7 +

v

20

).

This is solved by separation of variables,

dv

7 + v/20= −dt⇒ 20 log

(7 +

v

20

)= −t+ C ⇒ v(t) = De−t/20 − 140.

The initial condition gives D = 140. The drum will burst if it reaches a downward velocity

of 75 ft/s, so

−75 = 140(e−t/20 − 1)⇒ t = −20 log

(65

140

)is the time when the drum would reach that velocity. The depth it would be at at that

time is given by integrating the velocity:

x(−20 log(65/140)) =

∫ −20 log(65/140)0

v(t) dt

= 140

∫ −20 log(65/140)0

e−t/20 − 1 dt

= 140[−20e−t/20 − t

]−20 log(65/140)0

≈ −648.

Provided we drop the drums to a depth less than 648 feet, they should not break up

impact.

EP2.3.20 An arrow is shot straight upward from the ground with an initial velocity of 160 ft/s. It

experiences both the deceleration of gravity and deceleration v2/800 due to air resistance.

How high in the air does it go?

Page 92: MATH320

MATH 320: HOMEWORK 2 9

Solution. The ODE for the law of motion is

dv

dt= −g − v2

800.

We can change this into an ODE for which the independent variable is x, since we want

to find the max height. We use dxdt = dv

dxdxdt = v dvdt . We can solve this using separation of

variables:

v dv

g + v2

800

= − dx⇒ 400 log

(g +

v2

800

)= −x+ C ⇒ g +

v2

800= De−x/400.

Applying the initial condition v(0) = 160 and g = 32, we get that D = 64. The maximum

height is reached when the velocity is 0. Applying this to the above equation results in

32 = 64e−x/400 ⇒ x = −400 log .5 ≈ 277.27

as the maximum height.

Page 93: MATH320

Solutions to Midterm 2 Practice Problems

Evan Dummit

April 13, 2012

1.

2. For the matrices A =

[1 31 2

]and B =

[1 3 −12 1 1

], either compute, or explain why it does not exist:

(a) A2 =

[1 31 2

]·[

1 31 2

]=

[4 93 7

].

(b) A−1 =

[−2 31 −1

], either by the formula or by a row-reduction.

(c) det(A) = −1 .

(d) B2 does not exist (B is not square).

(e) B−1 does not exist (B is not square).

(f) det(B) does not exist (B is not square).

(g) AB =

[1 31 2

]·[

1 3 −12 1 1

]=

[7 6 25 5 1

].

(h) BA does not exist (dimensions don't work: 2× 3 and 2× 2).

(i) det(A100) = det(A)100 = (−1)100 = 1 .

3. Consider the set S of matrices of the form

[a b−b a

], where a and b are real numbers.

(a) Show that S is a subspace of the vector space V of 2× 2 matrices with real number entries.

• We check the subspace conditions:

i. S contains the zero matrix. (True).

ii. S is closed under addition:

[a b−b a

]+

[c d−d c

]=

[a+ c b+ d−(b+ d) a+ c

], which is of the

same form.

iii. S is closed under scalar multiplication: r ·[

a b−b a

]=

[ra rb−rb ra

], which is of the same form.

(b) Find a basis for S.

• We can write

[a b−b a

]= a

[1 00 1

]+ b

[0 1−1 0

], so one basis is

[1 00 1

]and

[0 1−1 0

].

(c) Show that the (matrix) product of two elements in S is also in S.

• We have

[a b−b a

]·[

c d−d c

]=

[ac− bd ad+ bc−(ad+ bc) ac− bd

], so the product is also of this form.

4.

1

Page 94: MATH320

(a) Find a basis for the solution space to the system of equations

u− v − 2w − x− 2y − z = 0

w − 3x− y − z = 0

x+ y − z = 0

• In matrix form this is

1 −1 −2 −1 −2 −10 0 1 −3 −1 −10 0 0 1 1 −1

∣∣∣∣∣∣000

.

• We see that the pivotal columns (columns containing a leading row term) are the rst, third, andfourth columns. Hence the nonpivotal columns are the second, fth, and sixth columns, so thecorresponding variables v, y, and z are the free variables.

• So if we set v = t1, y = t2, and z = t3, then we can write all variables in terms of the arbitraryparameters t1, t2, and t3:

• The last equation gives x = −y + z so x = −t2 + t3.

• The middle equation gives w = 3x+ y + z so w = 3(−t2 + t3) + t2 + t3 = −2t2 + 4t3.

• The rst equation gives u = v + 2w + x+ 2y + z so u = t1 + 2(−2t2 + 4t3) + (−t2 + t3) + t2 + t3 =t1 − 4t2 + 10t3.

• So our solution vector 〈u, v, w, x, y, z〉 is 〈t1 − 4t2 + 10t3, t1,−2t2 + 4t3,−t2 + t3, t2, t3〉.• To nd the basis we split the vector apart: 〈t1 − 4t2 + 10t3, t1,−2t2 + 4t3,−t2 + t3, t2, t3〉 = t1 ·〈1, 1, 0, 0, 0, 0〉+ t2 · 〈−4, 0,−2,−1, 1, 0〉+ t3 · 〈10, 0, 4, 1, 0, 1〉.• Then the basis is 〈1, 1, 0, 0, 0, 0〉 , 〈−4, 0,−2,−1, 1, 0〉 , 〈10, 0, 4, 1, 0, 1〉 .

(b) Show that, for any a, b, and c, there exists a solution to

u− v − 2w − x− 2y − z = a

w − 3x− y − z = b

x+ y − z = c

[Hint: There is a fairly easy solution in terms of a, b, c.]

• The idea is to set all of the free variables equal to zero, and then solve for the non-free variables.

• Doing this eventually leads us to the solution v = y = z = 0, x = c, w = 3c+ b, u = 6c+ 2b+ a, or,

as a vector, 〈u, v, w, x, y, z〉 = 〈6c+ 2b+ a, 0, 3c+ b, c, 0, 0〉 .• If we were interested in the general solution, we would add this particular solution to the generalsolution of the homogeneous equation, which we found in the previous part.

5.

(a) Are the vectors 〈−2,−1, 0〉, 〈−1, 0, 1〉, and 〈0, 1, 2〉 linearly independent? If so, explain why; if not, ndan explicit dependence between them.

• No , they are not linearly independent.

• If we had a dependence a · 〈−2,−1, 0〉 + b · 〈−1, 0, 1〉 + c · 〈0, 1, 2〉 = 〈0, 0, 0〉 then we would need−2a − b = 0, −a + c = 0, and b + 2c = 0. The rst equation gives b = −2a and the second givesc = a, and then the third reduces to 0 = 0.

• Thus there is a nontrivial dependence: for example, 1·〈−2,−1, 0〉−2·〈−1, 0, 1〉+1·〈0, 1, 2〉 = 〈0, 0, 0〉.

• Another way (which is really the same way) is to compute the determinant of the matrix

−2 −1 0−1 0 10 1 2

and see that it is zero. The coecients in the dependence are then a nonzero solution to

−2 −1 0−1 0 10 1 2

· abc

=

000

.

Page 95: MATH320

(b) Are the vectors 〈1, 0,−1,−1〉, 〈0, 1,−1, 1〉, and 〈1, 2, 0, 2〉 linearly independent? If so, explain why; if not,nd an explicit dependence between them.

• Yes , they are linearly independent.

• If we had a dependence a · 〈1, 0,−1,−1〉+ b · 〈0, 1,−1, 1〉+ c · 〈1, 2, 0, 2〉 = 〈0, 0, 0, 0〉 then we wouldneed a + c = 0, b + 2c = 0, −a − b = 0, and −a + b + 2c = 0. The rst equation gives c = −a andthe third gives b = −a, and then the second gives −a− 2a = 0, so that a = b = c = 0.

• So the only way for a · 〈1, 0,−1,−1〉 + b · 〈0, 1,−1, 1〉 + c · 〈1, 2, 0, 2〉 = 〈0, 0, 0, 0〉 to be true is witha = b = c = 0, which means the vectors are linearly independent.

6. Suppose that the two vectors ~v1 and ~v2 are linearly dependent. Show that one of the vectors is a scalarmultiple (possibly by zero) of the other.

• If the vectors are dependent then we have a relation a~v1 + b~v2 = ~0 with at least one of a and b not equalto zero.

• If a is not zero, then we can solve the dependence for ~v1: a~v1 = −b~v2 so ~v1 =

(− b

a

)~v2.

• If b is not zero, then we can solve the dependence for ~v2: b~v2 = −a~v1 so ~v2 =

(− b

a

)~v1.

7. Suppose that the three vectors ~v1, ~v2, and ~v3 are a basis for a vector space V .

(a) Show that the three vectors ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3 are linearly independent.

• Suppose we had a dependence relation a1(~v1)+ a2(~v1 +~v2)+ a3(~v1 +~v2 +~v3) = ~0. We want to showthat the only possibility is a1 = a2 = a3 = 0.

• By the distributive laws of scalar multiplication, the equation above is the same as a1~v1+a2~v1+a2~v2+a3~v1+a3~v2+a3~v3 = ~0, which we can regroup and write as (a1+a2+a3)~v1+(a2+a3)~v2+(a3)~v3 = ~0.

• But we know that ~v1, ~v2, and ~v3 are linearly independent (because they are a basis). So each of thecoecients a1 + a2 + a3, a2 + a3, and a3 must be zero.

• Since a3 = 0 and a2 + a3 = 0, then we must have a2 = 0. And then since a1 + a2 + a3 = 0 anda2 = a3 = 0 we get a1 = 0 as well.

• So we conclude that in fact a1 = a2 = a3 = 0 is the only way to satisfy the equation a1(~v1)+a2(~v1+~v2) + a3(~v1 + ~v2 + ~v3) = ~0.

• But this means precisely that ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3 are linearly independent.

(b) Show that the three vectors ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3 span V .

• Suppose that w is a vector in V : we want to write w as a linear combination of ~v1, ~v1 + ~v2, and~v1 + ~v2 + ~v3, say w = c1~v1 + c2(~v1 + ~v2) + c3(~v1 + ~v2 + ~v3).

• Since ~v1, ~v2, and ~v3 span V (because they are a basis), we know that w = a1~v1 + a2~v2 + a3~v3 forsome scalars a1, a2, and a3.

• Setting the two expressions equal gives a1~v1 + a2~v2 + a3~v3 = c1~v1 + c2(~v1 + ~v2) + c3(~v1 + ~v2 + ~v3).Our goal is to nd values for c1, c2, and c3.

• The most natural thing to do is compare coecients of ~v1, ~v2, and ~v3 on both sides. Doing this (afterexpanding with the distributive laws as in the previous part) yields a1 = c1 + c2 + c3, a2 = c2 + c3,and a3 = c3.

• Substituting gives c3 = a3, c2 = a2 − a3, and c1 = a1 − a2.

• So w = (a1 − a2)~v1 + (a2 − a3)(~v1 + ~v2) + a3(~v1 + ~v2 + ~v3).

• We have written a general vector w in V as a linear combination of ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3,which shows that ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3 span V .

(c) Show that the three vectors ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3 are also a basis for V .

• By the previous two parts, ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3 are a linearly independent spanning set forV , which means they are a basis for V .

8. Consider the function T from R4 to R3 which sends the vector 〈w, x, y, z〉 to the vector 〈w + x, y + z, w + x+ y + z〉.

Page 96: MATH320

(a) Find ker(T ): all vectors 〈w, x, y, z〉 in R4 which are sent to the zero vector 〈0, 0, 0〉 by T .

• The vectors sent to 〈0, 0, 0〉 are those with w+x = 0, y+ z = 0, and w+x+ y+ z = 0. We can then

see that the vectors sent to 〈0, 0, 0〉 are those 〈w, x, y, z〉 of the form 〈s,−s, t,−t〉 for any values of

s and t.

(b) Find im(T ): all vectors 〈a, b, c〉 in R3 for which there exists a vector 〈w, x, y, z〉 with T (w, x, y, z) =〈a, b, c〉.• We have 〈w + x, y + z, w + x+ y + z〉 = 〈a, b, c〉, which means that a, b, c must satisfy the relationc = a+ b. Then it is clear that a and b can be anything, and so the image of T is the vectors of the

form 〈s, t, s+ t〉 for any values of s and t.

Page 97: MATH320

Quiz #2 Solutions, Math 320, Sections 352/353/354/355, February 14/16.

• Find the requested solutions to the given dierential equations (implicit solutions are okay):

The function ♥(x) such that x♥′ + 3♥ = 3 and ♥(1) = 2.

∗ Solution 1 (Linear): Putting the equation in the form

♥′ + 3

x♥ =

3

x

shows that it is linear. The integrating factor is then I(x) = e´(3/x) dx = e3 ln(x) = eln(x

3) = x3.Multiplying by it and the integrating both sides gives

ˆ(x3♥′ + 3x2♥) dx =

ˆ3x2 dx.

Evaluating the integrals yieldsx3♥ = x3 + C

so that ♥ = 1 + C x−3. Plugging in the initial condition and solving yields C = 1, so the answer is

♥(x) = 1 + x−3 .

∗ Solution 2 (Separable): Rearranging the equation as ♥′ = 3− 3♥x

shows that it is separable. Sepa-

rating it yields♥′

3− 3♥=

1

x

and the integrating both sides givesˆ

1

3− 3♥d♥ =

ˆ1

xdx.

Evaluating both sides yields −1

3ln(3 − 3♥) = ln(x) + C, so that ln(3 − 3♥) = −3 ln(x) + C =

ln(x−3) + C. Exponentiating and rearranging then shows ♥(x) = 1 + C x−3, and plugging in the

initial condition gives C = 1, so ♥(x) = 1 + x−3 .

The function y(x) with y + y′ =8e2x

yand y(0) = 1.

∗ Solution (Bernoulli): Rewrite in the form y+ y′ = 8e2x · y−1 to see the equation is of Bernoulli type,with n = −1. Set v = y2, so that v′ = 2yy′. Multiplying both sides of the original equation by 2ygives

2yy′ + 2y2 = 16e2x,

and rewriting in terms of v gives the rst-order linear equation

v′ + 2v = 16e2x.

The integrating factor is I(x) = e´2 dx = e2x so scaling by it and then integrating yields

ˆ(e2xv′ + 2e2xv) dx =

ˆ16e4x dx

e2xv = 4e4x + C

so that v = 4e2x + Ce−2x. Since y(0) = 1, v(0) = 12 = 1. Therefore, we see C = −3; then solving

for y yields y(x) =√4e2x − 3e−2x .

Page 98: MATH320

All functions P (t) satisfying et · dPdt

= eP .

∗ Solution (Separable): The equation is separable. Rearranging and then integrating gives

ˆe−P dP =

ˆe−t dt

and evaluating gives−e−P = −e−t + C.

This is a correct (implicit) answer, although we can solve for P to obtain P = − ln(e−t + C) .

All functions y(x) satisfying y′ = −4x3y − y2

x4 − 2xy.

∗ Solution (Exact): Clearing the denominators and rewriting gives the equation as(4x3y − y2

)+(

x4 − 2xy)y′ = 0. We then have M = 4x3y − y2 and N = x4 − 2xy, so that

My = 4x3 − 2y

Nx = 4x3 − 2y.

These are equal, so the equation is exact. We want Fx = M so F =´M dx = x4y − xy2 + g(y)

for some function g(y). Dierentiating gives N = Fy = x4 − 2xy + g′(y) so we can take g(y) = 0.

Therefore, the general solution is F (x, y) = x4y − xy2 = C .

The function w(x) with w′ =x2 + w2

xwand w(1) = 4.

∗ Solution 1 (Homogeneous): Dividing the fraction gives

w′ =x

w+

w

x,

which is a homogeneous equation. Setting v =w

x, so that w′ = v+xv′, transforms the equation into

v + xv′ = v +1

v

so that xv′ =1

v. Separating and integrating yields

ˆv dv =

ˆ1

xdx

so that1

2v2 = ln(x) + C, or v = ±

√2 ln(x) + C, so w = ±x

√2 ln(x) + C. Finally, plugging in the

initial condition shows we want C = 16 and the plus sign, so w = x√2 ln(x) + 16 .

∗ Solution 2 (Bernoulli): Dividing the fraction and rearranging gives

w′ − 1

xw = xw−1,

which is Bernoulli with n = −1. Setting v = w2 gives v′ = 2ww′, so multiplying the original equationby 2w yields

2ww′ − 2

xw2 = 2x,

or, in terms of v,

v′ − 2

xv = 2x.

Page 99: MATH320

The integrating factor is I(x) = e´(−2/x) dx = e−2 ln(x) = eln(x

−2) = x−2, so scaling and integratingyields ˆ

(x−2v′ − 2x−2v) dx =

ˆ2x−1 dx.

Evaluating givesx−2v = 2 ln(x) + C,

so that v = x2(2 ln(x) + C) and w = ±x√2 ln(x) + C. As before the initial condition yields the

answer as w = x√2 ln(x) + 16 .

∗ Solution 3 (Almost-Exact): Clearing the fraction and rearranging gives

(−x2 − w2) + (xw)w′ = 0,

so that M = −x2 − w2 and N = xw. Thus we have

Mw = −2wNx = w,

so the equation is not exact. However, we can observe thatMw −Nx

N=−3wxw

= − 3

x, which is a

function of x only. Hence we have an integrating factor of I(x) = e´(−3/x) dx = e−3 ln(x) = eln(x

−3) =x−3, so scaling by it gives

(−x−1 − x−3w2) + (x−2w)w′ = 0.

We want Fx = −x−1 − x−3w2 so F =´ (−x−1 − x−3w2

)dx = − ln(x) +

1

2x−2w2 + g(w) for some

function g(w). Dierentiating gives Fy = x−2w + g′(w) so we can take g(w) = 0. Then the general

solution is − ln(x) +1

2x−2w2 = C. Plugging in the initial condition shows C = 8, so the solution is

given by − ln(x) +1

2x−2w2 = 8 . (Note that we can solve for w, and would recover the same answer

as in the other solutions.)

The function y(x) satisfying (2exy + y4) + (ex + 4y3)y′ = 0 which has y(0) = 1.

∗ Solution (Almost-Exact): We have M = 2exy + y4 and N = ex + 4y3, so

My = 2ex + 4y3

Nx = ex

so the equation is not exact. However, we can observe thatMy −Nx

N=

ex + 4y3

ex + 4y3= 1, which is a

function of x only. Hence we have an integrating factor of I(x) = e´1 dx = ex, so scaling by it gives

(2e2xy + exy4) + (e2x + 4exy3)y′ = 0.

We want Fx = 2e2xy+exy4 so F =´ (

2e2xy + exy4)dx = e2xy+exy4+g(y) for some function g(y).

Dierentiating gives Fy = e2x + 4exy3 + g′(y) so we can take g(y) = 0. Thus the general solutionis e2xy + exy4 = C. Now plugging in the initial condition shows C = 2, so the solution is given by

e2xy + exy4 = 2 .

Page 100: MATH320

Quiz #3 Solutions, Math 320, Sections 352/353/354/355, March 20/22.

• (5 × 2.5) Decide whether the following statements are true or false, and circle T or F respectively. If youwish to receive possible partial credit in case of a wrong answer, you may explain your answer (and work outcomputations, if necessary) in the space below the problem.

T F If A =

[1 20 1

]and B =

[−1 14 1

], then AB =

[7 34 2

].

∗ False . The correct product is AB =

[7 34 1

].

T F The inverse of the matrix

[3 41 1

]is

[−1 41 −3

].

∗ True . To check, multiply

[3 41 1

]·[

−1 41 −3

]to see that the product is

[1 00 1

].

T F If A,B,C are 2× 4, 2× 3, and 3× 2 matrices respectively, then the product CAB is dened.

∗ False . The product AB is not dened (the middle dimensions do not match), so CAB is not denedeither.

T F The matrix

1 0 −1 π 10 1 0 0 00 0 0 1 10 0 0 0 0

is in row-echelon form but not reduced row-echelon form.

∗ True . The matrix has the proper staircase shape for echelon form, but is not reduced because ofthe π.

T F The matrix

2 1 01 0 −10 −1 −2

is invertible.

∗ False . This matrix has determinant zero, and therefore is not invertible.

Page 101: MATH320

• (6 × 2.5) Decide whether the following statements are true or false, and circle T or F respectively. If youwish to receive possible partial credit in case of a wrong answer, you may explain your answer (and work outcomputations, if necessary) in the space below the problem.

T F If A and B are square matrices of the same size, (A+B)2 = A2 + 2AB +B2.

∗ False . The correct statement is (A+B)2 = A2+AB+BA+B2; this is not equal to A2+2AB+B2

because matrix multiplication is not necessarily commutative.

T F If A and B are invertible matrices of the same size, then (A+B)−1 = A−1 +B−1.

∗ False . If A and B are invertible, then A + B does not need to be invertible (example: A = I,B = −I). Even if A+B is invertible, its inverse is generally not A−1 +B−1 (example: A = B = I).

T F If A is an n × n matrix and ~b is an n × 1 matrix such that there exist two dierent n × 1matrices ~x1 and ~x2 such that A · ~x1 = ~b and A · ~x2 = ~b, then det(A) = 0.

∗ True . If det(A) were nonzero, then A−1 would exist. Then we could multiplyA · ~x1 = ~b = A · ~x2 on

the left by A−1 to see ~x1 = A−1 ·~b = ~x2, which can't happen because ~x1 and ~x2 are dierent.

∗ Another way to phrase the problem statement is: if a homogeneous system of n equations in nvariables has two dierent solutions, then the coecient matrix has determinant zero. (Written thisway, the statement might seem more familiar.)

T F If A =

[1 20 1

], B =

[−1 11 1

], and C =

[3 2−1 1

], then det(CBABA2CA) = 100.

∗ True . The determinant is multiplicative, so

det(CBABA2CA) = det(C) det(B) det(A) det(B) det(A)2 det(C) det(A) = det(A)4 det(B)2 det(C)2.

Since det(A) = −1, det(B) = −2, and det(C) = 5, we see that det(CBABA2CA) = (−1)4(−2)2(5)2 =100.

∗ The wrong way to do this problem is to actually perform the seven matrix multiplications to nd

CBABA2CA =

[−6 14−8 2

], and then take the determinant. (This does work, but it is a massive

waste of time.)

T F If A and B are matrices such that AB is invertible, then BA is also invertible.

∗ False . Here is a counter-example: A =[1 2

], B =

[−11

]. Then AB = [1] is invertible, but

BA =

[−1 −21 2

]is not invertible.

∗ The problem is true if the matrices A and B are the same size: in that case, we can take determinantsto see that det(AB) = det(A) · det(B) = det(B) · det(A) = det(BA).

T F Matrices are interesting.

∗ True . This is self-evidently a true statement.

Page 102: MATH320

Quiz #4 Solutions, Math 320, Sections 352/353/354/355, March 20/22.

• (2,2,2) Suppose that ~v1, · · · , ~vn are vectors in a vector space V .

Dene, in one sentence, the span of ~v1, · · · , ~vn.∗ The span is the subspace W of all linear combinations of the vectors ~v1, · · · , ~vn.∗ More explicitly, the span is the collection of vectors w with w = a1~v1 + · · ·+ an~vn for some scalarsa1, · · · , an.

∗ Equivalently, the span is the smallest subspace of V which contains each of the vectors ~v1, · · · , ~vn. Dene, in one sentence, what it means for ~v1, · · · , ~vn to be linearly independent.

∗ The vectors ~v1, · · · , ~vn are linearly independent when the only scalars for which a1~v1+ · · ·+an~vn = ~0are a1 = · · · = an = 0.

∗ Equivalently, the vectors are linearly independent if none of them can be written as a linear combi-nation of the others.

Dene, in one sentence, what it means for ~v1, · · · , ~vn to be a basis of V .

∗ A basis is a linearly independent spanning set: so, to be a basis, the vectors ~v1, · · · , ~vn must belinearly independent and span V .

∗ Equivalently, ~v1, · · · , ~vn is a basis if every vector w in V can be written as a unique linear combinationw = a1~v1 + · · ·+ an~vn.

• (2,2,2) For each of the following pairs (V, S), determine whether S is a subspace of V , and briey explain whyor why not.

Note: The goal of all three parts of this problem was to check the three pieces of the subspace criterion:(i) that the zero vector is in S, (ii) that the sum of two vectors in S is also in S, and (iii) that any scalarmultiple of a vector in S is also in S. If any of these conditions fail, then S is not a subspace.

V = R3, S is all vectors whose sum of coordinates is 3.

∗ In this case, S is the vectors 〈x, y, z〉 where x+ y + z = 3.

∗ We go through the parts of the subspace criterion:

1. The zero vector 〈0, 0, 0〉 is not in S.2. S is not closed under addition: for example, 〈3, 0, 0〉 and 〈0, 3, 0〉 are both in S, but their sum〈3, 3, 0〉 is not.

3. S is not closed under scalar multiplication: for example, 〈1, 1, 1〉 is in S, but 2 · 〈1, 1, 1〉 = 〈2, 2, 2〉is not.

∗ Therefore, this S is not a subspace of V none of the three parts of the subspace criterion is

satised.

V = R2012, S is all vectors which have at least 2010 zeroes in them.

∗ We go through the parts of the subspace criterion:

1. The zero vector 〈0, 0, . . . , 0〉 is in S, since it has 2012 zeroes in it.

2. S is not closed under addition: for example, if v1 = 〈1, 1, 0, 0, · · · , 0〉 is the vector with rst twoentries equal to 1 and v2 = 〈0, 0, · · · , 0, 0, 1, 1〉 is the vector with last two entries equal to 1, thentheir sum v1 + v2 = 〈1, 1, 0, 0, · · · , 0, 0, 1, 1〉 only has 2008 zeroes, not 2010.

3. S is closed under scalar multiplication, since scaling a vector will not reduce the number of zeroentries it has.

∗ Therefore, this S is not a subspace of V , because S is not closed under addition.

V is all polynomials in the variable x, S is all polynomials p(x) such that p(0) = 0.

∗ Some examples of elements in S are x, x2, x10 − 3x2, and 0: when x = 0, each of these polynomialsis zero.

∗ We go through the parts of the subspace criterion:

Page 103: MATH320

1. The zero vector here is the zero polynomial, and it is in S.

2. S is closed under addition: if r(x) = p(x) + q(x) and p(0) = q(0) = 0, then r(0) = p(0) + q(0) =0 + 0 = 0.

3. S is closed under scalar multiplication: if r(x) = α · p(x) and p(0) = 0 then r(0) = α · 0 = 0.

∗ Therefore, this S is a subspace of V , because S satises all three of the subspace conditions.

• (5) Find a basis for the set of solutions 〈v, w, x, y, z〉 to the system

v − w + 2x− y − z = 0

w − x− 2y − 3z = 0

y + 2z = 0

.

The goal is to nd all solutions to this homogeneous system, and then extract the basis from the collectionof solutions.

The system in matrix form is

1 −1 2 −1 −10 1 −1 −2 −30 0 0 1 2

∣∣∣∣∣∣000

. This matrix is already in echelon form,

so we need only identify the free variables and then nd the general solution.

The rst, second, and fourth columns are pivotal columns (because they contain leading row terms).Thus the third and fth columns are nonpivotal columns, and so their corresponding variables (x and z)are the free variables.

So if we set x = s and z = t, then we can substitute one at a time into the equations to write all thevariables in terms of our free parameters s and t.

∗ The last equation y + 2z = 0 gives y = −2z = −2t.∗ The middle equation w − x− 2y − 3z = 0 gives w = x+ 2y + 3z = s− 4t+ 3t = s− t.∗ The rst equation v−w+2x− y− z = 0 gives v = w− 2x+ y+ z = (s− t)− 2s− 2t+ t = −s− 2t.

Hence all the solutions are given by 〈v, w, x, y, z〉 = 〈−s− 2t, s− t, s,−2t, t〉. To extract the basis we just split apart this solution vector: 〈−s− 2t, s− t, s,−2t, t〉 = 〈−s, s, s, 0, 0〉 +〈−2t,−t, 0,−2t, t〉 = s · 〈−1, 1, 1, 0, 0〉+ t · 〈−2,−1, 0,−2, 1〉.

We have written every solution to the system as an explicit linear combination of the vectors 〈−1, 1, 1, 0, 0〉and 〈−2,−1, 0,−2, 1〉: therefore, 〈−1, 1, 1, 0, 0〉 and 〈−2,−1, 0,−2, 1〉 are a basis for the solution space.

• (5) Determine (with justication) whether 〈2, 0, 1〉, 〈1, 1, 1〉, and 〈1, 3, 2〉 form a basis for R3.

In general, a collection of n vectors ~v1, · · · , ~vn in Rn is a basis precisely when the matrixA =

| · · · |~v1 · · · ~vn| · · · |

is invertible. (The columns of A are the coordinates of ~v1, · · · , ~vn.)∗ The reason is: if we wanted to determine whether there was a dependence between our vectors, wewould need to look for scalars x1, · · · , xn (not all zero) such that x1~vn + · · · + xn~vn = ~0. But this

is the same as nding a nonzero solution to the matrix equation A · ~x = ~0, where ~x =

x1...xn

, andsuch a solution exists precisely when A is non-invertible.

∗ Because det(A) = det(AT ), one could just as well use the matrix whose rows are the vectors~v1, · · · , ~vn.

Thus we need to determine whether the matrix A =

2 1 10 1 31 1 2

is invertible. To do this we compute the

determinant by expanding down the rst column: det(A) = 2 ·∣∣∣∣ 1 31 2

∣∣∣∣+1 ·∣∣∣∣ 1 11 3

∣∣∣∣ = 2 ·(−1)+1 ·2 = 0.

Since the determinant is zero, A is not invertible, meaning that these vectors are not a basis becausethey are linearly dependent.

Page 104: MATH320

Note: We could also solve the problem just by nding an explicit dependence: 1 · 〈2, 0, 1〉 − 3 · 〈1, 1, 1〉+1 · 〈1, 3, 2〉 = 〈0, 0, 0〉.

• (4) Suppose that the vectors ~v1, ~v2, and ~v3 are linearly independent. Show that the vectors ~v1, ~v1 + ~v2, and~v1 + ~v2 + ~v3 are also linearly independent.

Suppose we had a dependence relation a1(~v1)+a2(~v1+~v2)+a3(~v1+~v2+~v3) = ~0. We want to show thatthe only possibility is a1 = a2 = a3 = 0.

By the distributive laws of scalar multiplication, the equation above is the same as a1~v1 + a2~v1 + a2~v2 +a3~v1 + a3~v2 + a3~v3 = ~0, which we can regroup and write as (a1 + a2 + a3)~v1 + (a2 + a3)~v2 + (a3)~v3 = ~0.

But we know that ~v1, ~v2, and ~v3 are linearly independent. So each of the coecients a1+a2+a3, a2+a3,and a3 must be zero.

Since a3 = 0 and a2+a3 = 0, then we must have a2 = 0. And then since a1+a2+a3 = 0 and a2 = a3 = 0we get a1 = 0 as well.

So we conclude that in fact a1 = a2 = a3 = 0 is the only way to satisfy the equation a1(~v1) + a2(~v1 +~v2) + a3(~v1 + ~v2 + ~v3) = ~0.

But this means precisely that ~v1, ~v1 + ~v2, and ~v1 + ~v2 + ~v3 are linearly independent.

Page 105: MATH320

Quiz #5, Math 320, Sections 352/353/354/355, May 2± 1.

• (2,2,2) Find the general solution to each homogeneous second-order linear dierential equation:

y′′ + 6y′ + 10y = 0.

∗ The characteristic equation is r2 + 6r + 10 = 0, which has roots−6±

√62 − 4 · 102

= −3 ± i bythe quadratic formula. The general solution is y = Ae(−3+i)x + Be(−3−i)x, or, to use real-valued

functions, y = Ae−3x cos(x) +Be−3x sin(x) .

y′′ + 6y′ + 5y = 0.

∗ The characteristic equation is r2 +6r+5 = 0, which has roots−6±

√62 − 4 · 52

= −3± 2 namely,

r = −5,−1. The general solution is y = Ae−5x +Be−x .

y′′ + 6y′ + 9y = 0.

∗ The characteristic equation is r2 +6r+9 = 0, which has roots−6±

√62 − 4 · 92

= −3± 0 namely,

r = −3,−3. The general solution is y = Ae−3x +Bxe−3x .

• (4) Find one solution to the non-homogeneous equation y′′ + 4y = x2 + e3x + sin(2x).

The homogeneous equation is y′′ + 4y = 0, which has general solution y = C1 cos(2x) + C2 sin(2x).

The non-homogeneous part of the original equation is x2 + e3x + sin(2x).

∗ We replace all coecients with variables; then we ll in the missing cosine term, and nally add in themissing lower-degree terms, to obtain a rst guess of A2x

2+A1x+A0+Be3x+D sin(2x)+E cos(2x).

∗ There is an overlap (the sine and cosine terms) with the solutions of the homogeneous equation, sowe scale the overlapping terms by x.

∗ This gives us the correct guess as ypar = A2x2 +A1x+A0 +Be3x +Dx sin(2x) + Ex cos(2x) .

Now we compute the second derivative

y′′par = 2A2 + 9Be3x +D(4 cos(2x)− 4x sin(2x)) + E(−4 sin(2x)− 4x cos(2x))

[note: on the actual quiz, the second derivatives of x sin(2x) and x cos(2x) were given] to obtain

y′′par + 4ypar = 4A2x2 + 4A1x+ (2A2 + 4A0) + 13Be3x + 4D cos(2x)− 4E sin(2x).

Equating y′′par+4ypar with x2+e3x+sin(2x) yields A2 =

1

4, A1 = 0, A0 = −1

8, B =

1

13, D = 0, E = −1

4.

Thus the desired answer is ypar =1

4x2 − 1

8+

1

13e3x − 1

4x cos(2x) .

• (4) Show that the functions 1, x, and ex are linearly independent functions on the real line. [Hint: W .]

We compute the Wronskian W (1, x, ex) =

∣∣∣∣∣∣1 x ex

0 1 ex

0 0 ex

∣∣∣∣∣∣ = ex. (Note the matrix is upper-triangular, so

the determinant is just the product of the diagonal entries 1, 1, ex.)

The Wronskian is nonzero, hence the functions are linearly independent .

Page 106: MATH320

• (4) Find a homogeneous linear dierential equation which has sin(t)−1337 cos(t) and t2+t+π666 as solutions.

The idea is to think about what factors have to be in the characteristic polynomial of the equation, inorder to get these two functions as solutions.

In order to have a function A sin(t)+B cos(t) as a solution to an equation with constant coecients, thecharacteristic polynomial p(r) should have a factor r2 + 1.

In order to have a function Ax2 + Bx + C as a solution to an equation with constant coecients, thecharacteristic polynomial p(r) should have a factor r3.

Thus, in order to get both functions as solutions, the characteristic polynomial should be divisible byboth r2 + 1 and r3.

The easiest polynomial with this property is just the product, r3(r2 + 1) = r5 + r3.

The corresponding dierential equation is y′′′′′ + y′′′ = 0 .

Note: There are many other dierential equations which will work (though this one is the simplest). Forexample, any constant-coecient equation with characteristic polynomial divisible by r5 + r3 will alsowork. There are also equations with non-constant coecients which work.

• (4) Find one function yp(x) such that y′′ +1

xy′ − 1

x2y = x, given that two solutions to the homogeneous

equation y′′ +1

xy′ − 1

x2y = 0 are y1 = x and y2 =

1

x.

This is a variation of parameters problem (although it is possible to guess a solution via undeterminedcoecients).

We are given two independent solutions y1 = x and y2 =1

xto the homogeneous equation.

The variation of parameters setup then says to take our particular solution as ypar = v1y1 + v2y2, where

∗ v′1 =W1(x)

W (x)and v′2 =

W2(x)

W (x), with

∗ W (x) =

∣∣∣∣∣∣∣x

1

x

1 − 1

x2

∣∣∣∣∣∣∣ = (x) ·(− 1

x2

)− (1) ·

(1

x

)= − 2

x,

∗ W1(x) =

∣∣∣∣∣∣∣0

1

x

x − 1

x2

∣∣∣∣∣∣∣ = 0− (x) ·(1

x

)= −1, and

∗ W2(x) =

∣∣∣∣ x 01 x

∣∣∣∣ = x2.

Thus we get v′1 =−1

(−2/x)=x

2and v′2 =

x2

(−2/x)= −x

3

2.

Integrating gives v1 =x2

4and v2 = −x

4

8.

Thus we obtain ypar = v1y1 + v2y2 =

(x2

4

)· (x) +

(−x

4

8

)·(1

x

)=

x3

8.

• (4) Find the eigenvalues of the matrix A =

[−1 −5−1 3

].

The eigenvalues are the zeroes of the characteristic polynomial p(t) = det(tI −A).

We have tI−A =

[t+ 1 51 t− 3

], so p(t) = det(tI−A) = (t+1)(t−3)−5 = t2−2t−8 = (t−4)(t+2).

Therefore the eigenvalues are the solutions to (λ− 4)(λ+ 2) = 0 i.e., λ = −2 and 4 .

Page 107: MATH320

Quiz #6, Math 320, Sections 352/353/354/355, April 25±1.

• (1× 9) Find general solutions for each of the following dierential equations or systems:

P ′ = 1

100P (5− P ).

∗ This is a logistic equation. The equation is separable:

ˆdP

P (5− P )=

ˆ1

100dt

ˆ (1/5

P+

1/5

5− P

)dP =

t

100+ C

1

5ln(P )− 1

5ln(5− P ) =

t

100+ C

∗ Rewriting gives ln

(P

5− P

)=

t

20+ C, so by exponentiating we have

P

5− P= Det/20.

∗ Solving for P gives P =5

1 +De−t/20. (Alternatively, one could use the formula.)

y′ = 2xy + x3.

∗ This is rst-order linear; write it in the form y′ − 2xy = x3.

∗ The integrating factor is e´(−2x) dx = e−x

2

.

∗ So we have e−x2

y′ − 2xe−x2

y = x3e−x2

.

∗ Integrating both sides (substitute u = −x2 and then integrate by parts on the right-hand side) yields

e−x2

y = −1

2(x2 + 1)e−x

2

+ C, so that y = −1

2(x2 + 1) + Cex

2

.

y′ + 4xy = 8xy2.

∗ This is a Bernoulli substitution, with n = 2, so that v = y−1 and v′ = −y−2y′.∗ Scaling by −y−2 gives −y−2y′ − 4xy−1 = −8x, or v′ − 4xv = −8x, which is now rst-order linear.

∗ The integrating factor is e´−4x dx = e−2x

2

.

∗ So we have e−2x2

v′ − 4xe−2x2

v = −8xe−2x2

.

∗ Integrating both sides (substitute u = −x2 on the right-hand side) yields e−2x2

v = 2e−2x2

+ C, so

that v = 1 + Ce−2x2

. Therefore, y = (2 + Ce2x2

)−1 .

(6x+ y) + (x+ 3y2)y′ = 0.

∗ This is an exact equation: with M = 6x+ y and N = x+ 3y2 we have My = 1 = Nx.

∗ So there exists a function f(x, y) with fx = 6x+ y and fy = x+ 3y2.

∗ Taking the anti-partial with respect to x gives f = 3x2+xy+g(y) for some g(y). Then fy = x+g′(y)must equal x+ 3y2 so g′(y) = 3y2 hence we can take g(y) = y3.

∗ Therefore the solutions are 3x2 + xy + y3 = C .

y′′′′ − 4y′′′ + 4y′′ = 0.

∗ This is a homogeneous linear equation with constant coecients.

∗ The characteristic equation is r4 − 4r3 + 4r2 = 0, which factors as r2(r − 2)2 = 0, with rootsr = 0, 0, 2, 2.

∗ So the general solution is y = A+Bx+ Ce2x +Dxe2x .

y′′ − 6y′ + 9y = ex + e2x + e3x.

∗ This is a non-homogeneous equation with constant coecients. We can use undetermined coecients.

Page 108: MATH320

∗ The homogeneous equation is y′′ − 6y′ + 9y = 0 which has characteristic equation r2 − 6r + 9 = 0with roots r = 3, 3. So the general homogeneous solution is yhom = C1e

3x + C2xe3x.

∗ Now we look for a solution to the non-homogeneous equation: the rst guess is A1ex+A2e

2x+A3e3x

but there is an overlap (namely e3x) with the homogeneous solutions, so we must multiply thatsolution by x2 in order to avoid an overlap.

∗ So our trial solution is ypar = A1ex +A2e

2x +A3x2e3x. We compute

y′par = A1ex + 2A2e

2x + 2A3xe3x + 3A3x

2e3x

y′′par = A1ex + 4A2e

2x + 2A3e3x + 12A3xe

3x + 9A3x2e3x

soy′′par − 6y′par + 9ypar = 4A1e

x +A2e2x + 2A3e

3x

whence A1 =1

4, A2 = 1, and A3 =

1

2.

∗ Hence the general solution is ygen = ypar + yhom =

(1

4ex + e2x +

1

2x2e3x

)+ C1e

3x + C2xe3x .

∗ Note: One can also use variation of parameters to solve this problem. It will, of course, lead to thesame general solution.

y′′ + 4y = tan(x).

∗ This is a non-homogeneous equation with constant coecients. We cannot use undetermined coe-cients because of the tan(x) term, so we use variation of parameters.

∗ The homogeneous equation is y′′ + 4y = 0 which has characteristic equation r2 + 4 = 0, with rootsr = 2i,−2i. The general homogeneous solution is thus C1 cos(2x) + C2 sin(2x).

∗ We then take y1 = cos(2x) and y2 = sin(2x), and want to construct ypar = v1y1 + v2y2 where

v′1 =W1(x)

W (x)and v′2 =

W2(x)

W (x), with

· W (x) =

∣∣∣∣ cos(2x) sin(2x)−2 sin(2x) 2 cos(2x)

∣∣∣∣ = 2 cos2(2x) + 2 sin2(2x) = 2,

· W1(x) =

∣∣∣∣ 0 sin(2x)tan(x) 2 cos(2x)

∣∣∣∣ = − tan(x) sin(2x) = − sin(x)

cos(x)· 2 sin(x) cos(x) = −2 sin2(x),

· W1(x) =

∣∣∣∣ cos(2x) 0−2 sin(2x) tan(x)

∣∣∣∣ = tan(x) cos(2x) =sin(x)

cos(x)· (2 cos2(x)− 1) = sin(2x)− tan(x).

∗ Thus v′1 = − sin2(x) = −1− cos(2x)

2, and v′2 =

1

2sin(2x)− 1

2tan(x).

∗ Integrating gives v1 = −x2+

1

4sin(2x) and v2 = −1

4cos(2x) +

1

2ln(cos(x)).

∗ Thus we have ypar = v1y1+v2y2 =

[−x2+

1

4sin(2x)

]cos(2x)+

[−1

4cos(2x) +

1

2ln(cos(x))

]sin(2x).

We can cancel some terms to simplify this to ypar = −x2cos(2x) +

1

2ln(cos(x)) sin(2x).

∗ Finally, ygen =

[−x2cos(2x) +

1

2ln(cos(x)) sin(2x)

]+ C1 cos(2x) + C2 sin(2x) .

y′ = 3y + 2zz′ = y + 2z

.

∗ This is a rst-order linear system; we use the eigenvalue method. The coecient matrix is A =[3 21 2

].

∗ We have det(tI − A) =∣∣∣∣ t− 3 −2−1 t− 2

∣∣∣∣ = (t − 3)(t − 2) − 2 = t2 − 5t + 4 = (t − 1)(t − 4), so the

eigenvalues are λ = 1, 4.

Page 109: MATH320

· For λ = 1 we solve

[3 21 2

]·[ab

]= 1 ·

[ab

]so that

[3a+ 2ba+ 2b

]=

[ab

], or b = −a. So a

basis for the λ = 1 eigenspace is

[1−1

].

· For λ = 4 we solve

[3 21 2

]·[ab

]= 4 ·

[ab

]so that

[3a+ 2ba+ 2b

]=

[4a4b

], or a = 2b. So a

basis for the λ = 4 eigenspace is

[21

].

∗ We have the proper number of linearly-independent eigenvectors (namely, 2) so the general solution

to the system is

[yz

]= c1

[1−1

]et + c2

[21

]e4t, or

y = c1et + 2c2e

4t

z = −c1et + c2e4t .

• (1) Given y′ = sin(x+ y) and y(1) = 1, nd an approximation of y(2).

The idea is to use one of the approximation methods (Euler's method, or one of the improvements to it).

Here is the setup for the standard Euler's Method with a step size of h = 0.1 and f(x, y) = sin(x + y).We ll in all the x-values, and then the y, f(x, y), and h · f(x, y) values one column at a time, generatingthe new y-values using the recursion yn = yn−1 + h · f(xn−1, yn−1).

x 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0y 1 1.0909 1.1723 1.2419 1.2983 1.3412 1.3708 1.3878 1.3932 1.3880 1.3734

f(x, y) 0.909 0.814 0.696 0.564 0.429 0.296 0.170 0.054 -0.052 -0.146 -h · f(x, y) 0.0909 0.0814 0.0696 0.0564 0.0429 0.0296 0.0170 0.0054 -0.0052 -0.0146 -

The approximation thus gives y(2) ≈ 1.37 .

Note: It is possible to solve this dierential equation explicitly for y, via a substitution: the result is

y = 2 cot−1(

2

x+ C− 1

)−x for an appropriate branch of the inverse cotangent, where C =

1− cot(1)

1 + cot(1).

This gives the (correct) value y(2) ≈ 1.3375 .

• (1,1) Find all initial conditions y(a) = b for which these dierential equations are guaranteed to have uniquesolutions (in some region containing (a, b)):

y′ = (x− y)1/3

∗ The existence-uniqueness theorem for rst-order equations states that the initial value problemy′ = f(x, y) with y(a) = b has exactly one solution (on some interval containing a) if the partial

derivative∂f

∂yis continuous on a rectangle containing (a, b).

∗ In this case, f(x, y) = (x− y)1/3 so∂f

∂y= −(x− y)−2/3 = − 1

(x− y)2/3. Since (x− y)2/3 is dened

and continuous for all (x, y), we see that∂f

∂yis dened and continuous provided the denominator is

nonzero: that is, as long as x 6= y.

∗ Hence the pairs (a, b) for which the IVP is guaranteed to have a unique solution are (a, b) such that a 6= b .

∗ Notes: If a 6= b, this equation can be solved by a substitution, yielding y =

(2

3x+ C

)3/2

− x, with

C =2

3(a+ b)3/2 − a. If a = b the equation also has the solution y = 0 (among others).

y′′ + 1

xy′ +

1

x2y = 0

∗ The initial value problem should have two conditions y(a) = b1 and y′(a) = b2 in order to specify

a unique solution. The correct answer to the problem is thus no pairs (a, b) because no initial

condition y(a) = b contains enough information to specify a unique solution.

Page 110: MATH320

∗ Partial credit was also given for the following answer:

· The existence-uniqueness theorem for general linear equations states that the initial value prob-lem y(n) + Pn(x) y

(n−1) + · · · + P2(x) y′ + P1(x) y = Q(x), with y(a) = b1 and y′(a) = b2 has a

unique solution (on some interval containing a) if Pn(x), · · · , P1(x) and Q(x) are continuous onan interval containing a.

· Since the coecient functions are 1,1

x,1

x2, and 0, they are continuous everywhere except where

x = 0.

· Hence the collections (a, b1, b2) for which the IVP is guaranteed to have a unique solution are

(a, b1, b2) such that a 6= 0 .

∗ Note: If a 6= 0, this equation is of Euler type and can be solved by the substitution v = ln |x|, toobtain y = c1 cos(ln |x|) + c2 sin(ln |x|), for appropriate values of c1 and c2 determined by the initialconditions. If a = 0 then since the coecients are not dened when x = 0, the IVP cannot have asolution.

• (1) Find the Wronskian of ln(x) and ln(x2), for x > 0. Are the functions linearly independent?

The functions are not independent , because 2 ln(x) = ln(x2).

The Wronskian is W =

∣∣∣∣∣ ln(x) ln(x2)1

x

2x

x2

∣∣∣∣∣ = 2

xln(x)− 1

xln(x2) =

1

x

[2 ln(x)− ln(x2)

]=

1

x· 0 = 0.

Remark: After nding W , one can plug in some easy values of x to W =1

x

[2 ln(x)− ln(x2)

], like x = 1

and x = e, to see that W = 0 for each of those values of x; this was intended to lead to the observationthat 2 ln(x) = ln(x2).

• (2) Find all solutions to the system

3x1 + x2 = 5

x1 + x2 + 2x3 = 9

4x1 − 2x2 − 10x3 = −30.

The procedure is to put this matrix in reduced row-echelon form, identify any free variables, and thenwrite down the general solution.

In matrix form we have 3 1 01 1 24 −2 −10

∣∣∣∣∣∣59−30

R1↔R2−→

1 1 23 1 04 −2 −10

∣∣∣∣∣∣95−30

Now clear out the rst column: 1 1 2

3 1 04 −2 −10

∣∣∣∣∣∣95−30

R2−3R1−→

1 1 20 −2 −64 −2 −10

∣∣∣∣∣∣9−22−30

R3−4R1−→

1 1 20 −2 −60 −6 −18

∣∣∣∣∣∣9−22−66

. Now clear out the second column, and then rescale the second row: 1 1 2

0 −2 −60 −6 −18

∣∣∣∣∣∣9−22−66

R3−3R2−→

1 1 20 −2 −60 0 0

∣∣∣∣∣∣9−220

−12 ·R2

−→

1 1 20 1 30 0 0

∣∣∣∣∣∣9110

Finally, we put it in reduced row echelon form: 1 1 2

0 1 30 0 0

∣∣∣∣∣∣9110

R1−R2−→

1 0 −10 1 30 0 0

∣∣∣∣∣∣−2110

. So the system does have a solution (since the bottom row is not a contradiction), and we have one freevariable, z. If we set z = t we obtain y = 11− 3t and x = −2 + t.

Page 111: MATH320

So the solutions of the systems are 〈x, y, z〉 = 〈−2 + t, 11− 3t, t〉 .

• (1×8) Let A =

2 −2 50 −1 0−1 0 −2

. Compute the determinant, inverse, characteristic polynomial, eigenvalues,

eigenvectors (one in each eigenspace), diagonalization D, conjugating matrix P with D = P−1AP , and 2012thpower of A.

Determinant of A

∗ Expand down the rst column: det(A) = 2 ·∣∣∣∣ −1 0

0 −2

∣∣∣∣+ (−1) ·∣∣∣∣ −2 5−1 0

∣∣∣∣ = 4− 5 = −1 .

Inverse of A

∗ We write down the matrix [A|I] and then row-reduce A to the identity; the result will be [I|A−1]. 2 −2 50 −1 0−1 0 −2

∣∣∣∣∣∣1 0 00 1 00 0 1

R1↔R3−→

−1 0 −20 −1 02 −2 5

∣∣∣∣∣∣0 0 10 1 01 0 0

(−1)·R1,R2−→

1 0 20 1 02 −2 5

∣∣∣∣∣∣0 0 −10 −1 01 0 0

R3−2R1−→

1 0 20 1 00 −2 1

∣∣∣∣∣∣0 0 −10 −1 01 0 2

R3+2R2−→

1 0 20 1 00 0 1

∣∣∣∣∣∣0 0 −10 −1 01 −2 2

R1−2R3−→

1 0 00 1 00 0 1

∣∣∣∣∣∣−2 4 −50 −1 01 −2 2

∗ So we see that A−1 =

−2 4 −50 −1 01 −2 2

. To verify this we can multiply A by this matrix, and we

indeed get the identity matrix (as we should).

Characteristic Polynomial of A

∗ The characteristic polynomial p(t) is given by p(t) = det(tI −A).

∗ Since tI−A =

t− 2 2 −50 t+ 1 01 0 t+ 2

by expanding down the rst column we see that det(tI−A) =

(t− 2) ·∣∣∣∣ t+ 1 0

0 t+ 2

∣∣∣∣+ 1 ·∣∣∣∣ 2 5t+ 1 0

∣∣∣∣ = (t− 2)(t+ 1)(t+ 2)− 5(t+ 1) = (t+ 1)(t2 + 1).

∗ So p(t) = (t+ 1)(t2 + 1) = t3 + t2 + t+ 1 .

∗ Note: If one uses p(t) = det(A− tI) instead, the resulting polynomial is p(t) = −t3 − t2 − t− 1 .

Eigenvalues of A

∗ The eigenvalues are the zeroes of the characteristic polynomial p(t) = (t+ 1)(t2 + 1).

∗ Setting p(λ) = 0 and solving yields λ = −1, i,−i .

Eigenvectors of A

∗ For λ = −1 we want

−3 2 −50 0 01 0 1

· abc

=

000

, so that a = −c and b = c. Hence a basis

for the λ = −1 eigenspace is

−111

.

∗ For λ = i we want

i− 2 2 −50 i+ 1 01 0 i+ 2

· abc

=

000

, so that b = 0 and a = −(i+2)c. Hence

a basis for the λ = i eigenspace is

−i− 201

.

Page 112: MATH320

∗ For λ = i we want

−i− 2 2 −50 −i+ 1 01 0 −i+ 2

· abc

=

000

, so that b = 0 and a = −(−i+2)c.

Hence a basis for the λ = −i eigenspace is

i− 201

.

Diagonalization of A

∗ Since the eigenspaces of A are the maximal sizes, we see that A is diagonalizable.

∗ The diagonal entries of a diagonalization of A are be the eigenvalues of A.

∗ So, for example, we can take D =

−1 0 00 i 00 0 −i

.

∗ Note: Any ordering of the eigenvalues along the diagonal is acceptable.

Conjugating matrix for A

∗ The conjugating matrix P has columns given by independent eigenvectors of A.

∗ So for the D given above, one P would be P =

−1 −i− 2 i− 21 0 01 1 1

.

2012th power of A

∗ We know that D = P−1AP so D2012 = P−1A2012P , or P D2012P−1 = A2012.

∗ Since D is diagonal we can compute D2012 =

(−1)2012 0 00 i2012 00 0 (−i)2012

=

1 0 00 1 00 0 1

= I.

∗ Then A2012 = P D2012P−1 = PP−1 = I =

1 0 00 1 00 0 1

.

• (1,1) Let S be the vectors 〈x, y, z, w〉 in R4 satisfying x+ y + z + w = 0.

Show that S is a subspace of R4.

∗ We check the three parts of the subspace criterion (contains ~0, closed under addition, closed underscalar multiplication).

∗ [S1]: The zero vector 〈0, 0, 0, 0〉 satises the condition. So ~0 is in S.

∗ [S2]: Suppose that 〈x1, y1, z1, w1〉 and 〈x2, y2, z2, w2〉 are in S: then x1 + y1 + z1 + w1 = 0 andx2 + y2 + z2 +w2 = 0. Then because (x1 + x2) + (y1 + y2) + (z1 + z2) + (w1 +w2) = (x1 + y1 + z1 +w1) + (x2 + y2 + z2 +w2) = 0 + 0 = 0, we see that the vector 〈x1, y1, z1, w1〉+ 〈x2, y2, z2, w2〉 is alsoin S.

∗ [S3]: Suppose that 〈x1, y1, z1, w1〉 is in S: then x1 + y1 + z1 + w1 = 0. Then because αx1 + αy1 +αz1 + αw1 = α(x1 + y1 + z1 +w1) = α · 0 = 0, we see that the vector α · 〈x1, y1, z1, w1〉 is also in S.

∗ Note: Another acceptable answer is to observe that these vectors are the solutions to a homogeneoussystem of linear equations (in this case, the single equation x + y + z + w = 0). Thus they form asubspace, because the solutions to any homogeneous system form a subspace.

Find a basis for S.

∗ We seek a basis for the set of solutions to the (single) homogeneous equation x+ y + z + w = 0.

∗ The corresponding coecient matrix is[1 1 1 1

], which (rather obviously) is already in re-

duced row-echelon form. The rst column is pivotal and the other three are nonpivotal: thus thereare three free variables y, z, and w.

∗ Setting y = t1, z = t2, and w = t3 gives x = −t1− t2− t3 so the solutions are precisely 〈x, y, z, w〉 =〈−t1 − t2 − t3, t1, t2, t3〉 = t1 〈−1, 1, 0, 0〉+ t2 〈−1, 0, 1, 0〉+ t3 〈−1, 0, 0, 1〉.

Page 113: MATH320

∗ Therefore, one basis of S is 〈−1, 1, 0, 0〉 , 〈−1, 0, 1, 0〉 , 〈−1, 0, 0, 1〉 . (Note that there are many

others.)

• (1,1) Decide whether the given collections of vectors are linearly independent:

〈−1, 1, 1〉, 〈1,−1, 1〉, and 〈−1, 1, 0〉 in R3.

∗ Not independent : an explicit dependence is (−1) · 〈−1, 1, 1〉+1 · 〈1,−1, 1〉+2 · 〈−1, 1, 0〉 = 〈0, 0, 0〉.

∗ Alternatively, one could see they are dependent by checking that

∣∣∣∣∣∣−1 1 −11 −1 11 1 0

∣∣∣∣∣∣ = 0.

〈2, 1, 1, 1, 1〉, 〈1, 2, 1, 1, 1〉, 〈1, 1, 2, 1, 1〉, and 〈1, 1, 1, 2, 1〉 in R5.

∗ Independent : Suppose a 〈2, 1, 1, 1, 1〉+ b 〈1, 2, 1, 1, 1〉+ c 〈1, 1, 2, 1, 1〉+ d 〈1, 1, 1, 2, 1〉 = 〈0, 0, 0, 0, 0〉.∗ Then 2a+ b+ c+d = 0, a+2b+ c+d = 0, a+ b+2c+d = 0, a+ b+ c+2d = 0, and a+ b+ c+d = 0.

∗ Subtracting the last equation from each of the other four gives a = 0, b = 0, c = 0, and d = 0.

∗ Thus there is no nontrivial linear combination of these vectors giving the zero vector, so they arelinearly independent.

Page 114: MATH320

Chapter 7 Notes

David SealSpring 2009

6. Eigenvalues and Eigenvectors

A scalar1 λ ∈ C is an eigenvalue of the n×n matrix A if there exists a non-zerovector ~v ∈ Rn such that

A~v = λ~v.

If we subtract λ~v from both sides, the above equation is equivalent to

(A − λI)~v = ~0

for some non-zero ~v. Since ~v 6= ~0, we require A−λI to be non-invertible, otherwisethe only solution to the last equation is ~v = ~0. That is we can perform the followingalgorithm to find every eigenvalue and eigenvector:

(1) Set det (A − λI) = |A − λI| = 0 and solve for λ. This gives us everypossible eigenvalue.

(2) For each eigenvalue computed in step (1), solve (A−λI)~v = ~0 for ~v. Notethat this is just a restatement of the equation A~v = λ~v.

7. Linear Systems of Differential Equations

We oftentimes run into matrices that do not have ‘real’-valued eigen-values.In such cases, we have two options: 1) give up and say this matrix has no eigen-values, or 2) use complex numbers to factor the characteristic polynomial. Theterm imaginary is a bit of a misnomer because there is nothing ‘imaginary’ aboutcomplex (imaginary) eigenvalues, so I will try to stick to the term complex.

7.3. In this section, we are interested in solving the problem

(1) A~x = ~x

where A is an n×n square matrix. (I’m going to use the notation ddt

~x ≡ ~x becauseit’s easier to type. This is very common notation, but not used in our textbook).

7.3.1. Preliminary Theory. If ~x1, . . . , ~xk are solutions to A~x = ~x, then so is

(2) ~x(t) = c1~x1(t) + · · · + ck~xk(t).

Our goal is to find as many solutions ~xi(t) as possible, then take linear combi-nations to form the general solution.

The function ~xi(t) = eλit~vi solves A~x = ~x where λi is any eigenvalue of A witheigenvector ~vi. You can see this if you just plug it into the equation:

A~xi(t) = eλitA~vi = eλitλi~vi =d

dt(eλit~vi).

Taking linear combinations of the solutions ~xi, equation (??) becomes

(3) ~x(t) = c1eλ1t~v1 + · · · + ckeλkt~vk.

If we have k = n distinct eigenvalues, then this completely solves the problem. Insection 7.5 we handle the case when we don’t have ‘enough’ eigenvectors from theseeigenvalues.

1Sometimes we take scalars from R, and sometimes we take them from C - the definitions

and theory we learn is identical for either choice.

1

Page 115: MATH320

Note: that this theory doesn’t care if our eigenvalues are real or complex - theabove formula holds for any eigenvalue/eigenvector pair!

7.3.2. Complex Valued Solutions. Suppose we have a complex-valued solution~y(t) = ~x1(t) + i~x2(t). If we plus this into the differential equation, we can seethat the real and imaginary parts (Re(~y) = ~x1 and Im(~y) = ~x2) both solve thedifferential equation. On one hand,

A~y = A(~x1(t) + i~x2(t)) = A~x1(t) + iA~x2(t).

On the other hand,

~y = ~x1 + i~x2.

Since A~y = ~y, the real and imaginary parts 2 must be equal, and hence

~x1 = A~x1, ~x2 = A~x2

are two solutions of (1). So from one complex-valued solution the problem, weobtain two real-valued solutions.

If we have a complex eigenvector ~v with associated eigenvalue λ = α+ iβ, thenwe know from equation (2) that the function

~y(t) = eλt~v = eαteiβt~v = eαt(cos(βt) + i sin(βt))~v

is a solution to the problem. Hence we have two solutions (they will be linearlyindependent) from one complex eigenvalue/eigenvector pair by setting

~x1(t) := Re(~y) and ~x2(t) := Im(~y).

7.4. Second-Order Systems and Mechanical Vibrations.

7.5. Multiple Eigenvalue Solutions. Suppose we have a 3 × 3 matrix A

with eigenvalues λ = 2 (mult. ×2) and λ = 1 (mult. ×1). With the eigenvalueλ = 1, we expect exactly 1 eigenvector ~v1. With the eigenvalue λ = 2, we can haveeither one or two eigenvectors.

If λ = 2 produces two eigenvectors ~v2, ~v3, then we say this eigenvalue is complete

and the solution to (1) is given by equation (3):

~x(t) = c1et~v1 + c2e

2t~v2 + c3e2t~v3.

This is the good case: A has a complete set of eigenvectors and hence A is alsodiagonalizable.

If λ = 2 produces only one eigenvector (it’s going to produce at least one!),then we say this eigenvalue is defective. The number of ‘missing’ eigenvalues isits degeneracy d = 1. The way to handle this case is to perform the followingalgorithm:

(1) Solve (A − 2I)2~u2 = ~0 for ~u2. If this is the zero matrix, then any vector~u2 works, and so you can usually get away with choosing the easiest one:~u2 = (1, 0, 0).

(2) Set ~u1 = (A − 2I)~u2. Doing this actually forces ~u1 to be an eigenvectorsince then

(A − 2I)~u1 = (A − 2I)2~u2 = ~0.

and hence A~u1 = λ~u1.(3) Consider the two solutions ~x1(t) = e2t~u1 and ~x2(t) = e2t(t~u1 + ~u2). The

general solution is given by equation (2) with ~x3 = et~v1.

2Complex numbers, just like vectors are equal if and only if each component is equal.

Page 116: MATH320

CHAPTER 1

First-Order Differential Equations

1. Diff Eqns and Math Models

Know what it means for a function to be a ‘solution’ to a differential equation.In order to figure out if y = y(x) is a ‘solution’ to the differential equation, we plugthis into the differential equation and see if it solves it.

For example, in algebra we may be faced with an equation like

x2 − 5x + 10 = 2x.

We can verify that x = 2 is a solution to this equation by plugging it in andverifying that it solves the equation. Here is my proof that x = 2 is the solution tothe previous equation:

LHS = x2 − 5x + 10 = 22 − 5(2) + 10 = 4 − 10 + 10 = 4.

RHS = 2x = 2(2) = 4.

Since LHS = RHS, x = 2 is a solution. When verifying your solution, do NOTmanipulate both sides of the equation. For example, the following is NOT a validproof that x = 2 is a solution:

22 − 5(2) + 10 = 2(2)

4 − 10 + 10 = 4

4 − 10 = 4 − 10

−6 = −6.

For practice reviewing derivatives, you can look at 1, 4, 7, 10 and 17, 20, 23.

2. Integrals as General and Particular Solutions

This section is intended to be more practice with integration. You shouldbe able to solve 1–8 and 10 without thinking about them too much. Rememberintegration by parts is your friend:

u dv = uv −

v du.

Try using this tool on problem 10.Another integration technique that is extremely useful is Partial Fractions. In

order to do the partial fractions setup for the function f(x) = 1

x3+3x2 we first needto completely factor the denominator:

f(x) =1

x3 + 3x2=

1

x2(x + 3).

1

Page 117: MATH320

Once you have all the factors, just set up the partial fractions. If any factor hasdegree higher than 1, you need to put a polynomial that has one less degree on thenumerator:

1

x2(x + 3)=

A

x + 3+

Bx + C

x2.

The velocity problems are covered in much more detail in section 2-3. Problems11–18 are good practice with integration and require you to know the relationsbetween a, v and x. You should be able to do these without thinking about theproblems too much.

3. Slope Fields and Solution Curves

You have one theorem whose statement you need to memorize:

Theorem 1. If f(x, y) and the partial derivative ∂f

∂y(x, y) is continuous ‘near’

the point (x0, y0), then the initial value problem

dy

dx= f(x, y), y(x0) = y0

has a unique solution on some (possibly small) interval containing x0.

If the hypotheses of the theorem are not satisfied, then anything goes. Youmay have one solution, infinitely many solutions or no solutions whatsoever.

For practice with this theorem, you may want to consider trying problems:11, 12, 13, 14, 17, 18.

As a variation on this theorem, you can drop the hypothesis on the continuityof ∂f

∂xand obtain existence (possibly without uniqueness). See

http://en.wikipedia.org/wiki/Peano existence theorem.

4. Separable Equations and Applications

The technique of separating variable allows us to solve a whole new class ofproblems that aren’t covered in the standard 222 course. Any problem from 1–28will be extremely good practice for the exam.

In addition, this section introduces a few new models not previously covered:

(1) Population Growth: dPdt

= kP ;. k > 0 is the growth constant.

(2) Radioactive Decay: dNdt

= −kN ; k > 0 is the decay constant.

(3) Newton’s Law of Cooling: dTdt

= k(A − T ); k > 0 is a constant that hasto do with how well insulated the system is. A ≡ constant is the ambientair temperature.

Now that we know about phase diagrams, you should be able to sketch one of thesefor each of these equations. Note that the first two equations are essentially thesame, dy

dt= const · y, after you’re supplied with enough initial conditions, you’ll be

able to determine the correct sign for the constant.For practice, you may want to consider trying problems 33–36, 43, 49, 65.

5. Linear First-Order Equations

In this section we learned how to solve any linear first order differential equa-tion. These equations are anything that can be written in what I call so called‘standard form’:

dy

dx+ P (x)y = Q(x).

Page 118: MATH320

In order to solve these equations, we use the INTEGRATING FACTOR METH-ODS. Memorize the method given on page 47. You should probably add in a step0, which says ‘write in standard form’. The most important thing to remember ishow to compute the integrating factor: ρ = exp(

P (x) dx).Practice problems 1–19 (every third) until you become comfortable with this

method. It’s VERY important to understand this method because it is one of theonly two methods that we learn in the first two chapters.

The integrating factor method allows us to solve Mixing Problems. The stan-dard setup is given by the picture on page 51. I think it helps to do the dimensionalanalysis to come up with the terms:

dx

dt= stuff in − stuff out.

Since[

dxdt

]

= g salt

sec, we know that ‘stuff in’ has units of g salt

sec. Therefore if we take

[ri] = g salt

Land multiply it by [ci] = L

secwe’ll have the correct units:

dx

dt= rici − roco.

The rate in, ri and ci is usually given to us. To find co we need to compute thisusing what we know about the problem. The volume V (t) of the tank can usuallybe explicitely computed, and hence co = x

V. The differential equation becomes:

dx

dt= rici − roco = rici −

x(t)

V (t)ro.

For practice with this problem redo your homework problem 33, and try solving 36and 37 on your own.

Page 119: MATH320
Page 120: MATH320

CHAPTER 2

Mathematical Models and Numerical Methods

1. Population Models

Here we encounter a more sophisticated population equation that’s called thelogistic’s equation. It can be found on page 79, equation 3:

dP

dt= kP (M − P ).

One can think of this as an extension to the unbounded population growth model:dPdt

= kP by subtracting another term that is proportional to P 2. Thus, when P is

small, the kP term dominates, and when P is large, the P 2 term dominates.From a phase diagram, you should be able to immediately see what the limiting

population is.For review of partial fractions, you may want to consider looking at problems

1–8. For practice with some population models, try problems 9 and 21.

2. Equilibrium Solutions and Stability

When we’re studying first order differential equations of the form

dx

dt= f(x)

where f = f(x) is a function of x only, (note that t is the independent variable,and in this context, x is the dependent variable) we can oftentimes derive qual-itative information about what happens as the solution evolves over time from agiven initial condition x0.

Know how to find a critical point. (set f(x) = 0, and solve for x). Know howto determine if your critical point is stable/unstable/semistable.

For practice, try problems 1, 3, 5 and 9. Know how to analyze the stabilityand long term behavior for the logistics equation from section 2-1 as well as thevariation that includes the harvesting parameter h ≥ 0:

dP

dt= kP (M − P ) − h.

What qualitative behavior changes as h is increased? Is there a special point whereh drastically changes the behavior of the solutions?

3. Acceleration-Velocity Models

3.1. Gravity. Newton’s second law (abbreviated N2L) states that F = ma

where F is the force applied on the object, m is the mass of the object, and

a = dvdt

= d2y

dt2is the objects acceleration. Recall the relations between position,

velocity and acceleration.

5

Page 121: MATH320

In general F is actually a vector, so it has velocity and magnitude. Sincewe have only ever worked in 1-dimension, the only options for F are positive ornegative.

When N2L is applied to gravity, we have |FG| = ma. One of the fundamentalassumptions concerning gravity is |FG| = GM·m

d2 where d is the distance betweenthe two objects, G is the gravitational constant and M , m are the masses of thetwo bodies. After dividing by m, we have

|a(t)| =GM

d2,

so that in fact the acceleration due to the force of gravity is independent of theobjects mass!

When d ≈ R doesn’t vary too much (i.e. for an object which stays near thesurface of the earth), then we can say that GM

d2 ≈ g is a constant. Hence:

dv

dt= −g.

The minus sign is to account for the direction of the force of gravity. When we’reusing units of ft, we have that g = 32. When we’re measuring distance in termsof meters, g = 9.8. For practice, I suggest you review your homework problems:25, 30. In addition I suggest looking at problems 3 and 20 for extra practice.

4. Numerical Approximation: Euler’s Method

Euler’s method is the most basic and fundamental Numerical Technique forsolving differential equations. I should emphasize that when one studies any realworld applied problem, one usually needs to resort to numerical techniques forsolving a differential equation.

The best way to derive Euler’s method is to start with what we’ll call thediscrete derivative. To see where this comes from recall the definition of thederivative:

y′(x) = limh→0

y(x + h) − y(x)

h.

From here it makes sense that y′(x) ≈ y(x+h)=y(x)

hwhen h is very small. Now if we

write yn+1 = y(xn + h); yn = y(xn); xn+1 = xn + h, we have the approximation

y′(xn) ≈yn+1 − yn

h.

If we’re trying to solve the equation

dy

dx= f(x, y)

then we just set this discrete derivative equal to the right hand side function f :

yn+1 − yn

h= f(xn, yn).

Solving this equation for yn+1 gives us the updating formula. Initialize y0 and x0

from the initial conditions. For n = 0, 1, 2, 3, ... do:

yn+1 = yn + hf(xn, yn), xn+1 = xn + h.

For practice try problems 3 and 6.

Page 122: MATH320

PRACTICE PROBLEMS FOR EXAM 2

1. Matrix Operations and Inverses

(1) Problems 31–38 in Section 3.4 of Edwards and Penney.(2) Problems 31, 33, 34, 44 in Section 3.5 of Edwards and Penney.(3) Suppose A is a 2 × 1 matrix and B is a 1 × 2 matrix. If C = AB, show that C is not

invertible.

Proof. If

A =

[a11a21

]B =

[b11 b12

].

Then the product AB is

AB =

[a11b11 a11b12a21b11 a21b12

].

If we perform the row-operation −a21a11

R1 + R2 → R2 we get the matrix[a11b11 a11b12

0 0

].

Since AB is row-equivalent to a matrix with a row of zeros, it is not invertible.

2. Determinants

(1) Problems 47, 52, 60 in Section 3.6 of Edwards and Penney.(2) Let A be a matrix with dimensions (2k + 1) × (2k + 1). This matrix is a skew-symmetric

matrix, which means AT = −A. Prove that detA = 0.

Proof. Since detA = detAT , we have detA = det(−A). Note the −A means we aremultiplying each row by −1, we can factor −1 out of det(−A) once for each row. That is,det(−A) = (−1)2k+1 det(A) = −det(A). Thus, det(A) = −det(A) which is only possible ifdetA = 0.

3. Basic Vector Space Properties and Subspaces

(1) Problems 24, 28–31 in Section 4.2 of Edwards and Penney.(2) If V is a vector space verify that

(v1 + v2) + (v3 + v4) = [v2 + (v3 + v1)] + v4

for all vectors v1, v2, v3, v4 in V . Use only the definition of a vector space.

Date: November 1, 2012.

1

Page 123: MATH320

Proof. This is an exercise in the use of associativity and commutativity.

(v1 + v2) + (v3 + v4) = [(v1 + v2) + v3] + v4

= [(v2 + v1) + v3] + v4

= [v2 + (v1 + v3)] + v4

= [v2 + (v3 + v1)] + v4.

Easy as pie.

(3) Let Rn denote the usual vectors of n-tuples of real numbers. However, we now define a newtype of vector addition and scalar multiplication.

v1 ⊕ v2 = v1 − v2

c v1 = −cv1where the right hand side of the equations are defined in the usual way for vectors in Rn.Which properties of a vector space are satisfied by ⊕, ?

Solution. We go through all the properties.(a) v1 ⊕ v2 6= v2 ⊕ v1 for all vectors. Thus commutativity for addition fails.(b) Now we test for associativity.

u⊕ (v ⊕ w) = u⊕ (v − w) = u− (v − w)

= (u− v) + w = (u⊕ v)⊕ (−w).

So the operation ⊕ is not associative.(c) There is no zero element for this operation. The usual zero vector is a right-side

identity,

u⊕ 0 = u− 0 = u

but there is no left-side identity,

0− u = −u 6= u.

(d) We cannot even talk about inverses, because there is no identity element for ⊕!(e) Let b, c be scalars. Then

b (c u) = b (−cu) = (−b)(−cu) = cbu = (−c)(−bu) = b (b u).

So is commutative.(f) There is an identity element for this operation, however it is −1 rather than 1:

(−1) u = −(−1)u = u.

(g) The first distributive law

a (u⊕ v) = a (u− v) = (−a)(u− v) = (−a)u− (−a)v = a u⊕ a v

holds.(h) The second distributive law

(a + b) u = −(a + b)u = −au− bu = a u⊕ [(−b) u].

holds.

2

Page 124: MATH320

(4) Let R2 denote all pairs of real numbers. We define vector addition to be

(x, y)⊕ (u, v) = (x + u, 0)

and scalar multiplication as

c (x, y) = (cx, 0).

Is R2 with ⊕, as the operations a vector space?

Solution. We go through all the properties.(a) We first check commutativity. We have

(x1, x2)⊕ (y1, y2) = (x1 + y1, 0) = (y1 + x1, 0) = (y1, y2)⊕ (x1, x2)

for all vectors. Thus commutativity for addition is true.(b) Now we test for associativity.

u⊕ (v ⊕ w) = (u1, u2)⊕ (v1 + w1, 0) = (u1 + v1 + w1, 0)

= (u1 + v1, 0)⊕ (w1, 0) = (u⊕ v)⊕ w.

So the operation ⊕ is associative.(c) There is no identity element for this operation,

(u1, u2)⊕ v = (u1 + v1, 0) 6= u.

regardless of how we choose v, provided u2 6= 0.Since one of the properties has failed, these operations cannot form a vector space. If wehad planned better we could have gone straight to property (c), then stopped once thatproperty failed.

(5) Let F be the vector space of real-valued functions. Is the subset f ∈ F | f(0) = f(1) asubspace of F?

Solution. We test for the two subspace properties.(a) Let f, g be in the subset above. In other words, f(0) = f(1) and g(0) = g(1). Then

(f + g)(0) = f(0) + g(0) = f(1) + g(1) = (f + g)(1)

so f + g is also in the subset.(b) Let c be a scalar and f a function in the subset. Then

(cf)(0) = cf(0) = cf(1) = (cf)(1)

so cf is in the subset.Both of the subspace properties are satisfied so the subset is a subspace.

(6) It is a fact that Rn×n, the set of all n×n matrices, is a vector space with the usual definitionof A + B and cA for any real c.(a) Is the set of all invertible matrices a subspace?(b) Let B be a given matrix in Rn×n. Is the set of all matrices such that AB = BA a

subspace of Rn×n?

Solution. (a) No, this is not a subspace. It fails the first subspace property. In particular,the identity matrix I and −I are both invertible, but I + (−I) = 0 where 0 is thematrix of all zeros is not invertible.

(b) Yes. We verify both properties.(i) Let A, C be matrices in the above subspace, so AB = BA and CB = BC. Then

(A + C)B = AB + CB = BA + BC = B(A + C)

so A + C is in the subset.3

Page 125: MATH320

(ii) Let c be a scalar. Then

(cA)B = cAB = cBA = B(cA)

so cA is also in the subset.Because both properties are satisfied, the subspace theorem tells us this subset is asubspace.

4. Span and Bases

(1) Problems 23–32 of Section 4.3 in Edwards and Penney.(2) Problems 24, 34, 35 of Section 4.4 in Edwards and Penney.(3) Suppose two vectors u and v are linearly dependent. Prove that one of them is a scalar

multiple of the other.

Proof. If one of these vectors if ~0 this is trivially true. So we assume both u and v arenonzero vectors. Then if they are linearly dependent we have

c1u + c2v = ~0

for c1, c2 6= 0. Then

v = −c1c2u

so the vectors are scalar multiples of each other.

(4) Find three vectors in R3 which are linearly dependent, but any two of them are linearlyindependent (between just the two of them).

Solution. Visually, this corresponds to 3 vectors lying in the same plane. So we can pickthe vectors (1, 0, 0), (0, 1, 0), and (1, 1, 0). We must actually prove the solution is correct,so first we verify the vectors are mutually linearly independent.

Pair 1. First we show (1, 0, 0) and (0, 1, 0) are LI. If

c1(1, 0, 0) + c2(0, 1, 0) = (0, 0, 0)

⇒ (c1, 0, 0) + (0, c2, 0) = (0, 0, 0)

⇒ (c1, c2, 0) = (0, 0, 0)

so c1 = c2 = 0. Thus they are linearly independent.Pair 2. First we show (1, 0, 0) and (1, 1, 0) are LI. If

c1(1, 0, 0) + c2(1, 1, 0) = (0, 0, 0)

⇒ (c1, 0, 0) + (c2, c2, 0) = (0, 0, 0)

⇒ (c1 + c2, c2, 0) = (0, 0, 0)

so c2 = 0 and c1 + c2 = c1 = 0. Thus they are linearly independent.Pair 3. First we show (0, 1, 0) and (1, 1, 0) are LI. If

c1(0, 1, 0) + c2(1, 1, 0) = (0, 0, 0)

⇒ (0, c1, 0) + (c2, c2, 0) = (0, 0, 0)

⇒ (c2, c1 + c2, 0) = (0, 0, 0)

so c2 = 0 and c1 + c2 = c1 = 0. Thus they are linearly independent.4

Page 126: MATH320

Now we must show together they are linearly dependent.

c1(1, 0, 0) + c2(0, 1, 0) + c3(1, 1, 0) = (0, 0, 0)

⇒ (c1 + c3, c2 + c3, 0) = (0, 0, 0)

so c1 = −c3 and c2 = −c3 with no restrictions on c3. Thus there are nonzero values c1, c2, c3such that the vectors add to ~0, so they are linearly dependent.

5. Inner Products

(1) Problems 26, 30, 32 of Section 4.6 in Edwards and Penney.(2) Consider the vector space Rn×n of n× n matrices with real entries. Determine if

(A,B) = tr (ABT )

is an inner product on Rn×n, where trA is the sum of the diagonal entries of A.

Proof. We must verify (A,B) has the inner product properties.(a) The trace is unchanged by taking the transpose because the transpose leaves the diag-

onal unchanged. So trA = trAT . Thus

(A,B) = tr (ABT ) = tr ((ABT )T ) = tr ((BT )TAT ) = tr (BAT ) = (B,A).

So the first property is satisfied.(b) The trace is also linear so tr (A + B) = trA + trB. Then

(A,B + C) = tr (A(B + C)T ) = tr (ABT + ACT )

= tr (ABT ) + tr (ACT ) = (A,B) + (A,C)

so the second property holds.(c) By the aforementioned linearity of the trace, we have

(cA,B) = tr (cABT ) = ctr (ABT ) = c(A,B).

The third property is satisfied.(d) This one seems like a doozy. The expanded definition of matrix multiplication is

[AB]jj =n∑

i=1

aijbji.

In our case B = AT so [AT ]ji = aij . Thus the sum above is

[AAT ]jj =n∑

i=1

a2ij .

From this we get

(A,AT ) =

n∑j=1

[AAT ]jj

=n∑

j=1

n∑i=1

a2ij ≥ 0

because each a2ij ≥ 0. In fact, we only have (A,AT ) = 0 if each of those entries is zeroso the fourth and final property holds.

Thus it defines an inner product.

5

Page 127: MATH320

EXAM 2 REVIEW

DAVID SEAL

3. Linear Systems and Matrices

3.2. Matrices and Gaussian Elimination. At this point in the course, you allhave had plenty of practice with Gaussian Elimination. Be able to row reduceany matrix you’re given. My advice to you concerning not making mistakes is thefollowing:

(1) Avoid fractions! Use least common multiples rather than deal with them.Computers don’t make mistakes adding fractions, we do.

(2) Take small steps. When I do these problems, you will never see me writedown the operation “→−2R1+3R2”. Instead I would break this up into threesteps:

1.) →R2=3R2

2.) →R1=−R1

3.) →R2=R1+R2 .

In fact you can kind of cheat and do both operations 1 and 2 at once. Thisisn’t an elementary row operation, but it doesn’t hurt to do that.

(3) Use back substitution after you have your matrix in diagonal form. Youdon’t need your matrix in reduced row echelon form to find a solution, justrow echelon form.

What I mean by this third piece of advice is the following. Suppose you’vealready performed the following row operations:

[

A ~b

]

→row ops

4 2 0 00 6 1 10 0 2 2

.

This is now in row echelon form, but not reduced row echelon form (see next section)because the off diagonal entries are non zero. But as far as we care, this is enoughinformation to solve for all the variables. Write down the equation described bythe 3rd row: 2z = 2, find z = 1. Then write down the equation described by the2nd row: 6y + z = 1, and solve for y.

For practice see problems 11, 12 and 23, 24. Practice enough of 11-18 until youdon’t have to think about doing these problems, it just becomes mechanical.

3.3. Reduced Row-Echelon Matrices. The most important theorem from thissection is

Theorem 1 (The Three Possibilities Theorem). Every linear system has either a

unique solution, infinitely many solutions or no solutions.

Date: Updated: March 29, 2009.

1

Page 128: MATH320

2 DAVID SEAL

Exercise: If A is an n × n matrix, what are the possible outcomes of solvingA~x = ~0? Hint: What could A’s reduced row echelon form look like?

Know how to put a matrix in reduced row echelon form. Usually if you’reinterested in solving an equation you don’t do this, but this vocabulary term couldcome up and you should know it. For practice try problems 1-4.

3.4. Matrix Operations. Know if and when multiplication/addition is defined formatrices and how to do it. These are easy problems to do, but in case they showup on the exam you want to make sure you don’t make any arithmetic mistakes!I’m a bit reluctant to suggest problems from this section, but you should be ableto do any of 1-16 with your eyes closed.

See problems 17,18,19. Do these look familiar now? Can you produce a vec-tor/vectors that uniquely describe the kernel of a particular matrix for each ofthese problems? What are the dimensions of these linear subspaces? (Hint: thedimension is the number of free variable that are necessary to describe the space).

3.5. Inverses of Matrices. The important definition.

Definition. An n × n matrix A (square matrix only!) is said to be invertible if

there exists a matrix B such that AB = BA = In×n where In×n is the identity

matrix.

Fact. Inverses, when they exist, are unique. This is a very nice feature to havewhen you define something! You wouldn’t want to get in a fight with your friendover who found the ‘better’ inverse. We denote the (unique) inverse matrix of A

by A−1.You DO NOT need to be prepared to find the inverse of a matrix using row

operations! Prof. Bertrand said this type of problem is too long for an examsituation.

One problem you might run into is given two matrices, are they inverses of eachother? To check this you need to know how to multiply two matrices together andthe definition of the inverse. The following problem is short enough to appear onan exam.

Exercise: If A and B are invertible matrices, is AB an invertible matrix? If so,what’s the inverse matrix? Can you prove this?

3.6. Determinants. We’ll first begin with an extremely important fact. If A is asquare matrix, then A is invertible if and only if det(A) 6= 0. You can add this totheorem 7 given on page 193. See your notes from discussion section about whatI had to suggest in how to go about finding the determinant of a matrix. Youcan always do this through cofactor expansion, the thing to keep in mind is thecheckerboard pattern that shows up when evaluating the sign of each coefficient.

See problems 1,2. I don’t imagine you’ll be asked to evaluate the determinantfor any large (i.e. larger than 4 × 4) matrix.

One other trick to keep in mind is what sort of ‘row operations’ can you do toa matrix when evaluating the determinant. See property 5 listed in the text. Thissays you’re allowed to add a multiple of any row to another row and this doesn’tchange the determinant. You have to be very careful not to misuse this property!This doesn’t mean you can arbitrarily multiply a row by a number like we’re usedto with systems of equations.

See problems 7, 9.

Page 129: MATH320

EXAM 2 REVIEW 3

Exercise: If A is an n×n matrix, express det(λA) in terms of λ and the matrixA. Hint: The answer is not λdet(A).

4. Vector Spaces

4.2. The Vector Space Rn and Subspaces. If we have a vector space V , andwe have a subset W ⊂ V , a natural question to ask is whether or not W itself formsa vector space. This means it needs to satisfy all the properties of a vector spacethat are listed on page 236 of your text. The bad news is this is quite a long list,but the good news is we don’t have to check every property on the list, becausemost of them are inherited from the original vector space V . In short, in order tosee if W is a vector space, we need only check if W passes the following test.

Theorem 2. If V is a vector space and W ⊂ V is a non-empty subset, then W

itself is a vector space if and only if it satisfies the following two conditions:

(1) Additive Closure If ~a ∈ W and ~b ∈ W , then ~a +~b ∈ W .

(2) Multiplicative Closure If λ ∈ R and ~a ∈ W , then λ~a ∈ W .

The statement of this theorem has the term ‘non-empty’ as one hypothesis forthe theorem to be true. In most applications of this theorem, we actually replacethe statement non-empty with requiring that ~0 ∈ W . I.e., the theorem from aboveis equivalent to the following theorem.

Theorem 3. If V is a vector space and W ⊂ V , then W itself is a vector space if

and only if it satisfies the following two conditions:

(1) Additive Closure If ~a ∈ W and ~b ∈ W , then ~a +~b ∈ W .

(2) Multiplicative Closure If λ ∈ R and ~a ∈ W , then λ~a ∈ W .

(3) Non Empty ~0 ∈ W .

Note that these are two properties that are on the long laundry list of propertieswe require for a set to be a vector space.

Example Consider W := ~a = (x, y) ∈ R2 : x = 2y. Since W ⊂ R2, we maybe interested if W itself forms a vector space. To answer this question we need onlycheck two items:

(1) Additive Closure: An arbitrary element in W can be described by (2y, y)where y ∈ R. Let (2y, y), (2z, z) ∈ W . Then (2y, y)+(2z, z) = (2y +2z, y +z)) ∈ W since 2y + 2z = 2(y + z).

(2) Multiplicative Closure: We need to check if λ ∈ R, and ~a ∈ W , thenλ~a ∈ W . Again, an arbitrary element in W can be described by (2y, y)where y ∈ R. Let λ ∈ R and (2y, y) ∈ W . Then λ(2y, y) = (2λy, λy) ∈ W

since the first coordinate is exactly twice the second element.

Note: it is possible to write this set as the kernel of a matrix. In fact, you can checkthat W = ker(A), where A1×2 =

[

−2 1]

. We actually have a theorem that saysthe kernel of any matrix is indeed a linear subspace.

Example Consider W := ~a ∈ R3 : z ≥ 0. In order for this to be a linearsubspace of R3, it needs to pass two tests. In fact, this set passes the additiveclosure test, but it doesn’t pass multiplicative closure! For example, (0, 0, 5) ∈ W ,but (−1) · (0, 0, 5) = (0, 0,−5) /∈ W .

Page 130: MATH320

4 DAVID SEAL

Definition. If Am×n is a matrix, we define ker(A) := x ∈ Rn : Ax = 0. This

is also called the nullspace of A.

Note that ker(A) lives in Rn.

Definition. If Am×n is a matrix, we define Image(A) := y ∈ Rm : Ax =y for some x ∈ Rn. This is also called the range of A.

Note that Image(A) lives in Rm.

Theorem 4. If Am×n is a matrix, then ker(A) is a linear subspace of Rn and

Image(A) is a linear subspace of Rm.

Exercise: show this theorem is true. In lieu of section 4-4, I think likely candi-dates from this section will be proving a set of element is not a vector space. Forexample see problems 7, 9, 10, 13.

Problems 15-22 serve as excellent practice for leading up to section 4-4, but youshould get the gist after doing problems from that section.

4.3. Linear Combinations and Independence of Vectors. If we have a col-lection of vectors ~v1, ~v2, . . . , ~vk, we can form many vectors by taking linear com-binations of these vectors. We call this space the span of a collection of vectors,and we have the following theorem:

Theorem 5. If ~v1, ~v2, . . . , ~vk is a collection of vectors in some vector space V ,

then

span~v1, ~v2, . . . , ~vk := ~w : ~w = c1~v1 + c2~v2 + · · ·+ ck~vk, for some scalars ci ∈ R

is a linear subspace of V .

For a concrete example, we can take two vectors ~v1 = (1, 1, 0) and ~v2 = (1, 0, 0)which both lie in R

3. Then the set W = span(1, 1, 0), (1, 0, 0) describes a planethat lives in R3. This set is a linear subspace by this previous theorem. In fact, wecan be a bit more descriptive and write W = (x, y, z) ∈ R3 : z = 0.

If we continue with this example, it is possible to write W in many other ways.In fact, we could have written

W = span(1, 1, 0), (1, 0, 0), (−5, 1, 0) = span(−10, 1, 0), (2, 1, 0).

These examples illustrate the fact that our choice of vectors need not be unique.What is unique, is the least number of vectors that are required to describe the set.In fact this is so important we give it a name, and call it the dimension of a vectorspace. This is the content of section 4.4. In our example, dim(W ) = 2, but rightnow we don’t have enough tools to show this.

In order to make this statement ‘least’, precise, we need to introduce followingimportant definition.Definition Vectors ~v1, ~v2, . . . , ~vk are said to be linearly independent if

c1~v1 + c2~v2 + · · · + ck~vk = 0

for some scalars ci, it must follow that ci = 0 for each i.OK, so definitions are all fine and good, but how do we check if vectors are

linearly independent? The nice thing about this definition is it always boils downto solving a linear system.

Example As a concrete example, let’s check if vectors ~v1, ~v2 linearly indepen-dent where ~v1 = (4,−2, 6,−4) and ~v2 = (2, 6,−1, 4).

Page 131: MATH320

EXAM 2 REVIEW 5

We need to solve the problem

c1~v1 + c2~v2 = 0.

This reduces to asking what are the solutions to

c1

4−2

6−4

+ c2

26

−14

=

0000

.

We can write this problem as a matrix equation A~c = ~0 where

A =

4 2−2 6

6 −1−4 4

, ~c =

[

c1

c2

]

and solve this using Gaussian elimination.

4 2 0−2 6 0

6 −1 0−4 4 0

→row ops

1 0 00 1 00 0 00 0 0

.

Thus c1 = c2 = 0 is the only solution to this problem, and so these two vectors arelinearly independent.

To demonstrate that a collection of vectors are not linearly independant, it suf-fices to find a non-trivial combination of these vectors and show they sum to ~0. Forexample, see example 6 in the textbook.

We have another method for checking if vectors are linearly independent. Thismethod is more complicated to apply so I encourage you become familiar withusing Gaussian Elimination (row operations) for checking independence. But tobe complete, I’ll include this other method as well. Essentially what happenedin this last problem was we were able to do row operations to a matrix, and getthe ‘identity’ matrix in a part of it. Being row equivalent to the identity matrixis equivalent to being invertible and this is equivalent to having a non-zero deter-minant. What makes this tricky is what part of what matrix are we consideringbecause determinants are only defined for square matrices. We’ll do the specialcase first:

Theorem 6. Independance of n Vectors in Rn The n vectors ~v1, ~v2, . . . , ~vn are

linearly independent if and only if det(A) 6= 0 where (as usual)

A =

| | |~v1 ~v2 . . . ~vn

| | |

,

A is the (square n × n) matrix given by putting the vectors ~vi as collumns.

Note: this theorem currently only applies to a collection of n vectors in Rn, but

we can easily extend this to other collections of vectors. We’ll do this in a minute.What this theorem doesn’t give us is what combination gives us a sum that’s ~0.

I do think seeing a proof of this is instructive. It tells you why we care abouteven taking determinants here, as well being a good illustration for how one goesabout checking for independence.

Page 132: MATH320

6 DAVID SEAL

Proof: (If det(A) 6= 0, then the vectors are independent). Suppose c1~v1+· · ·+cn~vn =~0 for some scalars ci. Then as usual, this leads to solving the system A~c = ~0 where

A =

| | |~v1 ~v2 . . . ~vn

| | |

, ~c =

c1

c2

...cn

.

Since det(A) 6= 0, we know A is invertible which also means A is row equivalentto the identity matrix. This means we can do row operations to this and turn thesystem into:

[A | 0 ] =

| | | 0~v1 ~v2 . . . ~vn 0| | | 0

→row ops

1 0 0 · · · 00 1 0 · · · 00 0 1 · · · 0...

. . . 00 · · · 1 00 · · · 0 1 0

Evidently, c1 = c2 = · · · = cn = 0, so the vectors are linearly independent.

To summarize. Know the definition of linear independence. Know that checkingfor independence always results in asking what are the solutions to the equationA~c = ~0. If ~c = ~0 is the only solution, then they are independent, and if there’sa non-zero ~c that solves this, they are dependent. Finding ~c gives you the coef-ficients c1, c2, . . . that demonstrate linear dependence. Gaussian elimination (rowoperations) is a method that will give you the coefficients.

Exercise Is the set of vectors (0, 0, 1), (1, 0, 1), (0, 0, 0) linearly independent?Exercise Is the set of vectors (1, 1, 1), (1, 5,−1), (−10, 17, 0), (0, 1, 0) linearly

independent? Hint: There is a one line solution to this, or you can write down thefull linear system and solve it.

Exercise

5. Bases and Dimension for Vector Spaces

We’ll begin with the major definition of the section.

Definition. A collection of vectors ~v1, ~v2, . . . , ~vn are a basis for a vector space

V if they satisfy

(1) ~v1, ~v2, . . . , ~vn are linearly independent.

(2) V = span ~v1, ~v2, . . . , ~vn.

Fact: The number of vectors in any basis for a finite dimensional vector spaceis unique. We define dim(V ) = n where n is the number of vectors in a basis.

Theorem 7. If dim(V ) = n, and S = ~v1, ~v2, . . . , ~vn is a collection of n linearly

independent vectors, then S is a basis for V .

Fact: dim(Rn) = n. This is a deep, non-trivial result that depends on the previ-ous theorem! Why is this true? For R3, we have the basis S = (1, 1, 1), (1, 1, 0), (1, 0, 0)and there are 3 vectors here.

I’m guessing a problem very similar to your homework problems will show upwith a high probability. All of your problems asked you to find a basis for the kernelof a matrix. Know how to do this.

Page 133: MATH320

EXAM 2 REVIEW 7

Practice 12-20 until you know how to do this!

5.1. Good Luck!

Page 134: MATH320

Lecture Notes of Math 320 , Fall 2012

Bing Wang

1 September 5th, 2012; Differential Equation

An equation relating an unknown function and one or more of its derivatives is called a differentialequation.

Notation: use prime or dot to denote derivatives.

Example 1.1. The movement of free falling body.

x = −g.

Example 1.2. Harmonic Oscillator. When the body is displaced from its equilibrium position,x = 0, it experiences a restoring force, F, proportional to the displacement, x: F = −kx for somepositive k.

x + kx = 0.

Example 1.3. Newton’s law of cooling. The time rate of change of the temperature T (t) of a bodyis proportional to the difference between T and the temperature A of the surrounding medium.

dTdt

= −k(T − A),

Example 1.4. The time rate of change of population P(t) with constant birth and death rate isdescribed by the following equation

dPdt

= kP,

Terminology: order.

The order of a differential equation is the order of the highest derivative that appears in it.

Then check the order of the previous example and the following equation:

y′′′ + (y′)5 + y = 0.

1

Page 135: MATH320

Answer: 3.

Note that all the differential equations we will study in this semester contain only one indepen-dent variable. Such differential equation is called ordinary differential equation, or simplified asODE. If the number of independent variables is more than one, then the differential equation iscalled partial differential equation, abbreviated as PDE.

Return to Newton’s cooling rule. Suppose the domain Ω is occupied by some material, whichis put in an environment with temperature A. Then temperature is a function of both position andtime. We denote the temperature function by T = T (~x, t). It satisfies the following equation

∂tT = k

∂2

∂x21

+∂2

∂x22

+∂2

∂x23

T.

Moreover, T satisfies boundary condition T (~x, t) = A whenever ~x is on the boundary of Ω. Wewill not discuss PDE in the future. As you know, ODE is strongly related to the linear algebra. Insome sense, PDE is also related to some advanced version of linear algebra.

Many equations have no solution at all. For example, (y′)2 = −1 has no real solution. For theequations with solutions, how to compute it? If a solution exists, is the solution unique?

In many cases, the ODE can be solved formally. The solution can be written down explic-itly. The methods of finding exact solution includes, but not limited to, integrals, separation ofVariables, solution of linear equations, elementary substitution methods, etc. However, for mostODEs, it is impossible to find the precise solution. In this case, we will use numerical method tofind approximation solution.

The equations that can be solved includes:

1. Integrals: use the example of free falling body.

2. Separable Equations: use the example of Newton’s cooling rule.

3. Linear First Order equations: use the example of population model.

4. Substitution Methods and Exact Equations.

In this class and the next, we shall discuss simple examples of the first 3 types. In particular,the Separable Equations and Linear First Order equations.

At the end, discuss the solution of the harmonic oscillator. We don’t know how to solve it now.That’s the reason why do we need to study linear algebra.

2 September 7th, 2012; Separable Equation and First Order LinearEquation

Separable ODE

Example 2.1. Solve the initial value problem.

dydx

= −8xy, y(0) = 2.

2

Page 136: MATH320

It is easy to see that

dyy

= −8xdx,⇒ log y = −4x2 + C1,⇒ y = e−4x2+C1 = C2e−4x2.

Now decide the value of C2. Put in x = 0, y = 2, we have

2 = C2e0,⇒ C2 = 2.

Therefore, the final solution is y = 2e−4x2.

Analyze the general rule,

dydx

= H(x, y) = g(x)h(y) =g(x)f (y)

.

Then we have ∫f (y)dy =

∫g(x)dx.

Note that we obtain the relationship between y and x. However, generally this is given implicitly.

Then we return to solve the equation

dydx

=4 − 2x3y2 − 5

, y(1) = 3.

We can rewrite the equation in the form

(3y2 − 5)dy = (4 − 2x)dx⇒ y3 − 5y = 4x − x2 + C.

Now we decide the value of C. Put x = 1, y = 3 into the last equation, we obtain

12 = 3 + C ⇒ C = 9.

Therefore, we obtain the solution y3 − 5y + x2 − 4x − 9 = 0.

First order linear ODE.

Example 2.2. Solve the initial value problem

dydx− y =

118

e−x3 , y(0) = −1.

Multiply two sides by e−x. Applying the product rule in differentiation, we see that

Le f t = e−x(

dydx− y

)= e−x dy

dx− e−xy = e−x dy

dx+

de−x

dxy =

ddx

(e−xy

),

Right =118

e−4x3 =

ddx

(−

3332

e−4x3

).

3

Page 137: MATH320

Integration on both sides yields that

e−xy = −3332

e−4x3 + C,⇒ y = −

3332

e−x3 + Cex.

Now we decide the constant C. Put in x = 0, y = −1, we have

−1 = −3332

+ C,⇒ C =132.

Therefore, the solution is y =132

(ex − 33e−

x3).

From this example, we can see the general method to solve the first order linear ODE, whichhas the form

dydx

+ P(x)y = Q(x).

Let ρ(x) = e∫

P(x)dx. Then we have

ddx

(ye

∫P(x)dx

)= Q(x)e

∫P(x)dx + C.

Then

y(x) = e−∫

P(x)dx∫ (

Q(x)e∫

P(x)dx)

dx + C.

Let’s go back to examples to check this general method.

Example 2.3. Solve the initial value problem.

x2 dydx

+ xy = sin x, y(1) = y0.

First, we can write this equation as a first order linear ODE.

dydx

+yx

=sin x

x2 .

The integration factor should be e∫

1x dx = elog x = x. Now the equation becomes

xdydx

+ y =sin x

x,⇒

ddx

(xy) =sin x

x.

Denote S i(x) =∫ x

0sint

t dt. Then we see that

xy = S i(x) + C.

Put x = 1, y = y0 into the formula, we have y0 = S i(1) + C. Therefore, we have

y =S i(x)

x+

y0 − S i(1)x

=1x

(S i(x) − S i(1) + y0) .

4

Page 138: MATH320

3 September 10th, 2012; Exact Equations and Substitution Methods

Exact Differential Equation

Let’s see the example

xdy + ydx = 0,⇒ d(xy) = 0,⇒ xy = C.

In general, if we have F(x, y(x)) = C, then taking derivatives yields

dF(x, y(x)) = 0,⇒∂F∂x

+∂F∂y

∂y∂x

= 0.

Example 3.1. Solve the initial value problem.

2xy3dx + 3x2y2dy = 0, y(1) = 2.

Solution The equation can be written as

d(x2y3) = 0,⇒ x2y3 = C.

Put in x = 1, y = 2, we obtain C = 8. So the solution is x2y3 = 8.

In general, suppose we have a differential equation in the form

M(x, y)dx + N(x, y)dy = 0,

can we use the previous method to solve it? To be precise, can we find a function F = F(x, y) suchthat dF = Mdx + Ndy? In other words, is the differential equation exact?

Actually, there is a necessary condition. If the differential equation is exact, then we have

M =∂F∂x, N =

∂F∂y,⇒

∂M∂y

=∂N∂x

=∂2F∂x∂y

.

Now check that 8xdx + 9xydy = 0 is not exact.

The amazing thing is that ∂M∂y = ∂N

∂x is also a sufficient condition if the domain where thedifferential equation is solved is not too complicated.

Theorem 3.1. Suppose Ω is a rectangle domain of the plane. Then the differential equation

M(x, y)dx + N(x, y)dy = 0

is exact if and only if

∂M∂y

=∂N∂x

.

at each point of Ω.

5

Page 139: MATH320

Example 3.2. Solve the differential equation(2x sin y + 3x2y

)dx +

(x3 + x2 cos y + y2

)dy = 0.

Solution Note that M = 2x sin y + 3x2y, N = x3 + x2 cos y + y2. Check

∂M∂y

= 2x cos y + 3x2 =∂N∂x

.

This equation is exact by Theorem 3.1. So there is a function F = F(x, y) such that

dF =∂F∂x

dx +∂F∂y

dy =(2x sin y + 3x2y

)dx +

(x3 + x2 cos y + y2

)dy = 0

It follows that

∂F∂x

= 2x sin y + 3x2y,⇒ F = x2 sin y + x3y + f (y);

∂F∂y

= x3 + x2 cos y + y2,⇒ F = x3y + x2 sin y +y3

3+ g(x).

This forces that f (y) =y3

3 + C, g(x) ≡ C. Therefore, the solution is

x3y + x2 sin y +y3

3+ C = 0,

where C is an arbitrary constant.

Substitution Methods

Consider the equation dydx = (x + y)2. It is not separable. Let u = x + y, where u is the new

variable, then we have

dudx

= 1 +dydx

= 1 + u2.

This new equation is separable and can be solved:

du1 + u2 = dx,⇒ tan−1 u =

∫du

1 + u2 =

∫dx = x + C,⇒ u = tan(x + C).

Therefore, we have the solution of the original equation

y = u − x = tan(x + C) − x.

There are numerous substitution which is based on smart observation. In this class, we focuson two basic substitutions: Homogeneous equation and Bernoulli Equations.

Homogeneous Equation

6

Page 140: MATH320

Consider the equation dydx = ( y

x )2 + 1. Let u =yx , then y = ux ⇒ dy

dx = u + x dudx . Therefore, we

have

u + xdudx

= u2 + 1,⇒ xdudx

= u2 − u + 1⇒du

u2 − u + 1=

dxx.

So we arrived a separable equation which can be solved.

In general, a differential equation in the form dydx = F( y

x ) is called homogeneous equation. Forhomogeneous equation, the standard substitution is to let u =

yx . Then we have

xdudx

= F(u) − u,

a separable equation.

Example 3.3. Solve the initial value problem

xdydx

= y +

√x2 − y2, y(1) = 0.

Solution Dividing both sides by x gives us

dydx

=yx

+

√1 −

( yx

)2,

which is a homogeneous equation. Let u =yx , we have

xdudx

+ u = u +√

1 − u2,⇒du

√1 − u2

=1x,⇒ sin−1 u = log x + C.

Put in x = 1, y = 0, which is the same as x = 1, u = 0, we obtain C = 0. Therefore, the solution isyx = u = sin log x, i.e.,

y = x sin log x.

Bernoulli Equations

A differential equation in the form

dydx

+ P(x)y = Q(x)yn (1)

is called Bernoulli Equation, where n is a constant satisfying n , 0, 1

7

Page 141: MATH320

4 September 12th, 2012; Slope Fields and Solution Curves

Continue the discussion of Bernoulli equation.

Bernoulli equation can be written as

y−n dydx

+ P(x)y1−n = Q(x).

Let u = y1−n, then dudx = (1 − n)y−n dy

dx . Consequently, we have

−1

n − 1dudx

+ P(x)u = Q(x),⇒dudx

+ (1 − n)P(x)u = (1 − n)Q(x).

The last equation is first order linear equation, which can be solved.

Example 4.1. Solve the equation

xdydx

+ 6y = 3xy2.

Solution Rewrite the equation as

dydx

+ 6yx

= 3y2,

which is Bernoulli equation with n = 2. Multiply both sides by y−2, we have

y−2 dydx

+6xy

= 3,⇒ −dudx

+6x

u = 3,⇒dudx−

6x

u = −3,

where u = y−1. For this linear first order equation, the integration factor is e∫− 6

x = x−6. Multiply-ing this factor to both sides of the last equation implies

ddx

(x−6u) = −3x−6,⇒ x−6u =35

x−5 + C,⇒ u =35

x + Cx6.

The solution is

y =1

35 x + Cx6

.

The meaning of slope field:

Given the differential equation dydx = f (x, y), there is a simple geometric way to think about

its solutions. At each point (x, y), the value of f (x, y) determines a slope f (x, y). Therefore, thesolution curve of the differential equation dy

dx = f (x, y) is a curve in xy-plane whose tangent line ateach point (x, y) has slope f (x, y).

This geometric viewpoint suggests a graphical method for constructing approximate solutionsof the differential equation dy

dx = f (x, y).

8

Page 142: MATH320

Example 4.2. Construct a slope field for the differential equation dydx = x − y and use it to sketch

an approximate solution curve that passes through the point (−4, 4). Use this solution curve toestimate y(0).

It follows from the table and picture of page 21 of the text book. So y(0) = −0.9.

Does solution of a differential equation exist?

Consider the example dydx = −x−2. The initial value problem y(0) = 1 has no solution at all.

Is solution of differential equation unique?

Consider the example dydx = y

13 . There are at least two solution curves passing through (0, 0):

y ≡ 0, y =

(23

x) 3

2

.

However, we do have a existence and uniqueness theorem for differential equations.

Theorem 4.1. Suppose f and ∂ f∂y are continuous on some rectangle R in xy-plane that contains the

point (a, b) in its interior. Then, for some open interval I containing the point a, the initial valueproblem

dydx

= f (x, y), y(a) = b

has one and only one solution that is defined on the interval I.

In this theorem, we don’t know how large I is, we only know it is some interval containing a.Return to previous examples to illustrate this theorem.

Mention the idea of the proof: transfer the differential equation into integration equation. Givea sketchy proof if time permitted.

5 September 14th, 2012; Some Word Problems and Population Mod-els

Radioactive decay.

Let N(t) be the number of atoms of certain radioactive isotope at time t. It has been observedthat N obey the following differential equation

dNdt

= −kN

for some positive k, which depends on the particular radioactive isotope. Clearly, we have N =

N0e−kτ.

Half life is the time needed for half of the material to decay, which we denote by τ. Then

12

N0 = N0e−kτ,⇒ kτ = log 2⇒ τ =log2

k.

For C14, the half-life is τ =log 2

0.0001216 = 5700 years.

9

Page 143: MATH320

Example 5.1. A specimen of charcoal turns out to contain 63% as much C14 as a sample ofpresent-day charcoal of equal mass. What is the age of the sample?

We have

0.63N0 = N0e−kt ⇒ t = −log 0.63

0.0001216= 3800.

Example 5.2. A tank contains 1000 liters of a solution consisting of 100kg of salt dissolved inwater. Pure water is pumped into the tank at the rate of 5L/s, and the mixture–kept uniform bystirring—is pumped out at the same rate. How long will it be until only 10kg of salt remains in thetank?

First, we set up the differential equation with initial conditions. Let y to be the number (in kg)of salt in the tank, we see that

y′ = −5

1000y, y(0) = 100.

This equation can be solved easily:

y = 100e−0.005t.

Let y = 10, we have

10 = 100e−0.005t ⇒ −0.005t = log110,⇒ t = 200 log 10 = 461(s).

Note that in the simplest model, the population growth satisfies some equation which is thesame as the radioactive decay:

dPdt

= αP,

where P = P(t) is the population of the world at time t. The solution is P(t) = P0eαt. This solutionis not very satisfactory since

limt→∞

P(t) = limt→∞

P0eαt = ∞,

which is impossible since our world is limited. It is reasonable that the world population shouldhave an upper bound.

The following differential equation describes the population which cannot tend to infinity.

dPdt

= kP(M − P), (2)

where both k and M are positive numbers. Clearly, if P > M, then the population is decreasing.Equation (2) is called Logistic equation.

10

Page 144: MATH320

Note that equation (2) is separable. Suppose P0 = M, then P ≡ M is a trivial solution. So weassume that P0 , M. We have

dPP(M − P)

= kdt,⇒

1P

+1

M − P

dP = kMdt,

log |P| − log |M − P| = kMt + C,

log |P

M − P| = kMt + C,

PM − P

= ±C1ekMt.

Plug in t = 0, P = P0, we have

PM − P

=P0

M − P0ekMt ⇒ P

(1 +

P0

M − P0ekMt

)=

MP0

M − P0ekMt ⇒ P =

MP0ekMt

M + P0(ekMt − 1).

Clearly, let t → ∞, we obtain limt→∞ P(t) = M. This M is called the carrying capacity of theenvironment.

Quickly review Chapter 1 if time permitted.

6 September 17th, 2012; Population Models

Let’s use the following example to recall the equation we learned in previous section.

Example 6.1. Suppose that in 1885 the population of certain country was 50 million and wasgrowing at the rate of 750, 000 people per year. Suppose also that in 1940 its population was100 million and was then growing at the rate of 1 million per year. Assume that this populationsatisfies the logistic equation. Predict the population for the year 2000.

The logistic equation is

ddt

P = kP(M − P),

with solution

P(t) =M

1 +(

MP0− 1

)e−kMt

. (3)

The population can be calculated whenever P0, k and M are known.

The given conditions imply

0.75 = 50k(M − 50), 1.00 = 100k(M − 100)⇒ M = 200, k =1

10000.

11

Page 145: MATH320

We set t = 0 at year 1940 (Why don’t we choose t = 0 at 1885?), then P0 = 100. Therefore, at2000, when t = 60, we have

P(60) =200

1 + (2 − 1)e−0.0001∗200∗60 ' 153.7.

The situation where Logistic equation applies: limited resources, competition, joint propor-tion(speard of disease). The joint proportion interpretation need more attention. The spreadingspeed of a desease is proportional to the chance of encounters among healthy people and infectedpeople. Here is an important observation: the population growth is proportional to the encounters.Use this observation, we can deal with a different population model.

Let P be the population of some unsophisticated animals. Suppose the death rate is a constant,the birth rate is proportional to the chance of encounters between males and females. In view ofthis assumption, we have the following equation

dPdt

= −δP + k0(P2

) · (P2

) = −δP + kP2 = kP(P −

δ

k

)= kP (P − M) ,

where M = δk .

Solve this equation

dPP(P − M)

= kdt,(

1P − M

−1P

)dP = kMdt, ⇒ log

|P − M|P

= kMt + C0,

⇒|P − M|

P= ekMt+C0 , ⇒

P − MP

= ±CekMt.

Plug in initial condition, we have

1 −MP

=

(1 −

MP0

)ekMt,

which in turn imply the general solution

P =M

1 +(

MP0− 1

)ekMt

. (4)

It is interesting to compare (4) with (3).

Example 6.2. Consider an animal population P(t) that is modeled by the equation

dPdt

= 0.0004P(P − 150).

Find P(t) if (a) P(0) = 200; (b) P(0) = 100.

In case (a), we have

P(t) =150

1 − 0.25e0.06t.

12

Page 146: MATH320

At t → log 40.06 from the left, we see that P(t) blows up. This time t =

log 40.06 is the doomsday, when the

population explosion happens.

In case (b), we have

P(t) =150

1 + 0.5e0.06t.

The solution exists always and limt→∞

P(t) = 0.

Return to the general setting. According to (4), we see that if P0 < M, then the solution existsalways and tends to zero as time goes to infinity. If P0 > M, then the solution exists until the

doomsday time: T =log P0

P0−M

kM .

Draw the typical solution curves for both equations P′ = kP(M − P) (Logistic equation) andP′ = kP(P − M) (Explosion-Extinction Equation). Explain the stability of the solution P ≡ M.For the first equation, P ≡ M is a stable solution. However, for the second equation, P ≡ M is nota stable solution.

It is clear that one can predict the stability of solutions by writing down every solution ex-plicitly. However, there are many equations whose solutions cannot be written down explicitly.However, the stability of solutions can also be analyzed in many cases.

7 September 19th, 2012;Equilibrium Solutions and Stability

We have seen the basic idea of stability of solutions last time. In this class, we shall define thestability more rigorously.

The differential equation of the form

dxdt

= f (x) (5)

is called an autonomous first-order differential equation. Note that the independent variable t doesnot appear explicitly.

We say x = c is a critical point of (5) if f (c) = 0. Clearly, x ≡ c is then a constant solution of(5). It is also called an equilibrium solution. Draw the phase diagram for each case.

Example 7.1. Calculate the critical points of the following differential equations.

dxdt

= x(x − 1), x = 0, x = 1.

dxdt

= x(1 − x), x = 0, x = 1.

dxdt

= x3 + 1, x = −1.

dxdt

= ex − 1, x = 0.

13

Page 147: MATH320

Definition 7.1. A critical point x = c is said to be stable provided that, if the initial value x0 issufficiently close to c, then x(t) remains close to c for all t > 0. More precisely, the critical pointc is stable if, for each ε > 0, there exists δ > 0 such that |x(t) − c| < ε for every t > 0 whenever|x0 − c| < δ.

The critical point x = c is unstable if it is not stable.

Come back to the Logistic equation dxdt = x(1 − x) and the Explosion-Extinction equation

dxdt = x(x − 1) to check the concept of stability. Draw pictures of typical solution curves.

Phase diagram: A diagram which includes x-axis and the moving direction of x according tothe position of x.

Check stability on the Phase diagram: A point is stable if it is the “converging” point in thephase diagram. The point is unstable if it is not a “converging” point.

Example 7.2. Use the phase diagram method to analyze the stability of the critical points of theequations in Example 7.1.

Harvesting a logistic equation.

Example 7.3. Solve the differential equation

dPdt

= P(4 − P) − h

whenever h = 3, 4, 5. Discuss the stability of critical points in each case.

If h = 3, critical points are P = 1 and P = 3. We have

dPP2 − 4P + 3

=dP

(P − 1)(P − 3)= −dt,⇒

dP2

(1

P − 3−

1P − 1

)= −dt,

⇒ log∣∣∣∣∣P − 3P − 1

∣∣∣∣∣ = −2dt,⇒P − 3P − 1

=P0 − 3P0 − 1

e−2t.

⇒ P(1 −

P0 − 3P0 − 1

e−2t)

= 3 −P0 − 3P0 − 1

e−2t,

⇒ P =3 − P0−3

P0−1 e−2t

1 − P0−3P0−1 e−2t

.

The behavior of P depends on the initial value P0.

(a). P0 > 3. Note that 0 < P0−3P0−1 < 1. Therefore, 1 − P0−3

P0−1 e−2t > 0 always. So P(t) always existsand limt→∞ P(t) = 3.

(b). 1 < P0 < 3. We have P0−3P0−1 < 0. Clearly 1 − P0−3

P0−1 e−2t > 0 always and P(t) always existslimt→∞ P(t) = 3.

(c). P0 < 1. Let T = 12 log P0−3

P0−1 . Then 1 − P0−3P0−1 e−2t < 0 whenever t < T . So the solution exists

whenever t < T , and limt→T− P(t) = −∞.

14

Page 148: MATH320

If h = 4, critical points are P = 2. We have

dPdt

= P(4 − P) − 4 = −(P − 2)2,⇒ −dP

(P − 2)2 = dt,⇒ d(P − 2)−1 = dt,⇒1

P − 2−

1P0 − 2

= t,

⇒ P = 2 +1

t + 1P0−2

= 2 +P0 − 2

1 + (P0 − 2)t.

Therefore, if P0 > 2, then 1 + (P0 − 2)t > 1 > 0. This means the solution P(t) exists for everand limt→∞ P(t) = 2. However, if P0 < 2, then the denomenator 1 + (P0 − 2)t approaches 0 as tapproaches T = 1

2−P0. Furthermore, limt→T− P(t) = −∞.

If h = 5, no critical points. we have

dPdt

= −P2 + 4P − 5 = −(P − 2)2 − 1,⇒dP

(P − 2)2 + 1= −dt,⇒ tan−1(P − 2) = −t + C,

⇒ P − 2 = tan(C − t),⇒ P = 2 + tan(C − t),

where C = tan−1(P0 − 2).

Draw the solution curve for each case and mention the effect of parameters to the solutionsof differential equations. Draw the phase diagram for each case and analyze the stability of eachcritical point.

8 September 21th, 2012; Acceleration-Velocity Models

1. The simplest model,dvdt

= −g = −9.8m/s2.

2. There exists resistance.

(a). Resistance is proportional to velocity.

dvdt

= −ρv − g.

This is a separable first-order differential equation. It can be solved as

dvv +

= −ρdt,⇒ logv +

v0 +gρ

= −ρt,⇒ v(t) =

(v0 +

)e−ρt −

gρ.

As t → ∞, there is a limit finite speed, gρ , which is the terminal speed. Integrate the veloclity

function, we obtain the position function

y(t) = y0 +v0ρ + gρ2 (1 − e−ρt) −

t.

(b). Resistance is proportional to square of velocity.

15

Page 149: MATH320

dvdt

= −g − ρv|v|.

If v > 0, or the body is moving upward, we have

dvdt

= −g(1 +

ρ

gv2

),

Using the substitution such that u2 =ρg v2, or u = v

√ρg . Then du =

√ρg dv. It follows that

√gρ

dudt

= −g(1 + u2),⇒du

1 + u2 = −√ρgdt,⇒ tan−1 u − tan−1 u0 = −

√ρgt,⇒ u = tan

(tan−1 u0 −

√ρgt

).

Recall that v =√

gρu. Then

v =

√gρ

tan(tan−1

(v0

√ρ

g

)−√ρgt

)=

√gρ

tan(C1 −

√ρgt

),

where we denote tan−1(v0

√ρg

)by constant C1 for convenience. Note that

∫tan udu =

∫sin ucos u

du =

∫d cos ucos u

= − log | cos u| + C. Integration of the velocity function gives us

y(t) = y0 +

∫ s

0

√gρ

tan(C1 −

√ρgs

)ds = y0 −

∫ s

0tan(C1 −

√ρgs)d(C1 −

√ρgs)

= y0 +1ρ

log

∣∣∣∣∣∣cos(C1 −√ρgt)

cos C1

∣∣∣∣∣∣ .If v < 0, or the body is moving downward. In this case, we have

dvdt

= −g + ρv2,

v(t) =

√gρ

tanh(C2 −√ρgt), C2 = tanh−1

(√ρ

gv0

).

y(t) = y0 −1ρ

log

∣∣∣∣∣∣cosh(C2 −√ρgt)

cosh C2

∣∣∣∣∣∣ .3. Escape Velocity(if time permitted).

The movement of a body in the gravitational field of a planet is described by the following sim-ple model. We assume the direction of the velocity is parallel to the direction of the gravitationalforce. So

dvdt

= −GMr2 .

16

Page 150: MATH320

Note that dr = vdt. We have

vdvdr

= −GMr2 ,⇒

dv2

dr= −

2GMr2 ,⇒ v2(r) − v2(r0) = 2GM(

1r−

1r0

).

Since v(r) > 0 always according to our requirement, we have

v2(r0) − v2(r) = 2GM(1r0−

1r

),⇒ v2(r0) ≥ limr→∞

2GM(1r−

1r0

) =2GM

r0,⇒ v(r0) ≥

√2GM

r0.

The number√

2GMr0

is the escape velocity.

9 September 24th, 2012; Euler’s Method

A general differential equation cannot be solved explicitly. Therefore, numerical method are nec-essary for the solution of general differential equation.

How do you drive? You follow the instruction of the road sign. Actually, to draw a solutioncurve is similar to driving. We adjust the drawing direction according to the differential equation.

Suppose we have the equation dydt = f (y, t). Starting from the point (t0, y0), we have a solution

curve y = y(t). How can we calculate the value of y(t0 + 1)? The simplest way is to set the“driving” direction at time t0, then “drive” directly to the time t0 + 1 without adjusting direction.In this case, we approximate y(t0 + 1) by the value y0 + f (t0, y0) × 1 = y0 + f (t0, y0). Of course,this approximation may be far away from the real precise value of y(t0 + 1). We divide the time 1into N-pieces of equal length, with each length 1

N . So

t0 = t0, t1 = t0 +1N, · · · , tN = t0 + 1.

Now we will drive more carefully, at each time tk, we adjust the “driving” direction according tothe “road sign”, or the direction given by the differential equation. So we have

y1 = y0 +1N× f (t0, y0),

y2 = y1 +1N× f (t1, y1),

· · ·

yN = yN−1 +1N× f (tN−1, yN−1).

Then yN is a better approximation of y(t0+1). The larger the N is, the smaller the error |yN−y(t0+1)|is.

If we plot our driving record on the y− t plane, we obtain a sequence of straight line segments,which are approximations of the real solution curves. The method described above is called Eu-ler’s Method, the number 1

N is called the step size. Generally, the step size could be any positivenumber h and the algorithm of Euler’s method can be described as follows.

17

Page 151: MATH320

Given the initial value problem

dydt

= f (t, y), y(t0) = y0,

Euler’s method with step size h consists of applying the iterative formula

yn+1 = yn + h · f (tn, yn), (n ≥ 0)

to calculate successive approximations y1, y2, y3, · · · to the true values y(t1), y(t2), · · · of the solu-tion y = y(x) at the points t1, t2, t3, · · · respectively.

Example 9.1. Apply Euler’s method to approximate the solution of the initial value problem

dydt

= t + y, y(0) = 5,

(a) first with step size h = 1 on the interval [0, 3] to approximate y(3). (b) then with step sizeh = 0.2 on the interval [0, 1] to approximate y(1).

(a). We have

t0 = 0, y0 = 5;

t1 = 1, y1 = 5 + 1 ∗ (0 + 5) = 10;

t2 = 2, y2 = 10 + 1 ∗ (1 + 10) = 21;

t3 = 3, y3 = 21 + 1 ∗ (2 + 21) = 44.

Therefore, y(3) is approximated by 44.

(b). We have

t0 = 0, y0 = 5;

t1 = 0.2, y1 = 5 + 0.2 ∗ (0 + 5) = 6;

t2 = 0.4, y2 = 6 + 0.2 ∗ (0.2 + 6) = 7.24;

t3 = 0.6, y3 = 7.24 + 0.2 ∗ (0.4 + 7.24) = 7.24 + 1.528 = 8.768;

t4 = 0.8, y4 = 8.768 + 0.2 ∗ (0.6 + 8.768) = 10.6416;

t5 = 1.0, y5 = 10.6416 + 0.2 ∗ (0.8 + 10.6416) = 12.92992.

So y(1) is approximated by 12.92992.

The error for the Euler’s method: local error, accumulative error and round off error in thecomputation.

The disadvantage of Euler’s method: there exists example such that the appximation values donot converge as h, the step size, tends to zero.

Example 9.2. Solve the intial value problem

dydt

= 1 + y2, y(0) = 0.

What is limt→ π2− y(t)?

18

Page 152: MATH320

It is not hard to solve this equation to obtain that

y(t) = tan t.

Therefore, we have limt→ π2− = ∞. However, we can also use Euler’s method to approximate the

solution curve. Let h = π2N for a very large N. One can see that y(π2 ) can always be approximated

by

yN = yN−1 +π

2N· (1 + y2

N−1).

No matter how large N is, the value of yN is always finite. So the difference between yN andlim

t→ π2−

y(t) is infinite.

10 September 26th, 2012; More on Euler Method

The estimate of local error.

Using h as step size, we apply Euler’s method to the differential equation

dydt

= f (t, y), y(t0) = y0.

In the first step, we want to find the difference between the approximation value y1 and the precisevalue y(t1).

Clearly, we have

y1 = y0 + h f (t0, y0),

Recall the mean value theorem of calculus. It states that given an arc between two endpoints, thereis at least one point at which the tangent to the arc is parallel to the secant through its endpoints.Applying mean value theorem to the solution curve, we have

y(t1) = y0 + hy′(t0 + ξ), 0 ≤ ξ ≤ h.

So the error term is

|yactual(t1) − yapproximation(t1)| = |y(t1) − y1| = h|y′(t0 + ξ) − f (t0, y0)| = h| f (t0 + ξ, y(t0 + ξ)) − f (t0, y0)|

Note that

f (t0 + ξ, y(t0 + ξ)) − f (t0, y0) = f (t0 + ξs, y(t0 + ξs))|s=1s=0

=

∫ 1

0

dds

f (t0 + ξs, y(t0 + ξs))ds

=

∫ 1

0

(∂ f∂t· ξ +

∂ f∂y·∂y∂t· ξ

)ds

=

∫ 1

0

(∂ f∂t· ξ +

∂ f∂y· f · ξ

)ds

= ξ

∫ 1

0

(∂ f∂t

+∂ f∂y

f)

ds.

19

Page 153: MATH320

If we know both the actual solution and the aprroximated solution locates in the compact rect-angular region [a, b] × [c, d], we can define

C = sup[a,b]×[c,d]

|∂ f∂t| + |

∂ f∂y|| f |.

Then we have

Local error ≤ Cξh ≤ Ch2.

So the error in the first step is bounded by Ch2. If we want to approximate y(t0 + 1) by Euler’smethod with step size h, we need 1

h steps. The total error is then expected to by bounded byCh2 × 1

h ≤ Ch, i.e.,

|y(t0 + 1) − yN | ≤ Ch, (6)

where N = 1h . Because of (6), we say the error of Euler’s method is of order h.

In practice, we need more precise method, whose error is of order higher than h. We nowintrouce one of them: the improved Euler Method.

Given the intial value problem

dydt

= f (t, y), y(t0) = y0,

the improved Euler method with step size h constists in applying the iterative formulas

un+1 = yn + h f (tn, yn),

yn+1 = yn + h ·12

( f (tn, yn) + f (tn+1, un+1)).

Draw a picture to show this.

Example 10.1. Apply Improved Euler’s method to approximate the solution of the initial valueproblem

dydt

= t + y, y(0) = 5,

with step size h = 1 on the interval [0, 3].

We have

t0 = 0, y0 = 5;

t1 = 1, u1 = y0 + 1 · (t0 + y0) = 10; y1 = y0 +12· 1 · ((0 + 5) + (1 + 10)) = 5 + 8 = 13;

t2 = 2, u2 = y1 + 1 · (t1 + y1) = 27; y2 = 13 +12· 1 · ((1 + 13) + (2 + 27)) = 34.5;

t3 = 3, u3 = y2 + 1 · (t2 + y2) = 71; y3 = 34.5 +12· 1 · ((2 + 34.5) + (3 + 71)) = 34.5 + 55.25 = 89.75.

20

Page 154: MATH320

The exact solution can be solved as

dydt− y = t,

ddt

(e−ty) = e−tt,⇒ e−ty − 5 =

∫ t

0e−ssds = −te−t + 1 − e−t,⇒ y = 6et − t − 1.

Therefore, y(3) = 6e3 − 3 − 1 = 116.51. The appxoximation given by Euler’s method is 44.Clearly, the method of improved Euler’s method is more accurate. In general, the error term of theimproved Euler’s method is bounded by Ch2. In other words, the error of the improved Euler’smethod is of order h2.

11 September 28th, 2012; Linear Systems

Now we move to the second big topic in this course: Linear Algebra.

Linear Algebra is the branch of mathematics concerning vector spaces, as well as linear map-pings between such spaces. It originates from the solution linear equations in several unknowns.

A linear equation of two variables is an equation in the form

ax + by = c.

Note that the solution set of the above equation is a straight line. This is the reason why thisequation is called linear. A linear equation of three variables is an equation in the form

ax + by + cz = d.

Put several linear equations together, we obtain a linear system. The solution of a linear systemis the intersection of the solution set of each equation in the system.

Example 11.1. Solve the linear system of two variables.

x + y = 1,

2x + y = 3.

Example 11.2. Solve the linear system of two variables.

x + y = 1,

2x + 2y = 3.

Example 11.3. Solve the linear system of two variables.

x + y = 1,

2x + 2y = 2.

21

Page 155: MATH320

These three systems correspond to the following three pictures: intersection of two lines, par-rallel two lines, coincidence to two lines.

The geomtric picture is helpful in solving system of linear equations. However, if the unknownvariables’ number is very big, then the geometric picture becomes not very clear. So we need tostudy the solution in a more abstract way.

Let’s see how do we solve the following equation.

Example 11.4. Solve the lienar system

x + 2y + z = 4,

3x + 8y + 7z = 20,

2x + 7y + 9z = 23.

We repeatedly use several operations to simplify the system. These operations are called theelementary operations:

1. Multiply one equation by a nonzero constant.

2. Interchange two equations.

3. Add a constant multiple of one equation to another equation.

Conceptually, the solution of the given linear system is very simple. We repeatedly apply the 3elementary operations until the system arrives the form of

x = · · · ,

y = · · · ,

z = · · · .

It is not surprising that the solution process of the previous example is the same as the solutionprocess of the following system.

u + 2v + w = 4,

3u + 8v + 7w = 20,

2u + 7v + 9w = 23.

So the name of the variables are not important. The important things are the coefficient numbers.For simplicity, we can store these numbers in a rectangle by order of the variables. This leads usto the coefficient matrix 1 2 1

3 8 72 7 9

and augmented coefficient matrix 1 2 1 4

3 8 7 202 7 9 23

22

Page 156: MATH320

of the given linear system.

For augmented matrix, we obviously have 3 elementary row operations. Keep in mind thatthere is a one-to-one correspondence between linear system and its augemented matrix. So the3 elementary row operations is the basis for solving a linear system. A solution can be read outdirectly if the augmented matrix reaches a very simple form. Return the previous example to showthis.

12 October 1st, 2012; Gaussian Elimination and Reduced Row-echelonMatrices, I

In this class and the next, we focus on this problem: how to solve a linear system by the matrixmethod? As we know, the basic tools we can use are the three elementary row operations. Usingthese operations repeatedly, we can simplify a given augmented matrix to a “very simply” form,this very simple form is called the “reduced echelon form”, or “reduced echelon matrix”.

Definition 12.1. A matrix is called a reduced row-echelon matrix, or a reduced row-echelon form,if the following properties are satisfied.

• If a row has nonzero entries, then the first nonzero entry is 1, which is called the leading 1of this row.

• If a column contains a leading 1, then all other entries in the column are 0.

• If a row contains a leading 1, then each row above contains a leading 1 on the further left.

Let’s justify if the given matrices are reduced echelon forms.(1 0 00 2 0

)No! The second row has no leading 1.1 1 1

0 1 00 0 1

No! On the column of the second row leading 1, there exists another nonzero term

(0 1 21 0 0

)No! The first row’s leading 1 is on the right side of the second row’s leading 1.1 0 0 1

0 1 0 20 0 1 3

Yes!

Why reduced-row echelon form is important. Because it is simplest and one can read the solutionout directly from reduced-row echelon form. For example, the linear equation corresponding tothe last matrix above is

x = 1,y = 2,z = 3,

23

Page 157: MATH320

which is a linear system with an obvious solution.

So our task to solve a linear system is to transform the augmented matrix to its reduced row-echelon form. Why can we do this? We are guanranteed by the following theorem.

Theorem 12.1. Every matrix is row equivalent to one and only one reduced row-echelon matrix.Here, we say two matrices are row equivalent if one can be obtained from the other by finite stepsof elementary row operations.

Now we fix notations for the elementary row operations:

Multiply row p by constant c cRp,

Interchange row p and row q S WAP(Rp,Rq),

Add c times row p to row q cRp + Rq.

Using these notations, we study the following examples.

Example 12.1. Solve the linear system

2x + 8y + 4z = 2,

2x + 5y + z = 5,

4x + 10y − z = 1.

The augmented matrix is 2 8 4 22 5 1 54 10 −1 1

It can be transformed into a reduced row-echelon form by the following process2 8 4 2

2 5 1 54 10 −1 1

12 R1−→

1 4 2 12 5 1 54 10 −1 1

−2R1+R2−→

1 4 2 10 −3 −3 34 10 −1 1

−4R1+R3−→

1 4 2 10 −3 −3 30 −6 −9 −3

− 1

3 R2−→

1 4 2 10 1 1 −10 −6 −9 −3

6R2+R3−→

1 4 2 10 1 1 −10 0 −3 −9

− 13 R3−→

1 4 2 10 1 1 −10 0 1 3

−R3+R2−→

1 4 2 10 1 0 −40 0 1 3

−2R3+R1−→

1 4 0 −50 1 0 −40 0 1 3

−4R2+R1−→

1 0 0 110 1 0 −40 0 1 3

.From the reduced row-echelon form, we can read the solution out.

x = 11,

y = −4,

z = 3.

24

Page 158: MATH320

13 October 3rd, 2012; Gaussian Elimination and Reduced Row-EchelonMatrices, II

The process to obtain the reduced row-echelon form is called the Gauss-Jordan Elimination.

1. Use the elementary row operations to transform the matrix into a matrix of shape uppertriangular, and each nonzero row has a leading 1.

2. Use back substitution to obtain reduced row-echelon form.

Example 13.1. Solve the linear system

x + y + z = −1,

2x + 2y + 5z = −8,

4x + 6y + 8z = −14.

The augmented matrix is 1 1 1 −12 2 5 −84 6 8 −14

One easily arrives 1 1 1 −1

0 0 3 −60 2 4 −10

It seems we have trouble in the second row and second column: 0 can not be made into a leading1. This difficulty can be solved by swapping the second and third row.1 1 1 −1

0 2 4 −100 0 3 −6

12 R2−→

1 1 1 −10 1 2 −50 0 3 −6

13 R3−→

1 1 1 −10 1 2 −50 0 1 −2

So we have arrived a matrix of upper triangular shape, each nonzero row start from a leading 1.Now we are able to use back-substitution to clear all the columns with leading one. This processis done from bottom to the top, from right to the top. This is the reason why do we call it “back”substitution.1 1 1 −1

0 1 2 −50 0 1 −2

−2R3+R2−→

1 1 1 −10 1 0 −10 0 1 −2

−R3+R1−→

1 1 0 10 1 0 −10 0 1 −2

−R2+R1−→

1 0 0 00 1 0 −10 0 1 −2

See more complicated examples.

25

Page 159: MATH320

Example 13.2. Solve the linear system

x3 − x4 − x5 = 4,

2x1 + 4x2 + 2x3 + 4x4 + 2x5 = 4,

2x1 + 4x2 + 3x3 + 3x4 + 3x5 = 4,

3x1 + 6x2 + 6x3 + 3x4 + 6x5 = 6.

Example 13.3. Solve the linear system

x3 + x4 = 3,

x1 + x2 = 1,

x2 + x3 = 2,

x1 + 3x2 + 3x3 + x4 = 10.

14 October 5th, 2012; Matrix Operations

A linear system is called consistent if it has at least one solutions. Otherwise, it is called inconsis-tent. Check the consistency of previous examples.

How to justify if a linear system is consistent? Suppose A is the reduced row-echolon of someaugmented matrix of a linear system. Then the linear system is consistent if and only if it does notcontain a row of (0, 0, · · · , 0, 1).

If a linear system is consistent, there are two possibilities.

1. Each variable column has a leading 1. There is a unique solution.

2. Some variable column has no leading 1. Such a variable is called a free variable, which canbe arbitrary number. So in this case the linear system has infinitely many solutions.

Then apply the above discussion to the study of Homogeneous linear system.

Introduce the matrix operation. Why do we needed it? One of the important reason is for thesimplicity of notation.

Start from inner product. The innner product of two vectors ~v and ~wcan be written as theproduct of a row matrix ~vτ and a column matrix ~w.

Then the simpest linear system can be written as a matrix product. For example,

x + 2y + 3z = 1

can be written as A~v = 1 where

A = (1, 2, 3), ~v =

xyz

.26

Page 160: MATH320

Now we increase the number of equations. If we define correct relationship, the form A~v = ~b willrepresent the whole system. For example, the linear system

x + 2y + 3z = 1,

2x + y + 3z = 2,

x + 3y − 5z = 3,

10x + y − 9z = 4.

can be written as A~v = ~b where

A =

1 2 32 1 31 3 −510 1 −9

, ~v =

xyz

~b =

1234

Then let’s give the formal definition of matrix product. Note that the matrix product is defined forAB if and only if the column number of A equals the row number of B. We say the matrix A is an×m matrix if it has n rows and m columns. Therefore, if A is a n×m matrix, B is a p× q matrix,then AB is well defined if and only if m = p. Note that AB is defined does not mean BA is defined.

Then show the matrix sum. This is very easy, two matrices can be summed if and only if thereare of the same dimension, or the same shape.

Then introduce the scalar product of a matrix and a number.

There is a special type of matrices whose row number equals the column number. Such matricesare called square matrices.

Give examples of A, B such that AB , BA.

Give examples of A, B such that AB = 0, hower A , 0, B , 0.

Actually, there is one example satisfying the previous two requirements.

A =

(0 10 1

), B =

(1 10 0

)Then we have

AB =

(0 00 0

), BA =

(0 20 0

).

27

Page 161: MATH320

15 October 8th, 2012; Matrix Operations, II

16 October 10th, 2012; Midterm Exam, I

17 October 12th, 2012; Review for Midterm Exam

18 October 15th, 2012; Inverse matrices

19 October 17th, 2012; Determinants

The meaning of determinant: the volume with orientation.

For the matrix (a bc d

)The first column vector gives us a line t : 7→ (at, ct), or

ay − cx = 0.

The distance from this line to the point (b, d) is given by

|ad − bc|√

a2 + c2

The length of the vector (a, c)τ is√

a2 + c2. It follows that the area spanned by (a, c)τ and (b, d)τ

is

Area =√

a2 + c2 ·|ad − bc|√

a2 + c2= |ad − bc|

For two matrices, we define

detA = ad − bc,

it has the geometric meaning of “area with orientation”.

In dimension 3, give the cofactor expansion definition. Then explain that the expansion alongthe first row is the same as the expansion along the first column. This will imply that the detA =

detAτ.

Then explain from geometric point of view the change of determinant under the fundamentalrow transformations.

Then use the reduced row echelon form to explain that detA = detEk · detEk−1 · ·detE1. Thisthen implies that

detAB = detAdetB.

28

Page 162: MATH320

A matrix is invertible if and only if the row reduced echelon form is Id, if and only if det A , 0.

Calculate the determinants of the following example.

det

−2 5 45 3 11 4 5

, det

1 2 1 −10 1 −2 0−2 3 −2 30 −3 3 3

20 October 19th, 2012; Determinants, II

Using fundamental row(column) transformation to transform a matrix to an uptrianglular matrix.In each transformation, record the change of the determinant.

21 October 22th, 2012; The Vector Space and subspaces

In this class, we study the explicit vector space R3.

A vector in R3 is simply an ordered triple (x, y, z) of real numbers. For simplicity of notation, avector is denoted by ~v. The sum of vectors and the multiplication of a vector by a scalar is definedin an obious way.

R3 is the set of all the vectors ~v = (x, y, z).

A subset V ⊂ R3 is called a subspace if the following properties are satisfied.

(1). If ~v ∈ V , then k~v ∈ V for arbitrary k ∈ R.

(2). If ~u ∈ V and ~v ∈ V , then ~u + ~v ∈ V .

By definition, if V , ∅, then ~0 ∈ V .

Example 21.1. Check if the following set are linear subspace.

(a). V = (x, y, z)|z ≥ 0.

(b). V = (x, y, z)|x + y + z = 1.

(c). V = (x, y, z)|x + y + z = 0.

(d). V = (x, y, z)|x + y + z = 0, x + 2y + 3z = 0.

The sets V defined in (c) and (d) are linear subspaces. However, they are different in geometryby the dimensions. In order to illustrate the dimension rigorously, we need the concept of linearindpendence.

Two vectors ~u and~v are called linearly independent if and only if the linear system (for variablesa an b)

a~u + b~v = ~0 (7)

29

Page 163: MATH320

has a unique solution

a = 0, b = 0.

Two vectors ~u and ~v are called linearly dependent if and only if they are not linearly indepen-dent,i.e., there exists a solution of (7) such that either a , 0 or b , 0.

Suppose ~u and ~v are linearly dependent, then

a~u + b~v = ~0

has a nonzero solution. Without loss of generality, we assume that a , 0, then we have

~u = −ba~v.

This means that ~u is generated by ~v. So ~u is redundant.

Example 21.2. Check if the following vectors are linearly independent.

(a). ~u = (1, 0, 1), ~v = (−3, 0,−3).

(b). ~u = (2, 1, 3), ~v = (1, 2, 3).

Three vectors ~u, ~v and ~w are called linearly independent if and only if the linear system (forvariables a an b)

a~u + b~v + c~w = ~0, (8)

has a unique solution

a = 0, b = 0, c = 0.

Three vectors ~u, ~v and ~w are called linearly dependent if and only if they are not linearly indepen-dent,i.e., there exists a solution of (7) such that either a , 0 or b , 0, or c , 0.

Suppose ~u, ~v and ~w are linearly dependent, then

a~u + b~v + c~w = ~0

has a nonzero solution. Without loss of generality, we assume that a , 0, then we have

~u = −ba~v −

ca~w.

This means that ~u is generated by ~v and ~w. So ~u is redundant.

Example 21.3. Check if the following vectors are linearly independent.

(a). ~u = (1, 1, 0),~v = (−1, 1, 1), ~w = (−1, 3, 2).

(b). ~u = (1, 2, 3),~v = (1, 4, 9), ~w = (1, 8, 27).

30

Page 164: MATH320

22 October 24th, 2012; Linear combinations and independence ofvectors

We discuss the general Euclidean space Rn. One can regard Rn as the collection of all n-vectors,or n-tuples, (x1, x2, · · · , xn). Some times, we will also write vectors in column for convenience.

One can easily define linear subspace of Rn.

Definition 22.1. A set V in Rn is called a linear subspace of Rn if the following properties aresatisfied.

(a). If ~v ∈ V, then k~v ∈ V for every k ∈ R.

(b). If ~u,~v ∈ V, then ~u + ~v ∈ V.

Note that the set~0

is a linear subspace and it is contained in every other non-empty linearsubspace.

Many interesting set V appears as the solution set of a linear system. Not every solution set isa linear subspace. For example, let V = (x1, x2, x3, x4)|x1 − x2 + x3 + 2x4 = 1. It is not a linearsubspace since ~0 is not in V . However, if the linear system is homogeneous, then the solution setmust be a linear subspace.

Theorem 22.1. Suppose A is a m × n matrix, ~v is a column vector in Rn. Let

V =~v|A~v = ~0

.

Then V is a linear subspace of Rn. V is called the kernel space of the matrix A.

The key of the proof is the following. If A~v = ~0, then A(k~v) = ~0 for arbitrary real number k. IfA~u = ~0 and A~v = ~0, then A(~u + ~v) = ~0.

Example 22.1. Find V =~v|A~v = 0

, where ~v ∈ R4 and

A =

1 3 8 −11 −3 −10 51 4 11 −2

It is easy to compute that

rre f (A) =

1 0 −1 20 1 3 −10 0 0 0

As column 3 and 4 have no leading one’s, we see x3 and x4 are free variables. So let x3 = s, x4 = t,we have the solution

x1 = s − 2t,

x2 = −3s + t,

x3 = s,

x4 = t.

31

Page 165: MATH320

Now we write every solution as a vectorx1x2x3x4

= s

1−310

+ t

−2101

So every solution vector is the linear combination of ~v1 = (1,−3, 1, 0)τ, ~v2 = (−2, 1, 0, 1)τ. In

other words, the solution space is spanned by the vectors ~v1 and ~v2.

Definition 22.2. The vectors ~v1,~v2, · · · ,~vk are called linearly independent if the linear system

c1~v1 + c2~vn + · · · cn~vn = ~0

has a unique solution c1 = c2 = · · · = ck = 0. They are called linearly dependent if they are notlinearly independent.

Vectors ~v1,~v2, · · · ,~vk are linearly dependent if and only if one of them can be expressed as thelinear combination of the others. On one hand, if one ~vi can be expressed as the linear combinationof the others, we can easily see that ~v1, · · · ,~vk are linearly dependent. For example, let’s assume

~v1 = a2~v2 + a3~v3 + · · · ak~vk,

then

−~v1 + a2~v2 + a3~v3 + · · · ak~vk = ~0.

So we have a nontrivial solution c1 = −1, c2 = a2, · · · , ck = ak. It means that~v1, · · · ,~vk are linearlyindependent. On the other hand, if~v1, · · · ,~vk are linearly dependent, then we have a solution ci

ki=1

such that some ci is nonzero. For simplicity, let’s assume c1 , 0, then

c1~v1 = −c2~v2 + · · · + −ck~vk,⇒ ~v1 = −c2

c1~v2 + · · · −

ck

c1~vk.

So ~v1 can be expressed as the linear combination of the remainded vectors.

23 October 26th, 2012; Bases of Linear Spaces

The basis of a linear subspace V .

Definition 23.1. Vectors~v1, · · · ,~vk

is called a basis of V if the following properties are satisfied.

(a).~v1, · · · ,~vk

are linearly independent.

(b). V is spanned by~v1, · · · ,~vk

.

The number of vectors in a basis is independent of the choice of bases.

Theorem 23.1. Suppose V has two bases~v1, · · · ,~vk

and

~u1, · · · , ~ul

. Then k = l.

32

Page 166: MATH320

The point of the proof is to translate this to a problem of solving linear system by the definitionof linear independence. Then one can obtain contradiction if k , l.

Therefore, the number of vectors in a basis of V reveals some important property of the linearsubspace V . It exactly means the smallest number of vectors one need to span V . This number iscalled the dimension of V . Clearly, the dimension of a straight line is 1, the dimension of a planeis 2, the dimension of Rn is n.

For Rn, we define

~e1 = (1, 0, 0, · · · , 0),

~e2 = (0, 1, 0, · · · , 0),

· · ·

~en = (0, 0, 0 · · · , 1).

One can easily check~e1, · · · , ~en

form a basis of Rn. So dimension of Rn is n. Therefore, for n

vectors~v1, · · · ,~vn

in Rn, they form a basis if and only if they are linearly independent.

Theorem 23.2. The vectors ~u = (u1, u2, u3), ~v = (v1, v2, v3) and ~w = (w1,w2,w3) are linearlyindependent if and only if

det

u1 v1 w1u2 v2 w2u3 v3 w3

, 0.

Give a sketchy proof and then show an example.

Check if the vectors

~u = (1, 2, 3), ~v = (2, 3, 4), ~w = (3, 4, 5)

are linearly independent. So they form a basis of R3.

In general, we have the following theorem.

Theorem 23.3. Suppose ~v1, · · · ,~vn are n column vectors in Rn, then they form a basis of Rn if andonly if

det(~v1, · · · ,~vn) , 0.

Note the determinant symmetry det A = det Aτ.

24 October 29th, 2012; Second order linear equations

Study the special example of harmonic oscillator: F = −kx, with the equation of displacement xbeing

md2xdt2 + c

dxdt

+ kx = 0.

33

Page 167: MATH320

Note the following property of the solutions: If x1 and x2 are solutions, then for every pair ofconstants c1 and c2, the linear combination c1x1 + c2x2 is a new solution.

The second order linear equation is a differential equation in the form

y′′ + p(t)y′ + q(t)y = f (t).

The equation is called homogeneous if f (t) ≡ 0.

Lemma 24.1. The solution set of a homogeneous second order linear equation is a linear space,i.e., if y1 and y2 are two solutions, then c1y1 + c2y2 is a new solution for every two constants c1and c2.

The proof is to check by definition. If y1 and y2 are solutions, we have

y′′1 + p(x)y′1 + q(x)y1 = 0,

y′′2 + p(x)y2 + q(x)y2 = 0.

Multiply the first equation by c1, the second equation by c2. Then add them together, we have

(c1y1 + c2y2)′′ + p(x)(c1y1 + c2y2)′ + q(x)(c1y1 + c2y2) = 0,

which means c1y1 + c2y2 is a solution.

Let’s get some feeling from the following example.

Example 24.1. Check the solutions of the following differential equations.

y′′ + y = 0, y = c1 cos t + c2 sin t.

y′′ − y = 0, y = c1et + c2e−t.

y′′ − 2y′ + y = 0, y = c1et + c2tet.

In order to show y′′ + y = 0 has solutions c1 cos t + c2 sin t, we only need to show that cos t andsin t are solutions by Lemma 24.1. Actually, we see

(cos t)′′ + cos t = (− sin t)′ + cos t = − cos t + cos t = 0,

(sin t)′′ + sin t = (cos t)′ + sin t = − sin t + sin t = 0.

It is expected that second order linear equation has more than one solutions. However, how toexpress all the solutions of a given equation. This is guaranteed by the following existence anduniqueness theorem.

Theorem 24.1. Suppose that the functions p, q and f are continuous on the open interval I con-taining the point a. Then given any two numbers b0 and b1, the equation

y′′ + p(x)y′ + q(x)y = f (x)

has a unique solution on the entire interval I that satisfies the initial conditions

y(a) = b0, y′(a) = b1.

34

Page 168: MATH320

Note that one need two initial conditions to fix the solution: the initial value and initial deriva-tive. This is natural. Just think Newton’s equation F = ma = mx. When F is known, one needinitial position and velocity to predict the position at every time.

If f ≡ 0, or the equation is homogeneous, for the initial conditions y(a) = 0, y′(a) = 0, there isa unique solution y ≡ 0.

Fix two constants c1 and c2. For the initial condition y(a) = 1, y′(a) = 0, there is a solutiony1. For the initial condition y(a) = 0, y′(a) = 1, there is another solution y2. Therefore, thesolution Y = c1y1 + c2y2 satisfies the initial condition Y(a) = c1,Y ′(a) = c2. By Theorem 24.1,Y = c1y1 + c2y2 is the unique solution satisfying the given initial condition.

The above argument can be generalized. Suppose y1, y2 are two solutions such that

~v1 =

(y1(a)y′1(a)

), ~v2 =

(y2(a)y′2(a)

).

They form a basis of R2, so every initial condition vector(y(a)y′(a)

)can be expressed as a linear combination of ~v1 and ~v2. For every solution Y , we can write

~v =

(Y(a)Y ′(a)

)= c1~v1 + c2~v2.

Clearly, the solution Z = Y − c1y1 − c2y2 satisfies the initial condition Z(a) = 0,Z′(a) = 0.Therefore, by the existence and uniqueness theorem, we have Z ≡ 0. It follows that Y = c1y1+c2y2.

Recall that two vectors form a basis of R2 if and only if they form a matrix with nonzerodeterminant. Define

W(y1, y2) = det(y1 y2y′1 y′2

)The above argument show that if W(y1, y2)(a) , 0, then every solution can be expressed as thelinear combination of y1, y2. On the other hand, if W(y1, y2)(a) = 0, then we have ~v1, ~v2 linearlydependent. Therefore, we can assume that ~v2 = λ~v1. Then the solution Z = y2 − λy1 satisfies theinitial condition Z(a) = 0,Z′(a) = 0. Therefore, we have Z ≡ 0, which is the same as y2 = λy1.Then we see that

W(y1, y2) = det(y1 y2y′1 y′2

)= det

(y1 λy1y′1 λy′1

)= 0.

Therefore, we have checked the following statement.

Lemma 24.2. Suppose y1, y2 are two solutions of a homogeneous equation y′′+ p(x)y′+q(x)y = 0,then either W(y1, y2) , 0 every where, or W(y1, y2) ≡ 0.

For the given examples, calculate their Wronskian.

35

Page 169: MATH320

25 October 31st, 2012; Second order linear equations, II

Calculate the solutions for the examples.

Example 25.1. Find solutions for the equation

y′′ − 2y′ + y = 0.

Find solutions for the equation

y′′ − 2y′ + (1 − δ2)y = 0.

Solve the auxiliary equation

k2 − 2k + (1 − δ2) = 0,⇒ (k − 1 − δ)(k − 1 + δ) = 0.

So we have solutions

y = c1e(1+δ)t + c2e(1−δ)t.

In particular, there is a solution

y =e(1+δ)t − e(1−δ)t

2δ∼ tet.

Therefore, it is not surprising that tet is a solution of y′′ − 2y′ + y = 0.

In general, for y′′ − 2ay′ + a2y = 0, we have common solutions

y = c1eat + c2teat.

How do we know we have obtained all the solutions? We have the following theorem.

Theorem 25.1. Suppose y1 and y2 are two solutions of the homogeneous second-order linearequation

y′′ + p(x)y′ + q(x)y = 0

on an open interval I on which p and q are continuous. If W(y1, y2) , 0 at some point, then everysolution can be expressed as a linear combination of y1 and y2. In other words, y1 and y2 form abasis for the solution space.

Two functions f1, f2 are linearly independent over the interval I if the equality

c1 f1 + c2 f2 ≡ 0

hold for only c1 = c2 = 0. Note that the right hand side is zero function.

36

Page 170: MATH320

Recall that the Wronskian of two functions f1, f2 is defined as

W( f1, f2) = det(

f1 f2f ′1 f ′2

).

If f1, f2 are linearly dependent, then W( f1, f2) ≡ 0. Actually, we can assume f2 = λ f1 without lossof generality, then we have W( f1, λ f1) = 0 by direct calculation. We have

f1, f2 linearly dependent⇒ W( f1, f2) ≡ 0?⇒ f1, f2 linearly dependent.

If y1, y2 are solutions of a homogeneous equation, we have more information. The prevous lemmashows the following.

y1, y2 linearly dependent⇒ W( f1, f2) ≡ 0⇒ y1, y2 linearly dependent.

We also know that W(y1, y2) is either a zero function, or a function never being zero. So we have

y1, y2 linearly dependent⇒ W( f1, f2)(a) = 0⇒ y1, y2 linearly dependent.

26 November 2nd, 2012; High order Linear Equations

Look at the example

y′′′ − 6y′′ + 11y′ + 6y = 0.

Find solutions and show that they are all the possible solutions. Generalize the previous dis-cussion in 2nd order to higher order.

For linear differential equation of order n with continuous initial conditions, we have existenceand uniqueness theorem for the solutions. Note that for an order n differential equation, the initialconditions contain n terms: the initial value y(a), initial derivative y′(a), · · · , untial the initial(n − 1)-derivative y(n−1)(a).

27 November 5th, 2012; Homogeneous Equations with constant co-efficients

Suppose the characteristic equation of a differential equation is

rn + pn−1rn−1 + · · · p1r + p0 = 0.

Suppose it has roots r1 of multiplicity m1, r2 of multiplicity m2, · · · , rq of multiplicity mq. Supposethat m1 + m2 + · · ·mq = n. Then every solution of the differential equation is of the form(

c1,1er1t + c1,2ter1t + · · · c1,m1 tm1−1er1t)

+ · · ·(cq,1er1t + cq,2ter2t + · · · cq,mq tmq−1erqt

)

37

Page 171: MATH320

28 November 7th, 2012; Homogeneous Equations with constant co-efficients, II

Recall the example of the harmonic oscillator:

x +km

x = 0. (9)

For simplicity, we assume km = 1 and obtain the general solution

c1 cos t + c2 cos t

by checking that cos t and sin t are linearly independent solutions of this second order homoge-neous equation.

Note that the characteristic equation is

r2 + 1 = 0,

which has solution r = ±i where i is the imaginary number such that i2 = −1. Therefore, formally,the solution of (9) is

c1eit + c2e−it.

Actually, this is true if we allow c1, c2 also be complex number.

Recall that for any real number t, we Taylor expansions

et =

∞∑k=0

tk

k!

cos t =

∞∑k=0

(−1)kt2k

(2k)!

sin t =

∞∑k=0

(−1)kt2k+1

(2k + 1)!

Now we can define the complex exponential by the above formal series. So we have

eit =

∞∑k=0

iktk

k!= 1 + it −

t2

2−

it3

3!+

t4

4!+ · · ·+ =

∞∑p=0

(−1)pt2p

(2p)!+ i

∞∑q=0

(−1)qt2q+1

(2q + 1)!= cos t + i sin t.

Let t = π, we obtain the famous Euler’s formula

eiπ + 1 = 0.

Clearly, cos t and sin t is the real part and imaginary part of eit. Note that the real part and theimaginary part of e−it are cos t and − sin t.

38

Page 172: MATH320

It can be checked by formal series expansion that ez1 · ez2 = ez1+z2 . Therefore, let z = a + ib,then

ea+ib = ea · eib = ea (cos b + i sin b) = ea cos b + iea sin b.

So the real part and imaginary part of ea+ib are ea cos b, ea sin b respectively.

Look at the examples

y′′ + 4y′ + 5y = 0.

The characteristic equation has solutions r = −2 ± i. The real part and imaginary part of e(−2+i)t

are

e−2t cos t, e−2t sin t.

The general solution is

c1e−2t cos t + c2e−2t sin t.

Example 28.1. Find the general solutions of

y(4) + 4y(3) + 4y′′ + 8y′ + 4y = 0.

Characteristic equation

r4 + 4r3 + 4y2 + 8y + 4 = (r2 + 2r + 2)2 = [(r + 1 + i)(r + 1 − i)]2 = (r + 1 + i)2(r + 1 − i)2.

The root r = −1 + i has multiplicity 2. The real and imaginary part of e(−1+i)t are e−t cos t ande−t sin t. Therefore, the general solutions are

c1e−t cos t + c2te−t cos t + c3e−t sin t + c4te−t sin t.

29 November 9th, 2012; Mechanical Vibrations

Recall the free harmonic oscillator: mx + kx = 0. For simplicity, we write ω0 =

√km . The general

solution is

x = A cosω0t + B sinω0t =√

A2 + B2

(A

√A2 + B2

cosω0t +B

√A2 + B2

sinω0t).

Let α = tan−1 BA , then we have cosα = A√

A2+B2, sinα = B√

A2+B2. So

x =√

A2 + B2 (cosω0t cosα + sinω0t + sinα) =√

A2 + B2 cos(ω0t − α) = C cos(ω0t − α),

39

Page 173: MATH320

where C =√

A2 + B2 is called the amplitude, ω0 is called the circular frequency and α is calledthe phase angle.

Now we assume the existence of external force, the harmonical oscillator becomes the dampedharmonic oscillator.

x +km

x + 2px = 0,

where p = c2m .

Solutions of characteristic equation. Three cases (depending on coefficients): two distinct realsolutions; one real solution with multiplicity 2; two complex valued solutions.

Example 29.1. The following equations describes the motion of damped harmonic oscillator. Findthe precise solution for each equation.

(1). x + 4x + 2x = 0, x(0) = 0, x(0) = 4,

(2). x + 4x + 4x = 0, x(0) = 0, x(0) = 4,

(3). x + 4x + 8x = 0, x(0) = 0, x(0) = 4.

For the first example, the characteristic equation is r2+4r+2 = 0. Therefore, r1 = −2+√

2, r2 =

−2 −√

2. A general solution is of the form

x(t) = c1e(−2+√

2)t + c2e(−2−√

2)t,⇒ x′(t) = (−2 +√

2)c1e(−2+√

2)t + (−2 −√

2)c2e(−2−√

2)t.

Plug in the intial condition, we have

0 = x(0) = c1 + c2;

4 = x′(0) = (−2 +√

2)c1 + (−2 −√

2)c2.

This yields c1 =√

2, c2 = −√

2. So the solution is

x(t) =√

2e(−2+√

2)t −√

2e(−2−√

2)t.

The second example. The characteristic equation is r2 + 4r + 4 = (r + 2)2 = 0, which has rootr = −2 of multiplicity 2. So the general solution is

x(t) = c1e−2t + c2te−2t,⇒ x(t) = −2c1e−2t + c2e−2t − 2c2te−2t.

Plugging in the condition x(0) = 0, x(0) = 4, we have

0 = c1,

4 = −2c1 + c2.

Therefore, c1 = 0, c2 = 4. The solution is

x(t) = 4te−2t.

40

Page 174: MATH320

The third example. The charactristic equation is r2 + 4r + 8 = (r + 2)2 + 4 = 0, which has rootsr = −2±2i. It follows that e−2t cos 2t and e−2t sin 2t are linearly independent solutions. So generalsolution is

x(t) = c1e−2t cos 2t + c2e−2t sin 2t,⇒ x(t) = −2c1e−2t(cos 2t + sin 2t) + 2c2e−2t(cos 2t − sin 2t).

It follows that

0 = x(0) = c1;

4 = x(0) = −2c1 + 2c2.

So c1 = 0 and c2 = 2, we obtain the solution

x(t) = 2e−2t sin 2t.

Draw the picture of each solution and generalize the above example. For the damped harmonicoscillator

x + 2px +km

x = 0.

Note that p ≥ 0.

(a). p2 − km > 0, we have two nonpositive distinct roots for the characteristic equation. No

oscillation behavior. This is the Overdamped case.

(b). p2 − km = 0, we have one real root of multiplicity two. No oscillation behavior. This is

called the Critically Damped case.

(c). p2 − km < 0, we have a pair of conjugate imaginary roots for the characteristic equation.

There are oscillation behavior. This is the Underdamped case.

Discuss more examples of using solutions to construct differential equations.

30 November 12th, 2012; Nonhomogeneous equations

Suppose we have two solutions y1 and y2 for the non-homogeneous equations

y(n) + pn−1y(n−1) + pn−2y(n−2) + · · · p1y′ + p0 = f (t). (10)

Clearly, y1 − y2 satisfies the equation

y(n) + pn−1y(n−1) + pn−2y(n−2) + · · · p1y′ + p0 = 0. (11)

Therefore, every solution of (10) can be written as yc + yp where yp is a solution of (10), yc is thegeneral solution of (11).

In order to solve a non-homogeneous equation, we only need one more step than the homoge-neous equation: to find a particular solution.

41

Page 175: MATH320

Example 30.1. Check that t + 1 is the solution of

y′′ + 6y′ + 5y = 5t + 11. (12)

Write down the general solution of this equation.

yp = t + 1, yc = c1e−t + c2e−5t, so the general solution

y = t + 1 + c1e−t + c2e−5t.

Example 30.2. Find particular solutions for the following equations

(1).y′′ − 4y = 10e3t;

(2).y′′ − 4y = 4e2t.

(1). Try y = Ae3t. Then calculate to obtain A = 2.

(2). Note that y = e2t solves the corresponding homogeneous equation y′′ − 4y = 0. So we tryy = Ate2t. Calculate to obtain A = 1.

If f (x) is the linear combination of product functions of polynomial, erx and cos rx, sin rx,we can always “try” particular solutions. If f (x) itself is not a complementary solution, i.e., thesolution of the associated homogeneous linear equation, then we try directly according to the formof f (x). If f (x) solves the homogeneous equation, then one should consider multiplying extra tterms.

Example 30.3. Solve the initial value problem

y′′ + y = cos t, y(0) = 0, y′(0) = 1.

Since f (t) = cos t, the first try for particular solution is y = A cos t + B sin t. However, it is easyto see that

(A cos t + B sin t)′′ + (A cos t + B sin t) = 0,

no matter how do we choose A, B. Actually, cos t and sin t are two linearly independent solutionsof the homogeneous equation y′′ + y = 0.

The trick is to replace cos t by t cos t, sin t by t sin t. Let y = At cos t + Bt sin t, we have

y′ = A cos t − At sin t + B sin t + Bt cos t;

y′′ = −A sin t − A sin t − At cos t + B cos t + B cos t − Bt sin t = −t(A cos t + B sin t) + (−2A sin t + 2B cos t).

It follows that

cos t = y′′ + y = −2A sin t + 2B cos t,⇒ A = 0, B =12.

42

Page 176: MATH320

We obtain a particular solution yp = 12 t sin t. On the other hand, the solutions for the homogeneous

equation y′′ + y = 0 is yc = c1 cos t + c2 sin t. Therefore, the general solution is

y =12

t sin t + c1 cos t + c2 sin t,⇒ y′ =12

t cos t + (12− c1) sin t + c2 cos t.

Plug in the value t = 0, we have

0 = y(0) = c1,

1 = y′(0) = c2.

So the solution is

y =12

t sin t + sin t.

31 November 19th, 2012; Eigenvalues and Eigenvectors, I

In many cases, we will meet the problem of finding Ak~v where A is an n × n matrix and ~v is acolumn vector in Rn. If we do the calculation from the definition, then the calculation becomesvery complicated when n and k are large. However, in some special case, the calculation is verysimple. For example, if

A =

1 1 11 1 11 1 1

, ~v =

111

.Then we have

A~v =

333

= 3~v, A2~v = A(3~v) = 3A~v = 32~v, · · · , Ak~v = A(Ak−1~v) = A(3k−1~v) = 3k~v.

The multiplication of A to ~v behaves like the muliplication of a number to ~v. Therefore, thecomputation is much easier. The above example suggests us to study the following importantconcepts: eigenvector and eigenvalue.

Definition 31.1. Suppose A is an n × n square matrix. A number λ is called an eigenvalue of Aprovided there exists a nonzero vector ~v ∈ Rn such that

A~v = λ~v.

This ~v is called an eigenvector associated with the eigenvalue λ.

Note that the requirement of ~v , 0 is necessary since A~0 = λ~0 for arbitrary λ.

In the above example, we see that for the given A, ~v is an eigenvector associated with theeigenvalue λ = 3.

43

Page 177: MATH320

More examples for eigenvalues. Let

A =

(2 34 6

).

It’s easy to see that 8 is an eigenvalue. The vector ~v = (1, 2)τ is an eigenvector associated with theeigenvalue 8.

What is the general method to search eigenvalues and eigenvectors?

Suppose λ is an eigenvalue, ~v is an eigenvector associated with λ. Then

A~v = λ~v,⇔ A~v = λI~v,⇔ (A − λI)~v = 0.

Note that ~v , 0. So the linear system (A − λI)~x = ~0 has at least one non-zero solution ~v. Thismeans that this system has at least one free variables. So the reduced row echelon form of A − λIhas at most n − 1 leading 1’s. Therefore, the matrix A − λI is not invertible, which is equivalentto say det(A − λI) = 0. After expansion, we see that det(A − λI) = 0 is a polynomial equation ofvariable λ. This equation is called the characteristic equation.

Example 31.1. Find the characteristic equation for the following matrices.

A =

(2 34 6

)

We see that

det(A − λI) = det(2 − λ 3

4 6 − λ

)= (2 − λ)(6 − λ) − 12 = λ2 − 8λ.

The characteristic equation is λ2−8λ = 0. Clearly, the eigenvalues are the roots of the characteristicequations: λ = 0 and λ = 8.

Once the eigenvalues are decided, one can calculate the eigenvectors by solving the linearsystem (A − λI)~x = ~0. Let’s continue to calculate the eigenvectors for Example 31.1. For λ = 0,we solve (

2 34 6

) (x1x2

)=

(00

)⇒

(1 3

20 0

) (x1x2

)=

(00

)So the associated eigenvectors are (− 3

2 t, t)τ for arbitrary nonzero t.

For λ = 8, we solve (−6 34 −2

) (x1x2

)=

(00

),⇒

(1 − 1

20 0

) (x1x2

)=

(00

)Therefore, the associated eigenvectors are ( 1

2 t, t)τ for arbitrary nonzero t.

Note that for every eigenvalue, the eigenvector is NOT unique.

44

Page 178: MATH320

Example 31.2. Find eigenvalues and eigenvectors for the following matrices.

A =

(0 4−1 0

), B =

(3 00 3

), C =

(1 20 1

).

For A, λ = ±2i, with eigenvectors being multiples of (∓2i, 1)τ.

For B, λ = 3, with eigenvectors being all nonzero vectors.

For C, λ = 1, with eigenvectors being multiples of (1, 0)τ.

Example 31.1 and Example 31.2 illustrate all the four possibilities for eigenvalues and eigen-vectors of 2 × 2 matrix A.

32 November 21st, 2012; Eigenvalues and Eigenvectors, II

First, let’s calculate the following example

Example 32.1. Find the eigenvalues and the associated eigenvectors for the matrix

A =

1 2 11 0 10 0 1

Calculate characteristic equation.

det(A − λI) = det

1 − λ 2 11 −λ 10 0 1 − λ

= (1 − λ) det

(1 − λ 2

1 −λ

)= (1 − λ)(λ2 − λ − 2)

= −(λ − 2)(λ − 1)(λ + 1).

So the eigenvalues are λ1 = 2, λ2 = 1, λ3 = −1.

For each eigenvalue, we can calculate the eigenvectors.

For λ = 2, we have

A − λ1I =

−1 2 11 −2 10 0 −1

,⇒ rre f (A − λ1I) =

1 −2 00 0 10 0 0

So the solution space of (A − λ1I)~x = 0 is spanned by ~v1 = (2, 1, 0)τ. The eigenvectors associatedwith λ = 2 are t~v1 for arbitrary t , 0.

45

Page 179: MATH320

For λ2 = 1, we have

A − λ2I =

0 2 11 −1 10 0 0

⇒ rre f (A − λ2I) =

1 0 32

0 1 12

0 0 0

The solution space is spanned by ~v2 =

(− 3

2 ,−12 , 1

)τ. The eigenvectors associated with λ = 1 are

all the vectors t~v2 for arbitrary t , 0.

For λ3 = −1, we have

A − λ3I =

2 2 11 1 10 0 2

⇒ rre f (A − λ3I) =

1 1 00 0 10 0 0

The solution space is spanned by ~v3 = (0, 1, 0)τ. The eigenvectors associated with λ = −1 are allthe vectors t~v3 for arbitrary t , 0.

Then show the following general fact: every 3 × 3 matrix has at least one real eigenvalues.Actually, after fundamental calculation, it is easy to see that f (λ) = det(A− λI) is a polynomial ofdegree 3. The leading term is (−1)3λ3. It follows that

limλ→∞

f (λ) = −∞, limλ→∞

f (λ) = ∞.

By the continuity of f , there must exist some λ0 ∈ (−∞,∞) such that f (λ0) = 0.

The same argument applies when A is an odd-dimensional square matrix. So every odd dimen-sional square matrix has at least one real eigenvalues.

We already know for a given eigenvalue, the associated eigenvectors are not unique. However,all the eigenvectors plus zero form a linear subspace, which is called the eigenspace of λ.

Definition 32.1. Suppose λ is an eigenvalue of A, then the solution space of (A−λI)~x = ~0 is calledthe eigenspace of A associated with the eigenvalue λ.

Example 32.2. Calculate the eigenspaces of the matrix

A =

4 −2 12 0 12 −2 3

Calculate characteristic equation

det(A − λI) = det

4 − λ −2 12 −λ 12 −2 3 − λ

= · · · = λ3 − 7λ2 + 16λ − 12.

We observe an integer solution λ = 2, then use long division to obtain the factorization.

det(A − λI) = (λ − 2)2(λ − 3).

46

Page 180: MATH320

For λ = 2. We obtain the eigenspace is span v1, v2 where

~v1 =

110

, ~v2 =

−102

For λ = 3. We obtain the eigenspace is span

~v3

where

~v3 =

111

Then use the eigenvectors and eigenvalues to calculate A4~v in the above example, where ~v =

(2, 3, 4)τ.

Actually, note that

~v = ~v1 + ~v2 + 2~v3,

A~v = 2~v1 + 2~v2 + 2 · 3~v3,

A2~v = 22~v1 + 22~v2 + 2 · 32~v3,

· · ·

A4~v = 24~v1 + 24~v2 + 2 · 34~v3.

Therefore,

A4~v = 16(~v1 + ~v2) + 162~v3 = 16

012

+ 162

111

=

162178194

.33 November 26th, 2012; First-Order Linear Systems and Matrices

Return to the harmonic oscillator and translate it to a first order linear system.

Therefore, we can solve a first order linear system by translating it to a second order equation.

Examples of translating a high-order equation to linear system.

Example 33.1. Using substitution to translate the following equation into a first order linearsystem:

y(3) + 10y′′ + 3y′ − 10y = et.

Solve the first order linear system:

x′ = y, y′ = −x.

47

Page 181: MATH320

This system is the same as x′′ + x = 0. The characteristic equation for this second order, homoge-neous equation is

r2 + 1 = 0,⇒ r = ±i⇒ x = c1 cos t + c2 sin t.

In general, a first order differential equation system is

y′1 = p11(t)y1 + · · · p1nyn + f1(t),

y′2 = p21(t)y1 + · · · p2nyn + f2(t),

· · ·

y′n = pn1(t)y1 + · · · pnnyn + fn(t).

Then suggest the existence and uniqueness theorem: if each function pi j and fk are continuouson the open interval I containing the point a. Then given the n numbers b1, b2, · · · , bn, the abovesystem has a unique solution on the entire interval I satisfying the n initial conditions

y1(a) = b1, · · · , yn(a) = bn

.

We can put y1, y2, · · · , yn into one vector ~y, the coefficients pi j(t) into one matrix P(t), then wehave

d~ydt

= P~y + ~f .

34 November 28th, 2012; First-Order Linear Systems and Matrices

Linear property.

Let ~y1, · · · , ~yn be n solutions of the homogeneous linear equation on the open interval I. Ifc1, c2, · · · , cn are constants, then the linear combination

~y(t) = c1~y1(t) + · · · + cn~yn(t)

is also a solution.

The vector-valued functions ~y1, · · · , yn are linearly dependent on the interval I if there existconstants c1, c2, · · · , cn, not all zero, such that

c1~y1(t) + · · · cnyn(t) = ~0

for all t in I.

Suppose each ~yi is n-dimensional vector valued functions on the inteval I, then the Wronskianof ~y1, · · · , ~yn is

W(t) = det

y11(t) y12(t) · · · y1n(t)y21(t) y22(t) · · · y2n(t)· · · · · · · · · · · ·

yn1(t) yn2(t) · · · ynn(t)

48

Page 182: MATH320

Note that if ~y1, · · · , ~yn are linearly dependent, then the Wronskian is a zero function.

Note the difference of Wronskian here and the Wronskian of a higher order differential equa-tion.

By the existence and uniqueness theorem, we see that if ~y1, · · · , ~yn are solutions of the homoge-neous equation, then the Wronskian is zero fundtion if and only if the Wronskian is zero at somepoint.

By the existence and uniqueness theorem, the solution of a homogeneous equation is totallydecided by the initial value at the point a ∈ I. At the point a, the initial value can be arbitraryvector in ~b ∈ Rn. So the solution space has the same structure as Rn.

Theorem 34.1. Suppose ~y1, · · · , ~yn be n linearly independent solutions of the homogeneous linearequation ~y′ = P(t)~y on an open interval I where P(t) is continuous. Then every solution can bewritten as

~y(t) = c1~y1(t) + · · · + cn~yn(t).

Example 34.1. Verify that ~y1(t) = (3e2t, 2e2t)τ, ~y2(t) = (e−5t, 3e−5t)τ are solutions of the first-ordersystem

y′1 = 4y1 − 3y2,

y′2 = 6y1 − 7y2.

Check the Wronskian, the linear independence, then every solution can be written as c1~y1(t) +

c2~y2(t).

Example 34.2. Verify that ~y1(t) = (2et, 2et, et)τ, ~y2(t) = (2e3t, 0,−e3t)τ, ~y3(t) = (2e5t,−2e5t, e5t)τ

are solutions of

d~ydt

=

3 −2 0−1 3 −20 −1 3

~y

Then with initial value ~y(0) = (0, 2, 6)τ.

Nonhomogeneous equation.

35 November 30th, 2012; First Linear Equations and Matrices(II)

Consider the simple Homogeneous system:

ddt~y = A~y, ~A =

(0 −11 0

).

49

Page 183: MATH320

Write ~y = (y1, y2)τ. Then the system can be easily written as

ddt

y1 = −y2,

ddt

y2 = y1.

Clearly, both

~ya =

(cos tsin t

), ~yb =

(− sin tcos t

)The Wronskian

W(t) = det(cos t − sin tsin t cos t

)= cos2 t + sin2 t = 1 , 0.

So ~ya and ~yb form a basis of the solution space. Every solution can be written as

~y = ca~ya + cb~yb =

(ca cos t − cb sin tca sin t + cb cos t

),

where ca, cb are arbitrary constants.

The method discussed above is not general. The general method is to use eigenvalues andeigenvectors. The characteristic equation

det(−λ −11 −λ

)= λ2 + 1 = 0.

So λ = ±i.

For i, calculate the corresponding eigenspace (i.e. the solution space, or kernel space of A−λI).

A − λI = A − iI =

(−i −11 −i

)R1∗(−i)+R2−→

(−i −10 0

)R1∗i−→

(1 −i0 0

)It follows that the eigenspace is spanned by (i, 1)τ. So the following vector-valued function is asolution:

eit(i1

)=

(ieit

eit

)=

(− sin t + i cos tcos t + i sin t

)=

(− sin tcos t

)+ i

(cos tsin t

)Take real and imaginary part, we obtain two independent solutions

~y1 =

(− sin tcos t

), ~y2 =

(cos tsin t

).

So every solution can be written as

~y = c1~y1 + c2~y2 =

(−c1 sin t + c2 cos tc1 cos t + c2 sin t

).

Note that the particular expression of solutions as linear combination of basis vectors may bedifferent. However, the solution space is independent of the choice of basis.

50

Page 184: MATH320

Example 35.1. Solve the initial value problem:

ddt~y = A~y, ~y(0) =

(11

)where

A =

(4 −33 4

).

Characteristic equation (λ − 4)2 + 9 = 0. Eigenvalue λ = 4 ± 3i.

For the eigenvalue λ = 4 + 3i, the eigenspace is the kernel space of

A − λI =

(−3i −33 −3i

)R1∗(−i)+R2−→

(−3i −30 0

)R1/(−3i)−→

(1 −i0 0

),

which is spanned by (i, 1)τ. So we have

~veλt = e(4+3i)t(i1

)= e4t

(− sin 3t + i cos 3tcos 3t + i sin 3t

)= e4t

(− sin 3tcos 3t

)+ ie4t

(cos 3tsin 3t

)The real part and the imaginary part of this complex-valued solution are two independent realsolutions. Therefore, every solution is

~y = c1e4t(− sin 3tcos 3t

)+ c2e4t

(cos 3tsin 3t

)Plug in the initial value ~y(0) = (1, 1)τ. So(

11

)= ~y(0) = c1

(01

)+ c2

(10

)⇒ c1 = 1, c2 = 1.

So the solution of the initial value problem is

y = e4t(cos 3t − sin 3tcos 3t + sin 3t

).

36 December 3rd, 2012; Multiple Eigenvalue solutions

Suppose

A =

(1 10 1

).

Then what are the general solutions of ddt~y = A~y?

51

Page 185: MATH320

The characteristic equation is (λ−1)2 = 0. There is a unique eigenvalue λ = 1 with multiplicity2. The eigenspace is spanned by (1, 0)τ. So we have a solution

et(10

).

However, the solution space of ddt~y = A~y is two-dimensional? Where is the other independent

solution? An easy guess is

tet(10

).

However, direct calculations show that it is not a solution. However, we can change it to be asolution by adding an extra term et~w. Let ~v = (1, 0)τ. Suppose ~y = (t~v + ~w)et is a solution. Then

ddt~y = (t~v + ~w)et + ~vet = et (1 + t)~v + ~w

,

A~y = et(tA~v + A~w) = et (t~v + A~w).

So ~y is a solution if and only ~w satisfies

(A − I)~w = ~v.

Choose ~w = (0, 1)τ. Then we see that

et (t~v + ~w)

= et(t1

)It is easy to see this is a solution independent of et~v. So the general solutions are

~y = c1et~v + c2et(t~v + ~w) = et(c1 + c2t

c2

).

Note that (A − I)~w = ~v, (A − I)~v = ~0. Therefore, we can also solve ~w by (A − I)2~w = ~0.

Then we start the general discussion.

An eigenvalue is called of multiplicity k if it is a k-fold root of the characteristic equation. Itis possible that for some eigenvalue λ with multiplicity k, the dimension of the eigenspace is lessthan k. If we denote the dimension of eigenspace as p. Then define d = k − p, which is calleddefect. An eigenvalue λ of mulitplicity k is called complete if the dimension of the correspondingeigenspace is of dimension k. So λ is complete if and only if d = 0. Note that if k = 1, then λ isalways complete. An eigenvalue λ of multiplicity k > 1 is called defective if it is not complete,i.e. d > 0. If d > 0, we have to use similar method as we have discussed in the previous exampleto find new solutions.

If A is 2 × 2 and λ is defective, there is only one possibility: k = 2, p = 1, d = 1. In this case,there is a uniform way to solve the system d

dt~y = A~y.

1. Solve a ~v2 such that (A − λI)2~v2 = 0 and (A − λI)~v2 , 0. Let ~v1 = (A − λI)~v2.

2. Then solutions are

~v1eλt,(~v1t + ~v2

)eλt.

52

Page 186: MATH320

37 December 5th; Multiple Eigenvalue solutions(II)

A more complicated example.

Example 37.1. Find the eigenvalue and eigenspace of the matrix A.

A =

(1 −33 7

)

The characteristic equation is (λ − 4)2 = 0.

For 3-dimensional case.

Example 37.2. Find the general solutions of

ddt~y = A~y, A =

5 −1 11 3 0−3 2 1

Calculate characteristic equation (3 − λ)3 = 0. The eigenvalue λ = 3 has multiplicity 3.

Note that (A − λI)3 = 0. So ~v1 can be chosen as arbitrary vector. Without loss of generality, let

~v1 =

100

Then calculate ~v2 = (A − λI)~v1,~v3 = (A − λI)~v2. We have

~v2 =

21−3

,~v3 =

022

Then one can write down the generalized solutions.

Discuss the general situation. For an eigenvalue with multiplicity k and defect d. Only need totry from (A − λI)d+1.

38 December 7th, Matrix Exponentials and Linear Systems

Suppose ~y1(t), · · · , ~yn(t) form a basis of the solution space of the homogeneous system ddt~y = A~y.

We can combine the ~yi’s to obtain a matrix

Φ(t) = (~y1(t), · · · , ~yn(t)),

53

Page 187: MATH320

which is called the fundamental matrix. It’s easy to see that Φ(t) solve the differential equation ofmatrices:

ddt

Φ(t) = AΦ(t).

As a corollary, the initial value problem

ddt~y = A~y, ~y(0) = ~y0,

has the solution

~y(t) = Φ(t)Φ(0)−1~y0.

Example 38.1. Solve the initial value problem in Example 35.1 by fundamental matrix method.

According to the solution of Example 35.1 , we have

Φ(t) =

(−e4t sin 3t e4t cos 3te4t cos 3t e4t sin 3t

)So we have

Φ(0) =

(0 11 0

), Φ(0)−1 =

(0 11 0

)Recall that initial condition

~y0 =

(11

)So we obtain

~y(t) = Φ(t)Φ(0)−1~y0 =

(−e4t sin 3t e4t cos 3te4t cos 3t e4t sin 3t

) (0 11 0

) (11

)=

(e4t(− sin 3t + cos 3t)e4t (cos 3t + sin 3t)

)Since Φ(t) is a solution of

ddt

Y(t) = AY(t),

where Y(t) a square-matrix-valued function. We can guess that the solution of this equation is

Y(t) = eAtY(0),

by formally look Y as a number. Actually, this makes sense if we define the exponential of squarematrix by formal series.

For every square matrix B, define

eB =

∞∑k=0

Bk

k!

54

Page 188: MATH320

By this definition, we can claculate the derivative of eAt formally.

ddt

eAt =ddt

∞∑k=0

(At)k

k!=

ddt

(I + At +

A2t2

2!+

A3t3

3!+ · · ·

)= A + A2t +

A3t2

2!

= A(I + At +

A2t2

2!+ · · ·

)= AeAt.

Therefore, we can easily check that eAtΦ(0) is a solution of ddt Y = AY:

ddt

eAtΦ(0)

=

ddt

eAt

Φ(0) = AeAtY(0) = AeAtΦ(0)

.

Note that eAt = I. So eAtΦ(0) is a solution of the matrix differential equation with initial valueΦ(0). On the other hand, Φ(t) is another solution of the same equation with the same initialconditions. So we have

Φ(t) = Φ(0)eAt.

Calculate some examples of eAt.

39 December 10th, Nonhomogeneous Linear Systems

Consider the linear system

ddt~y = P(t)~y + ~f (t). (13)

It is easy to see the solution has the decomposition

~y(t) = ~yc(t) + ~yp(t),

where ~yp(t) is a particular solution of (13), ~yc(t) is a general solution of the associated homoge-neous system d

dt~y = A~y.

How to find ~yp? There is a general method called variation of parameters.

Let Φ(t) be the fundamental matrix of the associated homogeneous system. Then for everyconstant vector ~c, we see that Φ(t)~c is a solution of the homogeneous differential equation. If ~cis not a constant vector, but a vector function, then the derivative of ~c will come into effect. Bychoose good ~c(t), it is possible to let the derivative of Φ(t)~c(t) be exactly ~f (t):

ddt

Φ(t)~c(t)

=

(ddt

Φ(t))~c(t) + Φ(t)

ddt~c(t) = P(t)Φ(t)~c(t) + Φ(t)

ddt~c(t) = ~f (t).

55

Page 189: MATH320

Thereofore, the choice of ~c(t) is given by the restriction

~f (t) = Φ(t)ddt~c(t)⇔

ddt~c(t) = Φ(t)−1 ~f (t).

So ~c(t) is any antiderivative of Φ(t)−1 ~f (t). For example, we can choose

~c(t) =

∫ t

aΦ(s)−1 ~f (s)ds.

So we obtain a particular solution

~yp = Φ(t)∫ t

aΦ(s)−1 ~f (s)ds.

Theorem 39.1. If Φ(t) is a fundamental matrix for the homogeneous system ddt~y = P(t)~y on some

interval where P(t) and ~f (t) are continuous, then a particular solution of the nonhomogeneouslysystem

ddt~y = P(t)~y + ~f (t)

is given by

~yp(t) = Φ(t)∫ t

aΦ−1(s) ~f (s)ds.

Suppose ~y(a) is given, then the solution is

~y = ~yc + ~yp = Φ(t)Φ(a)−1~y(a) + Φ(t)∫ t

aΦ(s)−1 ~f (s)ds.

In particular, if P(t) = A, then we can choose Φ(t) = eAt. Then Φ(t)−1 = e−At. So

~y = eA(t−a)~y(a) + eAt∫ t

ae−As ~f (s)ds.

Example 39.1. Solve the initial value problem

ddt~y = A~y + ~f , ~y(0) =

(00

)where

A =

(4 −33 4

), ~f =

(11

)

56

Page 190: MATH320

The solution is

~y = eAt~y(0) + eAt∫ t

0e−As ~f ds = eAt

∫ t

0e−As ~f ds.

Note that

eAt = e4t(cos 3t − sin 3tsin 3t cos 3t

),

e−As = e−4s(

cos 3s sin 3s− sin 3s cos 3s

),

e−As ~f = e−4s(

cos 3s sin 3s− sin 3s cos 3s

) (11

)= e−4s

(cos 3s + sin 3s− sin 3s + cos 3s

)Note that∫ t

0e−4s cos 3sds =

13

∫ t

0e−4sd sin 3s =

13

e−4s sin 3s|t0 −13

∫ t

0sin 3s · (−4)e−4sds

=13

e−4t sin 3t + 4

∫ t

0e−4s sin 3sds

=

13

e−4t sin 3t + 4

cos 3s3

e−4s|t0 −43

∫ t

0e−4s cos 3sds

=

13

e−4t sin 3t +49

(1 − e−4t cos 3t

)−

169

∫ t

0e−4s cos 3sds

It follows that

259

∫ t

0e−4s cos 3sds =

13

e−4t sin 3t +49

(1 − e−4t cos 3t

),∫ t

0e−4s cos 3sds =

325

e−4t sin 3t +425

(1 − e−4t cos 3t

).

Similarly, we can calculate∫ t

0e−4s sin 3sds =

34

∫ t

0e−4s cos 3sds −

14

e−4t sin 3t

= −425

e−4t sin 3t +325

(1 − e−4t cos 3t

).

So we have ∫ t

0e−As ~f ds =

∫ t

0

(e−4s cos 3s + e−4s sin 3s−e−4s sin 3s + e−4s cos 3s

)ds

=e−4t

25

(− sin 3t + 7(e4t − cos 3t)7 sin 3t + (e4t − cos 3t)

)=

e−4t

25

((−7 cos 3t − sin 3t) + 7e4t

(− cos 3t + 7 sin 3t) + e4t

)

57

Page 191: MATH320

It follows that

~y(t) = eAt∫ t

0e−As ~f ds

= e4t(cos 3t − sin 3tsin 3t cos 3t

e−4t

25

((−7 cos 3t − sin 3t) + 7e4t

(− cos 3t + 7 sin 3t) + e4t

)=

125

(−7 + e4t(7 cos 3t − sin 3t)−1 + e4t(cos 3t + 7 sin 3t)

).

This is the final solution. It is not hard to verify that it satisfies the equation.

40 December 12th, Stability and Phase Plane

Learn how to draw the phase diagram.

A first order system of the form

ddt

x = F(x, y),

ddt

y = G(x, y).

is called autonomous system. In an autonomous system, the variable t does not appear in the righthand side.

Starting from time t0, the solution x(t), y(t) corresponds to a curve in the xy-plane, which iscalled the phase plane. This curve is called a trajectory. A critical point is a point (x∗, y∗) such that

F(x∗, y∗) = G(x∗, y∗) = 0.

Find the critical points of the following autonomous system.

dxdt

= 14x − 2x2 − xy,

dydt

= 16y − 2y2 − xy.

The critical points are (0, 0), (0, 8), (7, 0), (4, 6).

A phase portrait is a picture on the phase plane (xy-plane). It consists of the critical points andtypical solution curves.

Example 40.1. Draw the phase diagram for the following linear system.

ddt~z = A~z,

where ~z = (x, y)τ. A is (1 00 1

),

(2 00 3

),

(−1 00 −2

),

(0 −11 0

),

(0 10 0

)

58

Page 192: MATH320

41 December 14th, Review for the final

The final will cover the following sections.

• Chapter 4.1, 4.2, 4.3, 4.4.

• Chapter 5.1, 5.2, 5.3, 5.4, 5.5.

• Chapter 6.1.

• Chapter 7.1, 7.2, 7.3, 7.5.

• Chapter 8.1, 8.2.

You can go to the Math library to request previous exams for practice. Some limited exams areonline:

http://math.library.wisc.edu/reserve/320.html

However, you need to make sure wheter the problems in the previous exams are covered by theabove list.

The final will take place in Social Science 5206, 7:25-9:25pm, 12/18/2012.

59

Page 193: MATH320

MATH 320: HOMEWORK 1

EP1.1.4 Verify by substitution that each given function is a solution of the differential equation:

y′′ = 9y; y1 = e3x, y2 = e−3x.

Solution. This is a second order ODE, so we need second derivatives:

y′1 = 3e3x, y′2 = −3e3x

y′′1 = 9e3x, y′′2 = 9e3x.

From this we get

y′′1 = 9y1, y′′2 = 9y2.

Thus both functions are solutions.

EP1.1.22 Consider the IVP

eyy′ = 1; y(0) = 0.

First verify y(x) = ln(x + C) satisfies the ODE. Then determine a value of the constant

C so that y(x) satisfies the initial condition. Sketch several typical solutions of the ODE,

then highlight the one that satisfies the initial condition.

Solution. We only need the first derivative of the given function,

y′ =1

x + C.

Substituting this into the ODE, we get

eln(x+C) 1

x + C= (x + C) · 1

x + C= 1.

Thus y(x) is a solution to the ODE.

For the initial condition we solve some basic algebra.

y(0) = 0⇔ ln(0 + C) = 0⇔ C = e0 = 1.

So with C = 1 the initial condition is satisfied.

I have not included the necessary sketches here.

EP1.1.32 Write a differential equation that models the following: The time rate of change of a pop-

ulation P is proportional to the square root of P .

Solution. dPdt = k

√P

EP1.1.33 Write a differential equation that models the following: The time rate of change of the

velocity v of a coasting motorboat is proportional to the square of v.

Solution. dvdt = kv2

1

Page 194: MATH320

2 MATH 320: HOMEWORK 1

EP1.1.34 Write a differential equation that models the following: The acceleration v of a Lamborghini

is proportional to the difference between 250 km/h and the velocity of the car.

Solution. v = k(250− v).

EP1.2.5 Find a function y = f(x) satisfying the given ODE and initial condition

dy

dx=

1√x + 2

; y(2) = −1.

Solution. We integrate directly:

f(x) = −1 +

∫ x

2

1√t + 2

dt = −1 + [log(t + 2)]x2

= −1 + (log(x + 2)− log 4)

and we are done.

EP1.2.20 A particle starts at the origin and travels along the x-axis with the velocity function v(t)

whose graph is shown. Sketch the graph of the resulting position function x(t) for 0 ≤ t ≤10.

Solution. This is just like one of those calculus problems, but they give you a starting point.

Your sketch should look something like the included figure.

EP1.2.43 Arthur Clarke’s The Wind from the Sun describes Diana, a spacecraft propelled by the solar

wind. Its aluminized sail provides it with a constant acceleration of .0001 g = .0098m/s2.

Suppose this spacecraft starts from rest at time t = 0 and simultaneously fires a projectile

(straight ahead in the same direction) that travels at one-tenth of the speed c = 3.0×108m/s

of light. How long will it take the spacecraft to catch up with the projectile, and how far

will it have traveled by then?

Solution. Let xp(t) denote the position of the projectile and xD(t) denote the position of

the spacecraft Diana at time t. The frame of reference that is our best choice places Diana

at position zero at time zero, i.e., xD(0) = 0, and we say the projectile was fired at time

Page 195: MATH320

MATH 320: HOMEWORK 1 3

zero, xp(0) = 0. Since the projectile travels at a constant velocity .1c, its position is given

by xp(t) = (.1c)t. We must find an expression for the position of Diana and find when the

two meet.

We have only Diana’s acceleration, .0001 g, and that its initial velocity is zero (it starts

from rest). From this we can conclude that the velocity at time t is vD(t) = (.0001 g)t.

Velocity is the derivative of position, so we have the IVP

dxDdt

= .0001gt, xD(0) = 0

which is solved by integration:

xD(t) = 0 +

∫ t

0.0001gs ds =

[.0001gs2

2

]t0

= .0005gt2.

The two paths cross when xD(t) = xp(t), so

.0005gt2 = .1ct⇒ t =.1c

.0005g≈ 193878.625 years.

That’s a really long time.

EP1.3.26 Suppose the deer population P (t) in a small forest satisfies the logistic equation

dP

dt= .0225P − .0003P 2.

Construct a slope field and appropriate solution curve to answer the following questions:

If there are 25 deer at time t = 0 and t is measured in months, how long will it take the

number of deer to double? What will be the limiting deer population?

Solution. You can use DFIELD to get an idea of the slope field. By the slope field I have,

it should take about 62 months for the the deer population to double. It also appears that

the limiting population should be 75 deer.

EP1.4.26 Find a particular solution for the IVP

dy

dx= 2xy2 + 3x2y2, y(1) = −1.

Page 196: MATH320

4 MATH 320: HOMEWORK 1

Solution. I like breaking problems into separate pieces.

Step 1: Find the general solution. We separate the variables like so:

1

y2dy = (2x + 3x2) dx.

This integrates easily:

−y−1 = x2 + x3 + C ⇔ y(x) = − 1

x2 + x3 + C.

Step 2: Apply the initial conditions. This leads to an algebraic computation.

−1 =1

12 + 13 + C⇒ C = −3.

Conclusion. The solution to the given initial value problem is y(x) = 1x2+x3−3 .

EP1.4.45 The intensity I of light at a depth of x meters below the surface of a lake staisfies the ODE

dI

dx= −1.4I.

(a) At what depth is the intensity half the intensity I0 at the surface (where x = 0)?

(b) What is the intensity at a depth of 10 m (as a fraction of I0)?

(c) At what depth will the intensity by 1% of that at the surface?

Solution. Before moving to any of the individual parts of the problem note that we will

need to find I(x) by solving the initial value problem. So let’s do that first.

Step 1: Find the general solution. We can separate the variables:

1

IdI = −1.4 dx⇒ log I = −1.4x + C ⇒ I(x) = D exp(−1.4x)

where D is the parameter determined by the initial condition.

Step 2: Apply the initial conditions. In this case we are only given that I(0) = I0.

Substituting this into our expression for I(x) we get

I0 = D exp(−1.4 · 0) = D exp(0) = D.

Conclusion. The function which solves the initial value problem if I(x) = I0 exp(−1.4x).

(a) This question is really asking for what value of x do we have I(x) = I0/2. This is just

solving the equation

I0 exp(−1.4x) = I0/2⇒ −1.4x = log

(1

2

)⇒ x =

log 2

1.4≈ .495.

(b) This is asking for I(10) = I0 exp(−1.4 · 10) ≈ 8.3× 10−7I0.

(c) Very similar to (a). I(x) = .01I0 leads to

exp(−1.4x) = .01⇒ x =log .01

−1.4≈ 1.43.

So the light intensity is 1% of its surface intensity at only 1.43 meters deep.

That about does it.

Page 197: MATH320

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

(1) Verify the function x(t) = 5e2t is a solution to the differential equation x− 3x+ 2x = 0. If

we add the initial condition x(0) = 4, is x(t) a solution to the IVP?

(2) Integration Review. Compute the following integrals:

(a)∫

sin2(x) dx Hint. Use a trig identity

(b)∫ex cosx dx Hint. Use integration by parts twice

Solution.

(a)

∫sin2(x) dx =

∫1

2(1− cos 2x) dx =

1

2

(x− 1

2sin 2x

)+ C

(b)

∫ex cosx dx = ex sinx−

∫ex sinx dx = ex sinx−

(−ex cosx−

∫−ex cosx

)

⇒ 2

∫ex cosx dx = ex(sinx+ cosx) + C ⇒

∫ex cosx dx =

1

2ex(sinx+ cosx) + C

(3) How do we change a first order linear differential equation to begin solving it? (This is just

one line.)

Solution. We multiply the ODE by a generic function, µ(t). That is,

µ(t)y′(t) + µ(t)a(t)y(t) = µ(t)b(t).

(4) Determine the order of the following ODEs, and classify them as linear or nonlinear.

(a) y′′′ + 3y′ − 7y = 0,

(b) x+ 5tx = tan t,

(c) d4ydx4

+ sin y = dydx .

(5) Verify the given function(s) is a solution to the ODE or IVP.

(a) y′′ − y = 0; y(x) = coshx

Solution. y′ = sinhx, y′′ = coshx. Therefore,

y′′ − y = coshx− coshx = 0.

(b) tx− x = t2, x(0) = 0; x(t) = 3t+ t2

1

Page 198: MATH320

2 MATH 320: EXAM 1 ADDITIONAL PROBLEMS

Solution. x = 3 + 2t. Thus

t(3 + 2t)− (3t+ t2) = t2,

so this is a solution to the ODE. Since x(0) = 3 · 0 + 02 = 0, this is a solution to the

IVP.

(c) t2y′′ + 5ty′ + 4y = 0, y(e) = e−2; y1(t) = t−2, y2(t) = t−2 log t.

Solution. y′1 = −2t−3, y′′1 = 6t−4. Therefore,

t2(6t−4) + 5t(−2t−3) + 4(t−2) = (6− 10 + 4)t−2 = 0

so y1 is a solution to the ODE, and y1(e) = e−2 so it solves the IVP.

y′2 = −2t−3 log t+ t−2 · t−1, y′′2 = 6t−4 log t− 2t−3t−1 − 3t−4 = 6t−4 log t− 5t−4. So

t2(6t−4 log t− 5t−4) + 5t(−2t−3 log t+ t−2 · t−1) + 4(t−2 log t)

= (6− 10 + 4)t−2 log t+ (−5 + 5)t−2 = 0.

Therefore, y2 is a solution to the ODE, and y2(e) = e−2 log e = e−2 so it solves the

IVP.

(6) Sketch a direction field and a few integral curves for the following ODEs.

(a) y′ = y(4− y)

(b) y′ = y(y − 4)

(c) y′ = y(y − 3)2

(d) y′ = sin y

(7) Research: Find two ODEs (of any order) from your major and explain the system the

describe. Do not solve these equations or explain how they were derived, simply tell me

what system they are used to model.

(8) Solve the following initial value problems:

(a) t2(1 + x2) + 2xdxdt = 0; x(0) = 1,

Solution. We do a little algebra to rewrite:

2x

1 + x2dx

dt= −t2 ⇒ 2x

1 + x2dx = −t2dt.

Since this is separable, we directly integrate.∫2x

1 + x2dx = −

∫t2dt⇒ log(1 + x2) = −t3/3 + C

⇒ 1 + x2 = De−t3/3 ⇒ x(t) =

√De−t3/3 − 1.

Awesome, now we determine the coefficient D using the initial conditions.

x(0) = 1 =√De−03/3 − 1⇒ 1 = D − 1⇒ D = 2.

Thus, the solution is x(t) =√

2e−t3/3 − 1.

Page 199: MATH320

MATH 320: EXAM 1 ADDITIONAL PROBLEMS 3

(b) dydt + ty = t; y(0) = 2.

Solution. This equation is linear, so we employ those techniques. You multiply by an

integrating factor:

µdy

dt+ µty = t.

We want to take advantage of the product rule, hoping that

d

dt[µy] = µy + µy = µt.

This requires µy + µy = µy + µty ⇒ µ = µt. We solve this by separation, and find

that a solution is µ(t) = exp(t2/2). With this integrating factor, we know that

d

dt

[et

2/2y]

= et2/2t⇒ et

2/2y = et2/2 + C

⇒ y(t) = 1 + Ce−t2/2.

Now, the initial condition requires y(0) = 2 = 1 + Ce0, so C = 1.

(9) What is the equation of the logistic growth model? How should we solve it?

Solution. There are a number of different forms, but I prefer

dP

dt= rP

(1− P

K

).

This is a separable ODE, which we know how to solve.

(10) What are the ideas from basic calculus we take advantage of to solve separable equations

and linear equations? (Each answer is only two words long).

Solution. The solution of separable equations is based on the chain rule.

The solution of separable equations is based on the product rule.

(11) In this course, I presented a theorem on the existence and uniqueness of solutions to the

IVPdy

dt= f(t, y), y(0) = y0

assuming some smoothness conditions for f . The theorem is quite general, and it certainly

includes linear first order ODEs. However, there is a different theorem specific to linear first

order ODEs. What is the advantage of the theorem for linear first order ODEs compared

to the more general theorem?

Solution. Both existence and uniqueness theorems tells us that our solution to the IVP

is only valid only a limited interval. The theorem for linear ODEs has the advantage that

we know exactly how big the interval is. In particular, it states that the solution is valid

in the interval in which a(x) and b(x) are both continuous. The more general merely says

that the interval exists. It may be incredibly tiny, or quite large. The general theorem says

nothing one way or the other.

BD2.2.3 Solve the ODE y′ + y2 sinx = 0.

Solution. We do a little algebra to rearrange the equation as

y−2 dy = − sinx dx.

Page 200: MATH320

4 MATH 320: EXAM 1 ADDITIONAL PROBLEMS

Then we integrate:

−y−1 = cosx+ C.

Now do more algebra to arrive at

y(x) = − 1

cosx+ C.

BD2.2.4 Solve the ODE y′ = (3x2 − 1)/(3 + 2y).

Solution. We do a little algebra to rearrange the equation as

(3 + 2y) dy = (3x2 − 1) dx.

Then we integrate:

3y + y2 = x3 − x+ C.

This is actually a quadratic equation in y if we rewrite as

y2 + 3y + (x− x3 +D) = 0,

so we can solve for y to get

y(x) =−3±

√9− 4x+ x3 + E

2.

BD2.2.5 Solve the ODE y′ = (cos2 x)(cos2 2y).

Solution. We do a little algebra to rearrange the equation as

1

cos2 2ydy = cos2 x dx.

Now use a trig identity to get

sec2 2y dy =1

2(1 + cos 2x) dx.

Then we integrate:1

2tan 2y =

x

2− 1

4sin 2x+ C.

Now do more algebra to arrive at

y(x) =1

2arctan

(x− 1

2sin 2x+D

).

BD2.2.6 Solve the ODE xy′ = (1− y2)1/2.Solution. We do a little algebra to rearrange the equation as

1√1− y2

dy =1

xdx.

Then we integrate:

arcsin y = log x+ C.

Now do more algebra to arrive at

y(x) = sin(log x+ C).

Page 201: MATH320

MATH 320: EXAM 1 ADDITIONAL PROBLEMS 5

BD2.2.21 Solve the IVP y′ = (1 + 3x2)/(3y2 − 6y), y(0) = 1 and determine the interval in which the

solution is valid.

Solution. We do a little algebra to rearrange the equation as

3y2 − 6y dy = 1 + 3x2 dx.

Then we integrate:

y3 − 3y2 = x+ x3 + C.

Applying the initial condition, we get

13 − 3 · 12 = 0 + 0 + C ⇒ C − 2.

So our solution is

y3 − 3y2 = x+ x3 − 2.

We will consider this implicit solution good enough. Now we know this solution is only

valid in the interval between two vertical tangents. These vertical tangents occur when

dy

dx=∞⇒ 3y2 − 6y = 0⇒ y = 0 or y = 2.

So the question is, where on our specific solution does y = 0 and y = 2? We substitute

these values in to find that

03 − 3 · 02 = x+ x3 − 2⇒ x = 1

and

23 − 3 · 22 = x+ x3 − 2⇒ x = −1.

So the interval in which the solution is valid is −1 < x < 1.

BD2.5.3 Sketch f(y) versus y, determine the critical (equilibrium) points, and classify each as stable

or unstable. Draw the phase line, and sketch several graphs of solutions in the ty-plane.

dy

dt= y(y − 1)(y − 2), y0 ≥ 0

Solution. First, the graph of f .

Page 202: MATH320

6 MATH 320: EXAM 1 ADDITIONAL PROBLEMS

The critical points occur when f(y) = 0, and that is y = 0, y = 1, and y = 2. The points

are unstable, stable, and unstable. Now, the phase line and some solutions.

BD2.5.4 Sketch f(y) versus y, determine the critical (equilibrium) points, and classify each one

asymptotically stable or unstable. Draw the phase line, and sketch several graphs of solu-

tions in the ty-plane.

dy

dt= ey − 1, −∞ < y0 <∞

Solution. First, the graph of f .

The critical points occur when f(y) = 0, and that is y = 0. The point is unstable. Now,

the phase line and some solutions.

Page 203: MATH320

MATH 320: EXAM 1 ADDITIONAL PROBLEMS 7

BD2.5.22 Suppose that a given population can be divided into two parts: those who have a given

disease and can infect others, and those who do not have it but are susceptible. Let x be the

proportion of susceptible individuals and y the proportion of infectious individuals; then

x+y = 1. Assume that the disease spreads by contact between sick and well members of the

population and that the rate of spread dydt is proportional to the number of such contacts.

Further, assume that members of both groups move about freely among each other, so the

number of contacts is proportional to the product of x and y. Since x = 1 − y, we obtain

the IVPdy

dt= αy(1− y), y(0) = y0,

where α is a positive proportionality factor, and y0 is the initial proportion of infectious

individuals.

(a) Find the equilibrium points for the differential equation and determine whether each

is asymptotically stable, semistable, or unstable.

(b) Solve the IVP and verify that the conclusions you reached in part (a) are correct. Show

that y(t)→ 1 as t→∞, which means that ultimately the disease spreads through the

entire population.

Solution. Here we go.

(a) The equilibrium points occur when y(y − 1) = 0, so for y = 0 and y = 1. Doing a

phase line we can determine y = 1 is stable and y = 0 is unstable.

(b) This is a separable equation:

dy

y(1− y)= αdt.

Page 204: MATH320

8 MATH 320: EXAM 1 ADDITIONAL PROBLEMS

Now we integrate, using partial fractions.∫1

y− 1

y − 1dy = αt+ C ⇒ log

(y

y − 1

)= αt+ C

⇒ y =Deαt

Deαt + 1

Now applying the initial conditions, we find that

y(0) = y0 =D

D − 1⇒ D =

y0y0 − 1

.

Thus, we can do a little algebra to find the specific solution to the IVP

y(t) =y0e

αt

y0eαt − y0 + 1=

y0y0 − (y0 − 1)e−αt

.

We said in (a) that the only stable equilibrium point is 1, so all solutions should tend

to 1 (that is for positive y0, and this is a population problem). To verify this we note

that the limit as t→∞ is

limt→∞

y(t) =

1 : y0 > 0

0 : y0 = 0

BD2.5.28 A second order chemical reaction involves the interaction (collision) of one molecule of

a substance P with one molecule of a substance Q to produce one molecule of a new

substance X; this is denoted P + Q → X. Suppose that p and q, where p 6= q, are the

initial concentrations of P and Q, respectively, and let x(t) be the concentration of X at

time t. Then p − x(t) and q − x(t) are the concentrations of P and Q at time t, and the

rate at which the reaction occurs is given by the equation

dx

dt= α(p− x)(q − x),

where α is a positive constant.

(a) If x(0) = 0, determine the limiting value of x(t) as t → ∞ without solving the differ-

ential equation. Then solve the IVP and x(t) for any t.

(b) If the substances P and Q are the same, the p = q and the above ODE is replaced by

dx

dt= α(p− x)2.

If x(0) = 0, determine the limiting value of x(t) as t → ∞ without solving the differ-

ential equation. Then solve the IVP and x(t) for any t.

Solution. I recommend making a phase line.

(a) The equilibrium points are x = p and x = q. Without loss of generality we may assume

p < q. In this case the solution to the IVP has the limit

limt→∞

x(t) = p.

Page 205: MATH320

MATH 320: EXAM 1 ADDITIONAL PROBLEMS 9

We solve using separation of variables and partial fractions.

dx

(p− x)(q − x) = αdt⇒ 1

q − p(log(p− x)− log(q − x)) = αt+ C

⇒ x(t) =qDe(q−p)αt − pDe(q−p)αt − 1

Now we apply the initial condition.

x(0) = 0 =qD − pD − 1

⇒ D =p

q

So our final formula is

x(t) =pe(q−p)αt − pqe(q−p)αt/q − 1

=pq(e(q−p)αt − 1)

qe(q−p)αt − p.

(b) The equilibrium point is x = p. This point is semistable, particularly for an initial

condition starting below x = p, such as our initial condition x(0) = 0. In other words,

limt→∞

x(t) = p.

Now we solve our ODE:

dx

(p− x)2= αdt⇒ 1

p− x= αt+ C

Applying our IC right now, we find that

1

p= C

and go back to solving our ODE. Making this substitution, we solve for x and find

x(t) =p2αt+ p

pαt+ 1

Page 206: MATH320

MATH 320: PRACTICE EXAM 1

• This is merely a practice exam. It is not meant to be your sole source of study.

• You should aim to complete this in 50 minutes without consulting any references.

• Just to reiterate: THIS CANNOT BE YOUR ONLY RESOURCE FOR STUDYING. Con-

sult your class notes and textbook for additional examples and practice problems.

(1) Find the general/specific solutions to the following ODE/IVPs.

(a) y′ − 7y = sin 2x

(b) dxdt = 2t(x2 + 9)

(c) 2xe2t + (1 + e2t)dxdt = 0, x(1) = −2

Solution. The tricky part is choosing the correct method.

(a) We may use our formulas for linear equations directly in this case:

µ(x) = exp

(∫−7 dx

)= e−7x

and

y(x) =

∫e−7x sin 2x dx

µ(x)= − 1

53(7 sin(2x) + 2 cos(2x)) + Ce−7x.

(b) Separate the variables:

dx

x2 + 9= 2t dt⇒ 1

3arctan

(x3

)= t2 + C.

Now do the algebra for an explicit solution:

x(t) = 3 tan(3t2 + C.)

(c) This equation is separable

dx

x= − 2e2t

1 + e2tdt

⇒ log x = − log(1 + e2t) + C

⇒ x(t) = Delog(1+e2t)−1=

D

1 + e2t.

Applying the initial condition results in

−2 =D

1 + e2⇒ D = −2(1 + e2).

Thus the solution is

x(t) =−2(1 + e2)

1 + e2t

for any t.

1

Page 207: MATH320

2 MATH 320: PRACTICE EXAM 1

(2) What formula from basic calculus is being invoked each time separation of variables is used

to solve a separable ODE? Which formula is invoked for the use of an integrating factor for

1st order linear ODEs?

Solution. Separation of variables uses the chain rule, and the product rule is used for 1st

order linear ODEs.

(3) Consider the ODEdy

dt= y(y − 1)(y − 3).

Find the equilibrium points of this autonomous ODE and create a phase line to classify

their stability. Sketch a few solution curves for various initial conditions. Then solve the

ODE with initial condition y(0) = 3.

Solution. The equilibrium points are y = 0, y = 1, and y = 3. Using f(y) = y(y− 1)(y− 3)

as a guide for our phase line, we find have the following phase line:

We can use this phase line as a guide to sketch a few solution curves:

For the explicit solution we use separation of variables:

dy

y(y − 1)(y − 3)= dt⇒

(1

3y− 1

2(y − 1)+

1

6(y − 3)

)dy = dt

Page 208: MATH320

MATH 320: PRACTICE EXAM 1 3

Note that we used partial fractions here. Now integrate to find

1

3log y − 1

2log(y − 1) +

1

6log(y − 3) = t+ C

⇒ 2 log y − 3 log(y − 1) + log(y − 3) = 6t+ C.

A bit of algebra leads toy2(y − 3)

(y − 1)3= De6t.

Applying the initial condition gives

4(2− 3)

(2− 1)

3

= De0 ⇒ D = −4.

So the (implicit) specific solution is

y2(y − 3)

(y − 1)3= −4e6t.

An important fact to observe here is that determining stability of equilibrium points using

this solution would be extremely difficult. In this instance it seems not only easier, but

necessary to use the phase line to examine such issues.

(4) Suppose we have a plate of enchiladas at 300 in a room that is 75. After a minute out of

the oven, the temperature has cooled to 250.

(a) Assume the rate at which the temperature changes is directly proportional to the

difference between the room temperature and enchilada temperature. Write down the

differential equation describing the time evolution of the temperature of the enchiladas.

Define any variables you introduce.

(b) Find the formula that gives the temperature of the enchiladas for any time, t.

(c) How long until the enchiladas cool to 125? Based on this, would you say we have an

accurate temperature model?

Solution. This follows my previous lectures.

(a) dTedt = −k(Te − 75). Here Te is the temperature of the enchiladas and k is simply a

constant of proportionality. Note that the equation comes from Newton’s cooling law.

(b) This means solving the above equation using separation of variables.

dTeTe − 75

= −k dt⇒ log(Te − 75) = −kt+ C

⇒ Te(t) = De−kt + 75.

Applying the initial condition, we find that

Te(0) = 300 = De0 + 75⇒ D = 225.

To determine k, we apply what we know from a minute later:

Te(1) = 250 = 225e−k + 75⇒ k = − log

(7

9

).

Page 209: MATH320

4 MATH 320: PRACTICE EXAM 1

Putting these pieces together, we see that

Te(t) = 225et log(7/9) + 75 = 225

(7

9

)t

+ 75.

(c) This is now just a little algebra:

225et log(7/9) + 75 = 125⇒ t log(7/9) = log(2/9)⇒ t =log(2/9)

log(7/9)≈ 5.98.

That seems like a reasonable amount of time, so I would say the model is okay.

(5) Consider the linear system

x1 + 2x2 + x3 = 0

−2x1 − 3x2 + x3 = 0

3x1 + 5x2 = 0.

Rewrite this system as a matrix, and use Gaussian elimination to find the solutions to this

system.

Solution. The matrix form of this system and the row operations are

1 2 1 0

−2 −3 1 0

3 5 0 0

2R1+R2→R2−−−−−−−−→1 2 1 0

0 1 3 0

3 5 0 0

−3R1+R3→R3−−−−−−−−−→1 2 1 0

0 1 3 0

0 −1 −3 0

R2+R3→R3−−−−−−−→1 2 1 0

0 1 3 0

0 0 0 0

The system must have infinitely many solutions, so we start by parameterizing x3 = t.

Using row 2 of the matrix we have x2 = −3x3 = −3t. Going back up to row 1 gives

x1 = −2x2 − x3 = 6t+ t = 7t. So

x1 = 7t

x2 = −3t

x3 = t

Page 210: MATH320

Math 320 Spring 2009Part III – Linear Systems of Diff EQ

JWR

May 14, 2009

1 Monday March 30

The Existence and Uniqueness Theorem for Ordinary Differential Equationswhich we studied in the first part of the course has a vector version which issill valid. Here is the version (for a single equation) from the first part of thecourse, followed by the vector version.

Theorem 1 (Existence and Uniqueness Theorem). Suppose that f(t, y) is acontinuous function of two variables defined in a region R in (t, y) plane andthat the partial ∂f/∂y exists and is continuous everyhere in R. Let (t0, y0)be a point in R. Then there is a solution y = y(t) to the initial value problem

dy

dt= f(t, y), y(t0) = y0

defined on some interval I about t0. The solution is unique in the sense thatany two such solutions of the initial value problem are equal where both aredefined.

Theorem 2 (Existence and Uniqueness Theorem for Systems). Assume thatf(t,x) is a (possibly time dependent) vector field on Rn, i.e. a function whichassigns to each time t and each vector x = (x1, . . . , xn) in Rn a vector f(t,x)in Rn. Assume that f(t,x) is continuous in (t,x) and that the partial deriva-tives in the variables xi are continuous. Then for each initial time t0 andeach point x0 in Rn the initial value problem

dx

dt= f(t,x), x(t0) = x0

1

Page 211: MATH320

defined on some interval I about t0. The solution is unique in the sense thatany two such solutions of the initial value problem are equal where both aredefined.

Proof. See Theorem 1 on page 682 in Appendix A.3 and Theorem 1 onpage 683 in Appendix A.4 of the text.

3. For the rest of this course we will study the special case of linear systemwhere the vector field f(t,x) has the following form In that case the vectorfield f(t,x) has the form

f(t,x) = −A(t)x + b(t)

where A(t) is a continous n × n matrix valued function of t and b(t) is acontinuous vector valued function of t with values in Rn. A system of thisform is called a linear system of differential equations. We shall movethe term −A(t)x to the other side so the system takes the form

dx

dt+ A(t)x = b(t) (1)

In case the the right hand side b is identically zero we say that the systemis homogeneous otherwise it is called inhomogeneous or non homoge-neous. If

x =

x1

x2...

xn

, A =

a11 a12 . . . a1n

a21 a22 . . . a2n...

.... . .

...an1 an2 . . . ann

, b =

b1

b2...bn

.

Then f = (f1, . . . , fn) where fi = −ai1(t)x1 − ai2(t)x2 − · · · − ain(t)xn + bi(t)and the system (1) can be written as n equations

dxi

dt+ ai1(t)x1 + ai2(t)x2 + · · ·+ ain(t)xn = bi(t), i = 1, 2, . . . n (2)

in n unknowns x1, . . . , xn where the aij and bi(t) are given functions of t. Forthe most part we shall study the case where the coefficients aij are constant.Using matrix notation makes the theory of equation (1) look very much likethe theory of linear first order differential equations from the first part of thiscourse.

2

Page 212: MATH320

4. In the case of linear systems of form (2) the partial derivatives are au-tomatically continuous. This is because ∂fi/∂xj = −aij(t) and the matrixA(t) is assumed to be a continuous function of t. Hence the Existence andUniqueness Theorem 2 apples. But something even better happens. In thegeneral case of a nonlinear system solutions can become infinite in finite time.For example (with n = 1) the solution to the nonlinear equation dx/dt = x2

is x = x0/(1− x0t) which becomes infinite when t = 1/x0. But the followingtheorem says that in the linear case this doesn’t happen.

Theorem 5 (Existence and Uniqueness for Linear Systems). Let t0 be areal number and x0 be a point in Rn then the differential equation (1) has aunique solution x defined for all t satisfying the initial condition x(t0) = x0.

Proof. See Theorem 1 on page 399 of the text and Theorem 1 page 681 ofAppendix A.2.

Definition 6. A nth order linear differential equation is of form

dny

dtn+ p1(t)

dn−1y

dtn−1+ · · ·+ pn−1(t)

dy

dt+ pn(t)y = f(t) (3)

where the functions p1(t), . . . , pn(t), and f(t) are given and the function y isthe unknown. If the function f(t) vanishes identically, the equation is calledhomogeneous otherwise it is called inhomogeneous. For the most partwe shall study the case where the coefficients p1, . . . , pn are constant.

7. The text treats nth order equations in chapter 5 and systems in chapter 7,but really the former is a special case of the latter. This is because afterintroducing the variables

xi =di−1y

dti−1

the equation (3) becomes the system

dxi

dt− xi+1 = 0, i = 1, . . . , n− 1

anddxn

dt+ p1(t)xn−1 + · · ·+ pn−2(t)x2 + pn(t)x1 = f(t).

For example the 2nd order equation

d2y

dt2+ p1(t)

dy

dt+ p2(t)y = f(t)

3

Page 213: MATH320

becomes the linear systemdx1

dt− x2 = 0

dx2

dt+ p1(t)x2 + p2(t)x1 = f(t)

in the new variables x1 = y, x2 = dy/dt. In matrix notation this linearsystem is

d

dt

[x1

x2

]+

[0 −1

p2(t) p1(t)

] [x1

x2

]=

[0

f(t)

].

For this reason the terminology and theory in chapter 5 is essentially thesame as that in chapter 7.

Theorem 8 (Existence and Uniqueness for Higher Order ODE’s). If thefunctions p1(t), . . . , pn−1(t), pn(t), f(t) are continuous then for any given num-bers t0, y0, . . . , yn−1 the nth order system (3) has a unique solution definedfor all t satisfying the initial condition

y(t0) = y0, y′(t0) = y1, . . . , y(n−1)(t0) = yn−1

Proof. This is a corollary of Theorem 5. See Theorem 2 page 285, Theorem 2page 297.

Theorem 9 (Superposition). Suppose A(t) is a continuous n × n matrixvalued function of t. Then the solutions of the homogeneous linear system

dx

dt+ A(t)x = 0 (4)

of differential equations form a vector space In particular, the solutions of ahigher order homogeneous linear differential equation form a vector space.

Proof. The show that the set of solutions is a vector space we must checkthree things:

1. The constant function 0 is a solution.

2. The sum x1 + x2 of two solutions x1 and x2 is a solution.

3. The product cx of a constant c and a solution x is a solution.

4

Page 214: MATH320

(See Theorem 1 page 283, Theorem 1 page 296, and Theorem 1 page 406 ofthe text.)

Theorem 10 (Principle of the Particular Solution). Let xp is a solution isa particular solution of the non homogeneous linear system

dx

dt+ A(t)x = b(t).

Then if x is a solution of the corresponding homogeneous linear system

dx

dt+ A(t)x = 0

then x+xp solves the nonhomogeneous system and conversely every solutionof the nonhomogeneous system has this form.

Proof. The proof is the same as the proof of the superposition principle.The text (see page 490) says it a bit differently: The general solution ofthe nonhomogeneous system is a particilar solution of the nonhomogeneoussystem plus the general solution of the corresponding homogeneous system.

Corollary 11. Let yp be a solution is a particular solution of the non homo-geneous higher order differential equation

dny

dtn+ p1(t)

dn−1y

dtn−1+ · · ·+ pn−1(t)

dy

dt+ pn(t)y = f(t).

Then if y is a solution of the corresponding homogeneous higher order differ-ential equation

dny

dtn+ p1(t)

dn−1y

dtn−1+ · · ·+ pn−1(t)

dy

dt+ pn(t)y = 0

then y + yp solves the nonhomogeneous differential equation and converselyevery solution of the nonhomogeneous differential equation has this form.

Proof. Theorem 4 page 411.

Theorem 12. The vector space of solutions of (4) has dimension n. Inparticular, the vector space of solutions of (3) has dimension n.

5

Page 215: MATH320

Proof. Let e1, e2, . . . , en be the standard basis of Rn. By Theorem 2 there isa unique solution xi of (4) satisfying xi(0) = ei. We show that these solutionsform a basis for the vector space of all solutions.

The solutions x1,x2, . . . ,xn span the space of solutions. If x is any solu-tion, the vector x(0) is a vector in Rn and is therefore a linear combinationx(0) = c1e1 + c2e2 + · · · + cnen of e1, e2, . . . , en. Then c1x1(t) + c2x2(t) +· · · + cnxn(t) is a solution (by the Superposition Principle) and agrees withx at t = 0 (by construction) so it must equal x(t) for all t (by the Existenceand Uniqueness Theorem).

The solutions x1,x2, . . . ,xn are independent. If c1x1(t) + c2x2(t) + · · ·+cnxn(t) = 0 then evaluating at t = 0 gives c1e1 + c2e2 + · · · + cnen = 0 soc1 = c2 = · · · = cn = 0 as e1, e2, . . . , en are independent.

2 Wednesday April 1, Friday April 3

13. To proceed we need to understand the complex exponential function.Euler noticed that when you substitute z = iθ into the power series

ez =∞∑

n=0

zn

n!(5)

you get

eiθ =∞∑

n=0

inθn

n!

=∞∑

k=0

i2kθ2k

(2k)!+

∞∑k=0

i2k+1θ2k+1

(2k + 1)!

=∞∑

k=0

(−1)kθ2k

(2k)!+ i

∞∑k=0

(−1)kθ2k+1

(2k + 1)!

= cos θ + i sin θ.

This provides a handy way of remembering the trigonometric addition for-mulas:

ei(α+β) = cos(α + β) + i sin(α + β)

and

eiαeiβ =(cos α + i sin α

)(cos +β + i sin β

)= (cos α cos β − sin α sin β) + i(sin α cos β + cos α sin β)

6

Page 216: MATH320

so equating the real and imaginary parts we get

cos(α + β) = cos α cos β − sin α sin β, sin(α + β) = sin α cos β + cos α sin β.

Because cos(−θ) = cos θ and sin(−θ) = − sin θ we have

cos θ =eiθ + e−iθ

2, sin θ =

eiθ − e−iθ

2i.

Note the similarity to

cosh t =et + et

2, sinh t =

et − e−t

2.

It is not difficult to prove that the series (5) converges for complex numbersz and that ez+w = ezew. In particular, if z = x + iy where x and y are realthen

ez = exeiy = ex(cos y + i sin y) = ex cos y + iex sin y

so the real and imaginary parts of ez are given

<ez = ex cos y =ez + ez

2, =ez = ex sin y =

ez − ez

2i.

14. We will use the complex exponential to find solutions to the nth orderlinear homgeneous constant coefficient differential equation

dny

dtn+ p1

dn−1y

dtn−1+ · · ·+ pn−1

dy

dt+ pny = 0 (6)

As a warmup let’s find all the solutions of the linear homogeneous 2nd orderequation

ady

dt2+ b

dy

dt+ cy = 0

where a, b, c are constants and a 6= 0. (Dividing by a puts the equation inthe form (3) with n = 1 and p1 = b/a and p2 = c/a constant.) As an ansatzwe seek a solution of form y = ert. Substituting gives

(ar2 + br + c)ert = 0

so y = ert is a solution iff

ar2 + br + c = 0, (7)

7

Page 217: MATH320

i.e. if r = r1 or r = r2 where

r1 =−b +

√b2 − 4ac

2a, r2 =

−b−√

b2 − 4ac

2a.

The equation (7) is called the charactistic equation (or sometimes theauxilliary equation). There are three cases.

1. b2 − 4ac > 0. In this case the solutions r1 and r2 are distinct and realand for any real numbers c1 and c2 the function y = c1e

r1t + c2er2t

satisfies the differential equation.

2. b2 − 4ac < 0. In this case the roots are complex conjugates:

r1 = r =−b

2a+ i

√4ac− b2

2a, r2 = r =

−b

2a− i

√4ac− b2

2a.

The functions ert and ert are still solutions of the equation because cal-culus and algebra works the same way for real numbers as for complexnumbers. The equation is linear so linear combinations of solutions aresolutions so the real and imaginary parts

y1 =ert + ert

2, y2 =

ert − ert

2i

of solutions are solutions and for any real numbers c1 and c2 the functiony = c1y1 + c2y2 is a real solution.

3. b2 − 4ac = 0. In this case the characteristic equation (7) has a dou-ble root so aD2 + bD + c = a(D − r)2 where r = b/(2a). (For themoment interpret D as an indeterminate; we’ll give another interpre-tation later.) It is easy to check that both ert and tert are solutions soy = c1e

rt + c2tert is a solution for any constants c1 and c2.

Example 15. For any real numbers c1 and c2 the function

y = c1e2t + c2e

3t

is a solution the differential equation

d2y

dt2− 5

dy

dt+ 6y = 0.

8

Page 218: MATH320

Example 16. For any real numbers c1 and c2 the function

y = c1et cos t + c2e

t sin t

is a solution the differential equation

d2y

dt2− 2

dy

dt+ 2y = 0.

Example 17. For any real numbers c1 and c2 the function

y = c1et + c2te

t

is a solution the differential equation

d2y

dt2− 2

dy

dt+ y = 0.

3 Friday April 3 – Wednesday April 8

18. It is handy to introduce operator notation. If

p(r) = rn + p1rn−1 + · · ·+ pn−1r + pn

and y is a function of t, then

p(D)y :=dny

dtn+ p1

dn−1y

dtn−1+ · · ·+ pn−1

dy

dt+ pny

denotes the left hand side of equation (6). This gives the differential equationthe handy form p(D)y = 0. The characteristic equation of this differentialequation is the algebraic equation p(r) = 0, i.e.

y = ert =⇒ p(D)y = p(D)ert = p(r)ert = p(r)y

so y = ert is a solution when p(r) = 0. When p(x) = xk we have thatp(D)y = Dky, i.e.

Dky =dky

dtk=

(d

dt

)k

y

is the result of differentiating k times. Here is what makes the whole theorywork:

9

Page 219: MATH320

Theorem 19. Let pq denote the product of the polynomial p and the poly-nomial q, i.e.

(pq)(r) = p(r)q(r).

Then for any function y of t we have

(pq)(D)y = p(D)q(D)y

where D = d/dt.

Proof. This is because D(cy) = cDy if c is a constant.

Corollary 20. p(D)q(D)y = q(D)p(D)y.

Proof. pq = qp.

21. If q(D) = 0 then certainly pq(D)y = 0 by the theorem and if p(D)y = 0then pq(D)y = 0 by the theorem and the corollary. This means that wecan solve a homogeneous linear constant coefficient equation by factoringthe characteristic polynomial.

Theorem 22 (Fundamental Theorem of Algebra). Every polynomial has acomplex root.

Corollary 23. Every real polynomial p(r) factors as a product of (possiblyrepeated) linear factors r− c where c is real and quadratic factors ar2 + br+ cwhere b2 − 4ac < 0.

24. A basis for the solution space of (D − r)ky = 0 is

ert, tert, . . . , tk−1ert.

A basis for the solution space of (aD2 + bD + c)ky = 0 is

ept cos(qt), ept sin(qt), . . . , tk−1ept cos(qt), tk−1ept sin(qt)

where r = p± qi are the roots of ar2 + br + c, i.e. p = −b2a

and q =√

b2−4ac2a

.

Example 25. A basis for the solutions of the equation

(D2 + 2D + 2)2(D − 1)3y = 0

iset cos t, et sin t, tet cos t, tet sin t, et, tet, t2et.

10

Page 220: MATH320

26. This reasoning can also be used to solve inhomogeneous constant coeffi-cient linear differential equation

p(D)y = f(t)

where the homogeneous term f(t) itself solves solves a homogeneous constantcoefficient linear differential equation

q(D)f = 0.

This is because any solution y of p(D)y = f(t) will then solve the homo-geneous equation (qp)(D)y = 0 and we can compute which solutions of(qp)(D)y = 0 also satisfy p(D)y = f(t) by the method of undeterminedcoefficients. Here’s an

Example 27. Solve the initial value problem

dy

dt+ 2y = e3t, y(0) = 7.

Since e3t solves the problem (D − 3)e3t = 0 we can look for the solutions to(D + 2)y = e3t among the solutions of (D − 3)(D + 2)y = 0. These all havethe form

y = c1e−2t + c2e

3t

but not every solution homogeneous second order equation solves the originalfirst order equation. To see which do, we plug in

(D + 2)(c1e−2t + c2e

3t) = (D + 2)c2e3t = (3c2 + 2c2)e

3t = e3t

if and only if c2 = 1/5. Thus yp = e3t/5 is a particular solution of theinhomogeneous equation. By the Principle of the Particular Solution abovethe general solution to the inhomogeneous equation is the particular solutionplus the general solution to the homogeneous problem, i.e.

y = c1e−2t +

e3t

5

To satisfy the initial condition y(0) = 7 we must have 7 = c1+1/5 or c1 = 6.8.the solution we want is y = 6.8e−2t + 0.2e3t.

11

Page 221: MATH320

4 Friday April 10, Monday April 13

28. The displacement from equilibrium y of a object suspended fromthe ceiling by a spring is governed by a second order differential equation

md2y

dt2+ c

dy

dt+ ky = F (t) (8)

where m is the mass of the object, c is a constant called the dampingconstant, k is a constant of proportionality called the spring constant,and F (t) is a (generally time dependent) external force. The constants mand k are positive and c is nonnegative. This is of course Newton’s law

Ftot = ma, a :=d2y

dt2

where the total force Ftot is the sum Ftot = FS +FR +FE of three terms, viz.the spring force

FS = −ky, (9)

the resistive force

FR = −cdy

dt, (10)

and the external forceFE = F (t).

The spring force FS is the combined force of gravity and the force the springexerts to restore the mass to equilibrium. Equation (9) says that the sign ofthe spring force FS is opposite to the sign of the displacement y and henceFS pushes the object towards equilibrium. The resistive FR is the resistiveforce proportional to the velocity of the object. Equation (10) says that thesign of the resistive force FR is opposite to the sign of the velocity dy/dtand hence FR slows the object.. In the text, the resistive force FR is oftenascribed to a dashpot (e.g. the device which prevents a screen door fromslamming) but in problems it might be ascribed to friction or to the viscosityof the surrounding media. (For example, if the body is submersed in oil,the resistive force FR might be significant.) The external force will generallyarise from some oscillation, e.g.

F (t) = F0 cos(ωt) (11)

and might be caused by the oscillation of the ceiling or of a flywheel. (Seefigure 5.6.1 in the text.)

12

Page 222: MATH320

29. Equation 9 is called Hooke’s law. It can be understood as follows. Theforce FS depends solely of the position y of the spring. For y near y0 theFundamental Idea of Calculus says that the force FS(y) is well approximatedby the linear function whose graph is the tangent line to the graph of FS aty0, i.e.

FS(y) = FS(y0) + F ′S(y0)(y − y0) + o(y − y0) (12)

where the error term o(y − y0) is small in the sense that

limy→y0

o(y − y0)

y − y0

= 0.

The fact that equilibrium occurs at y = y0 means that FS(y0) = 0, i.e.(assuming FE is zero) i.e. if the object is at rest at position y = y0 thenNewton’s law F = ma holds. The assertion that y is displacement fromequilibrium means that y0 = 0, i.e. we are measuring the position of theobject as its signed distance from its equilibrium position as opposed to sayits height above the floor or distance from the ceiling. From this point ofview Hooke’s law is the approximation that arises when we ignore the errorterm o(y− y0) in (12). This is the same reasoning that leads to the equation

d2θ

dtt+

g

Lθ = 0

as an approximation to the equation of motion

d2θ

dtt+

g

Lsin θ = 0

for the simple pendulum. (See page 322 of the text.)On the other hand, some of the assigned problems begin with a sentence

like A weight of 5 pounds stretches a spring 2 feet. In this problem there areapparently two equilibria, one where the weight is not present and anotherwhere it is. In this situation the student is supposed to assume that theforce FS is a linear (not just approximately linear) function of the position,so that the spring constant k is 5/2 (i.e. the slope of the line) and that theequilibrium position occurs where the weight is suspended at rest.

30. The energy is defined as the sum

E :=mv2

2+

ky2

2, v :=

dy

dt

13

Page 223: MATH320

of the kinetic energy mvv2/2 and the potential energy ky2/2. If theexternal force FE is zero, the energy satisfies

dE

dt= mv

dv

dt+ ky

dy

dt= v

(m

d2y

de2+ ky

)= −c

(dy

dt

)2

When c (and FE) are zero, dE/dt = 0 so E is constant (energy is conserved)while if c is positive, dE/dt < 0 so E is decreasing (energy is dissipated).When we solve the equation (with FE = 0) we will see that the motion isperiodic if c = 0 and tends to equilibrium as t becomes infinite if c > 0.

31. It is important to keep track of the units in doing problems of this sortif for no other reason than that it helps avoid errors in computation. Wenever add two terms if they have different units and whenever a quantityappears in a nonlinear function like the sine or the exponential function,that quantity must be unitless. In the metric system the various terms haveunits as follows:

• y has the units of length: meters (m).

• dy

dthas the units of velocity: meters/sec.

• d2y

dt2has the units of acceleration: meters/sec2.

• m has the units of mass: kilograms (kg).

• F has the units of force: newtons = kg·m/sec2.

Using the principle that quantities can be added or equated only if they havethe same units and that the units of a product (or ratio) of two quantitiesis the product (or ratio) of the units of the quantities we see that c has theunits of kg·m/sec and that k has the units of kg/sec2. The quantity

ω0 :=

√k

m

thus has the units of frequency: sec−1 which is good news: When c = 0 andFE = 0 equation (8) becomes the harmonic oscillator equation

d2y

dt2+ ω2

0y = 0 (13)

14

Page 224: MATH320

(we divided by m) and the solutions are

y = A cos(ω0t) + B sin(ω0t). (14)

(The input ω0t to the trigonometric functions is unitless.)

Remark 32. When using English units (lb, ft, etc.) you need to be a bitcareful with equations involving mass. Pounds (lb) is a unit of force, notmass. Using mg = F and g=32 ft/sec2 we see that an object at the surfaceof the earth which weighs 32 lbs (i.e. the force on it is 32 lbs) will have amass of 1 slug1 So one slug weighs 32 lbs at the surface of the earth (or lb =(1/32)·slug·ft/sec2). When using metric units, kilogram is a unit of mass notforce or weight. A 1 kilogram mass will weigh 9.8 newtons on the surface ofthe earth. (g= 9.8 m/sec2 and newton = kg·m/sec2 ). Saying that a mass“weighs” 1 kilogram is technically incorrect usage, but it is often used. Whatone really means is that it has 1 kilogram of mass and therefore weighs 9.8newtons.

33. Consider the case where the external force FE is not present. In thiscase equation (8) reduces to the homogeneous problem

md2y

dt2+ c

dy

dt+ ky = 0 (15)

The roots of the characteristic equation are

r1 =−c +

√c2 − 4mk

2m, r1 =

−c−√

c2 − 4mk

2m.

We distinguish four cases.

1. Undamped: c = 0. The general solution is

y = A cos(ω0t) + B sin(ω0t), ω0 :=

√k

m

2. Under damped: c2 − 4mk < 0. The general solution is

y = e−pt (A cos(ω1t) + B sin(ω0t)) , p :=c

2m, ω1 :=

√ω2

0 − p2

1The unit of mass in the English units is called the slug – really!

15

Page 225: MATH320

3. Critically damped: c2 − 4mk = 0. The general solution is

y = e−pt (c1 + c2t)

4. Over damped: c2 − 4mk. The general solution is

y = c1er1t + c2e

r2t.

In the undamped case (where c = 0) the motion is oscillatory and the limit ofy(t) as t becomes infinite does not exists (except of course when A = B = 0).In the three remaining cases (where c > 0) we have limt→∞ y = 0. In case 3this is because

limt→∞

te−pt = 0

(in a struggle between an exponential and a polynomial the exponential wins)while case 4 we have r2 < r1 < 0 because

√c2 − 4mk < c. In cases 1 and 2

we can define α and C by

C :=√

A2 + B2, cos α =A

C, sin α =

B

C

and the solution takes the form

y = Ce−pt cos(ω1t− α).

In the undamped case p = 0 and the y is a (shifted) sine wave with ampli-tude C, while in the under damped case the graph bounces back and forthbetween the two graphs y = ±e−pt.

34. Now consider the case of forced oscillation where the external forceis given by (11). The right hand side F (t) = F0 cos(ωt) of (8) solves theODE (D2 + ω2)F = 0 so we can solve using the method of undeterminedcoefficients. We write equation (8) in the form

(mD2 + cD + k)y = F0 cos ωt (16)

and observe that every solution of this inhomogeneous equation satisfies thehomogeneous equation

(D2 + ω2)(mD2 + cD + k)y = 0.

The solutions of this last equation all can be written as y +yp where (mD2 +cD + k)y = 0 and (D2 + ω2)yp. We know all the solutions of the former

16

Page 226: MATH320

equation by the previous paragraph and the most general solution of thelatter is

yp = b1 cos ωt + b2 sin ωt.

We consider three cases.

(i) c = 0 and ω 6= ω0. The function yp satisfies (16) iff

(k −mω2)b1 cos ωt + (k −mω2)b2 sin ωt = F0 cos ωt

which (since the functions cos ωt and sin ωt are linearly independent) canonly be true if b2 = 0 and b1 = F0/(k−mω2). Our particular solution is thus

yp =F0 cos ωt

k −mω2=

F0 cos ωt

m(ω20 − ω2)

(ii) c = 0 and ω = ω0. In this case we can look for a solution ihe formyp = b1t cos ω0t+ b2t sin ω0t but it is easier to let ω tend to ω0 in the solutionwe found in part (i). Then by l’Hopital’s rule (differentiate top and bottomwith respect to ω) we get

yp = limω→ω0

F0 cos ωt

m(ω20 − ω2)

=F0t sin ω0t

2mω0

The solution yp bounces back and forth between the two lines y = ±F0t/(2m).The general solution in this case is (by the principle of the particular solution)the general solution of the homogeneous system plus the solution yp and theformer remains bounded. Thus every solution oscillates wildly as t becomesinfinite. This is the phenomenon of resonance and is the cause of manyengineering disasters. (See the text page 352.)

(iii) c > 0. The high school algebra in this case is the most complicatedbut at least we know that there are no repeated roots since the roots ofr2 + ω2 = 0 are pure imaginary and the roots of mr2 + cr + k = 0 are not.The function yp satisfies (16) iff(

(k −mω2)b1 + cωb2

)cos ωt +

(−cωb1 + (k −mω2)b2

)sin ωt = F0 cos ωt

so

(k −mω2)b1 + cωb2 = F0 and (−cωb1 + (k −mω2)b2 = 0.

17

Page 227: MATH320

In matrix form this becomes[k −mω2 cω−ω k −mω2

] [b1

b2

]=

[F0

0

]and the solution is[

b1

b2

]=

[k −mω2 cω−cω k −mω2

]−1 [F0

0

]=

1

(k −mω2)2 + (cω)2

[k −mω2 −cω

cω k −mω2

] [F0

0

]=

F0

(k −mω2)2 + (cω)2

[k −mω2

]so our particular solution is

yp =F0

(k −mω2)2 + (cω)2

((k −mω2) cos ωt + cω sin ωt

).

As above this can writen as

yp =F0 cos(ωt− α)√

(k −mω2)2 + (cω)2

where

cos α =k −mω2√

(k −mω2)2 + (cω)2, sin α =

cω√(k −mω2)2 + (cω)2

.

The general solution in this case is (by the principle of the particular solution)the general solution of the homogeneous system plus the solution yp. Theformer decays to zero as t becomes infinite, and the latter has the samefrequency as the external force F0 cos ωt but has a smaller amplitude and a“phase shift” α. (See the text page 355.)

5 Wednesday April 15 - Friday April 17

35. It’s easy to solve the initial value problem

dx1

dt= 3x1,

dx2

dt= 5x2, x1(0) = 4, x2(0) = 7.

18

Page 228: MATH320

The answer isx1 = 4e3t, x2 = 7e5t.

The reason this is easy is that this really isn’t a system of two equations intwo unknowns, it is two equations each in one unknown. When we write thissystem in matrix notation we get

dx

dt= Dx, x =

[x1

x2

], D =

[3 00 5

].

The problem is easy because the matrix D is diagonal.

36. Here is a system which isn’t so easy.

dy1

dt= y1 + 4y2,

dy2

dt= −2y1 + 7y2, y1(0) = 15, y2(0) = 11.

To solve it we make the change of variables

y1 = 2x1 + x2, y2 = x1 + x2; x1 = y1 − y2, x2 = −y1 + 2y2.

Then

dx1

dt=

dy1

dt− dy2

dt= (y1 + 4y2)− (−2y1 + 7y2) = 3y1 − 3y2 = 3x1,

dx2

dt= −dy1

dt+ 2

dy2

dt= −(y1 + 4y2) + 2(−2y1 + 7y2) = 5(−y1 + 2y2) = 5x2,

x1(0) = y1(0)− y2(0) = 4, x2(0) = −y1(0) + 2y2(0) = 7.

The not so easy problem 36 has been transformed to the easy problem 35and the solution is

y1 = 2x1 + x2 = 8e3t + 7e5t, y2 = x1 + x2 = 4e3t + 7e5t.

It’s magic!

37. To see how to find the magic change of variables rewrite problem 36 inmatrix form

dy

dt= Ay, A =

[1 4

−2 7

], y =

[y1

y2

]19

Page 229: MATH320

and the change of variables as

y = Px, x = P−1y.

In the new variables we have

dx

dt= P−1dy

dt= P−1Ay = P−1APx.

so we want to find a matrix P so that the matrix

D := P−1AP =

[λ1 00 λ2

](17)

is diagonal. Once we have done this the not so easy initial value problem

dy

dt= Ay, y(0) = y0 :=

[1511

]is transformed into the easy intial value problem

dx

dt= Dx, x(0) = x0 := P−1y0

as follows. Let x be the solution to the easy problem and define y := Px.Since the matrix P is constant we can differentiate the relation y = Px toget

dy

dt= P

dx

dt= P−1Dx = P−1DPy = Ay.

Also since the equation y := Px holds for all t it holds in particular for t = 0,i.e.

y(0) = Px(0) = Px0 = PP−1y0 = bfy0.

This shows that y := Px solves the not so easy problem when x solves theeasy problem.

38. So how do we find a matrices P and D satisfying (17)? Multiplying theequation A = PDP−1 on the right by P gives AP = DP. Let v and wbe the columns of P so that P =

[v w

]. Then as matrix multiplication

distributes over concatentation we have

AP =[

Av Aw].

20

Page 230: MATH320

But

PD =

[v1 w1

v2 w2

] [λ1 00 λ2

]=

[λ1v1 λ2w1

λ1v2 λ2w2

]=[

λ1v λ2w].

Thus the equation AP = DP can be written as[Av Aw

]=[

λ1v λ2w],

i.e. the single matrix equation AP = DP becomes to two vector equationsAv = λ1v and Aw = λ2w.

Definition 39. Let A be an n× n matrix. We say that λ is an eigenvaluefor A iff there is a nonzero vector v such that

Av = λv. (18)

Any nonzero solution v of equation (18) is called an eigenvector of A cor-responding to the eigenvector λ. The set of all solutions to (18) (includingv = 0) is called the eigenspace corresponding to the eigenvalue λ. Equa-tion (18) can be rewritten is a homgeneous system

(λI−A)v = 0

which has a nonzero solution v if and only if

det(λI−A) = 0.

The polynomial det(λI−A) is called the characteristic polynomial of Aand so

The eigenvalues of a matrix are the roots of its characteristicpolynomial.

40. We now find the eigenvalues of the matrix A from section 37. Thecharacteristic equation is

det(λI−A) = det

[λ− 1 −4

2 λ− 7

]= (λ− 1)(λ− 7) + 8 = λ2 − 8λ + 15

21

Page 231: MATH320

so the eigenvalues are λ1 = 3 and λ2 = 5. The eigenvectors v correspondingto the eigenvalue λ1 = 3 are the solutions of the homogeneous system[

00

]=

[3− 1 −4

2 3− 7

] [v1

v2

]=

[2 −42 −4

] [v1

v2

],

i.e. the multiples of the column vector v = (2, 1). The eigenvectors wcorresponding to the eigenvalue λ2 = 5 are the solutions of the homogeneoussystem [

00

]=

[5− 1 −4

2 5− 7

] [v1

v2

]=

[4 −44 −4

] [v1

v2

],

i.e. the multiples of the column vector v = (1, 1). A solution to (17) is givenby

P =

[2 11 1

], D =

[3 00 5

]and the change of variables y = Px used in problem 36 is[

y1

y2

]=

[2 11 1

] [x1

x2

]=

[2x1 + x2

x1 + x2

].

Remark 41. The solution of the diagonalization problem (17) is neverunique because we can always find another solutions be replacing each eigen-vector (i.e. column of P) by a nonzero multiple of itself. For example ifc1 6= 0 and c2 6= 0, Av = λ1v, Aw = λ2w, then also A(c1v) = λ1(c1v),A(c2w) = λ2(c2w) so the matrix

[c1v c2w

]should work as well as the

matrix P =[

v w]

used above. Indeed

[c1v c2w

]= PC = where C =

[c1 00 c2

]and CD = DC (since both are diagonal) so

(PC)D(PC)−1 = PCDC−1P−1 = PDCC−1P−1 = PDP−1

which shows that if A = PDP−1 then also A = (PC)D(PC)−1.

22

Page 232: MATH320

6 Monday April 20 - Wednesday April 22

Definition 42. The matrix A is similar to the matrix B iff there is aninvertible matrix P such that A = PBP−1. A square matrix D is said to bediagonal if its off diagonal entries are zero, i.e. iff it has the form

D =

λ1 0 · · · 00 λ2 · · · 0

. . .

0 0 · · · λn

.

A square matrix is said to be diagonzlizable iff it is similar to a diagonalmatrix.

Remark 43. (i) Every matrix is similar to itself and hence a diagonal matrixis diagonalizable. (ii) If A is similar to B, then B is similar to A (becauseA = PBP−1 =⇒ B = P−1AP). (iii) If A is similar to B and B is similarto C, then A is similar to C (because A = PBP−1 and B = QCQ−1 =⇒A = (PQ)C(PQ)−1 ).

Theorem 44. Similar matrices have the same characteristic polynomial.

Proof. If A = PBP−1 then

λI−A = λI−PBP−1 = λPIP−1 −PBP−1 = P(λI−B)P−1

so

det(λI−A) = det(P(λI−B)P−1

)= det(P) det(λI−B) det(P)−1 = det(λI−B)

as the determinant of the product is the producr of the determinants and thedeterminant of the inverse is the inverse of the determinant.

Remark 45. It follows that similar matrices have the same eigenvalues asthe eigenvalues are the roots of the characteristic polynomial. Of course theydon’t necessarily have the same eigenvectors, but if A = PBP−1 and w isan eigenvector for B then Pw is an eigenvector for A:

A(Pw) = (PBP−1)(Pw) = P(BW) = P(λw) = λ(PW).

Theorem 46. An n × n matrix A is diagonalizable iff there is a linearlyindependent sequence (v1,v2, . . . ,vn) of eigenvectors of A

23

Page 233: MATH320

Proof. The matrix P =[

v1,v2, . . . ,vn

]is invertible if and only if its

columns v1,v2, . . . ,vn are linearly independent and the matrix equationAP = PD holds with D is diagonal if and only if the columns of P areeigenvectors of A. (See Theorem 2 on page 376 of the text.)

Theorem 47. Assume that v1,v2, . . . ,vk are nonzero eigenvectors of A cor-responding to distinct eigenvalues λ1, λ2, . . . , λk, i.e. Avi = λi, vi 6= 0, andλi 6= λj for i 6= j. Then the vector v1,v2, . . . ,vk are linearly independent.

Proof. By induction on k. The one element sequence v1 is independentbecause we are are assuming v1 is non zero. Assume as the hypothesis ofinduction that v2, . . . ,vk are independent. We must show that v1,v2, . . . ,vk

are independent. For this assume that

c1v1 + c2v2 + · · ·+ ckvk = 0.

(We must show that c1 = c2 = · · · = ck = 0.) Multiply by λ1I−A:

c1(λ1I−A)v1 + c2)λ1I−A)v2 + · · ·+ ck(λ1I−A)vk = 0.

Since Avi = λivi this becomes

c1(λ1 − λ1)v1 + c2(λ1 − λ2)v2 + · · ·+ ck(λ1 − λk)vk = 0.

Since λ1 − λ1 = 0 this becomes

c2(λ1 − λ2)v2 + · · ·+ ck(λ1 − λk)vk = 0.

As v1, . . . ,vk are independent (by the induction hypothesis) we get

c2(λ−λ2) = · · · = ck(λ1 − λk) = 0

as the eigenvalues are distinct we have λ1 − λi 6= 0 for i > 1 so c2 = · · · =ck = 0. But now c1v1 = 0 so c1 = 0 as well (since v1 6= 0) so the ci are allzero as required. (See Theorem 2 on page 376 of the text.)

Corollary 48. If an n× n matrix A has n distinct real eigenvalues, then itis diagonalizable.

Proof. A square matrix is invertible if and only if its columns are indepen-dent. (See Theorem 3 on page 377 of the text.)

24

Page 234: MATH320

Remark 49. The analog of the corollary remains true if we allow complexeigenvalues and assume that all matrices are complex.

Example 50. The characteristic polynomial of the matrix

A =

[3 4

−4 3

]is

det(λI−A) = det

([λ− 3 −4

4 λ− 3

])= (λ− 3)2 + 16 = λ2 − 6x + 25

and the roots are 3±4i. The eigenvalues aren’t real so the eigenvectors can’tbe real either. However, the matrix A can be diagonalized if we use complexnumbers: For λ = 3 + 4i the solutions of

(λI−A)v =

[4i 4−4 4i

] [v1

v2

]= 4

[iv1 − v2

−v1 + iv2

]= 0

arei (v1, v2) = c(1,−i) while for λ = 3− 4i the solutions of

(λI−A)v =

[−4i 4−4 −4i

] [v1

v2

]= 4

[−iv1 − v2

−v1 − iv2

]= 0

are (v1, v2) = c(1, i). Hence we have A = PDP−1 where

D =

[3 + 4i 0

0 3− 4i

], P =

[1 1

−i i

], P−1 =

1

2i

[i −1i 1

].

Example 51. Not every matrix is diagonalizable even if we use complexnumbers. For example, the matrix

N =

[0 10 0

]is not diagonalizable. This is because N2 = 0 but N 6= 0. If N = PDP−1

then0 = N2 = (PDP−1)(PDP−1) = PD2P−1

so D2 = 0. But if D is diagonal, then

D2 =

[λ1 00 λ2

]2

=

[λ2

1 00 λ2

2

]is also diagonal so λ2

1 = λ22 = 0 so λ1 = λ2 = 0 so D = 0 so N = 0

contradicting N 6= 0.

25

Page 235: MATH320

7 Friday April 24

52. The Wronskian of an n functions x1(t),x2(t), . . . ,xn(t) taking valuesin Rn is the determinant

W (t) = det(x1,x2, . . . ,xn)

of the n × n matrix whose ith column is xi. Of course for each value of tit is the case that W (t) 6= 0 if and only if the vectors x1(t),x2(t), . . . ,xn(t)are linearly independent and certainly it can happen that W (t) is zero forsome values of t and non-zero for other values of t. But if the functions xi

are solution of a matrix differential equation, this is not so:

Theorem 53. If x1(t),x2(t), . . . ,xn(t) are solutions of the homogeneous lin-

ear systemdx

dt= A(t)x then either W (t) 6= 0 for all t or W (t) = 0 for all

t.

Proof. This is an immediate consequence of the Existence and UniquenessTheorem: If W (t0) = 0 then there are constants c1, c2, . . . , cn such thatc1x1(t0)+c2x2(t0)+ ·+cn,xn(t0). Now x(t) := c1x1(t)+c2x2(t)+ ·+cn,xn(t)satisfies the equation and x(t0) = 0 so (by uniqueness) x(t) = 0 for all t.

Remark 54. As explained in paragraph 7 this specializes to higher orderdifferential equations. The Wronskian of n functions x1(t), x2(t), . . . , xn(t)is Wronskian of of the coresponding sequence

xi =(xi, x

′i, x

′′i , . . . , x

(n−1)i

)of vectors. For example in n = 2 the Wronskian of x1(t), x2(t) is

W (x1, x2) = det

[x1 x1

x′1 x′2

]= x1x

′2 − x2x

′1

where x′ := dx/dt.

Definition 55. The trace Tr(A) of a square matrix A is the sum of itsdiagonal entries.

Theorem 56. Let x1(t),x2(t), . . . ,xn(t) be solutions of the homogeneous lin-

ear systemdx

dt= A(t)x. Then the Wronskian W (t) satisfies the differential

equationdW

dt= Tr

(A(t)

)W (t).

26

Page 236: MATH320

Proof. We do the 2× 2 case

A =

[a11 a12

a21 a22

], x1 =

[x11

x21

], x2 =

[x12

x22

].

Let x′ = dx/dt. Writing out the equations x′1 = Ax1 and x′2 = Ax2 gives

x′11 = a11x11 + a12x21, x′12 = a11x12 + a12x22,

x′21 = a21x11 + a22x21, x′22 = a21x12 + a22x22.

Since W = x11x22 − x12x21 we get

W ′ = x′11x22 + x11x′22 − x′12x21 − x12x

′21

= (a11x11 + a12x21)x22 + x11(a21x12 + a22x22)

−(a11x12 + a12x22)x21 − x12(a21x11 + a22x21)

= (a11 + a22)(x11x22 − x12x21)

= Tr(A)W.

as claimed.

8 Monday April 29

57. Consider a system of tanks containing brine (salt and water) connectedby pipes through which brine flows from one tank to another.

9 Wednesday April 29

58. Consider a collection of masses on a track each connected to the next bya spring with the first and last connected to opposite walls.

k1 m1k2 m2

k3 m3k4

27

Page 237: MATH320

There are n masses lying on the x axis. The left wall is at x = a, the rightwall at x = b, and the ith mass is at Xi so a < X1 < X2 < · · ·Xn < b. Thespring constant of the ith spring is ki and the ith mass is mi. The first springconnects the first mass to the left wall, the last ((n + 1)st) spring connectsthe last mass to the right wall, and the (i + 1)st spring (i = 1, 2, . . . , n− 1)connects the ith mass to the (i + 1)st mass. We assume that there is anequilibrium configuration a < X0,1 < X0,2 < · · ·X0,n < b where the massesare at rest and define the displacement from equilibrium x1, x2, . . . , xn

byxi := Xi −X0,i.

note that the distance Xi+1 −Xi between two adjacent masses is related tothe difference xi+1 − xi of the displacements by the formula

Xi+1 −Xi = (X0,i+1 −X0,i) + (xi+1 − xi). (19)

With n = 3 (as in the diagram above) the equations of motion for this systemare

m1d2x1

dt2= −(k1 + k2)x1 + k2x2

m2d2x2

dt2= k2x1 − (k2 + k3)x2 + k3x3

m3d2x3

dt2= k3x2 − (k3 + k4)x3

or in matrix notation

Md2x

dt2= Kx (20)

with

M =

m1 0 00 m2 00 0 m3

, x =

x1

x2

x3

K =

−(k1 + k2) k2 0k2 −(k2 + k3) k3

0 k3 −(k3 + k4)

.

In the general case (arbitrary n) the matrix M is diagonal with diagonalentries m1, m2, . . . ,mn and the matrix K is symmetric “tridiagonal” withentries −(k1 + k2) . . . ,−(kn + kn+1) on the diagonal and entries k2, . . . , kn onthe sub and super diagonal.

28

Page 238: MATH320

59. To derive the system of equations from first principles2 let Ti (i =1, 2, . . . , n + 1) denote the tension in ith spring. This means that the forceexerted by the ith spring on the mass attached to its left end if Ti and theforce exerted by the ith spring on the mass attached to its right end is −Ti.The tension depends on the length Xi − Xi−1 of the ith spring so by linearapproximation and (19)

Ti = Ti,0 + ki(xi+1 − xi) + o(xi+1 − xi)

where Ti,0 denotes the tension in the ith spring when the system is in equi-librium and ki is the derivative of the tension at equilibrium. (The tension isassumed positive meaning that each spring is trying to contract, so the masson its left is pulled to the right, and the mass on the right is pulled to theleft.) We ignore the small error term o(xi+1 − xi). The net force on the ithmass is

Ti+1 − Ti = T0,i+1 + ki+1(xi+1 − xi)− T0,i − ki(xi − xi−1).

At equilibrium the net force on each mass is zero: Ti+1,0−Ti,0 = 0 so Ti+1−Ti

simplifies to

Ti+1 − Ti = ki−1xi−1 − (ki+1 + ki)xi + ki+1xi+1.

Remark 60. If the masses are hung in a line from the ceiling with the lowestmass attached only to the mass directly above, the system of equations isessentially the same: one takes kn+1 = 0.

61. Assume for the moment that all the masses are equal to one. Then thesystem takes the form

d2x

dt2= Kx. (21)

The eigenvalues of K are negative so we may write them as the negatives ofsquares of real numbers. If

Kv = −ω2v

then for any constants A and B the function

x = (A cos ωt + B sin ωt)v

is a solution of (21). The following theorem says that K is diagonalizable sothis gives 2n independent solutions of (21).

2This is not in the text: I worked it out to fulfill an inner need of my own.

29

Page 239: MATH320

Theorem 62. Assume A is a symmetric real matrix, i.e. AT = A. ThenA is diagonalizable. If in addition, 〈Av,v〉 > 0 for all v 6= 0 then theeigenvalues are positive.

Remark 63. This theorem is sometimes called the Spectral Theorem. Itis often proved in Math 340 but I haven’t found it in our text. It is also truethat for a symmetric matrix, eigenvectors belonging to distinct eigenvaluesare orthogonal. In fact there is an orthonormal basis of eigenvectors i.e. abasis v1, . . . ,vn so that |vi| = 1 and 〈vi,vj〉 = 0 for i 6= j. (We can alwaysmake a non zero eigenvector into a unit vector by dividing it by its length.)

Remark 64. We can always make a change of variables to convert (20)to (21) as follows. Let M1/2 denote the diagonal matrix whose entries are√

mi. Then multiplying (20) by the inverse M−1/2 of M1/2 gives

M1/2d2x

dt2= M−1/2Kx.

Now make the change of variables y = M1/2x to get

d2y

dt2= M1/2d2x

dt2= M−1/2Kx =

(M−1/2KM−1/2

)y.

It is easy to see that M−1/2KM−1/2 (the “new” K) is again symmetric.

10 Friday May 1

65. You can plug square matrix into a polynomial (or more generally apower series) just as if it is a number. For example, if f(x) = 3x2 + 5 thenf(A) = 3A2 + 5I. Since you add or multiply diagonal matrices by adding ormultiplying corresponding diagonal entries we have

f

([λ 00 µ

])=

[f(λ) 0

0 f(µ)

]. (22)

Finally, since P(A + B)P−1 = PAP−1 + PBP−1 and (PAP−1)k = PAkP−1

we havef(PAP−1) = Pf(A)P−1. (23)

30

Page 240: MATH320

66. This even works for power series. For numbers the exponential functionhas the power series expansion

ex =∞∑

k=0

xk

k!

so for square matrices we make the definition

exp(A) :=∞∑

k=0

1

k!Ak.

Replacing A by tA gives

exp(tA) =∞∑

k=0

tk

k!Ak.

(Some people write etA.) Differentiating term by term gives

d

dtexp(tA) =

∞∑k=0

ktk−1

k!Ak

=∞∑

k=1

tk−1

(k − 1)!Ak

=∞∑

j=0

tj

j!Aj+1

= A

(∞∑

j=0

tj

j!Aj

)= A exp(tA)

Since exp(tA) = I when t = 0 this means that the solution to the initialvalue problem

dx

dt= Ax, x(0) = x0

isx = exp(tA)x0.

31

Page 241: MATH320

67. The moral of the story is that matrix algebra is just like ordinary algebraand matrix calculus is just like ordinary calculus except that the commutativelaw doesn’t always hold. However the commutative law does hold for powersof a single matrix:

ApAq = Ap+q = Aq+p = AqAp.

68. You can compute the exponential of a matrix using equations (22)and (23). If

A = PDP−1, D =

[λ 00 µ

],

then

exp(tA) = P exp(tD)P−1, exp(tD) =

[eλt 00 eµt

].

Example 69. In paragraph 40 we saw how to diagonalize the matrix

A =

[1 4

−2 7

].

We found that A = PDP−1 where

P =

[2 11 1

], P−1 =

[1 −1

−1 2

], D =

[3 00 5

].

Now

exp(tD) =

[e3t 00 e5t

]so exp(tA) = P exp(tD)P−1 is computed by matrix mutiplication

exp(tA) =

[2 11 1

] [e3t 00 e5t

] [1 −1

−1 2

]=

[2 11 1

] [e3t −e3t

−e5t 2e5t

]=

[2e3t − e5t −2e3t + 2e5t

e3t − e5t −e3t + 2e5t

].

Example 70. If N2 = 0 then

exp(tN) =∞∑

k=0

tk

k!Nk =

1∑k=0

tk

k!Nk = I + tN.

32

Page 242: MATH320

In particular,

exp

([0 t0 0

])=

[1 t0 1

].

Theorem 71. If AB = BA then exp(A + B) = exp(A) exp(B).

Proof. The formula ea+b = aaeb holds for numbers. Here is a proof usingpower series. By the Binomial Theorem

(a + b)k =∑

p+q=k

k!

p!q!apbq

so∞∑

k=0

(a + b)k

k!=

∞∑k=0

∑p+q=k

apbq

p!q!=

(∞∑

p=0

ap

p!

)(∞∑

q=0

bq

q!

).

If AB = BA the same proof works to prove the theorem.

Example 72. (Problem 4 page 487 of the text) We compute exp(tA) where

A =

[3 −11 1

].

The characteristic polynomial is

det(λI−A) = det

[λ− 3 1−1 λ− 1

]= (λ− 3)(λ− 1) + 1 = (λ− 2)2

and has a double root. But the null space of (2I−A) is one dimensional sowe can’t diagonalize the matrix. However

(A− 2I)2 =

[3− 2 −1

1 1− 2

]2

=

[1 −11 −1

]2

= 0

so exp(t(A − 2I)

)= I + t(A − 2I) as in Example 70. But the matrices 2tI

and t(A − 2I) commute (the identity matrix commutes with every matrix)and tA = 2tI + t(A− 2I) so

exp(tA) = exp(tI) exp(t(A− 2I)

)= e2t

(I + t(A− 2I)

)i.e.

exp(tA) = e2t

[1 + t −t

t 1− t

].

33

Page 243: MATH320

11 Monday May 4

73. A solution of a two dimensional system

dx

dt= F(x)

is a parametric curve in the plane R2. The collection of all solutions is calledthe phase portrait of the system. When we draw the phase portrait weonly draw a few representative solutions. We put arrows on the solutionsto indicate the direction of the parameterization just like we did when wedrew the phase line in the frst part of this course. We shall only draw phaseportraits for linear systems

F(x) = Ax

where A is a (constant) 2× 2 matrix.

12 Wednesday May 6

74. Consider the inhomogeneous system

dx

dt= A(t)x + f(y) (24)

where A(t) is a continuous n×n matrix valued function, f(t) is a continuousfunction with values in Rn and the unknown x also takes values in Rn. Weshall call the system

dv

dt= A(t)v (25)

the homogeneous system corresponding to the inhomogenoous sys-tem (24).

75. Let v1,v2, . . . ,vn be n linearly independent solutions to the homogeneoussystem (25) and form the matrix

Φ(t) :=[

v1(t) v2(t) · · · vn(t)].

The matrix Φ is called a fundamental matrix for the system (25); thismeans that the columns are solutions of (25) and they form a basis for Rn

for some (and hence3 every) value of t.

3by Theorem 53

34

Page 244: MATH320

Theorem 76. A fundamental matrix satisfies the matrix differential equation

dt= AΦ

Proof.d

dt

[v1 v2 · · · vn

]=

[dv1

dt

dv2

dt· · · dvn

dt

]=[

Av1 Av2 · · · Avn

]= A

[v1 v2 · · · vn

].

Theorem 77. If Φ is a fundamental matrix for the system (25) then thefunction

v(t) = Φ(t)Φ(t0)−1v0

is the solution of the initial value problem

dv

dt= A(t)v, v(t0) = v0.

Proof. The columns of Φ(t)Φ(t0)−1 are linear combinations of the columns of

Φ(t) (see the Remark 79 below) and hence are solutions of the homogeneoussystem. The initial condition holds because Φ(t0)Φ(t0)

−1 = I

Theorem 78. If the matrix A is constant the matrix

Φ(t) = exp(tA)

is a fundamental matrix for the systemdv

dt= Av.

Proof. det(Φ(0)) = det(0A) = det(I) = 1 6= 0.

Remark 79. The proof of Theorem 77 asserted that The columns of PCare linear combnations of the columns of P. Because

P[

c1 c2 . . . ck

]=[

Pc1 Pc2 . . .Pck

]it is enough to see that this is true when C is a single column, i.e. when Cis n× 1. In this case it is the definition of matrix multiplication.

35

Page 245: MATH320

80. Now we show how to solve the inhomogeneous system (24) once we havea fundamental matrix for the corresponding homogeneous system (25). Bythe Superposition Principle (more precisely the Principle of the ParticularSolution) it is enough to find a particular solution xp of (24) for then thegeneral solution of (24) is the particular solution plus the general solutionof (25). The method we use is called variation of parameters. We alreadysaw a one dimensional example of this in the first part of the course. We usethe Ansatz

xp(t) = Φ(t)u(t)

Thendxp

dt=

dtu + Φ

du

dt= AΦu + Φ

du

dt= Axp + Φ

du

dt

which solves (24) if

Φdu

dt= f

so we can solve by integration

u(t) = u(0) +

∫ t

0

u′(τ) dτ = u(0) +

∫ t

0

Φ(τ)−1f(τ) dτ.

(Since we only want one solution not all the solutions we can take u(0) to beanything.) The solution of the initial value problem

Φdu

dt= f , x(0) = x0

is

x = Φ(t)u(t) + Φ(t)

(Φ(0)−1x0 − u(0)

). (26)

Example 81. We solve the inhomogeneous system

dx

dt= Ax + f

where

A =

[1 4

−2 7

], f =

[10

].

In Example 69 we found the fundamental matrix

Φ(t) = exp(tA) =

[2e3t − e5t −2e3t + 2e5t

e3t − e5t −e3t + 2e5t

]36

Page 246: MATH320

for the corresponding homogeneous system. We take

du

dt= Φ(t)−1f = exp(tA)−1f = exp(−tA)f =

[2e−3t − e−5t

e−3t − e−5t

]Integrating gives

u =

−2e−3t

3+

e−5t

5

−e−3t

3+

e−5t

5

=1

15

−10e−3t + 3e−5t

−5e−3t + 3e−5t

so a particular solution is

xp = exp(tA)u =

[2e3t − e5t −2e3t + 2e5t

e3t − e5t −e3t + 2e5t

]1

15

[−10e−3t + 3e−5t

−5e−3t + 3e−5t

]=

1

15

[(2e3t − e5t)(−10e−3t + 3e−5t) + (−2e3t + 2e5t)(−5e−3t + 3e−5t)(e3t − e5t)(−10e−3t + 3e−5t) + (−e3t + 2e5t)(−5e−3t + 3e−5t)

]=

1

15

[(−23 + 10e2t + 6e3t) + (16− 10e2t − 6e−2t)(−13 + 10e2t + 6e−2t) + (11− 10e2t − 3e−2t)

]=

1

15

[−7−2

].

Whew! This means that if we haven’t made a mistake, the functions

x1 =−7

15, x2 =

−2

15

should satisfy

dx1

dt= x1 + 4x2 + 1,

dx2

dt= −2x1 + 7x2.

Let’s check:

dx1

dt= 0 =

(−7

15

)+ 4

(−2

15

)+ 1 = x1 + 4x2 + 1

dx2

dt= 0 = −2

(−7

15

)+ 7

(−2

15

)= −2x1 + 7x2

37

Page 247: MATH320

Using equation (26) we see that the solution of the initial value problem

dx1

dt= x1 + 4x2 + 1,

dx2

dt= −2x1 + 7x2, x1(0) = 17, x2(0) = 29

is x = exp(tA)u(t) + exp(tA)

(x0 − u(0)

)i.e.

x1 = −715

+ (2e3t − e5t)(17 + 715

) + (−2e3t + 2e4t)(29 + 215

),

x2 = −215

+ (e3t − e5t)(17 + 715

) + (−e3t + 2e5t)(29 + 215

)

Remark 82. This particular problem can be more easily solved by differen-tiating the equation x′ = Ax+ f . Since f is constant we get x′′ = Ax′ whichis a homogeneoous equation.

Final Exam 07:45 A.M. THU. MAY 14

38

Page 248: MATH320

VECTOR SPACE NOTES: CHAPTER 4

DAVID SEAL

4.2. Vector Space Rn and Subspaces. Vector Spaces are (in the abstract sense) setsof elements, called vectors, that are endowed with a certain group of properties. You canadd two vectors and multiply vectors by scalars. They satisfy the usual commutative,associative and distributive laws you would expect. These properties are listed on page236 of your textbook.

Many problems we encounter can actually be viewed as vector spaces. In fact, this maycome as a surprise to you, but functions are vectors. In fact, solutions to linear differentialequations can also be thought of as a linear subspace of a certain class of functions. Thisis the motivation for studying vectors in the abstract sense.

If we have a vector space V , and we have a subset W ⊂ V , a natural question to askis whether or not W itself forms a vector space. This means it needs to satisfy all theproperties of a vector space from above! The bad news is this is quite a long list, howeverthe good news is we don’t have to check every property on the list, because most of themare inherited from the original vector space V . In short, in order to see if W is a vectorspace, we need only check if W passes the following test.

Theorem 1. If V is a vector space and W ⊂ V is a non-empty subset, then W itself isa vector space provided it satisfies the following two conditions:

a. Additive Closure If ~a ∈ W and ~b ∈ W , then ~a +~b ∈ W .b. Multiplicative Closure If λ ∈ R and ~a ∈ W , then λ~a ∈ W .

Note that these are two properties that are on the long laundry list of properties werequire for a set to be a vector space.

Example Consider W := ~a = (x, y) ∈ R2 : x = 2y. Since W ⊂ R2, we may beinterested if W itself forms a vector space. To answer this question we need only checktwo items:

a. Additive Closure: An arbitrary element in W can be described by (2y, y) wherey ∈ R. Let (2y, y), (2z, z) ∈ W . Then (2y, y)+(2z, z) = (2y +2z, y + z)) ∈ W since2y + 2z = 2(y + z).

b. Multiplicative Closure: We need to check if λ ∈ R, and ~a ∈ W , then λ~a ∈ W .Again, an arbitrary element in W can be described by (2y, y) where y ∈ R. Letλ ∈ R and (2y, y) ∈ W . Then λ(2y, y) = (2λy, λy) ∈ W since the first coordinateis exactly twice the second element.

Note: it is possible to write this set as the kernel of a matrix. In fact, you can check thatW = ker(A), where A1×2 =

[−2 1

]. We actually have a theorem that says the kernel

of any matrix is indeed a linear subspace.

Date: Updated: March 19, 2009.1

Page 249: MATH320

2 DAVID SEAL

Example Consider W := ~a ∈ R3 : z ≥ 0. In order for this to be a linear subspaceof R3, it needs to pass two tests. In fact, this set passes the additive closure test, butit doesn’t pass multiplicative closure! For example, (0, 0, 5) ∈ W , but (−1) · (0, 0, 5) =(0, 0,−5) /∈ W .

Definition. If Am×n is a matrix, we define ker(A) := x ∈ Rn : Ax = 0. This is alsocalled the nullspace of A.

Note that ker(A) lives in Rn.

Definition. If Am×n is a matrix, we define Image(A) := y ∈ Rm : Ax = y for some x ∈Rn. This is also called the range of A.

Note that Image(A) lives in Rm.

Theorem 2. If Am×n is a matrix, then ker(A) is a linear subspace of Rn and Image(A)is a linear subspace of Rm.

4.3. Linear Combinations and Independence of Vectors. If we have a collection ofvectors ~v1, ~v2, . . . , ~vk, we can form many vectors by taking linear combinations of thesevectors. We call this space the span of a collection of vectors, and we have the followingtheorem:

Theorem 3. If ~v1, ~v2, . . . , ~vk is a collection of vectors in some vector space V , then

W := span~v1, ~v2, . . . , ~vk := ~w : ~w = c1~v1 + c2~v2 + · · ·+ ck~vk, for some scalars ciis a linear subspace of V .

For a concrete example, we can take two vectors ~v1 = (1, 1, 0) and ~v2 = (1, 0, 0) whichboth lie in R3. Then the set W = span(1, 1, 0), (1, 0, 0) describes a plane that lives inR3. This set is a linear subspace by this previous theorem. In fact, we can be a bit moredescriptive and write W = (x, y, z) ∈ R3 : z = 0.

If we continue with this example, it is possible to write W in many other ways. In fact,we could have written

W = span(1, 1, 0), (1, 0, 0), (−5, 1, 0) = span(−10, 1, 0), (2, 1, 0).These examples illustrate the fact that our choice of vectors need not be unique. What isunique, is the least number of vectors that are required to describe the set. In fact this isso important we give it a name, and call it the dimension of a vector space. This is thecontent of section 4.4. In our example, dim(W ) = 2, but right now we don’t have enoughtools to show this.

In order to make this statement ‘least’, precise, we need to introduce definition.

Definition. Vectors ~v1, ~v2, . . . , ~vk are said to be linearly independent if whenever

c1~v1 + c2~v2 + · · ·+ ck~vk = 0

for some scalars ci, it must follow that ci = 0 for each i.

OK, so definitions are all fine and good, but how do we check if vectors are linearlyindependent? The nice thing about this definition is it always boils down to solving alinear system.

Example As a concrete example, let’s check if vectors ~v1, ~v2 linearly independentwhere ~v1 = (4,−2, 6,−4) and ~v2 = (2, 6,−1, 4).

Page 250: MATH320

VECTOR SPACE NOTES: CHAPTER 4 3

We need to solve the problem

c1~v1 + c2~v2 = 0.

This reduces to asking what are the solutions to

c1

4

−26

−4

+ c2

26

−14

=

0000

.

We can write this problem as a matrix equation A~c = ~0 where

A =

4 2

−2 66 −1

−4 4

, ~c =

[c1

c2

]

and solve this using Gaussian elimination.4 2 0

−2 6 06 −1 0

−4 4 0

→row ops

1 0 00 1 00 0 00 0 0

.

Thus c1 = c2 = 0 is the only solution to this problem, and so these two vectors are linearlyindependent.

To demonstrate that a collection of vectors are not linearly independant, it suffices tofind a non-trivial combination of these vectors and show they sum to ~0. For example, seeexample 6 in the textbook.

Page 251: MATH320

Math 320 Spring 2009Part II – Linear Algebra

JWR

April 24, 2009

1 Monday February 16

1. The equation ax = b has

• a unique solution x = b/a if a 6= 0,

• no solution if a = 0 and b 6= 0,

• infinitely many solutions (namely any x) if a = b = 0.

2. The graph of the equation ax+ by = c is a line (assuming that a and b arenot both zero). Two lines intersect in a unique point if they have differentslopes, do not intersect at all if they have the same slope but are not thesame line (i.e. if they are parallel), and intersect in infinitely many points ifthey are the same line. In other words, the linear system

a11x + a12y = b1, a21x + a22y = b2

• has a unique solution if a11a22 6= a12a21,

• no solution if a11a22 = a12a21 but neither equation is a multiple of theother,

• infinitely many solutions if one equation is a multiple of the other.

3. The graph of the equation ax+ by + cz = d is a plane (assuming that a, b,c, are not all zero). Two planes intersect in a line unless they are parallel or

1

Page 252: MATH320

identical and a line and a plane intersect in a point unless the line is parallelto the plane or lies in the plane. Hence the linear system

a11x + a12y + a13z = b2

a21x + a22y + a23z = b2

a31x + a32y + a33z = b3

has either a unique solution, no solution, or infinitely many solutions.

4. A linear system of m equations in n unknowns x1, x2, . . . , xn has theform

a11x1 + a12x2+ · · ·+ a1nxn = b1,

a21x1 + a22x2+ · · ·+ a2nxn = b2,

· · · · · ·am1x1 + am2x2+ · · ·+ amnxn = bm.

(†)

The system is called inconsistent iff has no solution and consistent iff it isnot inconsistent. Some authors call the system underdetermined iff it hasinfinitely many solutions i.e. if the equations do not contain enough informa-tion to determines a unique solution. The system is called homogeneousiff b1 = b2 = · · · bm = 0. A homogeneous system is always consistent becausex1 = x2 = · · · = xn = 0 is a solution.

5. The following operations leave the set of solutions unchanged as they canbe undone by another operation of the same kind.

Swap. Interchange two of the equations.

Scale. Multiply an equation by a nonzero number.

Shear. Add a multiple of one equation to a different equation.

It ie easy to see that the elementary row operations do not change the setof solutions of the system (†): each operation can be undone by anotheroperation of the same type. Swapping two equations twice, returns to theoriginal system, scaling a row by c and then scaling it again by c−1 returnsto the original system, and finally adding a multiple on one row to anotherand then subtracting the multiple returns to the original system.

2

Page 253: MATH320

2 Wednesday February 18

6. A matrix is an m× n array

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

.... . .

...am1 am2 · · · amn

of numbers. One says that A has size m× n or shape m× n or that A hasm rows and n columns. The augmented matrix

M := [A b] =

a11 a12 · · · a1n b1

a21 a22 · · · a2n b2...

.... . .

......

am1 am2 · · · amn bm

(‡)

represents the system of linear equations (†) in section 4. The elementaryrow operations described above to transform a system into an equivalentsystem may be represented as

Swap. M([p,q],:)=M([q,p],:) Interchange the pth and qth rows.

Scale. M(p,:)=c*M(p,:) Multiply the pth row by c.

Shear. M(p,:)=M(p,:)+c*M(q,:) Add c times qth row to pth row.

The notations used here are those of the computer language Matlab.1 Theequal sign denotes assignment, not equality, i.e. after the command X=Y isexecuted the old value of X is replaced by the value of Y. For example, if thevalue of a variable x is 7, the effect of the command x=x+2 is to change thevalue of x to 9.

1An implementation of Matlab called Octave is available free on the internet. (Do aGoogle search on Octave Matlab.) I believe it was written here at UW. A more primitiveversion of Matlab called MiniMat (short for Minimal Matlab) which I wrote in 1989is available on my website. There is a link to it on the Moodle main page for this course.It is adequate for everything in this course. Its advantage is that it is a Java Applet anddoesn’t need to be downloaded to and installed on your computer. (It does require theJava Plugin for your web browser.)

3

Page 254: MATH320

Definition 7. A matrix is said to be in echelon form iff

(i) all zero rows (if any) occur at the bottom, and

(ii) the leading entry (i.e. the first nonzero entry) in any row occurs tothe right of the leading entry in any row above.

It is said to be in reduced echelon form iff it is in echelon form and inaddition

(iii) each leading entry is one, and

(iv) any other entry in the same column as a leading entry is zero.

When the matrix represents a system of linear equations, the variables cor-responding to the leading entries are called leading variables and the othervariable are called free variables.

Remark 8. In military lingo an echelon formation is a formation of troops,ships, aircraft, or vehicles in parallel rows with the end of each row projectingfarther than the one in front. Some books use the term row echelon form forechelon form and reduced row echelon form or Gauss Jordan normal form forreduced echelon form.

Definition 9. Two matrices are said to be row equivalent iff one can betransformed to the other by elementary row operations.

Theorem 10. If the augmented coefficient matrices (‡) of two linear sys-tems (†) are row equivalent, then the two systems have the same solutionset.

Proof. See Paragraph 5 above.

Remark 11. It is not hard to prove that the converse of Theorem 10 istrue if the linear systems are consistent, in particular if the linear systemsare homogeneous. Any two inconsistent systems have the same solutionset (namely the empty set) but need not have row equivalent augmentedcoefficient matrices. For example, the augmented coefficient matrices

[1 0 00 0 1

]and

[0 0 10 0 0

]

corresponding to the two inconsistent systems

x1 = 0, 0x2 = 1 and 0x1 = 1, 0x2 = 0

are not row equivalent.

4

Page 255: MATH320

Theorem 12. Every matrix is row equivalent to exactly one reduced echelonmatrix.

Proof. The Gauss Jordan Elimination Algorithm described in the text2

on page 165 proves “at least one”. Figure 2 shows an implementation of thisalgorithm in the Matlab programming language. “At most one” is tricky.A proof appears in my book3 on page 182 (see also page 105).

Remark 13. Theorem 12 says that it doesn’t matter which elementary rowoperations you apply to a matrix to transform it to reduced echelon form;you always get the same reduced echelon form.

14. Once we find an equivalent system whose augmented coefficient matrix isin reduced echelon form it is easy to say what all the solutions to the systemare: the free variables can take any values and then the other variables areuniquely determined. If the last non zero row is [0 0 · · · 0 1] (correspondingto an equation 0x1 + 0x2 + · · · + 0xn = 1) then the system is inconsistent.For example, the system corresponding to the reduced echelon form

[0 1 5 0 6 70 0 0 1 8 5

]

isx2 +5x3 +6x5 = 7

x4 +8x5 = 5.

The free variables are x1, x3, x5 and the general solution is

x2 = 7− 5x3 − 6x5, x4 = 5− 8x5

where x1, x3, x5 are arbitrary.

3 Friday February 20

Now we define the operations of matrix algebra. This algebra is very usefulfor (among other things) manipulating linear systems. The crucial point isthat all the usual laws of arithmetic hold except for the commutative law.

2Edwards & Penny: Differential Equations & Linear Algebra, 2nd ed.3Robbin: Matrix Algebra Using MINImal MATlab

5

Page 256: MATH320

Figure 1: Reduced Echelon (Gauss Jordan) Form

function [R, lead, free] = gj(A)

[m n] = size(A);

R=A; lead=zeros(1,0); free=zeros(1,0);

r = 0; % rank of first k columns

for k=1:n

if r==m, free=[free, k:n]; return; end

[y,h] = max(abs(R(r+1:m, k))); h=r+h; % (*)

if (y < 1.0E-9) % (i.e if y == 0)

free = [free, k];

else

lead = [lead, k]; r=r+1;

R([r h],:) = R([h r],:); % swap

R(r,:) = R(r,:)/R(r,k); % scale

for i = [1:r-1,r+1:m] % shear

R(i,:) = R(i,:) - R(i,k)*R(r,:);

end

end % if

end % for

(The effect of the line marked (*) in the program is to test that the columnbeing considered contains a leading entry. The swap means that the subse-quent rescaling is by the largest possible entry; this minimizes the relativeroundoff error in the calculation.)

6

Page 257: MATH320

Definition 15. Two matrices are equal iff they have the same size (SIZEMATTERS!) and corresponding entries are equal, i.e. A = B iff A and Bare both m× n and

entryij(A) = entryij(B)

for i = 1, 2, . . . , m and j = 1, 2, . . . , n. The text sometimes writes A = [aij]to indicate that aij = entryij(A).

Definition 16. Matrix Addition. Two matrices may be added only ifthey are the same size; addition is performed elementwise, i.e if A and B arem× n matrices then

entryij(A + B) := entryij(A) + entryij(B)

for i = 1, 2, . . . , m and j = 1, 2, . . . , n. The zero matrix (of whatever size)is the matrix whose entries are all zero and is denoted by 0. Subtraction isdefined by

A−B := A + (−B), −B := (−1)B.

Definition 17. Scalar Multiplication. A matrix can be multiplied by anumber (scalar); every entry is multiplied by that number, i.e. if A is anm× n matrix and c is a number, then

entryij(cA) := c entryij(A)

for i = 1, 2, . . . , m and j = 1, 2, . . . , n.

18. The operations of matrix addition and scalar multiplication satisfy thefollowing laws:

(A + B) + C = A + (B + C). (Additive Associative Law)A + B = B + A. (Additive Commutative Law)A + 0 = A. (Additive Identity)A + (−A) = 0. (Additive Inverse)c(A + B) = cA + cB, (Distributive Laws)(b + c)A = bA + cA.(bc)A = b

(cA

). (Scalar Asscociative Law)

1A = A. (Scalar Unit)0A = 0, c0 = 0 (Multiplication by Zero).

In terms of lingo we will meet later in the semester these laws say that theset of all m× n matrices form a vector space.

7

Page 258: MATH320

Definition 19. The product of the matrix A and the matrix B is definedonly if the number of columns in A is the same as the number or rows in B,and it that case the product AB is defined by

entryik(AB) =n∑

j=1

entryij(A)entryjk(B)

for i = 1, 2, . . . ,m and k = 1, 2, . . . , p where A is m × n and B is n × p.Note that the ith row of A is a 1× n matrix, the kth column of B is a n× 1matrix, and

entryik(AB) = rowi(A)columnk(B).

20. With the notations

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

.... . .

...am1 am2 · · · amn

, x =

x1

x2...

xn

, b =

b1

b2...bn

,

the linear system (†) of section 4 may be succinctly written

Ax = b.

21. A square matrix is one with the same number of rows as columns, i.eof size n× n. A diagonal matrix is a square matrix D of form

D =

d1 0 · · · 00 d2 · · · 0...

.... . .

...0 0 · · · dn

i.e. all the nonzero entries are on the diagonal. The identity matrix isthe square matrix whose diagonal entries are all 1. We denote the identitymatrix (of any size) by I. The jth column of the identity matrix is denotedby ej and is called the jth basic unit vector. Thus

I =[

e1 e2 · · · en

].

8

Page 259: MATH320

22. The matrix operations satisfy the following laws:

(AB)C) = A(BC), (aB)C) = a(BC). (Asscociative Laws)C(A + B) = CA + CB. (Left Distributive Law)(A + B)C = AC + BC. (Right Distributive Law)IA = A, AI = A. (Multiplicative Identity)0A = 0, A0 = 0. (Multiplication by Zero)

23. The commutative law for matrix multiplication is in general false. Twomatrices A and B are said to commute if AB = BA. This can only happenwhen both A and B are square and when they have the same size, but eventhen it can be false. For example,[

0 10 0

] [0 01 0

]=

[1 00 0

]but

[0 01 0

] [0 10 0

]=

[0 00 1

].

4 Monday February 23

Definition 24. An elementary matrix is a matrix which results from theidentity matrix by performing a single elementary row operation.

Theorem 25 (Elementary Matrices and Row Operations). Let A be an anm× n matrix and E be an m×m elementary matrix. Then the product EAis equal to the matrix which results from applying to A the same elementaryrow operation as was used to produce E from I.

26. Suppose a matrix A is transformed to a matrix R by elementary rowoperations, i.e.

R = Ek · · ·E2E1A

where each Ej is elementary. Thus R = MA where M = Ek · · ·E2E1.Because of the general rule

E[

A B]

=[

EA EB]

(we might say that matrix multiplication distributes over concatenation) wecan find M via the formula

M[

A I]

=[

MA MI]

=[

R M].

The Matlab program shown in Figure 4 implements this algorithm.

9

Page 260: MATH320

Figure 2: Reduced Echelon Form and Multiplier

function [M, R] = gjm(A)

[m,n] = size(A);

RaM = gj([A eye(m)]);

R = RaM(:,1:n);

M = RaM(:,n+1:n+m);

Definition 27. A matrix B is called a right inverse to the matrix A iffAB = I. A matrix C is called a left inverse to the matrix A iff CA = I.The matrix A is called invertible iff it has a left inverse and a right inverse.

Theorem 28 (Uniqueness of the Inverse). If a matrix has both a left inverseand a right inverse then they are equal. Hence if A is invertible there isexactly one matrix denoted A−1 such that

AA−1 = A−1A = I.

Proof. C = CI = A(AB) = (CA)B = IB = B.

Definition 29. The matrix A−1 is called the (not an) inverse of A.

Remark 30. The example

[1 0 00 1 0

]

1 00 10 0

=

[1 00 1

]

shows that a nonsquare matrix can have a one-sided inverse. Since

[1 0 c13

0 1 c23

]

1 00 10 0

=

[1 00 1

]

we see that left inverses are not unique. Since

[1 0 00 1 0

]

1 00 1

b31 b32

=

[1 00 1

]

we see that right inverses are not unique. Theorem 28 says two sided inversesare unique. Below we will prove that an invertible matrix must be square.

10

Page 261: MATH320

5 Wednesday February 25

31. Here is what usually (but not always) happens when we transform anm×n matrix A to a matrix R in reduced echelon form. (The phrase “usuallybut not always” means that this is what happens if the matrix is chosen atrandom using (say) the Matlab command A=rand(m,n).)

Case 1: (More columns than rows). If m < n then (usually but notalways) matrix R has no zero rows on the bottom and has the form

R =[

I F]

where I denotes the m×m identity matrix. In this case the homogeoneoussystem Ax = 0 has nontrivial (i.e. nonzero) solutions, the inhomogeneoussystem Ax = b is consistent for every b, and both the matrix A and thematrix R have infinitely many right inverses. In this case the last n − mvariables are the free variables. The homogeneous system Ax = 0 takes theform

Rx =[

I F] [

x′

x′′

]=

[00

], x′ =

x1

x2...

xm

, x′′ =

xm+1

xm+2...

xn

and the general solution of the homogeneous system is given by

x′ = −Fx′′.

(The free variables x′′ determines the other variables x′.)

Case 2: (More rows than columns). If n < m then (usually but notalways) the matrix R has m− n zero rows on the bottom and has the form

R =

[I0

]

where I denotes the n × n identity matrix. In this case the homogeoneoussystem Ax = 0 has no nontrivial (i.e. nonzero) solutions, the inhomogeneoussystem Ax = b is inconsistent for infinitely b, and both the matrix A and

11

Page 262: MATH320

the matrix R have infinitely many left inverses. The system Ax = b may bewritten as Rx = MAx = Mb or

[I0

]x =

[x0

]= Mb =

[b′

b′′

], b′ =

b1

b2...bn

, b′′ =

bn+1

bn+2...

bm

which is inconsistent unless b′′ = 0.

Case 3: (Square matrix). If n = m then (usually but not always) the ma-trix R is the identity and the matrix A is invertible. In this case the inversematrix A−1 is the multiplier M found by the algorithm in paragraph 26 andfigure 4. In this case the homogeneous system Ax = 0 has only the triv-ial solution x = 0 and the inhomogeneous system Ax = b has the uniquesolution x = A−1b.

Theorem 32 (Invertible Matrices and Elementary Matrices). Elementarymatrices are invertible. A matrix is invertible if and only if it is row equiva-lent to the identity, i.e. if and only if if it is a product of elementary matrices.

Proof. See Case(3) of paragraph 31 and paragraph 35 below.

Theorem 33 (Algebra of Inverse Matrices). The invertible matrices satisfythe following three properties:

1. The identity matrix I is invertible and

I−1 = I.

2. The inverse of an invertible matrix A is invertible and(A−1

)−1= A.

3. The product AB of two invertible matrices is invertible, and(AB

)−1= B−1A−1.

Proof. That I−1 = I follows from IC = CI = I if C = I. That(A−1

)−1= A

follows from CA = AC = I if C = A−1. To prove(AB

)−1= B−1A−1 let

C = B−1A−1. Then C(AB) = B−1A−1AB = B−1IB = B−1B = I and(AB)C = ABB−1A−1 = AIA−1 = AA−1 = I.

12

Page 263: MATH320

Remark 34. Note the structure of the last proof. We are proving “If P, thenQ” where “if P” is the statement “if A and B are invertible” and “then Q” isthe statement “then AB is invertible”. The first step is “Assume that A andB are invertible.” The second step is “Let C = B−1A−1.” Then there is somecalculation. The penultimate step is “Therefore C(AB) = (AB)C = I” andthe last step is “Therefore AB is invertible”. Each step is either a hypothesis(like the first step) or introduces notation (like the second step) or followsfrom earlier steps. The last step follows from the penultimate step by thedefinition of what it means for a matrix to be invertible.

35. The algorithm in paragraph 26 can be used to compute the inverse A−1

of an invertible matrix A as follows. We form the n × 2n matrix[

A I].

Performing elementary row operations produces a sequence[

A I]

=[

A0 B0

],[

A1 B1

], · · · [ Am Bm

]=

[I M

],

where each matrix in the sequence is obtained from the previous one bymultiplication by an elementary matrix

[Ak+1 Bk+1

]= Ek

[Ak Bk

]=

[EkAk EBk

].

Hence there is an invariant relation

A−1k+1Bk+1 = (EkAk)

−1 (EBk) = A−1k E−1EBk = A−1

k Bk,

i.e. the matrix A−1k Bk doesn’t change during the algorithm. Hence

A−1 = A−1I = A−10 B0 = A−1

m Bm = I−1M = M.

This proves that the algorithm computes the inverse A−1 when A is invert-ible. See Case 3 of paragraph 31.

36. If n is an integer and A is a square matrix, we define

An := AA · · ·A︸ ︷︷ ︸n

for n ≥ 0 withA−n :=

(A−1

)n.

The power lawsAm+n = AmAn, A0 = I

follow from these definitions and the associative law.

13

Page 264: MATH320

6 Friday February 27 and Monday March 2

37. In this section A denotes an n×n matrix and aj denotes the jth columnof A. We indicate this by writing

A =[

a1 a2 · · · an

].

We also use the notation ej for the jth column of the identity matrix I:

I =[

e1 e2 · · · en

].

Theorem 38. There is a unique function called the determinant4 whichassigns a number det(A) to each square matrix A and has the followingproperties.

(1) The determinant of the identity matrix is one:

det(I) = 1.

(2) The determinant is additive in each column:

det([ · · · a′j + a′′j · · · ]

) = det([ · · · a′j · · · ]

)+det([ · · · a′′j · · · ]

)

(3) Rescaling a column multiplies the determinant by the same factor:

det([

a1 · · · caj · · · an

]) = c det(

[a1 · · · aj · · · an

]).

(4) The determinant is skew symmetric in the columns: Interchanging twocolumns reverses the sign:

det([ · · · ai · · · aj · · · ]

) = − det([ · · · aj · · · ai · · · ]

).

Lemma 39. The following properties of the determinant function follow fromproperties (2-4)

(5) Adding a multiple of one column to a different column leaves the deter-minant unchanged:

det([ · · · ai + caj · · · ]

) = det([ · · · ai · · · ]

) (i 6= j)

4The text uses the notation |A| where I have written det(A).

14

Page 265: MATH320

(6) If a matrix has two identical columns its determinant is zero:

i 6= j, ai = aj =⇒ det([ · · · ai · · · aj · · · ]

) = 0.

Proof. Item (6) is easy: interchanging the two columns leaves the matrix un-changed (because the columns are identical) and reverses the sign by item (4).To prove (5)

det([ · · · ai + caj · · · ]

) = det([ · · · ai · · · ]

) + det([ · · · caj · · · ]

)

= det([ · · · ai · · · ]

) + c det([ · · · aj · · · ]

)

= det([ · · · ai · · · ]

)

by (2), (3), and (6) respectively.

Remark 40. The theorem defines the determinant implicitly by saying thatthere is only one function satisfying the properties (1-4). The text gives aninductive definition of the determinant on page 201. “Inductive” means thatthe determinant of an n×n matrix is defined in terms of other determinants(called minors) of certain (n− 1)× (n− 1) matrices. Other definitions aregiven in other textbooks. We won’t prove Theorem 38 but will instead showhow it gives an algorithm for computing the determinant. (This essentiallyproves the uniqueness part of Theorem 38.)

Example 41. The determinant of a 2× 2 matrix is given by

det

([a11 a12

a21 a22

])= a11a22 − a12a21.

The determinant of a 3× 3 matrix is given by

det

a11 a12 a31

a21 a22 a23

a31 a32 a33

= a11a22a33 + a12a23a31 + a13a21a32

− a12a21a33 − a11a23a32 − a13a22a31.

The student should check that (with these definitions) the properties of thedeterminant listed in Theorem 38 hold.

Theorem 42 (Elementary Matrices and Column Operations). Let A be anan m × n matrix and E be an n × n elementary matrix. Then the product

15

Page 266: MATH320

AE is equal to the matrix which results from applying to A the same elemen-tary column operation as was used to produce E from I. (The elementarycolumn operations are swapping two columns, rescaling a column by anonzero factor, and adding a multiple of one column to another.)

Proof. This is just like Theorem 25. (The student should write out the prooffor 2× 2 matrices.)

Theorem 43. If E is an elementary matrix, and A is a square matrix ofthe same size, then the determinant of the product AE is given by

(Swap) det(AE) = − det(A) if (right multiplication by) E swaps two columns;

(Scale) det(AE) = c det(A) if E rescales a column by c;

(Shear) det(AE) = det(A) if E adds a multiple of one column to another.

Proof. These are properties (2-4) in Theorem 38.

Theorem 44. The determinant of a product is the product of the determi-nants:

det(AB) = det(A) det(B).

Hence a matrix is invertible if and only if its determinant is nonzero anddet(A−1) = det(A)−1.

Proof. An invertible matrix is a product of elementary matrices, so this fol-lows from Theorem 43 if A and B are invertible. Just as a noninvertiblesquare matrix can be transformed to a matrix with a row of zeros by ele-mentary row operations so also a noninvertible square matrix can be trans-formed to a matrix with a column of zeros by elementary column opera-tions. A matrix with a column of zeros has determinant zero because ofpart (3) of Theorem 38: multiplying the zero column by 2 leaves the ma-trix unchanged and mutiplies the determinanat by 2 so the determinantmust be zero. Hence the determinant is zero if either A or B (and hencealso AB) in not invertible. The formula det(A−1) = det(A)−1 follows asdet(A−1) det(A) = det(A−1A) = det(I) = 1. The fact that a matrix is in-vertible if and only if its determinant is nonzero follows form the facts that aninvertible matrix is a product of elementary matrices (Theorem 32), the de-terminant of an elementary matrix is not zero (by Theorem 43 with A = I),and the determinant of a matrix with a zero column is zero.

16

Page 267: MATH320

Remark 45. The text contains a formula for the inverse of a matrix interms of determinants (the transposed matrix of cofactors) and a relatedformula (Cramer’s Rule) for the solution of the inhomogeneous systemAx = b where A is invertible. We will skip this, except that the studentshould memorize the formula

A−1 =1

det(A)

[a22 −a12

−a21 a11

]

for the inverse of the 2× 2 matrix

A =

[a11 a12

a21 a22

].

Corollary 46. If E is an elementary matrix, and A is a square matrix ofthe same size, then the determinant of the product EA is given by

(Swap) det(EA) = − det(A) if (left multiplication by) E swaps two rows;

(Scale) det(EA) = c det(A) if E rescales a row by c;

(Shear) det(EA) = det(A) if E adds a multiple of one row to another.

47. Figure 6 shows a Matlab program which uses this corollary to computethe determinant at the same time as it computes the reduced echelon form.The algorithm can be understood as follows. If E is an elementary matrixthen

det(EA) = c det(A)

where c is the scale factor if multiplication by E rescales a row, c = −1if multiplication by E swaps two rows, and c = 1 if multiplication by Esubtracts a multiple of one row from another. We initialize a variable d to1 and as we transform A we update d so that the relation d det(A) = kalways holds with k constant, i.e. k = det(A). (This is called an invariantrelation in computer science lingo.) Thus when we rescale a row by c−1 wereplace d by dc, when we swap two rows we replace d by −d, and when whenwe subtract one row from another we leave d unchanged. (The matrix Achanges but k does not.) At the end we have replaced A by I so d det(I) = kso d = k = det(A).

17

Page 268: MATH320

Figure 3: Computing the Determinant and Row Operations

function d = det(A)

% invariant relation d*det(A) = constant

[m n] = size(A); d=1;

for k=1:n

[y,h] = max(abs(A(k:m, k))); h=k-1+h;

if y < 1.0E-9 % (i.e if y == 0)

d=0; return

else

if (k~=h)

A([k h],:) = A([h k],:); % swap

d=-d;

end

c = A(k,k);

A(k,:) = A(k,:)/c; % scale

d=c*d;

for i = k+1:m % shear

A(i,:) = A(i,:) - A(i,k)*A(k,:);

end

end % if

end % for

18

Page 269: MATH320

7 Monday March 2

Definition 48. The transpose of an m× n matrix A is the n×m matrixAT defined by

entryij(AT) = entryji(A)

for i = 1, . . . , n, j = 1, . . . , m.

49. The following properties of the transpose operation (see page 206 of thetext) are easy to prove:

(i) (AT)T = A;

(ii) (A + B)T = AT + BT;

(iii) (cAT) = cAT;

(iv) (AB)T = BTAT.

For example, to prove (iv)

entryij((AB)T) = entryji(AB) =∑

k

entryjk(A)entryki(B)

=∑

k

entryki(B)entryjk(A) =∑

k

entryik(BT)entrykj(A

T)

= entryij(BTAT).

Note also that the transpose of an elementary matrix is again an elementarymatrix. For example.

[1 c0 1

]T

=

[1 0c 1

],

[c 00 1

]T

=

[c 00 1

],

[0 11 0

]T

=

[0 11 0

].

Finally a matrix A is invertible if and only if its transpose AT is invertible(because BA = AB = I =⇒ ATBT = BTAT = IT = I) and the inverse ofthe transpose is the transpose of the inverse:

(AT

)−1=

(A−1

)T.

19

Page 270: MATH320

Remark 50. The text (see page 235) does not distinguish Rn and Rn×1

and sometimes uses parentheses in place of square brackets for typographicalreasons. It also uses the transpose notation for the same purpose so

x = (x1, x2, . . . , xn) =[

x1 x2 · · · xn

]T=

x1

x2...

xn

.

Theorem 51. A matrix and its transpose have the same determinant:

det(AT) = det(A).

Proof. The theorem is true for elementary matrices and every invertible ma-trix is a product of elementary matrices. Hence it holds for invertible matricesby Theorem 44. If A is not invertible then det(AT) = 0 and det(A) = 0 soagain det(AT) = det(A).

8 Friday March 6

Definition 52. A vector space is a set V whose elements are called vectorsand equipped with

(i) an element 0 called the zero vector,

(ii) a binary operation called vector addition which assigns to each pair(u,v) of vectors another vector u + v, and

(iii) an operation called scalar multiplication which assigns to each num-ber c and each vector v another vector cv,

such that the following properties hold:

(u + v) + w = u + (v + w). (Additive Associative Law)u + v = v + u. (Additive Commutative Law)u + 0 = u. (Additive Identity)u + (−1)u = 0. (Additive Inverse)c(u + v) = cu + cv, (b + c)u = bu + cu. (Distributive Laws)(bc)u = b

(cu

). (Scalar Asscociative Law)

1u = u. (Scalar Unit)0u = 0, c0 = 0 (Multiplication by Zero).

20

Page 271: MATH320

Example 53. As noted above in paragraph 18 the set of all m× n matriceswith the operations defined there is a vector space. This vector space isdenoted by Mmn in the text (see page 272); other textbooks denote it byRm×n. (R denotes the set of real numbers.) Note that the text (see page 235and Remark 50 above) uses Rn as a synonym for Rn×1 and has three notationsfor the elements of Rn:

x = (x1, x2, . . . , xn) =[

x1 x2 · · · xn

]T=

x1

x2...

xn

.

Examples 54. Here are some examples of vector spaces.

(i) The set F of all real valued functions of a real variable.

(ii) The set P of all polynomials with real coefficients.

(iii) The set Pn of all polynomials of degree ≤ n.

(iv) The set of all solutions of the homogeneous linear differential equation

d2x

dt2+ x = 0.

(v) The set of all solutions of any homogeneous linear differential equation.

The zero vector in F is the constant function whose value is zero and theoperations of addition and scalar multiplication are defined pointwise, i.e. by

(f + g)(x) := f(x) + g(x), (cf)(x) = cf(x).

The set P is a subspace of F (a polynomial is a function); in fact, all thesevector spaces are subspaces of F . The zero polynomial has zero coefficients,adding two polynomials of degree ≤ n is the same as adding the coeficients:

(a0 + a1x + · · ·+ anxn) + (b0 + b1x + · · ·+ bnx

n) =

= (a0 + b0) + (a1 + b1)x + · · ·+ (an + bn)xn,

and multiplying a polynomial by number c is the same as multiplying eachcoefficient by c:

c(a0 + a1x + · · · anxn) = ca0 + ca1x + · · · canx

n.

21

Page 272: MATH320

Definition 55. A subset W ⊆ V of a vector space V is called a subspaceiff it is closed under the vector space operations, i.e. iff

(i) 0 ∈ W ,

(ii) u,v ∈ W =⇒ u + w ∈ W , and

(iii) c ∈ R,u ∈ W =⇒ cu ∈ W .

Remark 56. The definition of subspace on page 237 of the text appearsnot to require the condition (i) that 0 ∈ W . However that definition doesspecify that W is non empty; this implies that 0 ∈ W as follows. There isan element u ∈ W since W is nonempty. Hence (−1)u ∈ W by (iii) and0 = u + (−1)u ∈ W by (ii). Conversely, if 0 ∈ W , then the set W isnonempty as it contains the element 0.

The student is cautioned not to confuse the vector 0 with the empty set.The latter is usually denoted by ∅. The empty set is characterized by thefact that it has no elements, i.e. the statement x ∈ ∅ is always false. Inparticular, 0 /∈ ∅. The student should also take care to distinguish the wordssubset and subspace. A subspace of a vector space V is a subset of V withcertain properties, and not every subset of V is a subspace.

57. A subspace of a vector space is itself a vector space. To decide if a subsetW of a vector space V is a subspace you must check that the three propertiesin Definition 55 hold.

Example 58. The set Pn of ploynomials of degree ≤ n is a subset of thevector space P (its elements are polynomials) and the set Pn is a subspaceof Pm if n ≤ m (if n ≤ m then a p polynomial of degree ≤ n has degree≤ m). These are also subspace because they are closed under the vectorspace operations.

9 Monday March 9 – Wednesday March 11

Definition 59. Let v1,v2, . . . ,vk be vectors in a vector space V . The vectorw in V is said to be linear combination of the vectors v1,v2, . . . ,vk iffthere exist numbers x1, x2, . . . , xk such that

w = x1v2 + x2v2 + · · ·+ xkvk.

22

Page 273: MATH320

The set of all linear combinations of v1,v2, . . . ,vk is called the span ofv1,v2, . . . ,vk. The vectors v1,v2, . . . ,vk are said to span V iff V is thespan of v1,v2, . . . ,vk, i.e. iff every vector in V is a linear combination ofv1,v2, . . . ,vk.

Theorem 60. Let v1,v2, . . . ,vk be vectors in a vector space V . Then thespan v1,v2, . . . ,vk is a subspace of V .

Proof. (See Theorem 1 page 243 of the text.) The theorem says that a linearcombination of linear combinations is a linear combination. Here are thedetails of the proof.

(i) 0 is in the span since 0 = 0v1 + 0v2 + · · ·+ 0vk.

(ii) If v and w are in the span, there are numbers a1, . . . , bk such thatv = a1v1 + a2v2 + · · · + akvk and w = b1v1 + b2v2 + · · · + bkvk sov + w = (a1 + b1)v1 + (a2 + b2)v2 + · · ·+ (ak + bk)vk so v + w is in thespan.

(iii) If c is a number and v is in the span then there are numbers a1, . . . , ak

such that v = a1v1+a2v2+· · ·+akvk so cv = ca1v1+ca2v2+· · ·+cakvk

so cv is in the span.

Thus we have proved that the span satisfies the three conditions in the defi-nition of subspace so the span is a subspace.

Example 61. If A =[

a1 a2 · · · an

]is an m× n matrix and b ∈ Rm,

then b is a linear combination of a1, a2, . . . , an if and only if the linear systemb = Ax is consistent, i.e. has a solution x. This is because of the formula

Ax = x1a2 + x2a2 + · · ·+ xnan

for x = (x1, x2, . . . , xn). The span of the columns a1, a2, · · · , an is called thecolumn space of A.

Definition 62. Let v1,v2, . . . ,vk be vectors in a vector space V . The vec-tors v1,v2, . . . ,vk are said to be independent5 iff the only solution of theequation

x1v2 + x2v2 + · · ·+ xkvk = 0 (∗)5The more precise term linearly independent is usually used. We will use the shorter

term since this is the only kind of independence we will study in this course.

23

Page 274: MATH320

is the trivial solution x1 = x2 = · · · = xk = 0. The vectors v1,v2, . . . ,vk

are said to be dependent iff they are not independent, i.e. iff there arenumbers x1, x2, . . . , xk not all zero which satisfy (∗).Theorem 63. The vectors v1,v2, . . . ,vk are dependent if and only if one ofthem is in the span of the others.

Proof. Assume that v1,v2, . . . ,vk are dependent. Then there are numbersx1, x2 . . . , xk not all zero such that x1v1 + x2v2 + · · · + xkvk = 0. Since thenumbers x1, x2, . . . , xk are not all zero, one of them, say xi is not zero so

vi = −x1

xi

v1 − · · · − xi−1

xi

vi−1 − xi+1

xi

vi+1 − · · · − xk

xi

vk,

i.e. vi is a linear combination of v1, . . . .vi−1,vi+1 . . . ,vk. Suppose con-versely that vi is a linear combination of v1, . . . .vi−1,vi+1 . . . ,vk. Then thereare numbers c1, . . . , ci−1, ci+1, . . . , ck such that vi = c1v1 + · · · + ci−1vi−1 +ci+1vi+1 + · · · + ckvk. Then x1v1 + x2v2 + · · ·+ xkvk = 0 where xj = cj forj 6= i and xi = −1. Since −1 6= 0 the numbers x1, x2, . . . , xk are not all zeroand so the vectors v1,v2, . . . ,vk are dependent.

Remark 64. (A pedantic quibble.) The text says things like “the set ofvectors v1,v2, . . . ,vk” is independent” but it is better to use the word “se-quence” instead of “set”. The sets v,v and v are the same (both consistof the single element v) but if v 6= 0 the sequence whose one and only ele-ment is v is independent (since cv = 0 only if c = 0) whereas the two elementsequence v,v (same vector repeated) is always dependent since c1v+c2v = 0if c1 = 1 and c2 = −1.

Definition 65. A basis for a vector space V is a sequence v1,v2, . . . ,vn ofvectors in V which both spans V and is independent.

Theorem 66. If v1,v2, . . . ,vn is a basis for V and w1,w2, . . . ,wm is a basisfor V , then m = n.

This is Theorem 2 on page 251 of the text. We will prove it next time.It justifies the following

Definition 67. The dimension of a vector space is the number of elementsin some (and hence every) basis.

Remark 68. It can happen that there are arbitrarily long independent se-quences in V . For example, this is the case if V = P , the space of allpolynomials: for every n the vectors 1, x, x2, . . . , xn are independent. In thiscase we say that V is infinite dimensional.

24

Page 275: MATH320

10 Friday March 13

Proof of Theorem 66. Let w1,w2, . . . ,wm and v1,v2, . . . ,vn be two sequencesof vectors in a vector space V . It is enough to prove

(†) If w1,w2, . . . ,wm span V , and v1,v2, . . . ,vn are independent,then n ≤ m.

To deduce Theorem 66 from this we argue as follows: If both sequencesw1,w2, . . . ,wm and v1,v2, . . . ,vn are bases then the former spans and thelatter is independent so n ≤ m. Reversing the roles gives m ≤ n. If n ≤ mand m ≤ n, then m = n. To prove the assertion (†) is enough to prove thecontrapositive:

If w1,w2, . . . ,wm span V and n > m, then v1,v2, . . . ,vn aredependent.

To prove the contrapositive note that because w1,w2, . . . ,wm span there are(for each j = 1, . . . , n) constants a1j, . . . , amj such that

vj =m∑

i=1

aijwi.

This implies that for any numbers x1, x2, . . . , xn we have

n∑j=1

xjvj =n∑

j=1

xj

(m∑

i=1

aijwi

)=

m∑i=1

(n∑

j=1

aijxj

)wi. (#)

Since n > m the homogeneous linear system

n∑j=1

aijxj = 0, i = 1, 2, . . . , m ([)

has more unknowns than equations so there is a nontrivial solution x =(x1, x2, . . . , xn). The left hand side of ([) is the coefficient of wi in (#) so ([)implies that

∑nj=1 xjvj = 0, i.e. that v1,v2, . . . ,vn are dependent.

Definition 69. For an m× n matrix A

(i) The row space is the span of the rows of A.

25

Page 276: MATH320

(ii) The column space is the span of the columns of A.

(iii) The null space is the set of all solutions x ∈ Rn of the homogeneoussystem Ax = 0.

The dimension of the row space of A is called the rank of A. The textcalls the dimension of the row space the row rank and the dimension of thecolumns space the column rank but Theorem 72 below says that these areequal.

Theorem 70 (Equivalent matrices). Suppose that A and B are equivalentm× n matrices. Then

(i) A and B have the same null space.

(ii) A and B have the same row space.

Proof. Assume that A and B are equivalent. Then B = MA where M =E1E2 · · ·Ek is a product of elementary matrices. If Ax = 0 then Bx =MAx = M0 = 0. Similarly if Bx = 0 then Ax = M−1Bx = M−10 = 0.Hence Ax = 0 ⇐⇒ Bx = 0 which shows that A and B have the samenull space. Another way to look at it is that performing an elementaryrow operation doesn’t change the space of solutions of the correspondinghomogeneous linear system. This proves (i)

Similarly performing an elementary row operation doesn’t change the rowspace. This is because if E is an elementary matrix then each row of EAis either a row of A or is a linear combination of two rows of A so a linearcombination of rows of EA is also a linear combination of rows of A (andvice versa since E−1 is also an elementary matrix). This proves (ii).

Theorem 71. The rank of a matrix A is the number r of non zero rows inthe reduced echelon form of A.

Proof. By part (ii) of Theorem 70 it is enough to prove this for a matrixwhich is in reduced echelon form. The non zero rows clearly span the rowspace (by the definition of the row space) and they are independent since theidentity matrix appears as an r × r submatrix.

Theorem 72. The null space of an m×n matrix has dimension n− r wherer is the rank of the matrix.

26

Page 277: MATH320

Proof. The algorithm on page 254 of the text finds a basis for the null space.You put the matrix in reduced echelon form. The number of leading variablesis r so there are n− r free variables. A basis consists of the solutions of thesystem obtained by setting one of the free variables to one and the others tozero.

Spring recess. Mar 14-22 (S-N)

11 Monday March 23

73. Theorem 70 says that equivalent matrices have the same row space, butthey need not have the same column space. The matrices

A =

[1 01 0

]and B =

[1 00 0

]

are equivalent and the row space of each is the set of multiples of the row[1 0

], but the column spaces are different: the column space of A consists

of all multiples of the column (1, 1) while the column space of B consists ofall multiples of column (1, 0). However

Theorem 74. The row rank equals the column rank, i.e. the column spaceand row space of an m× n matrix A have the same dimension.

Proof. Theorem 63 says that if a sequence v1, . . . ,vn of vectors is dependentthen one of them is a linear combination of the others. This vector can bedeleted without changing the span. In particular, if the columns of a matrixare dependent we can delete one of them without changing the column space.This process can be repeated until the vectors that remain are independent.The remaining vectors then form a basis. Thus a basis for the column spaceof A can be selected from the columns of A. The algorithm in the text onpage 259 tells us that these can be the pivot columns of A: these are thecolumns corresponding to the leading variables in the reduced echelon form.

Let ak be the kth column of A and rk be the kth column of the redcedechelon form R of A. Then

A =[

a1 a2 · · · an

], R =

[r1 r2 · · · rn

],

27

Page 278: MATH320

and MA = R where M is the invertible matrix which is the product ofthe elementary matrices used to transform A to its reduced echelon form R.Now matrix multiplication distributes over concatenation:

MA =[

Ma1 Ma2 · · · Man

]= R =

[r1 r2 · · · rn

],

soMak = rk, and rk = M−1ak

for k = 1, 2, . . . , n. After rearranging the columns of R and rearrangingthe columns of A the same way we may assume that the first r columns ofR are the first r columns e1, e2, . . . , er of the identity matrix and the lastn − r rows of R are zero. Then each of the last n − r columns of R is alinear combination of the first r columns so (multiplying by M) each of thelast n− r columns of A is a linear combination of the first r columns (withthe same coefficients). Hence the first columns of A span the column spaceof A. If some linear combination of the first r columns of A is zero, then(multiplying by M−1) the same linear combination of the first r columns iszero. But the first r columns of R are the first r columns of the identitymatrix so the coefficients must be zero. Hence the first r columns of A areindependent.

Example 75. The following matrices were computer generated.

A =

1 3 19 23−1 −2 −14 −17−2 −6 −38 −46−2 −7 −43 −52

, R =

1 0 4 50 1 5 60 0 0 00 0 0 0

,

M =

−2 −1 −3 2−6 0 −2 −1

5 3 −2 31 1 −1 1

, M−1 =

1 3 10 −29−1 −2 −7 21−2 −6 −19 55−2 −7 −22 64

.

The matrix R is the reduced echelon form of A and MA = R. The pivotcolumns are the first two columns. The third column of R is 4e1 + 5e2 andthe third column of A is a3 = 4a1 +5a2. The fourth column of R is 5e1 +6e2

and the fourth column of A is a4 = 5a1 + 6a2. The first two columns of Aare the same as the first two columns of M−1.

28

Page 279: MATH320

11.1 Wednesday March 22

The following material is treated in Section 4.6 of the text. We may not havetime to cover it in class so you should learn it on your own,

Definition 76. The inner product of two vectors u = (u1, u2, . . . , un) andv = (v1, v2, . . . , vn) in Rn is denoted 〈u,v〉 and defined by

〈u,v〉 = u1v2 + u2v2 + · · ·+ unvn.

It was called the dot product in Math 222. It can also be expressed interms of the transpose operation as

〈u,v〉 = uTv.

The length |u| of the vector u is defined as

|u| :=√〈u,u〉.

Two vectors are called orthogonal iff their inner product is zero.

77. The inner product satisfies the following.

(i) 〈u,v〉 = 〈v,u〉.(ii) 〈u,v + w〉 = 〈u,v〉+ 〈u,w〉.(iii) 〈cu,v〉 = c〈u,v〉.(iv) 〈u,u〉 ≥ 0 and 〈u,u〉 = 0 ⇐⇒ u = 0.

(v) |〈u,v〉| ≤ |u| |v|.(vi) |u + v| ≤ |u|+ |v|.The inequality (v) is called the Cauchy Schwartz Inequality. It justifiesdefining the angle θ between two nonzero vectors u and v by the formula

〈u,v〉 = |u| |v| cos θ

Thus two vectors are orthogonal iff the angle between them is π/2. Theinequality (vi) is called the triangle inequality.

29

Page 280: MATH320

Theorem 78. Suppose that the vectors v1,v2, . . . ,vk are non zero and pair-wise orthogonal, i.e. 〈vi,vj〉 = 0 for i 6= j. Then the sequence v1,v2, . . . ,vk

is independent.

Definition 79. Let V be a subspace of Rn. The orthogonal complementis the set V ⊥ of vectors which are orthogonal to all the vectors in V , in otherwords

w ∈ V ⊥ ⇐⇒ 〈v, w〉 = 0 for all v ∈ V.

Theorem 80. The column space of AT is the orthogonal complement to thenull space of A.

Exam II. Friday Mar 27

30

Page 281: MATH320

Math 320 Spring 2009Part I – Differential Equations

JWR

March 9, 2009

The text is Differential Equations & Linear Algebra (Second Edition) byEdwards & Penney.

1 Wednesday Jan 21

1. In first year calculus you learned to solve a linear differential equation like

dy

dt= 2y + 3, y(0) = 5 (1)

This semester you will learn to solve a system of linear differential equationslike:

dx

dt= 3x + y + 7,

dy

dt= x + 5y − 2, (x(0), y(0)) = (4, 8). (2)

Note that if you can solve systems of equations like (2) you can also solvehigher order equations like

d2y

dt2= 3

dy

dt+ y + 7, y(0) = 4,

dy

dt

∣∣∣∣t=0

= 8. (3)

You can change (3) into a system:

dy

dt= v,

dv

dt= 3v + y + 7, y(0) = 4, v(0) = 8 (4)

1

Page 282: MATH320

2. An ODE (ordinary differential equation) of order n looks like

F(t, y, y′, y′′, . . . , y(n)

)= 0 (5)

The unknown is a function y = y(t) of the independent variable t and

y′ :=dy

dt, y′′ :=

d2y

dt2, . . . , y(n) :=

dny

dtn.

When the equation looks like

y(n) = G(t, y, y′, y′′, . . . , y(n−1)

)(6)

we say it is in normal form. It may be impossible to rewrite equation (5) asequation (6). A system of differential equations is the same thing withthe single unknown y replaced by the vector y := (y1, y2, . . . , ym).

Remark 3. As our first examples will show, the independent variable oftenhas the interpretation of time which is why the letter t is used. Un thiscase the ODE represents the time evolution of a dynamical system. Forexample the 2nd order system

mr = −GMm

r3

describes the motion of a planet of mass m moving about a sun of mass M .The sun is at the origin, r is the posiition vector of the planet, and r = |r| isthe length of r, i.e. the distance from the planet to the sun. Sometimes theODE has a geometric interpretation in which case the letter x is often usedfor the independent variable.

Example 4. Swimmer crossing a river (Text page 15.) Let the banks of ariver be the vertical lines x = ±a in the (x, y) plane and suppose that theriver flows up so that the velocity vR of the river at the point (x, y) is

vR = v0

(1− x2

a2

).

The formula says that vR = 0 on the banks where x = ±a and vR = v0 inthe center of the river where x = 0 (the y-axis). The swimmer swims with

2

Page 283: MATH320

constant velocity vS towards the closest point on the opposite shore. Thesystem

dx

dt= vS,

dy

dt= vR = v0

(1− x2

a2

)

is a dynamical system describing the position of the swimmer. Dividing the

two equations and usingdy

dx=

dy/dt

dy/dxgives the geometric equation

dy

dx=

v0

vS

(1− x2

a2

)

which describes the trajectory of the swimmer.

Example 5. Newton’s law of cooling. This says that rate of change of thetemperature T of a body (e.g. a cup of coffee) is proportional to differenceA − T bewteen the ambient temperature (i.e. room temperature) and thetemperature of the body. The ODE is

dT

dt= k(A− T ).

In a tiny time from t to t + h of duration ∆t = (t + h)− t = h the change inthe temperature is ∆T = T (t + h) = T (t) so the rate of change id ∆T/∆t.By Newton’s law of cooling we have (approximately)

∆T

∆t≈ k(A− T ).

It doesn’t matter much if we use T = T (t) or T = T (t + h) on the righthand side because T is continuous and h is small. By the definition of thederivative

dT

dt= lim

∆t→0

∆T

∆t= lim

h→0

T (t + h) = T (t)

h

so we get the exact form of Newton’s law of cooling as the limit as h → 0 inthe approximate form.

2 Friday Jan 23

Theorem 6 (Existence and Uniqueness Theorem). Suppose that f(t, y) is acontinuous function of two variables defined in a region R in (t, y) plane and

3

Page 284: MATH320

that the partial ∂f/∂y exists and is continuous everyhere in R. Let (t0, y0)be a point in R. Then there is a solution y = y(t) to the initial value problem

dy

dt= f(t, y), y(t0) = y0

defined on some interval I about t0. The solution is unique in the sense thatany two such solutions of the initial value problem are equal where both aredefined.

Remark 7. The theorem is stated on page 23 of the text and proved inan appendix. The same theorem holds for systems and hence higher orderequations. We usually solve an ODE by doing an integration. Then anarbitrary constant C arises and we choose it to satisfy the initial conditiony(t0) = y0. The Existence and Uniqueness Theorem tells us that this is theonly answer.

8. The first order ODEdx

dt= f(t, x)

has an important special case, where the function f(t, x) factors as a product

f(t, x) = g(x)h(t)

of a functiong(x) of x and a function h(t) of t. Then we can write the ODEdx/dt = g(x)h(t) as dx/g(x) = h(t) dt, integrate to get

∫dx

g(x)=

∫h(t) dt,

and solve for x in terms of t. When g(x) is identically one, the equation isdx/dt = h(t) so the answer is x =

∫h(t) dt. When h(t) is is identically one,

the system is autonomous, i.e. dx/dt = g(x). In this case can find out a lotabout the solutions from the phase diagram. 1

Example 9. Braking a car. A car going at speed v0 skids to a stop at aconstant deceleration k in time T leaving skid marks of length L. We findeach of the four quantities in terms of the other three. Let the brakes beapplied at time t = 0, so the car stops at time t = T , and let v = v(t) denote

1We’ll study this later in Section 2.2 of the text. See Figure 2.2.7 on page 93.

4

Page 285: MATH320

the velocity at time t, and x = x(t) denote the distance travelled over thetime interval [0, t]. Then the statement of the problem translates into theequations

dv

dt= −k, v =

dx

dt, v(0) = v0, v(T ) = 0, x(0) = 0, x(T ) = L.

Integrating the first differential equation gives

∫dv

dtdt =

∫−k dt = −kt + C,

so C = v(0) = v0 and v(t) = v0 − kt so 0 = v(T ) = v0 − kT so v0 = kT ,k = v0/T , and T = v0/k. Integrating the second differential equation gives

L = x(T )− x(0) =

∫ T

0

dx

dtdt =

∫ T

0

v(t) dt

∫ T

0

(v0 − kt) dt = v0T − 12kT 2.

From T = v0/k we get L = v20/k − 1

2v2

0/k = 12v2

0/k. (See problems 30-32page 17 of the text.)

Remark 10. Mathematically this is the same problem as the problem ofa falling body on earth: If y is the height of the body, v = dy/dt is itsspeed, a = dv/dt = d2y/dt2 is the acceleration, then Newton’s 3rd law isF = ma = −mg where g = 32ft/sec2 =9.8m/sec2 is the acceleration due togravity so

v =dy

dt= −gt + v0, y = −gt2

2+ v0t + y0.

Example 11. Population equation (exponential growth and decay). Thedifferential equation

dP

dt= kP

says that the rate of growth (or decay if k < 0) of a quantity P is proportionalto its size. We solve by separation of variables: dP/P = dt so

ln P =

∫dP

P=

∫k dt = kt + C = kt + ln P0

so P = P0ekt.

5

Page 286: MATH320

12. A single linear homogeneous equation. The more general equation

dy

dt= R(t)y

is solved the same way: dy/y = R(t) dt so

ln y =

∫dy

y=

∫R(t) dt

and exponentiating this equation gives

y = eR

R(t) dt.

Note that the additive constant in∫

R(t) dt becomes a multiplicative constantafter exponentiating. For example, integrating the equation

dy

dt= ty

gives

ln y =

∫dy

y=

∫t dt = 1

2t2 + C

so exponentiating gives

y = exp(

12t2 + C

)= exp

(12t2

)exp(C) = y0 exp

(12t2

)

where y0 = eC . (For typographical reasons the exponential function is oftendenoted as exp(x) := ex.)

Example 13. Consider the function f(y) = |y|p. On the region where y 6= 0the the derivative f ′(y) is continuous so the Existence and Uniqueness The-orem applies to solutions which stay in this region. We solve by separationof variables where y > 0

y1−p

1− p)=

∫dy

yp= t− c.

so (as long as t− c > 0)

y = [(1− p)(t− c)]1/(1−p).

6

Page 287: MATH320

When y < 0 we have |y| = −y and

y = −[(1− p)(|t− c|]1/(1−p).

Funny things happen when y = 0. If p > 1 the derivative f ′(y) is continuousso by the Existence an Uniqueness Theorem the only solution with y(t0) = 0is y ≡ 0. (This is reflected in the fact that the above formula for y becomesinfinite when t = c.) If 0 < p < 1 however a solution can remain at zerofor a finite amount of time and follow one of the above solutions to the leftand right. For example for p = 1

2and any choice of c1 < 0 and c2 > 0 the

function

y(t) =

14(c1 − t)2 for t < c1

0 for c1 ≤ t ≤ c2

14(t− c2)

2 for c2 < t

solves the ODE and the initial condition y(0) = 0, so the solution is notunique. This is essentially the example of Remark 2 on page 23 of the text.

3 Monday Jan 26

14. Slope fields and phase diagrams. To draw the slope field of an ODEdy/dx = f(x, y) raw a little line segment of slope f(x, y) and many points(x, y) in the (x, y)-plane. The curves tangent to these little line segmentsare the graphs of the solution curves. This is a lot of work unless you havea computer and it is often not very helpful. In the case of an autonomousODE dy/dt = f(y) the phase diagram (see e.g. Figures 2.2.8n and 2.2.9 onpage 94 of the text.) is more helpful. This is a line representing the y-axiswith the zeros of f indicated and the intervals in between the zeros markedwith an arrow indicating the sign of f(y) for y in that interval.

Example 15. Swimmer crossing river. Recall from last Wednesday thedynamical system

dx

dt= vS,

dy

dt= vR = v0

(1− x2

a2

)

describing the position of the swimmer. Dividing the two equations and using

7

Page 288: MATH320

dy

dx=

dy/dt

dy/dxgives the geometric equation

dy

dx= k

(1− x2

a2

), k :=

v0

vS

which describes the trajectory of the swimmer. Take k = 1, a = 1. Thesolution curves are

y = x− x3

3+ C.

They are vertical translates of one another. The solution with C = 0 startsat the point (x, y) =

(−1,−23

)and ends at (x, y) =

(1, 2

3

).

Example 16. The slope field for dy/dx = x − y The slope is horizontal onthe line y = x, negative to the left and positive to the right. The picturein the text (page 20) suggests that the solutions are asymptotic as x → ∞.We’ll check this in the next lecture.

17. The phase diagram for dy/dt = (y − a)(y − b). Assume that a < b sody/dt > 0 for y < a and for b < y while dy/dt < 0 for a < y < b. The phasediagram is

s sa b¾- -

From the diagram we can see that

limt→∞

y(t) = a if y(0) < a

y(t) = a if y(0) = a

limt→−∞

y(t) = a if a < y(0) < b

y(t) = b if y(0) = b

limt→−∞

y(t) = a if y(0) < b

limt→T1−

y(t) = ∞ if b < y(0)

limt→T2+

y(t) = −∞ if y(0) < a

8

Page 289: MATH320

The diagram does not tell us whether T1 and T2 are finite. For this we willsolve the equation by separation of variables and partial fractions.

1

(y − a)(y − b)=

1

b− a

(1

y − b− 1

y − a

)

∫dy

(y − a)(y − b)=

∫dt

ln|y − b||y − a| = (b− a)t + c.

|y − b||y − a| = Ce(b−a)t, C := ec.

What to do about the absolute values? Well certainly

y − b

y − a= ±Ce(b−a)t,

y = y0 when t = 0, and the exponential is positive so we must have

±C =y0 − b

y0 − a, y0 := y(0).

Now we can solve for y. We introduce the abbreviation u := ±Ce(b−a)t tosave writing:

y − b

y − a= u =⇒ y = (y − a)u + b =⇒ y

(1− u

)= b− au =⇒ y =

b− au

1− u.

Now plug back in the values of u and ±C and multiply top and bottom ofthe resulting fraction by y0 − a to simplify:

y =(y0 − a)b− a(y0 − b)e(b−a)t

(y0 − a)− (y0 − b)e(b−a)t.

As a check we plug in t = 0. We get

y =(y0 − a)b− a(y0 − b)

(y0 − a)− (y0 − b)=

(b− a)y0

b− a= y0.

as expected. Now

limt→∞

y(t) =−a(y0 − b)

−(y0 − b)= a, lim

t→−∞y(t) =

(y0 − a)b

(y0 − a)= b

but if y0 < a there is a negative value of t (namely t = T2 above) where thedenominator vanishes and similarly if y0 > b there is a positive value of t(namely t = T1 above) where the denominator vanishes.

9

Page 290: MATH320

4 Wednesday Jan 28

18. Three ways to solve dy/dt + 2y = 3. A linear first order ODE is one ofform

dy

dt+ P (t)y = Q(t). (1)

If P and Q are constants we can solve by separation of variables. For exampleto solve dy/dt + 2y = 3 we write

ln(2y − 3)

2=

∫dy

2y − 3=

∫dt = t + c

so 2y − 3 = e2tC (where C = ec) and hence y = (3 + e2tC)/2. This doesn’twork if either P or Q is not a constant. In the method of integratingfactors we multiply the ODE (1) by a function ρ to get

ρ(t)dy

dt+ ρ(t)P (t)y = ρ(t)Q(t)

and then choose ρ so thatdρ

dt= ρ(t)P (t). (2)

The ODE (1) then takes the form

d

dt

(ρy

)= ρQ (3)

which can be solved by integration. In the method of variation of pa-rameters we look for a solution of the form

y = Φ(t)u(t)

so the ODE (1) takes the form

dtu + Φ

du

dt+ PΦu = Q.

Then once we solvedΦ

dt+ PΦ = 0 (4)

the ODE (1) simplifies todu

dt= Φ−1Q. (5)

In either method we first reduce to a homogeneous linear ODE (either (2)or (4)) and then do an integration problem (either (3) or (5)).

10

Page 291: MATH320

Remark 19. Because equation (3) is the homogeneous linear ODE corre-sponding to the inhomogeneous linear ODE (1), the general solution of (3)is of form Φ(t)C where C is an arbitrary constant. Having solved this prob-lem by separating variables we solve (1) by trying to find a solution wherethe constant C is replaced by a variable u. For this reason the method ofvariation of parameters is also called the method of variation of con-stants. The text uses the method of integrating factors for a single ODE insection 1.5 page 50 and the method of variation of constants for systems onsection 8.2 page 493.

Example 20. To solve dy/dx = x− y rewrite it as dy/dx + y = x. Multiplyby ρ(x) = ex to get

dy

dxex + yex = xex.

Thend

dx

(yex

)=

dy

dxex + yex = xex

so. integrating by parts,

yex =

∫xex dx = xex − ex + C

soy = x− 1 + Ce−x.

Note that the general solution is asymptotic to the particular (C = 0) solu-tion y = x− 1.

21. The Superposition Principle. Important! If

dy1

dt+ P (t)y1 = Q1(t) and

dy2

dt+ P (t)y2 = Q2(t)

and if y = y1 + y2 and Q = Q1 + Q2, then

dy

dt+ P (t)y = Q(t).

In particular (take Q2 = 0 and Q = Q1) this shows that the general solution

of an inhomogeneous linear equationdy

dt+P (t)y = Q(t) is the general solution

of the corresponding homogeneous equationdu

dt+P (t)u = 0 plus a particular

solution of the inhomogeneous linear equation.

11

Page 292: MATH320

Example 22. When we discussed the slope field of dy/dx = x− y (text fig-ure 1.3.6 page 20) we observed that it looks like all the solutions are asymp-totic. Indeed if dy1/dx = x− y1 and dy2/dx = x− y2 then

d

dx(y1 − y2) = −(y1 − y2)

so y1− y2 = Ce−x so limx→∞(y1− y2) = 0. This proves that all the solutionsare asymptotic without solving the equation. The argument works moregenerally if x is replaced by Q(x), i.e. for the equation dy/dx = Q(x)− y.

5 Friday January 30

23. Mixture problems. Let x denote the amount of solute in volume of sizeV and c denote its concentration. Then

c = x/V.

In a mixture problem, any of these may vary in time. Thus if a fluid withconcentration cin (units = mass/volume) flows into a tank at a rate of rin

(units = volume/time) the amount of solute added in time dt is cin rin dt.Similarly if a fluid with concentration cout (units = mass/volume) flows out ofthe tank at a rate of rout (units = volume/time) the amount of solute removedin time dt is cin rin dt. (The book uses the subscript i as an abbreviationfor in and the subscript o as an abbreviation for out.) Hence the differentialequation

dx

dt= cin rin − cout rout.

In such problems one generally assumes that cin, rin, and rout are constantbut x, cin, and possibly also the volume V of the tank vary.

Example 24. A tank contains V liters of pure water. A solution that con-tains cin kg of sugar per liter enters a tank at the rate rin Liters/min. Thesolution is mixed and drains from the tank at the same rate.

(a) How much sugar is in the tank initially?

(b) Find the amount of sugar x in the tank after t minutes.

(c) Find the concentration of sugar in the solution in the tank after 78minutes.

12

Page 293: MATH320

In this problem rin = rout so the volume V of the tank is constant. In a timeinterval dt, cin rin dt kg of sugar enters the tank and x(t)/V dt kg of sugarleaves the tank so we have an inhomogeneous linear ODE

dx

dt= cinrin − x

Vrout

with initial value x(0) = 0. To save writing we abbreviate c := cin, r :=rin = rout so the ODE is

dx

dt=

(c− x

V

)r.

Solve by separation of variables

−V ln(V c− x) =

∫V dx

V c− x=

∫r dt = rt + K.

Since the tank initially holds pure water we have x = 0 when t = 0, henceK = −V ln(V c) so −K/V = ln(V c). Solving for x gives

ln(V c− x) = −rt

V+ ln(V c) =⇒ x = V c

(1− exp

(−rt

V

))

Remark 25. When x is small, the term x/V is even smaller so the equationis roughly dx/dt = cinrout and the answer small values of t is roughly x =(cinrin)t. For small values of t the amount of sugar x is also small and theapproximation x = (cinrin)t is very accurate – so accurate that it may foolWeBWorK – but it is obviously wrong for large values of t. The reasonis that limt→∞(cinrin)t = ∞ whereas limt→∞ x = cinV so that the limitingconcentration of the sugar in the tank is the same as the concentration ofsolution flowing in.

Remark 26. One student was assigned this problem in WeBWorK withvalues of V = 2780, c = 0.06 and r = 3 and complained to me that WeBWorKrejected the answer. I typed

2780*0.06[1-exp(-3t/2780)]

and WeBWorK accepted the answer. The student had typed the value

(-2780/3)(exp((-3(t+1589))/2780)-.18)

and WeBWorK rejected that answer. The two answers would agree if

13

Page 294: MATH320

exp(-3*1589/2780)=0.18

but this isn’t exactly true. I typed exp(-3*1589/2780) into the answer boxfor the part 1 of the question to see what WeBWorK thinks is the value andWeBWork said the value is 0.180009041024602. (The answer to part 1 is0, but when I hit the Preview Button WeBWorK did the computation.) Ireplaced 0.18 by this value in the student’s answer and WeBWorK acceptedit.

6 Monday February 2

Here are some tricks for solving special equations. The real trick is to find atrick for remembering the trick.

27. Linear substitutions. To solve

dy

dx= (ax + by + c)p

try v = ax + by + c so

dv

dx= a + b

dy

dx= a + bvp

28. Homogeneous equations. A linear equation is called homogeneous if ascalar multiple of a solution is again a solution. A function h(x, y) is calledhomogeneous of degree n if

h(λx, λy) = λnh(x, y).

In particular, f is homogeneous of degree 0 iff f(λx, λy) = f(x, y). Then

f(x, y) = F(y

x

), F (u) := f(1, u)

To solvedy

dx= F

(y

x

)

Try the substitution v = y/x.

14

Page 295: MATH320

29. Bernoulli equations. This is an equation of form

dy

dx+ P (x)y = Q(x)yn.

Try y = vp and solve for a value of p which makes the equation simpler.

30. Exact equations. The equation

M(x, y) dx + N(x, y) dy = 0

is exact if there is a function F (x, y) such that

∂F

∂x= M,

∂F

∂y= N (∗∗)

then∂M

∂y=

∂N

∂x(∗)

because∂2F

∂y∂x=

∂2F

∂x∂y.

In Math 234 you learn that converse is true (if M and N are defined for all(x, y). Exactness implies that the solutions to the ODE

M(x, y) + N(x, y)dy

dx= 0

are the curvesF (x, y) = c

for various values of c. To find F (x, y) satisfying (∗∗) choose (x0, y0) andintegrate from (x0, y0) along any path joining (x0, y0) to (x, y). Condition (∗)guarantees that the integral is independent of the choice of the path.

Example 31. Write the ODE

3x2y5 + 5x3y4 dy

dx= 0

as3x2y5 dx + 5x3y4 dy = 0.

15

Page 296: MATH320

The exactness condition (∗) holds as

∂y3x2y5 = 15x3y4 =

∂x5x3y4.

Let (x0, y0) = (0, 0) and compute F (x, y) by integrating first along the y-axis(where dx = 0) from (0, 0) to (0, y) and the along the horizontal line from(0, y) to (x, y) (where dy = 0). We get

F (x, y) =

∫ t=y

t=0

N(0, t) dt +

∫ t=x

t=0

M(t, y) dt

=

∫ t=y

t=0

5(03)t4 dt +

∫ t=x

t=0

3t2y5 dt

= 0 + x3y5 = x3y5.

so the solutions of the ODE are the curves x3y5 = C. Because the exactnesscondition holds it doesn’t matter which path we use to compute F (x, y) solong as it goes from (0, 0) to (x, y). For example, integrating first along thex-axis (where dy = 0) from (0, 0) to (x, 0) and the along the vertical linefrom (x, 0) to (x, y) (where dx = 0) gives

F (x, y) =

∫ t=y

t=0

N(x, t) dt +

∫ t=x

t=0

M(t, 0) dt

=

∫ t=y

t=0

5(x3)t4 dt +

∫ t=x

t=0

3t2(05) dt

= x3y5 + 0 = x3y5.

Along the diagonal line from (0, 0) to (x, y) we have dx = x dt and dy = y dtwith t running from 0 to 1 so

F (x, y) =

∫ t=1

t=0

N(tx, ty)y dt +

∫ t=1

t=0

M(tx, ty)x dt

=

∫ t=1

t=0

5t7x3y4y dt +

∫ t=1

t=0

3t7x2(y5)x dt

= 58x3y5 + 3

8x3y5 = x3y5.

7 Wednesday February 4

32. Reducible second order equations. A second order ODE where either theunknown x or or its derivative dx/dt is missing can be reduced the equation

16

Page 297: MATH320

to a first order equation. If x is missing the equation is already first orderin dx/dt. The case where both t and dx/dt are missing is like a conservativeforce field in physics, i.e. a force field which is the negative gradient of apotential energy function U so Newton’s third law takes the form

md2x

dt2= −∇U

In this case the energy

E :=m|v|2

2+ U, v :=

dx

dt

is conserved (constant along solutions). When the number of dimensions isone (but not in higher dimensions) every force field is a gradient and we canuse this fact to reduce the order. To solve

d2x

dt2= f(x)

take U = − ∫f(x) dx and v = dx/dt so the equation becomes

12

(dx

dt

)2

+ U(x) = E

which can be solve by separation of variables.

Example 33. Consider the equation

md2x

dt2= −kx.

Define the velocity v, and the total energy E by

v :=dx

dt, E :=

mv2

2+

kx2

2.

(The total energy is the sum of the kinetic energy mv2/2 and the potentialenergy U(x) := kx2/2.) Now

dE

dt= mv

dv

dt+ kx

dx

dt=

(m

dv

dt+ kx

)v = 0,

17

Page 298: MATH320

so the total energy E is constant along solutions. Then

dx

dt= v = ±

√2E − kx2

m.

We solve the initial value problem v(0) = 0 and x(0) = x0. Then 2E = kx20

sodx

dt= µ

√x2

0 − x2, µ := ±√

k

m,

sodx√

x20 − x2

= µ dt

so

− cos−1

(x

x0

)=

∫dx√

x20 − x2

=

∫µ dt = µt + C.

When t = 0, we have x = x0 so x.x0 = 1 so (since cos(0) = 1) C=0 andhence so x = x0 cos(µt).

Remark 34. On page 70 of the text, the problem is treated a little differently.The unknown x is viewed as the independent variable and the substitution

v =dx

dt,

d2x

dt2=

dv

dt=

dv

dx· dx

dt=

dv

dx· v

is used to transform the equation

md2x

dt2+ kx = 0

into the equation

mdv

dxv + kx = 0.

Solving this by separation of variables gives

m

∫v dv + k

∫x dx = 0

which is the conservation law 12mv2 + 1

2kx2 = E from before. The book uses

the letters x, y, p where I have used t, x, v. (I deviated from the book’snotation to emphasize the connection with physics.)

18

Page 299: MATH320

8 Monday February 9

35. Peak Oil. In 1957 a geologist named M. K. Hubbert plotted the annualpercentage rate of increase in US oil production against total cumulative USoil production and discovered that the data points fell (more or less) on astraight line. Specifically

dQ/dt

aQ+

Q

b= 1

where Q = Q(t) us the total amount of oil (in billions of barrels) producedby year t, a = 0.055, b = 220 with the initial condition Q(1958) = 60.2 TheODE for Q can be written as

dQ

dt= aQ− kQ2, k =

a

b.

This equation is called the Logistic Equation. (We solved a similar equa-tion dy/dt = (y−a)(y−b) above.) By solving this equation Hubbert predictedthat annual US oil production would peak (i.e. dQ/dt would become neg-ative) in the year 1975. The peak actually occurred in 1970 but this wentunnoticed because by this time the US had begun to import much of its oil.A similar calculation for world oil production produced a prediction of a peakin the year 2005.

36. First order autonomous quadratic equations. Consider the equation

dx

dt= Ax2 + Bx + C.

The right hand side will have either two zeros, one (double) zero, or no (real)zeros depending on whether B2− 4AC is positive, zero, or negative. If thereare two zeros, say

p :=−B +

√B2 − 4AC

2A, q :=

−B −√B2 − 4AC

2A,

then the equations may be written as

dx

dt= A(x− p)(x− q)

2I got these figures from page 155 (see also page 201) of the very entertaining book:Kenneth S. Deffeyes, Hubbert’s Peak, Princeton University Press, 2001. I estimated theinitial condition from the graph, so it may not be exactly right.

19

Page 300: MATH320

and the limiting behavior can be determined from the phase diagram as wedid last week. If there are no zeros, all solutions reach ±∞ is finite time.After completing the square and rescaling x and t the equation has one ofthe folloeing three forms:

Example 37. Example with no zeros. We solve the ODE

dx

dt= 1 + x2, x(0) = x0.

We separate variables and integrate:

tan−1(x) =

∫dx

1 + x2=

∫dt = t + c, c := tan−1(x0),

so for −π/2 < t < π/2 we have

x = tan(t + c) =tan t + tan c

1− tan t tan c=

tan t + x0

1− x0 tan t.

The solution becomes infinite when t = tan−1(1/x0).

Example 38. Example with two zeros. We solve the ODE

dx

dt= 1− x2, x(0) = x0.

We separate variables and integrate:

tanh−1(x) =

∫dx

1− x2=

∫dt = t + c, c := tanh−1(x0),

so for −π/2 < t < π/2 we have

x = tanh(t + c) =tanh t + tanh c

1 + tanh t tanh c=

tanh t + x0

1 + x0 tanh t.

For −1 < x0 < 1 we have limt→∞ x = 1 and limt→−∞ x = −1.

Example 39. Example with a double zero. The solution of the ODE

dx

dt= x2, x(0) = x0

is x = x0/(x0 − t). If x0 6= 0 it becomes infinite when t = x0.

20

Page 301: MATH320

40. After a change of variables, every quadratic ODE

dy

ds= Ay2 + By + C

takes one of these three forms. Divide by A and complete the square

1

A

dy

ds=

(y +

B

2A

)2

− B2 − 4AC

4A2

Let u = y + (B/2A) and k2 = |(B2 − 4AC)/(4A2)|:1

A

du

ds= u2 ± k2.

Finally (if k 6= 0) divide by k2 and let x = u/k and t = −Ak2s to arrive atdx/dt = 1± x2. (If k = 0, take x = u and t = As to arrive at dx/dt = x2.)

41. Trig functions and hyperbolic functions.

sin(t) =eit − e−it

2isinh(t) =

et − e−t

2

cos(t) =eit + e−it

2cosh(t) =

et + e−t

2

i tan(t) =eit − e−it

eit + e−ittanh(t) =

et − e−t

et + e−t

cos2(t) + sin2(t) = 1 cosh2(t)− sinh2(t) = 1

d sin(t) = cos(t) dt d sinh(t) = cosh(t) dt

d cos(t) = − sin(t) dt d cosh(t) = sinh(t) dt

tan(t + s) =tan(t) + tan(s)

1− tan(t) tan(s)tanh(t + s) =

tanh(t) + tanh(s)

1 + tanh(t) tanh(s)

42. Bifurcation and dependence on parameters. The differential equation

dx

dt= x(4− x)− h

models a logistic population equation with harvesting rate h. The equilibriumpoints are

H = 2−√

4− h, N = 2 +√

4− h

if h < 4, There is a bifurcation at h = 4. This means that the qualitativebehavior of the system changes as h increases past 4. When h = 4 there isa double root (H = N) and for h > 4 there is no real root and all solutionsreach −∞ in finite time.

21

Page 302: MATH320

43. Air resistance proportional to v. The equation of motion is F = mawhere the force is F = FG + FR with

a =dv

dt, v =

dy

dt, FG = −g, FR = −kv.

This can be solved by separation of variables and there is a terminal ve-locity

v∞ := limt→∞

v = −mg/k

whicsh is independent of the initial velocity.

44. Air resistance proportional to v2. The equation of motion is F = mawhere the force is F = FG + FR with

a =dv

dt, v =

dy

dt, FG = −g, FR = −kv|v|.

Thus

mdv

dt=

−g − kv2 when v > 0−g + kv2 when v < 0.

After rescaling (i.e. a change of variable) we can suppose m = g = k andwe use the above. To find the height y we need to choose the constants ofintegration correctly.

45. Escape velocity. A spaceship of mass m is attracted to a planet of massM by a gravitational force of magnitude GMm/r2 so that (after cancellingm) the equation of motion (if gravity is the only force acting on the spaceship)is

dv

dt=

d2r

dt2= −GM

r2

where r is the distance of the spaceship to the center of the planet andv = dr/dt is the velovity of the spaceship. As above, the energy

E :=mv2

2− GMm

r

is a constant of the motion so if r(0) = r0 and v(0) = v0 we have (afterdividing by m/2)

v2 − 2GM

r= v2

0 −2GM

r0

22

Page 303: MATH320

from which follows

v2 > v20 −

2GM

r0

.

The quantity√

2GM/r0 is called the escape velocity. If v0 is greater thanthe esacape velocity then

r(t) =

∫ t

0

dr

dtdt =

∫ t

0

v dt >

∫ t

0

√v2

0 −2GM

r0

dt = t

√v2

0 −2GM

r0

so r becomes infinite in finite time.

9 Wednesday February 11

46. Monthly Investing. Mary starts a savings account. She plans to invest100 + t dollars t months after opening the account. The account pays 6%annual interest. How much is in the account after t months? Denote by S(t)the amount in the account after t months. Then S(0) = 100 and S(t + 1) =S(t)+interest+deposit, i.e.

S(t + 1) = S(t) +0.06

12S(t) + (100 + t)

This equation can be written in the form

S(t + h) = S(t) + f(t, S(t))h

where h = 1 and f(t, S) =0.06

12S(t) + (100 + t). It can also be written

∆S = f(t, S(t))∆t

where ∆S = S(t + h)− S(t) and ∆t = h = 1.

47. Daily Investing. Donald starts a savings account. He plans to invest dailyat a rate of 100+ t dollars per month after opening the account. The accountpays 6% annual interest. How much is in the account after t months? Denoteby S(t) the amount in the account after t months. This is n = 30t days. Oneday is h months where h = 1/30. Then S(0) = 100 and S(t+h) = S(t)+oneday’s interest+one day’s deposit, i.e.

S(t + h) = S(t) +0.06

12S(t)h + (100 + t)h, h =

1

30, t = nh

23

Page 304: MATH320

This equation can be written in the form

S(t + h) = S(t) + f(t, S(t))h

where h = 1/30 and f(t, S) =0.06

12S(t) + (100 + t). It can also be written

∆S = f(t, S(t))∆t

where ∆S = S(t + h)− S(t) and ∆t = h = 130

.

48. Hourly Investing. Harold starts a savings account. He plans to investhourly at a rate of 100 + t dollars per month after opening the account.The account pays 6% annual interest. How much is in the account after tmonths? Denote by S(t) the amount in the account after t months. This isn = 720t hours. One hour is h months where h = 1/720. Then S(0) = 100and S(t + h) = S(t)+one hours’s interest+one hours’s deposit, i.e.

S(t + h) = S(t) +0.06

12S(t)h + (100 + t)h, h =

1

720, t = nh

This equation can be written in the form

S(t + h) = S(t) + f(t, S(t))h

where h = 1/720 and f(t, S) =0.06

12S(t) + (100 + t). It can also be written

∆S = f(t, S(t))∆t

where ∆S = S(t + h)− S(t) and ∆t = h = 1720

.

49. Continuous Investing. Cynthia starts a savings account. She plans toinvest continuously at a rate of 100 + t dollars per month after opening theaccount. The account pays 6% annual interest. How much is in the accountafter t months? Denote by S(t) the amount in the account after t months.Then S(0) = 100 and the change dS in the account in an infinitessimal timeinterval of size dt at time t is

dS =0.06

12S(t) dt + (100 + t) dt

This equation can be written in the form

dS = f(t, S(t)) dt

where f(t, S) =0.06

12S(t) + (100 + t).

24

Page 305: MATH320

Remark 50. Mary is getting an annual interest rate of 6% compoundedmonthly Donald is getting an annual interest rate of 6% compounded dailyand is investing a little more each month than is Mary. Harold is getting anannual interest rate of 6% compounded hourly and is investing a little moreeach month than is Donald. Cynthia is getting an annual interest rate of 6%compounded continuously and is investing a little more each month than isHarold. The point is that all the answers are about the same. Here’s why:

Theorem 51 (The Error in Euler’s Method). Assume that f(t, y) is contin-uously diffentiable. Let y = y(t) be the solution to the initial value problem

dy

dt= f(t, y), y(0) = y0

and yn be the solution to the difference equation

yn+1 = yn + f(nh, yn)h.

Then there is a constant C = C(f, T ) (dependent on T and f but independentof h) such that

|y(t)− yn| ≤ Ch,

for t = nh and 0 ≤ t ≤ T .

Remark 52. This theorem is stated on page 122 of the text. When I get achance, I will put a formula for C in these notes and provide a proof. (Onlymotivated students should try to learn the proof.)

25