Top Banner
Notes on Calculus Tom Read September, 2014
341

Notes on Calculus

Mar 20, 2023

Download

Documents

Khang Minh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Notes on Calculus

Notes on Calculus

Tom Read

September, 2014

Page 2: Notes on Calculus

Abstract

“Question everything.”

Meg Ryan

as mathematician Catherine Boydin the movie “I.Q.”

2

Page 3: Notes on Calculus

CONTENTS

CONTENTS 3

1 Functions - A Framework for Problem Solving 71.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Functions and Their Representations . . . . . . . . . . . . . . 7

1.2.1 Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . 81.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.3 Machines . . . . . . . . . . . . . . . . . . . . . . . . . 91.2.4 Representations of Functions . . . . . . . . . . . . . . . 101.2.5 Function Notation . . . . . . . . . . . . . . . . . . . . 111.2.6 The Domain of a Function . . . . . . . . . . . . . . . . 11

1.3 Linear Functions . . . . . . . . . . . . . . . . . . . . . . . . . 161.4 Exponential Functions . . . . . . . . . . . . . . . . . . . . . . 22

1.4.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 261.4.2 Confession . . . . . . . . . . . . . . . . . . . . . . . . . 271.4.3 Compound Interest, Inflation Rates, etc. . . . . . . . . 28

1.5 Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . 321.5.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 321.5.2 Elementary Properties . . . . . . . . . . . . . . . . . . 331.5.3 Addition and Subtraction Formulas . . . . . . . . . . . 35

1.6 Operations on Functions . . . . . . . . . . . . . . . . . . . . . 391.6.1 Transforming the Domain of a Function . . . . . . . . 391.6.2 What is an Asymptote? . . . . . . . . . . . . . . . . . 431.6.3 Composition . . . . . . . . . . . . . . . . . . . . . . . . 45

1.7 Graphing and Machinery; Scales . . . . . . . . . . . . . . . . . 501.8 Inverse Functions, Logarithms, Roots . . . . . . . . . . . . . . 55

1.8.1 Inverse Functions . . . . . . . . . . . . . . . . . . . . . 551.8.2 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601.8.3 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . 601.8.4 Using Logarithms . . . . . . . . . . . . . . . . . . . . . 611.8.5 How Many Exponential Functions Are There? . . . . . 631.8.6 Accuracy, Exactness, and Calculator Technique . . . . 631.8.7 How Many Logarithm Functions Are There? . . . . . . 64

1.9 Comparing Families of Functions . . . . . . . . . . . . . . . . 681.9.1 Little Oh Notation . . . . . . . . . . . . . . . . . . . . 691.9.2 Comparing Power Functions and Exponentials . . . . 701.9.3 Comparing Power Functions and Logarithms . . . . . . 731.9.4 Behavior as x→ 0+ . . . . . . . . . . . . . . . . . . . . 761.9.5 Section Summary . . . . . . . . . . . . . . . . . . . . . 78

CONTENTS 3

Page 4: Notes on Calculus

1.10 Parametrized Curves . . . . . . . . . . . . . . . . . . . . . . . 79

2 Rates of Change 892.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 892.2 The Derivative of a Function at a Point . . . . . . . . . . . . . 93

2.2.1 The Graphical Interpretation—Secant Lines and TangentLines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

2.2.2 Interpreting the Derivative . . . . . . . . . . . . . . . . 952.2.3 Numerical Approximation of the Derivative . . . . . . 97

2.3 Average and Instantaneous Rates of Change, Limits . . . . . . 1012.4 Local Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . 105

2.4.1 What is the Point of Local Linearity? . . . . . . . . . . 1112.5 Two Fine Points . . . . . . . . . . . . . . . . . . . . . . . . . 115

2.5.1 Is Every Function Differentiable? . . . . . . . . . . . . 1152.5.2 Why Does a Have to Be an Endpoint? . . . . . . . . . 117

2.6 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1192.6.1 Plugging Holes in Graphs . . . . . . . . . . . . . . . . 120

2.7 Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . 1232.7.1 The Catch . . . . . . . . . . . . . . . . . . . . . . . . . 125

3 Computing Derivatives 1273.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1273.2 Derivatives of Trigonometric Functions . . . . . . . . . . . . . 1273.3 Derivatives of Exponential Functions . . . . . . . . . . . . . . 1293.4 The Product and Chain Rules . . . . . . . . . . . . . . . . . . 131

3.4.1 The Product Rule . . . . . . . . . . . . . . . . . . . . . 1323.4.2 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . 134

3.5 First Applications of the Chain Rule . . . . . . . . . . . . . . 1373.5.1 Derivative of bx . . . . . . . . . . . . . . . . . . . . . . 1373.5.2 Differentiating Inverse Functions . . . . . . . . . . . . . 137

3.6 Summary of the Rules of Differentiation . . . . . . . . . . . . 1433.7 Velocity Along a Parametrized Curve . . . . . . . . . . . . . . 144

3.7.1 Measuring Change of Position . . . . . . . . . . . . . . 1443.7.2 Velocity in the Plane . . . . . . . . . . . . . . . . . . . 1473.7.3 Acceleration in the Plane . . . . . . . . . . . . . . . . . 150

3.8 Components of Vectors—The Dot Product . . . . . . . . . . . . 1583.9 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1673.10 Components of Acceleration . . . . . . . . . . . . . . . . . . . 170

4 Applications of Differentiation 1734.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1734.2 Simple Applications of the Chain Rule . . . . . . . . . . . . . 173

4.2.1 Implicit Differentiation . . . . . . . . . . . . . . . . . . 1734.2.2 Related Rates . . . . . . . . . . . . . . . . . . . . . . . 174

4.3 Graphing and Calculus . . . . . . . . . . . . . . . . . . . . . . 177

4 CONTENTS

Page 5: Notes on Calculus

4.4 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 1824.5 Families of Functions . . . . . . . . . . . . . . . . . . . . . . . 1924.6 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . 197

5 Integration 2015.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

5.1.1 A General Formula for the Sums in Example 1 . . . . . 2045.1.2 Review: Sigma Notation for Sums . . . . . . . . . . . . 208

5.2 The Definite Integral . . . . . . . . . . . . . . . . . . . . . . . 2095.2.1 Definition of the Definite Integral . . . . . . . . . . . . 2125.2.2 Which Functions Are Riemann Integrable? . . . . . . . 213

5.3 Calculating Integrals With Sums . . . . . . . . . . . . . . . . 2155.4 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . 219

5.4.1 Midpoint and Trapezoidal Rules . . . . . . . . . . . . . 2195.4.2 Simpson’s Rule . . . . . . . . . . . . . . . . . . . . . . 2205.4.3 Errors of the Rules . . . . . . . . . . . . . . . . . . . . 2215.4.4 More on Errors of the Rules—A Partial Proof . . . . . . 224

5.5 Definite Integrals and Areas . . . . . . . . . . . . . . . . . . . 2295.5.1 What is area? . . . . . . . . . . . . . . . . . . . . . . . 232

5.6 Using Areas to Evaluate Integrals . . . . . . . . . . . . . . . . 2325.7 Average Value of a Function . . . . . . . . . . . . . . . . . . . 2365.8 The Fundamental Theorem of Calculus . . . . . . . . . . . . . 2375.9 The Area Function as Antiderivative . . . . . . . . . . . . . . 241

5.9.1 Composition Involving the Area Function . . . . . . . . 2475.10 Finding Antiderivatives Pictorially with Slope Fields . . . . . 249

5.10.1 Drawing Slope Fields with the TI-89 . . . . . . . . . . 2505.11 Finding Antiderivatives Symbolically . . . . . . . . . . . . . . 251

5.11.1 Substitution . . . . . . . . . . . . . . . . . . . . . . . . 2525.11.2 Substitution with Definite Integrals . . . . . . . . . . . 2535.11.3 An Area Interpretation of Substitution . . . . . . . . 2535.11.4 Integration by Parts . . . . . . . . . . . . . . . . . . . 2575.11.5 A Short Table of Antiderivatives . . . . . . . . . . . . . 259

5.12 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . 2615.12.1 The Basic Idea . . . . . . . . . . . . . . . . . . . . . . 2615.12.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . 2645.12.3 Another Kind of Improper Integral . . . . . . . . . . . 268

6 Modeling with Integrals 2736.1 Finding Areas and Volumes . . . . . . . . . . . . . . . . . . . 2736.2 Finding Physical Quantities . . . . . . . . . . . . . . . . . . . 2816.3 Finding the Total Change . . . . . . . . . . . . . . . . . . . . 2876.4 Distribution Functions and Probability . . . . . . . . . . . . . 293

6.4.1 Averages and Spread . . . . . . . . . . . . . . . . . . . 2976.4.2 Probability . . . . . . . . . . . . . . . . . . . . . . . . 299

CONTENTS 5

Page 6: Notes on Calculus

7 Differential Equations 3037.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

7.1.1 Classifying Differential Equations . . . . . . . . . . . . 3057.2 Modeling with Differential Equations . . . . . . . . . . . . . . 3067.3 Slope Fields, Equilibrium Solutions, Autonomous Equations . 309

7.3.1 Slope Fields . . . . . . . . . . . . . . . . . . . . . . . . 3097.3.2 Equilibrium Solutions . . . . . . . . . . . . . . . . . . . 3107.3.3 Autonomous Equations . . . . . . . . . . . . . . . . . . 311

7.4 Using the TI-89 Plus to Graph Solutions and Slope Fields . . 3137.4.1 Basic Setup . . . . . . . . . . . . . . . . . . . . . . . . 3137.4.2 Working With an Equation . . . . . . . . . . . . . . . 3147.4.3 Initial Conditions . . . . . . . . . . . . . . . . . . . . . 314

7.5 Existence & Uniqueness of Solutions and What It Tells Us . . 3157.5.1 Existence and Uniqueness . . . . . . . . . . . . . . . . 3157.5.2 What It Tells Us . . . . . . . . . . . . . . . . . . . . . 3167.5.3 Phase Lines for Autonomous Equations . . . . . . . . . 3187.5.4 Have I Found All Solutions? . . . . . . . . . . . . . . . 319

7.6 Separable Equations; Symbolic Solutions . . . . . . . . . . . . 3207.7 More Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 324

7.7.1 Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 3247.7.2 Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . 326

7.8 Systems of Differential Equations . . . . . . . . . . . . . . . . 3297.8.1 Direction Fields . . . . . . . . . . . . . . . . . . . . . . 3317.8.2 Direction Fields Using the TI-89 . . . . . . . . . . . . . 3337.8.3 Second Order Equations . . . . . . . . . . . . . . . . . 3347.8.4 An Epidemic Model . . . . . . . . . . . . . . . . . . . . 337

Index 340

6 CONTENTS

Page 7: Notes on Calculus

1. FUNCTIONS - A FRAMEWORKFOR PROBLEM SOLVING

1.1 Introduction

Successful problem solving depends to a surprising degree on accumulated ex-perience. For problems involving calculus, one important facet of the relevantexperience is a knowledge of functions. This includes both general experiencewith the representations of functions (symbolic, graphical, and on occasionnumerical) and with translating between different representations, and morespecific knowledge of the behavior of particular functions, as classified intofamilies according to various kinds of properties.Background knowledge is useful in many ways. It is needed to classify

a problem, to suggest ways (preferably more than one) by which it might besolved, and to bring to mind the possible outcomes. As you work on a problem,your expectation of what is possible is your single best protection against error.If you are coming up with a solution that you believe to be impossible, thisis a very strong message to review what you have done in order to find thepoint at which you have gone off the track. Failure to pay attention to suchclues is one of the most common forms of “preventable error.” Every once ina while the outcome is not one we realized was possible; such events, perhapsa bit embarrassing at the time, are one important way in which we refine andimprove our expertise.This chapter summarizes some essential information about operations on

functions and about the families of functions that arise most often in calculus.Despite the fact that the specific functions discussed will all be familiar, some ofthe material is likely to be new. The intention of the chapter is to concentrateon aspects of this material that may be less familiar; it should not be dismissedas “just review.”

1.2 Functions and Their Representations

The function concept carries a lot of baggage, much of which is often not madeexplicit. The many different ways of looking at or thinking about functions,each with its own advantages and disadvantages, each particularly useful insome contexts but not in others, may be one reason why working with functions

FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING 7

Page 8: Notes on Calculus

can at times be confusing.

1.2.1 Ordered Pairs

Characteristically, in making a formal definition of a function, mathematicianshave opted to eliminate both ambiguity and interpretation. The usual formaldefinition of a function from the real numbers to the real numbers is this:

A function f is a set of ordered pairs of real numbers,

no two of which have the same first component.

Or more informally,

A function f is a pairing of the values of one variable quantity

with the values of another variable quantity in such a way that

each value of the first variable is paired with exactly one value

of the second variable.

Now the set of ordered pairs (x, f(x)) where x is in the domain of f (theset of all numbers we apply f to, or the set of all first components of theordered pairs) is what we are used to calling the graph of f, so this definitionamounts to defining a function to be its graph.The strength of this definition is that it is precise — that is, in using it

to decide whether an object is a function or not, there should be no need for“judgment calls” — it should be clear whether the object is a function or not.The weakness of this definition is that it has abandoned nearly all the ways

in which we normally think about functions. In fact this is deliberate, sincenone of the interpretations is really wide enough to include all objects thatmost mathematicians would now like to consider to be functions. But thisdoes mean that we do not often use this definition in “everyday mathematicallife.”

1.2.2 Rules

A definition more commonly met in calculus books is something like this:

A function f is a rule that associates with each

number x in its domain a well-defined number f(x) .

So we still have x and f(x) , but what is a rule? Normally we think of arule as some regular, organized procedure for producing f(x) from x. In fact,

8 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 9: Notes on Calculus

normally we think of the rule as being expressed by some formula, such asf(x) = x3 − 4x+ 27. Sometimes the rule may be much more complicated in asomewhat hidden way. (Try to explain the rule f(x) = sinx, that is, if I giveyou x and don’t allow you to just punch a button on your calculator, whatconstruction do you have to go through to come up with sinx? Can you carryit out for x = 0.32?) But at the very least, we generally think of a rule assomething other than just a list of all the ordered pairs (x, f(x)) .What if the connection between x and f(x) is less regular? I have just

rolled a die 10 times with the following results:

No. of roll 1 2 3 4 5 6 7 8 9 10No. on die 1 5 6 6 3 2 6 3 5 5

Is this table of results a function? Can you find a rule here? Despite theabsence of any easily explainable rule, mathematicians would say that this is afunction, because it is a list of ordered pairs. (They are (1, 1) , (2, 5) , (3, 6) , ....)You may be tempted to protest that it is a very artificial function, and quiteunlike any “real” function. Yes, it is artificial. Its only virtue is that it waseasy for me to make. But it is just an oversimplified version of a very familiarand important situation.Whenever we collect data, by doing an experiment, or studying the weather,

or watching the stock market for a period, the result is something very like thissimple function. Thus most “real” functions in science or social science are ofthis type. “Real” here means directly observed from some outside source.Of course the point of calculus is to model reality by creating and studying

functions of the more familiar “formula” variety which we believe represent, atleast approximately, the behavior of whatever phenomenon we are studying.That is, we create and study an idealized version of reality in order to help usbetter understand the world that we observe. The test for whether our modelis a good one is whether the model functions match the observed functionsclosely enough to be useful. Usually, at least in the sciences, we mean bythis that the model functions have predictive value–not only do they matchreasonably well all previously known values of the observed functions, but alsoif we gather some new data this too will agree with the corresponding valuesof the model function. (Or, if we wish to create a physical system that willbehave in some particular way–say a rocket that will put a satellite into aprescribed orbit–we can use the model to tell us how to construct the system.)

1.2.3 Machines

Another common description of a function is as a kind of machine. You feeda number x in one side, it operates on the number somehow, and spits outa number f(x) on the other side. This description is a little different fromthe rule description, partly because there is less of an implication that there

FUNCTIONS AND THEIR REPRESENTATIONS 9

Page 10: Notes on Calculus

is a simple formula behind what is going on, but more importantly becauseit conveys the idea of the function as active — it changes numbers into othernumbers. When we want to think of a function this way, we often call it amapping. That is, “mapping” is a synonym for “function,” but it is a wordthat we use when we want to think of an active or dynamic process. We willsee that there are situations where this is a very useful device–when thinkingof inverses, for instance, or of the composition of two functions.

1.2.4 Representations of Functions

We can draw diagrams that represent functions in several ways. The mostfamiliar is the graph of a function–just a diagram of some or all of thepoints (x, f(x)) for x in the domain of f. For a “formula function” you expectto see a curve, or perhaps several pieces of curve if f has discontinuities. For a“data function” the graph will be a set of discrete points. We will be studyinggraphs a lot in this course, so there’s no need to say more here.A representation that is probably much less familiar is the mapping di-

agram. This is a diagrammatic version of the function as machine idea. Wehave two parallel copies of the real line, and we draw segments connecting val-ues of x on the left hand line to the corresponding f(x) on the right hand line.Here are some examples for the functions f(x) = x+ 3, g(x) = 2x, h(x) = x2.

0

3

6

9

12

15

18

0

3

6

9

12

15

18

f (x) = x+ 3

-2

-1

0

1

2

-2

-1

0

1

2

g (x) = 2x

-2

-1

0

1

2

-2

-1

0

1

2

h (x) = x2

Mapping diagrams are more often useful for thinking about general proper-ties of functions than for studying specific functions, but we will find them help-ful from time to time. For instance, note that f (x) = x+3moves each numberbut preserves the distance between numbers, that is, f (b)− f (a) = b− a forany real numbers a and b while g (x) = 2x moves pairs of numbers to be twiceas far apart: g (b) − g (a) = 2 (b− a) for any a and b. We can think of g asmagnifying the distance between numbers by a factor of 2. The nonlinearfunction x2 magnifies by different amounts near different points in its domain.We will return to this magnification interpretation when we start to discuss

10 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 11: Notes on Calculus

the derivative.

1.2.5 Function Notation

What is the difference between f and f(x)? Mathematicians regard f as thename of a function, and f(x) as the number obtained by applying that functionto the number x, just as f(3) is the number obtained by applying the functionto the number 3. In elementary textbooks this distinction is often ignored, andf(x) is used for the name of a function. (But then shouldn’t f(y) be regardedas a different function, since part of the name is different?)In terms of the rule interpretation of functions, f (or g or h or ...) is the

name of the rule; x (or y or t ...), if it has any real role at all just helps to tellyou how the rule is applied to a number in its domain. Thus

f(x) = x2

f(y) = y2

f(¤) = ¤2

are really all the same statement — that f is the squaring function.I will try to be somewhat careful about distinguishing between f and f(x) ,

though there are situations where it is very tempting to use f (x) to mean f ,and I will sometimes allow myself to be tempted.

1.2.6 The Domain of a Function

One of the slight complications of a calculus course for experienced students isthat the usual conventions of calculus are not exactly the same as those of moreformal mathematics. For instance, in calculus we nearly always consider thedomain of a function defined by a formula to be the set of all numbers for whichthe formula makes sense, and sometimes even a little more. In more formalmathematics we fuss more about domains and are more careful to considerfunctions with different domains to be different functions even if their valuesagree almost everywhere. (Since the set of ordered pairs is different, thisis required by the formal definition of function stated above.) For example,consider the function

f (x) =x2 − 4x− 2 .

Clearly the natural domain of this function is all real numbers except x = 2.Now for all such numbers we can factor x2 − 4 = (x− 2) (x+ 2) and cancelthe factors of x− 2 to see

x2 − 4x− 2 = x+ 2,

FUNCTIONS AND THEIR REPRESENTATIONS 11

Page 12: Notes on Calculus

and x+2 is of course defined and continuous for all real numbers. In calculus

this situation is generally described by saying thatx2 − 4x− 2 has a removable

discontinuity at x = 2 and is illustrated by a diagram like the one on the left

-4 -2 2 4-2

2

4

6

x

y

Removable Singularity

-4 -2 2 4-2

2

4

6

x

y

Removable Discontinuity

where the circle indicates that the point (2, 4) is not part of the graph.

In more formal mathematics the view is thatx2 − 4x− 2 is continuous on its

entire domain. It cannot have a discontinuity at x = 2 because it is not definedthere. Instead the function is said to have a removable singularity at x = 2. Seefor instance the Wikipedia article Classification of Discontinuities. Howeverthe very similar function in the right hand graph above

g (x) =

⎧⎨⎩ x2 − 4x− 2 , x 6= 2,1, x = 2,

is defined at x = 2, is discontinous there, and does have a removable discon-tinuity since if we just change the value at x = 2 to 4, the resulting functionis continuous.In practice this kind of distinction will not bother us very often, but it is

well to preserve in the back of your mind the realization that the conventionsof calculus are generally those of nineteenth century (or earlier) mathematicsand so not always consistent with modern usage.EXERCISES.

1. A flight from Sea-Tac to San Francisco has to circle San Francisco for anhour before being allowed to land. If the actual flight (not counting thecircling) takes two hours and the distance between airports is 900 miles,sketch a graph of the flight’s distance from Sea-Tac as a function of timefrom takeoff to landing.

2. If you rent a compact car from EZRent Cars for one day and drive it xmiles, they will charge f (x) dollars where

f (x) =

½30, 0 ≤ x ≤ 100,30 + 0.07 (x− 100) 100 < x.

12 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 13: Notes on Calculus

Describe EZRent’s pricing policy in plain English. Your descriptionshould include interpretations of the numbers 30, 100 and 0.07.

3. An open box is to be made by cutting squares s inches on a side fromthe four corners of a rectanular piece of cardboard measuring 24 inchesby 32 inches and then folding up the sides. Express the volume of thebox as a function of s.

4. Sketch the graphs of the functions f (x) =1

xand g (x) =

|x+ 1|x− 1 on

the same set of axes, and describe in words what is the same about thegraphs, what is different, and (using the properties of the absolute valuefunction) how the difference comes about. Use a scale on the x-axis thatdisplays the interesting part of the graphs clearly.

5. For each part, sketch the graph of a function consistent with the state-ment.(a) The cost of a house continues to increase at an increasing rate.(b) The population is growing more slowly than it was five years ago.

6. (a) Sketch a graph of temperature as a function of time that is consistentwith the following sentence. “The child’s temperature has been risingfor the last 2 hours, but the rise has been slowing since we gave her theantibiotic an hour ago.” Make t = 0 the time when the sentence wasspoken. What is the domain of this function?(b) Explain what feature of the graph corresponds to “The child’s tem-perature has been rising for the last 2 hours” and what feature corre-sponds to “but the rise has been slowing since we gave her the antibiotican hour ago.”

7. You are blowing up a small beach ball on the beach at Waikiki. Theball’s volume is 5 liters. You blow 1 liter of air into it with each breath.This takes 2 seconds. Then you need a second and a half to get ready forthe next breath. Sketch a reasonable graph of the rate you are blowingair into the ball as a function of time, t, from the time you start blowingthe first breath until the ball is full of air. Be sure that your verticalscale is consistent with all of the information in the problem, and witha reasonable interpretation of how you blow an object up.

8. State the domain and range of each of these functions. (Assume thatyou can see all of the domain.)

FUNCTIONS AND THEIR REPRESENTATIONS 13

Page 14: Notes on Calculus

(a)-2 2

-4

-2x

y

(b)

-2 2

-5

5

x

y

9. The sign function, sign(x), is defined for all real numbers x by

sign (x) =

⎧⎨⎩ 1, x > 0,0, x = 0,−1, x < 0.

Sketch the graph of each of these functions

(a) x·sign(x) (b)1 + sign (x)

2(c) x · 1 + sign (x)

2

10. The floor function, bxc , is the greatest integer less than or equal to xfor every real number x. Thus

b2.5c = 2, b−2.5c = −3, b2c = 2.

Give the domain and range of each of these functions and sketch thegraph. Be sure to indicate carefully the value of the function at anypoint of discontinuity.

(a) bxc (b) x−bxc (c) bxc (x− bxc) (d)¹1

x

º(e)

1

bxc

11. Similarly, the ceiling function, dxe , is the least integer greater than orequal to x. Thus

d2.5e = 3, d−2.5e = −2, dxe = 2.

Give the domain and range of each of these functions and sketch thegraph. Be sure to indicate carefully the value of the function at anypoint of discontinuity.

(a) dxe (b) dxe− x (c) dxe− bxc (d)»1

x

¼12. Make a mapping diagram for each of the following functions.

(a) f (x) = 2x− 1 (b) f (x) = −2x (c) f (x) = √x (d) f (x) =p|x|

13. Construct a function whose domain is all real numbers and whose rangeconsists of the two numbers π and

√2.

14 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 15: Notes on Calculus

14. Construct a function whose domain is all real numbers and whose rangeconsists of all real numbers except 0.

15. Construct a non-decreasing function (that means that if x ≤ y thenf (x) ≤ f (y)) whose domain is all real numbers and whose range consistsof all real numbers y such that −1 ≤ y ≤ 1.

16. Construct a function whose domain is all real numbers x in the interval−π2≤ x ≤ π

2and whose range is all real numbers. Is it possible for this

function to be non-decreasing? Explain why or why not.

17. Even in the graphing calculator age, it is important to have a sense ofwhat kind of formula might produce a given shape of graph. For eachof the graph shapes below, list all of the formulas from this list whichcould produce a graph with that shape with appropriate choices for theconstants. Constants are assumed to be nonzero, but not necessarilypositive, and not necessarily different from one another. Dashed linesindicate asymptotes.

ax ax+ b a√x a

√x+ b+ c ax1/3 + b

ax2 ax2 + c ax (x− b) a (x− b) (x− c) ax2 + bx+ cax3 ax3 + c ax3 + bx2 + cx+ d a (x− a) (x− b) (x− c)a

x

a

x+ b

ax

x+ b

ax

x2 + b

a+ bx2

x2 + c

a

(x− b) (x− c)+d

ln (ax) k ln (ax+ b) cakx cakx+ b a cosh (bx+ c) + da sin (kx+ b)+c a cos (kx+ b)+c a tan (kx+ b) a sec (kx+ b)

(a) (b) (c) (d)

(e) (f) (g) (h)

(i) (j) (k) ( )

FUNCTIONS AND THEIR REPRESENTATIONS 15

Page 16: Notes on Calculus

(m) (n) (o) (p)

(q) (r) (s) (t)

1.3 Linear Functions

What is a linear function? One answer is that a function is linear if and onlyif its graph is a straight line. What is it that distinguishes a straight line fromother graphs? Well, that it’s straight instead of curvy, of course. One way todescribe straightness more precisely is to say that a curve is a straight line ifit intersects every horizontal line at the same angle. (This presumes we knowwhat it means to talk about the angle between a curve C and a line L at apoint P. We mean by this the angle between L and the line tangent to C at P.It is one of the important tasks of differential calculus to describe this tangentline precisely in terms of the properties of the original curve, but for now thediagrams below give enough of an idea.)

θ

So a straight line is defined by the property that it intersects every hori-zontal straight line at the same angle θ. Moreover, it is clear that this anglespecifies uniquely the direction of the line. It is actually more convenient totalk about the tangent of this angle, called, as you know, the slope of the line.From the general right triangle definition, tan θ = opposite side

adjacent side . Since here theopposite side is vertical and the adjacent one is horizontal, we could also sayhere that tan θ = rise

run . Thus, if (x1, y1) and (x2, y2) are any two points on the

16 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 17: Notes on Calculus

line, then the slope of the line is

m = slope =riserun

=y2 − y1x2 − x1

. (1.1)

For instance, the slope of the line through (2,−1) and (4, 5) is

5− (−1)4− 2 =

6

2= 3.

Notice that we could also have taken the points in the opposite order becausethe minus signs cancel out; (−1)−5

2−4 = −6−2 = 3. However we do have to be careful

to use the same order in the numerator and denominator.So far this seems pretty obvious. But the point is that equation (1.1)

contains almost everything you need to know about lines, as long as you canhandle it well enough. For instance, let’s see how several forms of the equationof a line come immediately from it.Suppose we know the slope, m, of a line and that it goes through the point

(x0, y0) . Then the test for a point (x, y) to be on the line is that the slopeof the line through (x0, y0) and (x, y) is m. (For instance, (5, 8) is also on theline through (2,−1) and (4, 5) because 8−(−1)

5−2 = 3, but (3, 3) is not, because3−(−1)3−2 = 4 6= 3.) Expressed as an equation, this says that (x, y) is on the lineif and only if

m =y − y0x− x0

.

y - y0

x - x0

(x ,y )0

(x,y)

q0

Slope = tan θ =y − y0x− x0

Multiplying this equation through by x− x0, we get

y − y0 = m (x− x0) . (1.2)

This is usually called the point-slope form of the equation of a line. Multiplyingout and moving all the constants to the right hand side,

y = mx+ (y0 −mx0) = mx+ b (1.3)

LINEAR FUNCTIONS 17

Page 18: Notes on Calculus

where b is an abbreviation for the quantity y0 −mx0. This is the other mostfrequently useful form of the equation of a line.For instance, the equation of the line through (4, 5) with slope 3 is y− 5 =

3 (x− 4) , or y = 3x − 7. Note that for the last form of the equation it doesnot matter which point on the line we use; the equation of the line through(2,−1) with slope 3 is y− (−1) = 3 (x− 2) which also simplifies to y = 3x−7.

Of course you have known all this about lines for a long time. The realpoint of this explanation is to illustrate the general truth that useful formulas,like (1.1) or (1.2), are expressed explicitly, or at least implicitly, by diagramslike the one above.Thus formulas have meaning; they express in symbols a concept that can

also be expressed by a diagram. They are not just mysterious arrangementsof symbols that must be memorized, as a child memorizes the order of theletters of the alphabet. In this course, the emphasis will be on understandingconcepts, and on making connections between different ways of expressingthem, rather than on rote use of memorized formulas.It may seem like a lot more work to understand a concept expressed pic-

torially instead of just memorizing a few letters. In fact, it usually is morework at the beginning. But there are several big payoffs. First, memories offormulas tend to fade unless you are using them every day. But memories ofdiagrams and concepts, once acquired, are much easier to retain. Then theformula can be retrieved from them whenever you need it. Moreover, effectiveuse of the formula, especially in a slightly unfamiliar setting, depends on un-derstanding the meaning, and this generally requires that you understand thediagram. An emphasis on understanding, especially pictorial understandingand the relationship between diagrams and formulas will be a major emphasisof this course.

There are two more remarks to make about lines. First, another way tophrase the “straightness” property of a line is

A function is linear if and only if for constant differences in x,there are constant differences between the corresponding y.

In this form, linearity is best expressed (and best recognized) by a table ofx and y values. For instance, using the same line as before, we have the tableof values

x 0 1 2 3 4y -7 -4 -1 2 5

If we change the size of the constant x-difference, it will change the size of theconstant y-difference, but not the fact of constant differences. (And the ratioof the y-differences to the x-differences will still be 3!)

x 1/3 2/3 1 4/3 5/3y -6 -5 -4 -3 -2

18 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 19: Notes on Calculus

The x-differences can even be negative. The y-differences will change sign, butthey will still be constant, and the ratio of the differences, the slope, will stillbe the same.

x 5 3 1 -1 -3y 8 2 -4 -10 -16

This property of constant y-differences is often a useful way to recognize thata function is linear, especially when the function is expressed as a collectionof values in a table.And finally, if we think about a linear function as a mapping of the real

line to the line, then the slope m of the linear function is a “signed magnifi-cation factor” in the sense that for every pair of numbers in the domain, thecorresponding pair of numbers in the range is |m| times as far apart and inthe same order if m > 0 but in the opposite order if m < 0. Equivalentlythe interval between the numbers in the domain has been “magnified” into aninterval in the range that is m times as long. (For the tables just above for aline with slope 3 we see that any interval, say [1, 3] , is carried to an intervalthree times as long, in this case [−4, 2] )For the line y = −3x+ 7 the comparable table would be

x 0 1 2 3 4y 7 4 1 -2 -5

so that [1, 3] is still carried to an interval of length 6, [−2, 4] , but now the leftend of the domain interval is carried to the right end of the range interval.andvice versa.

EXERCISES

1. Match the graphs with the equations.

(a) y = x− 4 (b) y = 4 (c) y = x+ 4(d) y = −x+ 4 (e) y = −x− 4 (f) y = x/4

(I) (II) (III)

(IV) (V) (VI)

LINEAR FUNCTIONS 19

Page 20: Notes on Calculus

2. Find the linear equation used to generate this table:

x 6.3 6.4 6.5 6.6 6.7y 23.2 24.5 25.8 27.1 28.4

3. Find a linear function mapping [−1, 2] onto [2, 8] so that −1 is mappedonto 2. Find another linear function that maps [−1, 2] onto [2, 8] so that−1 is mapped onto 8. Draw mapping diagrams for both functions.

4. The density of mercury varies linearly with temperature. It is 13.6grams/cm3 at 0◦C and decreases by 0.1 grams/cm3 for each increaseof 40◦C in temperature. What is the density of mercury at 80◦C? At10◦C? (Note that it is not necessary or even desirable to find the equationof the linear function involved.)

5. The speed of sound in air varies linearly with temperature. It is 330meters/sec at 0◦C and 355 meters/sec at 40◦C. What is the speed ofsound at 20◦C? At 35◦C?

6. EZ Rent car rental company offers cars at $40 a day plus 15 cents a mile.Its competitor ItRunz Rental has cars at $50 a day and 10 cents a mile.

(a) For each company, write a function giving the cost of renting a carfor a day as a function of the distance driven.

(b) Graph both functions on the same set of axes.

(c) Describe how to use your answer to (b) to determine when EZ Rentis cheaper and when ItRunz Rental is cheaper. (And when is eachcheaper?)

7. You are driving at constant speed from Bellingham to Portland, a dis-tance of about 250 miles. About 150 miles south of Bellingham, you passthrough Olympia. Sketch a graph of your distance from Olympia as afunction of time. (You can have a vertical scale on your graph, but nota horizontal scale. Why not?)

8. Temperature in degrees Fahrenheit, ◦F , is a linear function of temper-ature in degrees Celsius, ◦C. You know that the freezing temperatureof water is 0◦C and 32◦F, and that the boiling temperature of water is100◦C and 212◦F.

(a) What is the slope of this linear function?

(b) Find a formula for the function.

20 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 21: Notes on Calculus

(c) What temperature is the same number of degrees in the Fahrenheitand Celsius systems? Can you draw a diagram that finds this pointgraphically? (Hint: Graph two lines.)

9. Two non-parallel lines intersect at a point P. Show that the vertical dis-tance dy between the lines is proportional to the horizontal distance dxfrom P to the position where the vertical distance is measured. Relatethe constant of proportionality to properties of the lines.

dy

dxP

10. Explain why the slopes of a pair of perpendicular lines are the negativereciprocals of one another. (Yes, horizontal and vertical lines are a specialcase.)

11. If you decide to earn a little money by selling keychains on vendors row,your weekly costs will be R dollars to rent the space and p dollars foreach keychain that you sell. Write a formula for your weekly costs C asa function of the number N of keychains that you sell. Note that yourformula will contain two kinds of letters: the variables C and N and theconstant amounts R and p. Such constants, which may have differentvalues in different circumstances, are often called parameters.

12. When you buy a latte on vendor’s row, its temperature starts to fallimmediately. Newton’s Law of Cooling predicts that the rateR of coolingis proportional to the temperature difference between the coffee and thesurrounding air. (Thus, fastest when the coffee is hot and slower as thecoffee cools.) Suppose that the temperature in Red Square is 15◦C.

(a) Write a formula giving R as a function of the temperature T ofthe coffee. (Of course T also depends on the time t somehow, butdon’t consider that. Think of T as the independent variable. Yourformula will also involve a constant (a parameter) whose value youdo not know. If you think of R as negative because the temperatureof the coffee is falling, what is the sign of this constant?)

(b) Sketch a qualitative graph of R against T. Think of R as negativeas in part (a).

LINEAR FUNCTIONS 21

Page 22: Notes on Calculus

1.4 Exponential Functions

What is an exponential function, and how can we recognise one? There areseveral ways to characterize exponential functions. When we study them us-ing calculus, the fundamental property will be that the instantaneous rate ofchange of an exponential function is proportional to its size. It is becauseso many natural systems behave approximately in this way that exponentialfunctions are so important. For now, an alternate characterization, which isstill often useful even after studying calculus, is based on a table of values,and is parallel to the description of linear functions:

A function is exponential if and only if for constant differences in x,there are constant ratios between the corresponding y.

For instance, the following table represents an exponential function.

x 0 1 2 3 4y 3 6 12 24 48

Here the constant ratio between successive y-values is 63= 12

6= 24

12= 48

24= 2.

The function this table represents is y = 3 · 2x. In a similar way to linearfunctions, if we alter the constant difference between the x values, we willalter the value of the constant ratio, but not the fact that there is a constantratio. For instance, each of the following tables still represents the functiony = 3 · 2x.

x 1 3/2 2 5/2 3y 6 6

√2 12 12

√2 24

x 5 3 1 -1 -3y 96 24 6 3/2 3/8

In the first case the common x-difference is 12and the common y-ratio is√

2¡= 21/2

¢; in the second case, the common x-difference is −2 and the com-

mon y-ratio is 14(= 2−2) .

This property can easily be related to the usual formula y = cax for anexponential function. If we calculate a table of values for x = 0, 1, 2, ... we see

x 0 1 2 3 4y ca0 ca1 ca2 ca3 ca4

Thus the number c represents the value of the exponential function at x =0. More importantly, the number a represents the common y-ratio for x-differences of 1. For other x-differences, the common y-ratio will be a cor-responding power of a. As the second and third tables above show for a = 2,an x-difference of 1

2gives a common y-ratio of a1/2; an x-difference of −2 gives

a common y-ratio of a−2 and so forth.

22 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 23: Notes on Calculus

Use of the ratio characterization can make the setting up and solving ofsimple problems involving exponential functions much easier than you areprobably used to in your previous work with exponentials. The basic idea isto identify a natural common ratio from the given information and to buildthe exponential function accordingly. Here are some examples.

Example 1. The population of a bacteria colony is 10,000 now, and willdouble every 3 days. What will the population be 5 days from now?Solution. The natural common ratio for this problem is 2, because we

are told about doubling the population. Since the population now (t = 0) is10,000, the population at any time is of the form

P = 10, 000 · 2something.

We need to choose the “something” (some function of t), so that the populationwill double every 3 days. To put it another way, we need a “something”function that is 0 at t = 0 and increases by 1 every time t (the number ofdays) increases by 3. This is clearly a linear function (constant y-differencesfor constant t-differences!) with slope (= rise/run remember) 1/3. In fact itis easily seen to be t/3, that is, the population at time t is

P = 10, 000 · 2t/3.

The population after 5 days can now be found from a calculator:

P (5) = 10, 000 · 25/3 ≈ 31, 748.

Note that this is a reasonable answer since in 5 days the population will havemore than doubled once (after 3 days), but will not yet have doubled twice,so the answer must be between 20,000 and 40,000.

Example 2. The half-life of strontium-90 (which was in the fallout fromnuclear tests in the 1950’s and 1960’s) is 25 years. (That means, after 25 yearshalf of the amount of the element present initially will have decayed.) Howmuch of a sample of 3 grams of strontium-90 from 1950 would be left today?Solution. Here the natural ratio is 1/2; the amount left will have the form

A = 3 ·¡12

¢something, where this time the “something” function should increase

by 1 when t increases by 25. So the amount left t years after 1950 is

A = 3

µ1

2

¶t/25

.

In 2010, t = 60 so the amount left would be

A = 3

µ1

2

¶60/25≈ 0.568 g.

Again the answer is reasonable since 60 years is a little more than two half-livesand after two half-lives one quarter of the original amount would remain.

EXPONENTIAL FUNCTIONS 23

Page 24: Notes on Calculus

Notice that in both examples making a suitable choice of common ratio(the base of the exponential function) made finding the exponent quite easy.We could have used another base, say 10 or the famous number e (much moreabout e later in the course), but then we would have had to play around withlogarithms and algebra to find the exponent.Long ago it was very difficult to do actual calculations with any base other

than 10 or e, so it was reasonable to put up with the logarithms and algebra toput problems into a standard form. Since the invention of calculators with anxy button, however, there has been little reason to do so for problems like this.We will see when we discuss calculus that the base e does have a privilegedposition for some very important purposes, but not for problems like these.To pursue this a little bit farther, using the ideas above amounts to finding

exponential formulas in the general form

f (x) = Cbkx

where the base b is the positive constant ratio, the constant k is determinedby the change in independent variable associated with the ratio (more specif-ically k is the reciprocal of the x-difference needed for the value f (x) to bemultiplied by the common ratio b) and C is the value f (0) . We don’t reallyneed to manipulate all three of the constants b, k, and C in order to produceall exponential functions, and many textbooks do not mention all of the pos-sibilities. Sometimes only the base e is introduced. The text we use in Math124-5 and 134-5 mentions only the two forms Cekx and Cax. I have not usedeither of these forms in any of the examples in this section because it seemsto me that allowing the maximum flexibility is the simplest and most naturalway to think about exponential growth, but it is not mathematically incorrectto restrict yourself to some narrower range of possible formulas.

When an exponential growth or decay problem asks you to find the timefor the quantity to reach some specified level, a symbolic solution will usuallyinescapably involve logarithms. We will come to that shortly. Here I will pointout that we can also solve such a problem graphically or numerically withoutusing logarithms.

Example 3. When will the population of the bacteria colony of Example1 reach 100,000?Solution. We found in Example 1 that the population satisfies P =

10, 000 · 2t/3. Thus we must solve the equation 10, 000 · 2t/3 = 100, 000, or,dividing through by 10,000, 2t/3 = 10. I will show how to solve this equationin two different ways—graphical and numerical. In a later section we will do itsymbolically with logarithms as you are used to doing.(i) For a graphical solution, a rough value for t can be found simply by

graphing the equation y1 = 2ˆ(x/3). (I have used the TI-89 syntax) and ob-serving, say by tracing the curve, where it crosses y = 10. For much betteraccuracy with less effort, also graph the horizontal line y2 = 10 and find

24 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 25: Notes on Calculus

the intersection of the two curves using the Intersection command on theMATH menu (F5 when the graph is showing). You should see on your screensomething like the picture below, where I have used the viewing rectangle9 ≤ x ≤ 12, 5 ≤ y ≤ 15.

9 10 11 12

6

8

10

12

14

x

y

(I chose the x-limits by noticing that 23 < 10 < 24, so 3 < x/3 < 4, or9 < x < 12.) The result is (shifting back to the original variable t)t ≈ 9.9657842847, that is, the population will reach 100,000 in slightly lessthan 10 days. If you do the problem symbolically (or wait until we do it thatway in a couple of sections), you will see that this value for t is just as accurateas a decimal obtained by finding the symbolic symbolically and converting toa decimal. It is to emphasize that fact that I have written it here to so manydecimal places.(ii) Getting a solution numerically is more awkward than the graphical

method unless we resort to button pushing. I already used the rough numericalestimate 9 < t < 12 to choose a viewing rectangle. We could close in on asolution by successively guessing a value for t, plugging it into the function2t/3, and using the result to select the next guess. For instance, we mightgenerate the following table.

t P (t)9 810 10.0799.9 9.8499.97 10.00979.965 9.99829.966 10.0005

Clearly this is much more tedious than the graphical method, for less result.The button pushing version of this is to use the Solve command (first

entry in the Algebra menu—F2). Enter solve(2ˆt/3 = 10, t) on the entry lineand push “≈ ” (green diamond + enter). (If you just press enter it tells you3 ln 10ln 2

which is getting ahead of the story.) The calculator returns 9.96578.

EXPONENTIAL FUNCTIONS 25

Page 26: Notes on Calculus

In general the difficulty with Solve commands is that equations often havemore than one solution and you want some particular one. The Solver on theTI-84 will tell you one solution, but will not tell you whether it is the oneyou want or whether there are others. This can often lead to errors. TheSolve command on the TI-89 is much more sophisticated and will tell you allof the solutions that it finds, sometimes even when there are infinitely many.(Try solve(sinx = 1/2, x). Note that solve(sinx = .5, x) produces a slightlydifferent result.) However even it will sometimes fail to find all solutions–seethe following example. In this particular case we know that 2t/3 is alwaysincreasing so the equation 2t/3 = 10 can have only one solution. Thus Solvehas found the right one. In general, however, when using Solve you shouldalways use other knowledge (usually graphical) to check that what Solve hasfound is what you want to know.

Cautionary Example. If you use the TI-89 to investigate when the ratiox10/2x is relatively small with solve(xˆ10/2ˆx = .1, x) it will find two solutions:x ≈ −.754 and x ≈ .842. There is, however, a second positive solution that issomewhat larger than these. (We will discuss in a later section how we can becertain that it exists even before looking for it.) If we try to help Solve outby insisting it look only at values of x greater than 1 with solve(xˆ10/2ˆx =.1, x)|x > 1, it still can’t find one. However you can easily find one graphicallyas in the previous example. (You should do this.) The moral is that you shouldnever believe that a calculator or computer has found all that there is to findunless you can produce some reason, graphical or otherwise, that it is so.A sophisticated calculator or computer will usually tell the whole story,

provided that you ask it the right question, but every once in a while it willfail to do so, and then your bridge will collapse, your satellite will fall out ofthe sky, or your patient will die unless you are prepared for this possibility.(And like all machinery it will always choose a moment when you are not onyour guard if it can. If accidents happened only when we were expecting them,they would be easier to avoid.)

1.4.1 Terminology

One of the minor difficulties of doing mathematics is that math terms areoften words shared with the ordinary language, but with different meaningsor usages. To make it worse the differences are often subtle enough thatthey are not easily apparent For instance, in ordinary usage if we see a flatsurface we often say that it has no slope. However it is not correct to saythat a horizontal line has no slope. It does have a slope (ratio of riseto run) and that slope is 0. There can be a similar confusion over the wordexponential. For us an exponential function is one with the constant ratioproperty. Functions like x2 have an exponent in their formula, but they are

26 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 27: Notes on Calculus

not exponential functions because, as we will verify in the exercises, theydo not have the constant ratio property. Functions of the form xn for someconstant exponent n are power functions. Similarly, the phrase “growingexponentially” is often used in casual language to mean “growing fast”, butthis is not proper usage in mathematics.

1.4.2 Confession

Just what does the symbol 2x mean? If x is an integer it is easy to answer thisquestion; the exponent just counts the number of times you multiply 2 timesitself: 22 = 2× 2, 23 = 2× 2× 2 and so forth.If x is a fraction, it is partly easy in terms of the preceding paragraph. 21/2

is the number whose square is 2, that is, the number such that 21/2×21/2 = 2.Similarly, 21/3 is the number such that 21/3 × 21/3 × 21/3 = 2 and then 22/3 =¡21/3

¢2. Proceeding in this way we can define 2p/q for any non-zero integers p

and q.It is only partly easy to define 2p/q in this way because this construction

implicitly assumes that there is a number whose square is 2, that is, a solutionto the equation x2 = 2, and strictly speaking we should be able to justifythat assumption. (Of course we can use a calculator to find a number whosesquare is very close to 2 such as 1.41421356237, but that does not prove thatthere is a number whose square is exactly 2, which is what the symbol 21/2 issupposed to mean.) Justifying the existence of the square root of 2 is worthdoing because we know that some equations just don’t have any solutions atall (like 2x = 0, though there are values for x so that 2x is very close to 0) andothers, like x2 = −1, don’t have any solutions that are real numbers. It turnsout that while the existence of the square root of 2 can indeed be justified verycarefully, such a justification depends first on defining the set of real numbersvery carefully (not easy) and then doing quite a lot of reasoning. So in thiscourse we will just go on believing in

√2 (I hope you believe) without any

precise justification.And what about 2x when x is an irrational number, for instance 2π? Now

there is not even a simple interpretation in terms of roots and integer pow-ers. There is a fairly natural intuitive idea–once 2x has been defined for allrational numbers x, it should be possible to fill in the values of 2x for theirrational numbers x that you can think of as the “gaps in the number lineleft between the rational numbers” by just extending 2x so that it makes asmoothly increasing graph. Proving carefully that this can be done is still,however, a fairly lengthy process, even after we have a careful definition of thereal numbers to work with.In practice, this lack of a really careful definition of exponential functions

shouldn’t cause any difficulty in this class. The intuitive ideas, reinforced bywhat you see on your calculator, are all that you will need here. But nailing

EXPONENTIAL FUNCTIONS 27

Page 28: Notes on Calculus

down these ideas more precisely is something you might want to put on yourlist of things to watch out for in the future.

1.4.3 Compound Interest, Inflation Rates, etc.

Recall that if you deposit P0 = $100 at 6% interest compounded annually(not likely these days), then after one year you will have P1 = P0 + .06P0 =1.06P0 = $106. After another year you will have P2 = P1 + .06P1 = 1.06P1 =1.062P0 ≈ $112.36. In the second year you got $6.36 interest instead of just $6because you also got interest on the first year’s interest. Continuing, after nyears you would have Pn = (1.06)

n P0 dollars. The amount of interest you geteach year increases in proportion to your balance at the beginning of the year.Suppose instead that the 6% interest is compounded monthly. That means

you get interest every month at1

12the annual rate or 0.5%. Now after a year

you will have gotten 0.5% interest 12 times, each time on your balance atthe beginning of that month. Thus you will have (1.005)12 P0 ≈ $106.17.Compounding monthly has given you 17 cents more interest for the year thancompounding annually.Now suppose that the 6% interest is compounded daily. For some reason

banks usually consider that there are 360 days in a year, so after one year you

will have gotten6

360% interest 360 times, that is, your initial deposit will have

grown toµ1 +

.06

360

¶360P0 ≈ $106.18.

It seems clear that it is to your advantage for the interest to be compoundedmore frequently. It is not so clear whether there is much point in being greedyand looking for a bank with a 6% interest rate that compounds the interesteven more frequently than daily. The extra interest that you get may be sosmall that it’s not worth asking around to get it.This is really a question about a limit. What is the limit as n → ∞ ofµ1 +

.06

n

¶n

, the amount that $1, invested at 6% interest compounded n times

per year will grow to in one year? We will perhaps see a bit later that the

limit is e.06 ≈ 1.0618365. Sinceµ1 +

.06

360

¶360≈ 1.0618312, there is not much

to be gained by compounded more often than daily.

EXERCISES.

1. Find possible exponential formulas for these graphs. Don’t guess at thecoordinates of points where the value is not obvious.

28 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 29: Notes on Calculus

(a)

-1 0 1 2

5

10

x

y(2,12)

(0,3)

(b)

-1 0 1 2

5

10

x

y(2,12)

(1,3)

(c)

-2 -1 0 1

5

10

x

y

(-2,8)

(1,2)

2. A recent study suggests that human tooth size continues to decrease at arate of 1% per 1000 years. What will be the tooth size of our descendents10,000 years from now if this rate persists (as a percentage of our presenttooth size)? In about how many years will human teeth be half theirpresent size?

3. A colony of bacteria in a Petri dish doubles in size every 10 minutes underoptimal conditions. At noon on a certain day, the Petri dish is completelycovered with bacteria. For each of these percentages, at what time (tothe nearest hour, minute, and second) was the dish that percentage ofthe dish covered with bacteria?

(a) 50%

(b) 25%

(c) 5%

(d) 1%

4. Recently on NPR the chief financial guru for a prominent Washingtonthink-tank said, “If the American economy grows by 4% per year forthe next 25 years, national wealth will have doubled.” Is this statementcorrect? If not, correct it.

5. Manufacturing companies just inside Mexico but owned by United Statescorporations are called maquiladoras. In 1994 it was predicted that grossproduction in the maquila industry would grow by 19% per year between1993 and 1998 and that employment would grow by 9% per year overthe same period.

(a) By what factors will production and employment increase over theentire five year period?

(b) By what percentage will they increase over the same period?

6. Atmospheric pressure decreases by 0.4% for each 100 feet of altitudegain.

EXPONENTIAL FUNCTIONS 29

Page 30: Notes on Calculus

(a) If the pressure at sea level is 30 inches of mercury, what is thepressure at 1700 feet?

(b) When the Olympic Games were held in Mexico City (altitude 7340feet) in 1968 there was much discussion of the effect the altitudehad on athletic performances. (Some world records in the sprintswere not beaten for quite a few years; on the other hand the times indistance events were quite slow.) By what percentage is air pressurereduced in Mexico City?

7. A typical ream of paper (500 sheets) is about 2 inches thick, so we mayestimate the thickness of a single sheet as 0.004 inches.

(a) If you fold a single sheet of paper in half five times in succession(i.e. first in half, then in half again, then...) what would be thetotal thickness of the folded piece of paper?

(b) Do it—fold a piece of paper in half five times, measure the thickness,and compare to your answer to (a). Note that the thickness willdepend on where you measure the folded piece. Why is it reasonablethat the smallest measurement should be the closest to your answerto (a)?

(c) Suppose you could fold the paper in half 50 times. What wouldthe thickness of the folded piece be then? Name a real-life distancethat is approximately the same as this. (You will need to convertinches to some other units.)

8. The median price of a new house in the US was about $25,000 in 1970and about $125,000 in 1990.

(a) Which kind of growth rate seems most reasonable to expect forhouse prices—linear growth (constant amount per year) or exponen-tial growth (constant percentage increase per year)? Explain yourchoice.

(b) Assuming that house prices increase at a linear rate, find the linearfunction and fill in the top row of the table below.

(c) Assuming that house prices incease exponentially, find the expo-nential function and fill in the second row of the table below.

(d) Graph both functions on the same set of axes.

(e) The actual median price for a new house in the US in 2009 was$217,000 (down from $248,000 in 2007). Which kind of growth fitsthe data better? (For more data points consultwww.census.gov/const/uspriceann.pdf.)

1970 1980 1990 2000 2010 2020Linear 25,000 125,000

Exponential 25,000 125,000

30 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 31: Notes on Calculus

9. Chess is supposed to have been invented (in a somewhat different form) inIndia in the 6th century AD. Legend has it that the game was introducedto the ruler of the time by a poor itinerant mathematician. The rulerwas so pleased that he offered to give the mathematician any rewardhe wanted. The mathematician asked for a single grain of wheat for thefirst square on the chessboard, two grains of wheat for the second square,four grains for the third square, 8 for the fourth square and so forth forall 64 squares of the 8x8 chessboard. After objecting to such a paltryaward the ruler asked for a sack of wheat to be brought and the squaresto be filled according to the mathematician’s desire.

(a) How many grains of wheat would be required to fulfill the mathe-matician’s request?

(b) If each grain has a volume of two cubic millimeters, what wouldthe total volume of the wheat be? Translate your answer into cubicmeters.

(c) World wheat production for 2010 is estimated at 2.4×1011 bushels.The volume of one bushel is about 0.035 cubic meters. How manysquares of the mathematician’s chess board would this cover?

10. In 1989 the U.S. inflation rate was 4.6% per year. This means that onthe average any price at the beginning of 1990 was 1.046 times the priceof the same good at the beginning of 1989. Also in 1989, the inflationrate in Argentina was 33% per month.

(a) What was the annual inflation rate in Argentina? (The real questionhere is how do you convert a monthly inflation rate to an annualinflation rate. Think about ratios and differences. You will alsohave to think about the relationship between multiplicative factorsand inflation rates.)

(b) What was the monthly inflation rate in the US? (An inverse prob-lem!)

11. The October 5, 2002 issue of the Economist states that when FernandoCardozo, outgoing president of Brazil, started his first term in 1994,Brazil’s annual inflation rate was 10,445%.

(a) Suppose that this were the current US inflation rate. What wouldbe the price one year from now of a loaf of bread that now costs $4.(Even that seems like a lot to me!)

(b) What would be the monthly inflation rate?

(c) What would the loaf of bread cost 6 months from now?

EXPONENTIAL FUNCTIONS 31

Page 32: Notes on Calculus

(d) Suppose that we did a savings-bank like computation for (b)—takingthe monthly inflation rate to be 1/12 of the annual rate—and thenconverted back to the annual rate by applying the monthly rate 12times. What annual rate would we get now? (Bigger, but how muchbigger?)

12. We have seen that exponential functions are characterized by the “equalratios” property. What happens if we compute ratios for other functions?For f (x) = x2, compute f (n+ 1) /f (n) for n = 1, 2, 3, 10, 100, 1000. (Ineach case the ratio corresponding to a change of one in x.) Is the ratio ofy-values constant? Does the ratio at least seem to be approaching somedefinite value? If so, what?

13. Give a symbolic argument (expanding and simplifying (n+ 1)2 /n2) thatthe ratios in #12 approach the limit you guessed.

14. Repeat #12 and #13 for f (x) = x3.

1.5 Trigonometric Functions

1.5.1 Definitions

1. Triangle definition (for 0 ≤ θ ≤ π/2). If θ is an acute angle in a righttriangle, then

sin θ =oppositehypotenuse

; cos θ =adjacenthypotenuse

; tan θ =oppositeadjacent

.

2. Circle definition (for any real number θ). The coordinates of the pointwhere the unit circle x2 + y2 = 1 intersects the ray from the origin makingan angle of θ with the positive x-axis are (cos θ, sin θ). According to theformula s = rθ relating arc length on a circle to the angle subtended by thearc, the point (cos θ, sin θ) is also the point on the unit circle a distance θcounterclockwise along the circle from (1, 0) . Then tangent can be defined bytan θ = sin θ/ cos θ.

Triangle definition

Θ

�cos�Θ �,sin�Θ �

�1.0 �0.5 0.5 1.0 1.5x

�1.0

�0.5

0.5

1.0

y

Circle definition

32 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 33: Notes on Calculus

Some facility in manipulating the circle definition is often useful, especiallywhen dealing with angles not between 0 and π/2. Many properties of trigfunctions are evident from this picture.Recall that for any integer k (positive or negative), θ and θ+2kπ represent

the same angle.The other three trig functions, secant, cosecant, and cotangent are much

less important. For the record, they are defined by

sec θ = 1/ cos θ; csc θ = 1/ sin θ; cot θ = 1/ tan θ.

Secant arises from time to time in identities based on tan, but csc and cotare hardly ever really useful.

1.5.2 Elementary Properties

1. A function f defined for all real numbers is periodic of period P iff (x+ P ) = f (x) for all real x.The trig functions sine and cosine are pe-riodic with period 2π, that is, sin (x+ 2π) = sinx and similarly for cosine.Tangent has period π as we will see in section 1.5.3. In general any pe-riodic function has infinitely many periods—if f (x+ p) = f (x) then alsof (x+ 2p) = f (x) , f (x+ 3p) = f (x) , f (x− p) = f (x) and so forth. Wewill use the phrase “f has period p” to mean that p is the smallest positivenumber such that f (x+ p) = f (x) .

2. Cosine is an even function (that is, cos(−x) = cos x for all real x); sine andtangent are odd functions (that is, sin(−x) = − sin x and tan (−x) = tanxfor all real x). (Obvious from the circle definition.) More generally, we saya function f is even around a number c if f (c− x) = f (c+ x) for all xand f is odd around c if f (c− x) = −f (c+ x) . Thus sin is even around−π/2, π/2, 3π/2 and more generally π/2 + kπ for each integer k, and oddaround kπ for each integer k, while cos is even around kπ for each k and oddaround π/2 + kπ for each k.

sinx is even around x = π/2

TRIGONOMETRIC FUNCTIONS 33

Page 34: Notes on Calculus

cosx is odd around x = π/2

3. −1 ≤ sin x ≤ 1, −1 ≤ cos x ≤ 1. (We say that the amplitude of sine andcosine is 1.) Moreover

sinx =

⎧⎨⎩ 0, x = kπ,1, x =

¡2k + 1

2

¢π,

−1, x =¡2k − 1

2

¢π,

cosx =

⎧⎨⎩ 0, x =¡2k + 1

2

¢π,

1, x = 2kπ,−1, x = (2k + 1)π,

tanx =

⎧⎨⎩ 0, x = kπ,1, x =

¡k + 1

4

¢π,

−1, x =¡k − 1

4

¢π.

(All these are obvious from the circle definition.)

4. sin2 x + cos2 x = 1. (This is Pythagoras in the triangle definition and theequation of the circle in the circle definition.)

Periods and Amplitude

What is the period of f (x) = sin 3x? One way to think about this is to notethat f will go through one period as the expression that sine acts on increasesfrom 0 to 2π. Now 3x = 0 when x = 0 and 3x = 2π when x = 2π/3. Thus theperiod of f is 2π/3 as you can verify by graphing f. Similarly, for any k > 0the period of f (x) = sin kx is 2π/k since kx increases from 0 to 2π when xincreases from 0 to 2π/k. Do not memorize this formula! Instead rememberthe method of the paragraph and work out the period whenever you need it.

The range of g (x) = 5 sin 3x is −5 ≤ y ≤ 5 with the extreme valuesoccurring when sin 3x = ±1. We say that the amplitude of g is 5, since itsgraph has the shape of a sine curve but varying between ±5 instead of ±1. Thegraph of h (x) = 2 + 5 sin 3x is similar but two units higher. The amplitudeis still 5. In general, for any A > 0 and any constant c, the amplitude ofg (x) = c + A sin kx is A, meaning that the graph varies by A up and downfrom y = c.

34 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 35: Notes on Calculus

-2 -1 1 2 3 4

-4

-2

2

4

x

y

5 sin 3x

-2 -1 1 2 3 4

-2

2

4

6

x

y

2 + 5 sin 3x

1.5.3 Addition and Subtraction Formulas

These are the single most important (because most useful) trig identities. Toput it another way, they are worth remembering, and are just about the onlytrig formulas worth trying to remember because most other trig identities areeasily derived from them. (A few samples below.) An ability to manipulatethem is essential for many problems. They are

cos(x+ y) = cos x cos y − sin x sin y,cos(x− y) = cos x cos y + sin x sin y,sin(x+ y) = sin x cos y + cos x sin y,sin(x− y) = sin x cos y − cos x sin y.

It’s best just to remember these, but in case of doubt they can be recoveredwithout too much trouble. The subtraction formula for cosine follows fromthe circle definition and the distance formula. (Use the distance formula tocompute the square of the length of the chord in a circular sector of angle x−yin two different ways. Then expand both, set them equal, and simplify. Seethe diagrams below.) Then the addition formula for cosine is immediate fromwriting x+ y = x− (−y) and using the subtraction formula and the fact thatcosine is even and sine is odd.

y

x�y�cos�y�,sin�y�

�cos�x�,sin�x�

�1.0 �0.5 0.5 1.0x

�1.0

�0.5

0.5

1.0

y

Dist:(cos x, sinx) to (cos y, sin y)

x�y

�cos�x�y�,sin�x�y�

�1.0 �0.5 0.5 1.0 1.5x

�1.0

�0.5

0.5

1.0

y

Dist: (cos (x− y) , sin (x− y)) to (1, 0)

TRIGONOMETRIC FUNCTIONS 35

Page 36: Notes on Calculus

Finally, the addition and subtraction formulas for sine can be derived fromthese by using the identities cos(π/2−x) = sin x, sin(π/2−x) = cos x whichare obvious from the circle definition. (Why?) For instance,

sin(x+ y) = cos(π/2− (x+ y)) = cos((π/2− x)− y).

Now expand using the subtraction formula for cosine and simplify.

Examples of the Use of the Addition Formulas

1. We say that a horizontal translate of a trig function is a phase shift. Forinstance, comparing the graphs of sinx and cosx we see that

-4 -2 2 4 6 8 10

-1

1

x

y

sinx & cosx

the graph of sinx is a translation of the graph of cosx to the right by π/2.

Symbolically, sinx = cos (x− π/2) . To verify this, use the subtraction formulafor cosine:

cos (x− π/2) = cosx cosπ

2+ sinx sin

π

2= cosx · 0 + sinx · 1 = sinx.

2. Set x = y to get the double angle formulas

sin 2x = sin(x+ x) = 2 sinx cos x,and

cos 2x = cos(x+ x) = cos2 x− sin2 x = 2cos2 x− 1or equivalently

cos2 x = (1 + cos 2x)/2.

3. Add the addition and subtraction formulas for cos to get

cosx cos y =1

2[cos(x+ y) + cos(x− y)].

4. To see that tangent has period π,

tan(x+ π) =sin(x+ π)

cos(x+ π)=(sinx) · (−1) + 0 · (cosx)(cosx) · (−1)− 0 · (sinx) =

sinx

cosx= tanx.

Again, it is a waste of effort and a possible source of error to try to rememberthese formulas. Instead remember how to get them from the addition andsubtraction formullas.

36 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 37: Notes on Calculus

Law of Cosines

The Law of Cosines generalizes Pythagoras to any triangle. It says that if atriangle has sides a, b, c with angle θ between “legs” a and b, then

c2 = a2 + b2 − 2ab cos θ.

To prove this, drop the perpendicular BD in the triangle below. Note thatthe lengths |BD| = a sin θ, |DC| = b−a cos θ. Now use Pythagoras on ∆BDCand simplify.

A

B

CD

a

b

c

θ

Law of Sines

If α, β, γ are the angles opposite sides a, b, c respectively, then

sinα

a=sinβ

b=sin γ

c.

To prove this, consider BD again in the triangle above (You have to relabelthe angles so that, for instance, γ is the angle labeled θ above). The length ofBD is a sin γ and also c sinα. Thus α sin γ = c sinα or sinα

a= sin γ

c. Consider a

different perpendicular to get the other equality.

EXERCISES.

1. Indicate whether each function is even, odd, or neither. Explain, graph-ically or otherwise.

(a) 1 + cosx (b) 1 + sinx (c) x+ sinx (d) |sinx| (e) sin2 x (f) 2cosx(g) 2sinx.

2. Find the period and amplitude of each function. You should check youranswers graphically.

(a) cosx− 2 (b) 3 sin (x/2) (c) −4 cos (2x) + 3 (d) 1 + 2sinx (e) 6 |sinx|(f) 7 |cos (3x)| .

TRIGONOMETRIC FUNCTIONS 37

Page 38: Notes on Calculus

3. The voltage, V , of an electrical outlet is given as a function of time, t,by the function V = V0 cos (120πt) .

(a) What is the period of this function?

(b) What does V0 represent?

(c) Sketch the graph of V. Label the important points on the axes.

4. Show that tanx is an odd function around kπ/2 for each integer k.Illustrate with a diagram for an odd value of k and a nonzero even valueof k.

5. Around which real numbers c is f (x) = sin 2x an odd function? An evenfunction? Illustrate with a diagram.

6. Construct an example of each kind of function:

(a) a nonconstant periodic function with period 1

(b) a nonconstant periodic function with period 1 which is not a trigfunction. A piecewise definition is ok.

(c) a nonconstant periodic function with period 1 which is not a trigfunction and is differentiable at every point.

7. Show that any function f that is even (around 0) and also even around1

2is periodic. What is the largest possible period for f? Find an example

of a function with these properties.

8. Each of the following graphs is the graph of a function of the formf (x) = c+ a sin (bx) . Identify a, b, and c for each graph.

(a)-10 -5 5 10

-4

-2x

y

(2 , - 2)π(b)

-10 -5 0 5 10

1

2

3

x

y

(2 ,2)π

9. Each of the following graphs is the graph of a function of the formf (x) = c+ a cos (x+ b) . Identify a, b, and c for each graph.

38 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 39: Notes on Calculus

(a)

-10 -5 5 10-1

1

2

3

x

y

( ,- 1)π

(b) -10 -5 5 10

-4

-2

2

x

y (2,2)

10. Let s > 0 be fixed. Let rn be the distance from the center of a regularn-gon of side

s

n(a regular n-gon is a polygon with n sides, all of the same

length) to one of the n vertices. Determine, with justification, limn→∞

rn.

(You will need a fact about a limit of some trig function.)

11. What is the angle (measured counterclockwise from the positive x-axisas usual) between the x-axis and the graph of y = 4 − x2 at the point(2,0)? (You can remember calculus for this.) Give the angle in bothradians and degrees.

1.6 Operations on Functions

It is often useful to think of a function as having been put together from one ormore simpler functions by some sequence of operations, for instance adding ormultiplying functions, shifting or stretching, or composing several functions.The single most useful differentiation rule, the chain rule, depends on regardinga complicated function as a composition of members of the standard list offunctions whose derivatives are known.In this section I will concentrate first on shifting and stretching, then on

the important transformation x 7→ 1/x, and finally on composition in general.When manipulating function graphs it is easy to get confused, so the moral ofthis section is: always check by plugging in some points.

1.6.1 Transforming the Domain of a Function

These operations often arise when we are trying to find a formula for a graphthat looks rather familiar but is not quite in “standard position.” Here aresome examples of the use of shifting and stretching in this context.Example 1. Find a formula for the parabola on the left below.

OPERATIONS ON FUNCTIONS 39

Page 40: Notes on Calculus

-1 1 2 3 4 5

-1.0

-0.5

0.5

1.0

x

y

(2,-1) -3 -2 -1 0 1 2 3

0.5

1.0

1.5

2.0

x

y

(2,1)(-2,1

To do this graphically, note first that if we shift the graph two units tothe left and one unit up, we get the graph on the right. That clearly has theformula y = kx2 for some constant k, and plugging in the point (2, 1) gives

that k = 1/4. Thus the graph on the right is y =1

4x2.

Now we shift it back. To move the vertex to x = 2, we replace x by x− 2 :y =

1

4(x− 2)2 . (How do we know this is right? Well, where does the right

side have its smallest value? Clearly when x− 2 = 0 or x = 2. It is a lot saferto reason like this than to just try to remember rules about which way thegraph moves if you replace x with x− 2 or x+ 2.) Of course we also have tomove the vertex down one unit. Thus the equation of the original parabola is

y =1

4(x− 2)2 − 1 = 1

4x2 − x =

1

4x (x− 4) .

The factored form serves also as a check: we can see from the original diagramthat the parabola crosses the x-axis at x = 0 and x = 4, so the formula mustcontain factors of x and x− 4.Example 2. Find a formula for the graph on the left:

0 1 2 3

1

2

3

4

5

x

y

(1,4)

f(x)

1 2 3-1

1

2

3

4

x

y

(1,1)

f (x)− 3This one is a little ambiguous, so the best we can do is to come up with

one formula that seems to fit, recognising that there may be other reasonableanswers. The graph decreases to a horizontal asymptote as x → ∞. Whatkind of functions behave that way? Well, decreasing exponentials, for one, sowe will try to make a formula involving exponential functions. (But rational

40 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 41: Notes on Calculus

functions like 1/x do also, so we could find other reasonable choices. SeeExercise 7.)As a first step, if we shift the graph down 3 units, we get a graph decreasing

to 0 as decreasing exponentials do. Since this new graph goes through thepoints (0, 2) and (1, 1) it has a ratio of 1/2 corresponding to an x-difference of1. Thus it is 2 ·

¡12

¢x= 21−x. This makes the original graph

y = 21−x + 3.

One could phrase this a little more formally as follows: if we are looking fory = f (x) , then the formula for the shifted graph is f (x) − 3. So we foundf (x)− 3 = 21−x or f (x) = 21−x + 3.Example 3. (Fancier version of Ex. 2) Find a formula for the graph on

the left.

1 2 3 4 5 60

1

2

3

x

y

(1,1)

f(x)

1 2 3 4 5 6

1

2

3

x

y

(1,2)

3− f (x)

Again it approaches a horizontal asymptote as x → ∞, and so could bemade from an exponential function. This time, though, the approach is frombelow, so in order to get it into “standard form” we will have to flip the graphover as well as move it vertically. One way to do this is simply to graph thedifference between y = 3 and y = f (x) , that is, y = 3 − f (x) . This is thegraph on the right. We could get to the same place in two steps by first movingthe graph down 3 units to get the asymptote on the x-axis for the functionf (x) − 3 (which is below the axis) and then multiplying by −1 to flip thegraph over. Then we have y = − (f (x)− 3) = 3− f (x) as before.The new graph passes through (0, 3) and (1, 2) so the common ratio is

23and a formula for it is 3 ·

¡23

¢x. Thus we have 3 − f (x) = 3 ·

¡23

¢xor

f (x) = 3− 3 ·¡23

¢x= 3

¡1−

¡23

¢x¢, that is, the original graph is

y = 3

µ1−

µ2

3

¶x¶.

Amore complicated way to shift the domain of a function defined for x > 0is by means of the mapping x 7→ 1/x. This function maps 1 to itself, and mapsany positive number less than 1 to a positive number more than 1 and viceversa. Numbers close to 1 get mapped to other numbers close to 1, and as you

OPERATIONS ON FUNCTIONS 41

Page 42: Notes on Calculus

get farther away from 1 on one side, the image gets farther from 1 on the otherside. You can think of this as a kind of reflection across x = 1, though nota symmetric one, since the numbers in the interval 0 < x < 1 get stretchedout so that their images cover the infinite interval x > 1, while in the otherdirection all numbers x > 1 get squeezed into the interval 0 < x < 1. It is oneof the minor “paradoxes” of the infinite that this mapping is one-to-one, thatis, there are “just as many” real numbers between 0 and 1 as in the entireinfinite interval x > 1.So how do the graphs of a function f (x) and the transformed function

f (1/x) compare on the half-line x > 0? Clearly they agree at the “pivotpoint” x = 1. Then the two sides of the graph get swapped and resized. Forinstance, if f (x) = 2 + x2, then f (1/x) = 2 + (1/x)2 . Here are the respectivegraphs where I have included the “line of reflection” x = 1 and some pairs ofcorresponding points. Note that the value f (0) (= 2 in this case) becomes thevalue of the horizontal asymptote as x→∞ for f (1/x) .

0 1 2 3 40

2

4

6

8

10

x

y

2 + x2 and 2 + (1/x)2

Here is a more complicated example where f is the cubicf (x) = 2 (x+ 1) (x− 1) (x− 2) .

1 2 3 4 5

-4

-2

0

2

4

6

8

10

x

y

f(x)f(1/x)

f (x) = 2 (x+ 1) (x− 1) (x− 2)

42 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 43: Notes on Calculus

1.6.2 What is an Asymptote?

We have seen a number of examples of asymptotes in the examples in thissection. In each case a curve approaches a horizontal or vertical line from oneside as x or y approaches +∞. What other possibilities are there? Well, wecould certainly have x or y approaching −∞ instead. For instance the x-axisis a horizontal asymptote for f (x) = ex since ex decreases to 0 as x → −∞and the vertical line x = −1 is a vertical asymptote for g (x) = − 1

(x+ 1)2. See

the graphs below.

-4 -3 -2 -1 0

0.0

0.2

0.4

0.6

0.8

1.0

x

y

ex as x→−∞

-4 -2 0 2 4

-10

-8

-6

-4

-2

xy

−1/ (x+ 1)2

But what about less obvious examples, say f (x) =sinx

x2as x → ∞ or

g (x) =√x as x→ 0 as illustrated below.

OPERATIONS ON FUNCTIONS 43

Page 44: Notes on Calculus

2 4 6 8 100.0

0.2

0.4

0.6

0.8

1.0

x

y

sin (x) /x2

0.0 0.1 0.2 0.3 0.4 0.50.0

0.2

0.4

0.6

x

y

√x

Here is the formal definition: we say that a line L is an asymptote forthe graph f (x) if the distance from the graph to L, measured perpendicularlyto L, approaches 0 as we move “out to infinity” along the line in one or both

directions. According to this definition, the x-axis is an asymptote forsinx

x2since there is no requirement that the graph approach the line from one sideonly, but the y-axis is not an asymptote for

√x since the approach to the y-

axis is as we approach the x-axis from above, not as we “move out to infinity.”Asymptotes do not need to be horizontal or vertical—the arms of the hy-

perbola y2 − x2 = 1 are asymptotic to the lines y = ±x as x → ±∞. (Is aparabola asymptotic to any line?)

-4 -2 2 4

-4

-2

2

4

x

y

44 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 45: Notes on Calculus

1.6.3 Composition

It is often useful to think of a function as resulting from a series of simplerfunctions performed one after the other. The function sin (x2) , for instance,can be thought of as resulting from first applying the squaring function to x,and then taking the sine of the result. The machine interpretation of a functionis helpful here. We are thinking of the function sin (x2) as resulting from theactivities of two machines. We feed x into the squaring machine and get outx2; then we feed this output into the sine machine and get out sin (x2) . Thiscan be indicated in a mapping diagram with three lines; the squaring functioncontrols the mapping between the first two lines, the sine function controls themapping between the second two lines, and the composite function sin (x2) isthe result of doing first one and then the other.

-2

-1

0

1

2

-2

-1

0

1

2

-2

-1

0

1

2

Mapping diagram for sin (x2)

The combined function is called the composition of the two individualfunctions. The symbol ◦ is sometimes used to denote the composition of twofunctions, that is, if f (x) = x2 and g (x) = sinx, then g ◦ f (x) = g (f (x)) =sin (x2) and f ◦ g (x) = f (g (x)) = (sinx)2 . Note that the symbol g ◦ f hasto be read from right to left, since it is the function on the right that is the“inside function” — the one that is done first.The shifted and stretched functions of Exercise 1 at the end of the section

can be interpreted as compositions. The functions G¡12x¢and 1

2G (x) there

are both compositions of G with the function h (x) = 12x, but in the opposite

order—G¡12x¢= G ◦ h (x) while 1

2G (x) = h ◦ G (x) . The function P (x) =

14(x− 2)2 − 1 at the beginning of the previous subsection can be regardedas the composition of three functions: P (x) = f ◦ g ◦ h (x) where f (x) =x − 1, g (x) = 1

4x2, h (x) = x − 2. The method of the examples of the first

OPERATIONS ON FUNCTIONS 45

Page 46: Notes on Calculus

subsection was essentially to strip away the parts that are linear functions (fand h in Example 1) in order to concentrate on the “main part” of P , thequadratic function g.Notice that the order in which functions are composed is important. If we

put the squaring function and the sine function together in the opposite order,we get (sinx)2 which is clearly different from sin (x2) . (For instance, sin (x2)has negative values for some x, while (sinx)2 can never be negative.)It can sometimes be hard to predict the appearance of the graph of g ◦ f

even when you are familiar with the individual graphs of f and g. This isworth being able to do, even in situations when you can just pull out yourgraphing calculator, for the usual reason that having some sense of what theanswer should be is your best insurance against making mistakes. There are,however, some simple principles. First, thinking of the two machine idea, theoutput of g ◦ f is actually output of g, since that is the function done last. Sovalues of sin (x2) , for instance must all lie in the range −1 ≤ y ≤ 1, since thatis true of the values of the sine function. Similarly, values of (sinx)2 are allsquares, so they are non-negative, but can be arbitrarily large. (Or can they?What numbers does the squaring function get as inputs here?)

EXERCISES

1. The function G has the graph in the diagram below. Sketch the graphof each of the following functions. Be sure to label all important points.(Corners, for instance and the maximum point.)

(a) G¡12x¢(b) 1

2G (x) (c) G (x− 2) (d) G (x)− 2 (e) G (−x) (f) −G (x)

-6 -4 -2 2 4

2

4

6

8

x

y

2. The curve y = 3x5 + (2x+ 3)6 is translated two units to the left. Whatis the equation of the translated curve?

3. Find a possible formula for a function whose graph passes through (0, 0)and (2, 2) and which has y = 4 as a horizontal asymptote as x→∞.

46 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 47: Notes on Calculus

4. Find a possible formula for an increasing function whose graph passesthrough (0, 0) and (2, 2) , which is defined for all x and which is concavedown, but has no horizontal asymptote. See the definition of concave upand down at the beginning of section 4.3. (No line segments allowed.)

5. Find a possible formula for an increasing function whose graph passesthrough (0, 0) , which is defined for all x and which is concave down, buthas no horizontal or vertical asymptote.

6. Find a possible formula for the function with this graph:

-6 -5 -4 -3 -2 -1

1

2

3

4

x

y

(-2,3)

7. Find a different function, say a rational function, that has a graph likethat in Example 2. It should pass through the points (0, 5) and (1, 4)and have the line y = 3 as a horizontal asymptote.

8. Graph your solution to #7, the exponential function from Example 2and the line y = 3 on the same axes. Which function approaches theasymptote more quickly? Is this reasonable? Why?

9. Find a possible formula for the function with this graph, where thedashed line is a horizontal asymptote:

-4 -2 2

-6

-4

-2x

y

(-4,-5)(-1,-3)

OPERATIONS ON FUNCTIONS 47

Page 48: Notes on Calculus

10. For each given function f, graph f (x) and f (1/x) on the same axes for0 < x < a where you choose a to be an appropriate value.(a) f (x) = x3 (b) f (x) = (x− 1)3 (c) f (x) = ex.

11. Here is the graph of a linear function f (x) for 0 ≤ x ≤ 1.

0 1 2 3 40

1

2

x

y

(a) Extend the graph of f to the interval 1 ≤ x ≤ 4 so that f (x) =f (1/x) for all x > 1. Your sketch should be accurate enough sothat the values f (2) , f (3) , and f (4) are correct. Connect each bya dotted horizontal line with the corresponding value for x in [0, 1] .

(b) If we continued the graph formed as in (a) for all positive x, whatwould we see as x→∞?

12. Sketch the graphs of f (x) = 2 (x+ 1) (x− 1) (x− 2) and f (1/x) forx < 0. Explain why the line x = −1 functions to the left of the y-axis inthe same way that x = 1 functions to the right of the y-axis.

13. Let f (x) =1

1 + x2.

(a) Sketch the graphs of f (x) and f (1/x) on the same axes. Do theyhave the same domain?

(b) The two graphs in (a) appear to have the same shape except thatthe second is an upside down copy of the first, that is, one is ashifted and reflected copy of the other. Find an algebraic relation-ship between the formulas that shows that the graphs have thisrelationship to one another.

14. Sketch the graphs of f (x) = sin (πx) and f (1/x) = sin (π/x) for x > 0.It is probably best to use different sets of axes, but think about startingwith the graph of sin (πx) (where does the line x = 1 cross this graph?)and “reflecting” as in the examples above. The fact that sin (πx) doesnot approach a limit as x → ∞ means that the graph of sin (π/x) getsrather complicated as x → 0—in fact your calculator will not do a very

48 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 49: Notes on Calculus

good job no matter how you adjust the scales. This is an example of agraph that you have to draw in your mind and then impressionisticallyon your paper. It may help to think about where the graph of sin (π/x)crosses the x-axis.

15. Find a possible formula for each graph.

(a)2

(b)

-8

20π(c)

4

18

(d)

4

18-6

(e)

4

13-11

16. Use these graphs for the exercises that follow. Don’t try to guess formulasfor the graphs. Just proceed graphically.

0 1 20

1

2

x

y

f

-1 1 2

-1

1

2

x

y

g -1 1 2

-1.0

-0.5

0.5

x

y

h

(a) Sketch the graphs of f and f ◦ f on the same axes.(b) Sketch the graphs of g and g ◦ g on the same axes.(c) Sketch the graphs of f ◦ g and g ◦ f on the same axes. Be sure to

label which is which!

(d) Sketch the graphs of g ◦ h and h ◦ g on the same axes. Be sure tolabel which is which!

17. The graphs of g ◦ h and h ◦ g in the previous problem are parallel butdistinct lines. Does this always happen? Either show algebraically that

OPERATIONS ON FUNCTIONS 49

Page 50: Notes on Calculus

the compositions of two linear functions in opposite order are alwaysparallel or find an example where they are not.

1.7 Graphing and Machinery; Scales

One might think that using a graphing calculator like the TI-89 automaticallysolves all graphing problems; you just plug in the formula defining the functionand the calculator does the rest. This is only partly right. A calculator or acomputer graphing utility will show you whatever portion of the graph thatappears in the graphing window you select, but it will not tell you whether thatwindow is the right one—the window that displays all the interesting features ofthe graph in the most revealing way. This is normal behavior for technology; itwill do its best to answer any question that you ask, but if you do not ask thecorrect question, you may not find out what you really want to know. Thustechnology does not replace thought; it just relocates the thought process tothinking about what questions to ask.

What graphing window is most appropriate? It depends on the question,and one of the main parts of your job is to determine an appropriate windowfor the task you are trying to accomplish. Consider, for instance, the following

EXAMPLE. Show graphically the points of intersection of each of thesepairs of functions: x2 and x3, x2 and 10x3, 100x2 and x3.

-1.4-1.2-1.0-0.8-0.6-0.4-0.2 0.2 0.4 0.6 0.8 1.0 1.2 1.4

-3

-2

-1

1

2

3

x

y

x2 and x3

50 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 51: Notes on Calculus

-0.15 -0.10 -0.05 0.05 0.10 0.15

-0.03

-0.02

-0.01

0.01

0.02

0.03

x

y

x2 and 10x3

-140-120-100-80 -60 -40 -20 20 40 60 80 100120140

-3e+6

-2e+6

-1e+6

1e+6

2e+6

3e+6

x

y

100x2 and x3

Notice that although the formulas for the functions don’t look that differ-ent, and the three pictures look exactly the same, the scales are very different.There is a factor of 1000 difference between the horizontal scales of the secondand third graphs, and a factor of 108 = 100, 000, 000 difference in the verticalscales.If you are tempted to think that such large factors are just mathematician’s

inventions, and that in real life numbers are a more reasonable size, recall thatthe national debt is about $1012. And for even a bigger contrast, the massof the sun is about 2×1030 kg; the mass of an electron is about 9.1 × 10−31kg. Thus the mass of the sun is equal to the mass of about 2× 1060 electrons.Another small quantity currently in the news is the attosecond (10−18 seconds).In 3 attoseconds a beam of light travels from one side of a water molecule tothe other. An article in the March 27, 2010 Science News describes howattosecond scale X-ray pulses are being used to “freeze” the location of anelectron, a capability that it is hoped will improve human understanding ofchemical reactions.Just to make life even more complicated, some functions exhibit different

behaviors on different scales, so that even for a single function which scale to

GRAPHING AND MACHINERY; SCALES 51

Page 52: Notes on Calculus

use depends on what you are trying to do.

EXAMPLE. Graph y = 15x4−142x3+21x2+24x−10, showing the essentialfeatures of its behavior.What are the essential features? Well, if we choose a horizontal scale of

−20 ≤ x ≤ 20, we see

-20 -10 0 10 20

1e+6

2e+6

3e+6

x

y

15x4 − 142x3 + 21x2 + 24x− 10

showing that for large negative and positive x, the function increases rapidly.At this scale, though, you really can’t tell what’s happening for 0 ≤ x ≤ 10.If we choose a smaller scale of −5 ≤ x ≤ 10, we see more detail of the

behavior for smaller x;

-4 -2 2 4 6 8 10

-10000

10000

20000

x

y

15x4 − 142x3 + 21x2 + 24x− 10

in particular we see that this function does have some negative values betweenx = 0 and x = 10 with a minimum at x = 7. But we still can’t really seewhat’s happening near the origin.To understand the behavior near x = 0 we must choose a much smaller

scale yet. Here is the graph for −0.8 ≤ x ≤ 0.8.

52 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 53: Notes on Calculus

-0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8

-40

-20

20

40

60

x

y

15x4 − 142x3 + 21x2 + 24x− 10

Now we can see that there is a local minimum at x = −0.2 and a localmaximum at x = 0.3.Could we capture all of this on a single graph? Not really. The difference in

y values between the local minimum and local maximum near x = 0 is about10. But the y-value at the minimum at x = 7 is −11504, and the vertical scalein the first graph that showed the behavior for larger x was in the millions. Sothere is already a factor of 1000 difference between the vertical scales of thesecond graph and the third one, even if we don’t worry about the first graph.There is no hope of showing both phenomena on the same graph. (Unless, ofcourse, we draw the graph by hand impressionistically and “cheat” in order tomake everything show up. This is often useful—sometimes hand drawn graphsare better than machine drawn ones.)

EXERCISES.

1. (a) Determine an appropriate y range so that the functions x3, x4, x5 caneasily be distinguished on the interval −0.1 ≤ x ≤ 0.1. Sketch the result.(b) Determine an appropriate y range so that the functions x3, x4, x5

can easily be distinguished on the interval −100 ≤ x ≤ 100. Sketch theresult.

In each case be sure to indicate which function is which. Is the samefunction on top in both cases?

2. Graph y = x5 − 10x3 + x2 + 1, showing the essential features of itsbehavior. In particular, be sure to show all changes of direction.

3. Graph each of these functions, showing the essential features of its be-havior.

(a) y = |x− 2|+ |x+ 3|(b) Repeat (a) for |x− 20|+ |x+ 30| . Your scale on the x-axis should

be chosen intelligently and labelled clearly.

(c) Repeat (a) for |x− 0.2|+ |x+ 0.3| .

GRAPHING AND MACHINERY; SCALES 53

Page 54: Notes on Calculus

(d) If you use equal x and y scales for each of these three functions inwhat way or ways do the graphs have the same shape? In what wayor ways are they different?

4. Let P (x) be the number of Americans whose height is less than or equalto x inches. Here x is a positive real number (not just an integer) andsince the total current US population is about 310 million, it is nottoo unreasonable (at least in thinking how a graph will look) to regardP (x) as a real number also instead of an integer. Choose and labelreasonable scales and sketch a rough graph of P (x) . Don’t bother to lookup additional demographic data; just make a reasonable guess about theshape of the graph.

5. Solve |2x+ 2|− |x− 1| = 7. Sketch the graphs first!

6. Graph y = −1 + 3√1− x, showing the essential features of its behavior.

Discuss.

7. Graph y =x5 − x4 − 5x+ 5

x2 − 100 , showing the essential features of its behav-

ior. Discuss.

8. Find the minimum value of 2x − x8 for x > 0 graphically.

9. How many times does the graph of y = 2x − x8 change direction forx > 0? (Do this graphically, don’t try to use calculus—it wouldn’t workout neatly anyway.) Can you show all the direction changes on a singlecalculator or computer drawn graph?

10. Determine the viewing window that will produce each of these graphs ofthe two functions x4 and 3x. Indicate which function is which for eachdiagram.

(a) (b) (c)

(d) Do you think that there is any single viewing window that will showall of the essential features of the relationship between these two graphs?Explain.

54 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 55: Notes on Calculus

1.8 Inverse Functions, Logarithms, Roots

1.8.1 Inverse Functions

In the machine interpretation of a function, its inverse function is representedby the machine working in reverse — the inverse undoes the action of theoriginal function. The inverse to putting a sock on would be taking the sock off.The inverse of the cubing function is the cube root function. In the mappingdiagram interpretation of functions, the diagram for the inverse function isobtained by just reversing the positions of the two vertical lines, so that thediagram for the inverse function is the mirror image of the diagram for theoriginal function.

-4 -2 2 4

-100

-50

50

100

x

y

x3

-100 -50 50 100

-4

-2

2

4

x

y

x1/3

-2

-1

0

1

2

-2

-1

0

1

2

x3-2

-1

0

1

2

-2

-1

0

1

2

x1/3

Thus the inverse function just reverses the roles of the domain and range ofthe original function, provided that the result is still a function. This provisois the source of most of the fuss concerning inverse functions. In order for theinverse to be a function, each element of the domain of the inverse (the rangeof the original function) must correspond to just one element in the range ofthe inverse (the domain of the original function). Restated in terms of theoriginal function, for its inverse to be a function, no element of its range cancorrespond to more than one element of its domain, that is, different elementsof its domain must be mapped by the function to different elements of itsrange. Such a function is said to be one-to-one (or 1-1 for short).

INVERSE FUNCTIONS, LOGARITHMS, ROOTS 55

Page 56: Notes on Calculus

-4 -2 2 4

-100

-50

50

100

x

y

1-1 function (x3)

-4 -2 0 2 4

10

20

x

y

Not 1-1 (x2)

-2

-1

0

1

2

-2

-1

0

1

2

1-1 function (x3)-2

-1

0

1

2

-2

-1

0

1

2

Not 1-1 (x2)

In terms of ordered pairs, reversing the roles of domain and range cor-responds to reversing first and second components. The graph of the cub-ing function includes the points (−2,−8) , (−1

2,−1

8),¡12, 18

¢, (3, 27) , ... while

the graph of the cube root function includes the points (−8,−2) ,¡−18,−1

2

¢,¡

18, 12

¢, (27, 3) , .... In the plane, the points (x, y) and (y, x) are the mirror im-

ages of one another across the line y = x. (How would you show that?) Thusthe graphs of a function and of its inverse function (if it has one) are reflectionsof one another across y = x as in the following diagram showing a small pieceof the graphs of x3 and x1/3.

56 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 57: Notes on Calculus

-2 -1 1 2

-2

-1

1

2

x

yx3

x3

x1/3

x1/3

x3 and x1/3

Keep in mind, though, that this picture is dependent on using the samescales on the x and y axes. If the values of x and f (x) are not essentially thesame size so that the scales are different, as in the graphs at the beginningof this section, then the graph of the line y = x is not where you would ex-pect to find it. Here is the graph from the beginning of the section of x3 for−5 ≤ x ≤ 5 with x1/3 and x added.

-4 -2 2 4

-100

-50

50

100

x

y

x3

x and x here1/3

x3 and x1/3

A function is one-to-one if and only if its graph intersects each horizontalline at most once — this is the horizontal line test. The graph of a functionpasses this test if and only if its inverse passes the vertical line test — thegraph of the inverse meets each vertical line at most once. The vertical linetest is just the the graphical version of the requirement for a function thateach element of its domain correspond to only one element of the range.The cubing function is one-to-one because different numbers have different

cubes; the squaring function is not because any non-zero number and its nega-

INVERSE FUNCTIONS, LOGARITHMS, ROOTS 57

Page 58: Notes on Calculus

tive have the same square. One might say the squaring function is two-to-one,except for zero. (See the diagram above.)The squaring function is more typical—most functions do not have inverses,

at least not when defined over their entire domain. For a function whosegraph is a continuous curve, it is easy to believe that it can be one-to-one onlyif it is monotonic, that is, increasing or decreasing, since as soon as thegraph changes direction, the function starts to repeat values. For the squaringfunction, for instance, this happens around x = 0 where the function changesfrom decreasing to increasing.As you know, we try to rescue the inverse idea for many functions by

restricting their domain to an interval in which they are monotonic. Forthe squaring function there are two possibilities—restricting the domain to[0,∞) = {x : x ≥ 0} or to (−∞, 0] = {x : x ≤ 0} . The respective inverse func-tions are of course the positive square root function and the negative squareroot function. In each case we get a function with domain [0,∞) (the rangeof the squaring function), but the ranges are completely separate from oneanother except for 0.For the trig functions sine and cosine the situation is somewhat more com-

plicated.

-10 -8 -6 -4 -2 2 4 6 8 10

-1.0

-0.5

0.5

1.0

x

y

sinx

For each one there are infinitely many intervals of length π on which thefunction is monotonic, so there are infinitely many possible inverse functions,each with domain [−1, 1] — the range of sine and cosine. It is customary to usethe restricted domains

£−π2, π2

¤for sine and [0, π] for cosine to define the named

inverses arcsin and arccos, but the other possibilities cannot be completelyignored—if you want to know the angle in the second quadrant whose sine is .5,this is really a question about the inverse of the restriction of sine to

£π2, 3π2

¤.

You cannot get the answer from the arcsin button on your calculator; it will

tell you arcsin (.5) =π

6(or the decimal equivalent) when what you need is

6— the corresponding angle in the second quadrant. On the TI-89 you can dobetter with Solve: solve(sinx = 1/2, x) produces a complete list of solutions

58 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 59: Notes on Calculus

x = 2kπ+5π

6or x = 2kπ+

π

6, though you still have to pick out the right one.

1 2 30.0

0.5

1.0

x

y

5 /6π

sinx = .5

-1.0 -0.5 0.5 1.0

-1.0

-0.5

0.5

1.0

x

y

5 /6π

sinx = .5

EXAMPLE AND COMMENT: What is the domain of arcsin (sinx)?What is the graph of this function? The first question is ambiguous, and theanswer to the second question changes significantly depending on how the firstquestion is interpreted.

In view of what I have just been saying about a function and its inverseswapping domains and ranges, and about the arcsine function being the inverseof a restriction of the sine function, one interpetation would be that the sinxin arcsin (sinx) refers to this restricted function. In that case the domainof arcsin (sinx) is the domain of the restricted sinx, that is −π

2≤ x ≤ π

2.

Then for each such x, arcsin (sinx) by definition maps sinx back to x. In thisinterpretation, the graph of arcsin (sinx) is just a segment of the line y = x asyou would expect.

But in calculus it is normal to take the graph of any function defined by aformula to be the set of all x for which the formula makes sense. The formulaarcsin (sinx) actually makes sense for all real numbers x, so the customaryanswer to the first question would be that the domain of arcsin (sinx) is allreal numbers. But then the graph does not lie entirely on the line y = x. (Seeproblem 2 below.)

To sum up, the ambiguity here is that we are using the symbol sinx tomean two different things—a function defined on the entire real line, and alsothe restriction of that function to the interval −π

2≤ x ≤ π

2. Strictly speaking,

these are different functions—remember that a function is “really” a set ofordered pairs and the first version of sinx contains many more ordered pairsthan the second—and so they should have different names. In more advancedareas of mathematics where the domain of a function plays a more crucial role,it is indeed the practice to use different names for the same rule applied todifferent domains. In calculus it is customary to let the notation be ambiguousand to leave it to the reader to supply the correct interpretation based on thecontext. As this example demonstrates, this is not always quite obvious.

INVERSE FUNCTIONS, LOGARITHMS, ROOTS 59

Page 60: Notes on Calculus

1.8.2 Units

In applied problems where there are different units associated with the domainand range of a function, you need to be sure to remember to reverse the unitswhen considering the inverse function. This can be quite helpful in thinkingabout the meaning of an inverse function and the meaning of its derivative.For instance, if P (t) is the population of Bellingham in year t, (an increasingfunction, at least for quite a while), then P is a function that you feed atime into and get a population out of, e.g. P (2009) = 80, 000. Its inverse,P−1, is a function that you feed populations into and get times out of, e.g.P−1 (80, 000) = 2009. The derivative of P is the rate of increase of populationper unit of time — if P 0 (2009) = 1500, then Bellingham is currently gainingfifteen hundred people per year. The derivative of P−1 would be the rate thattime increases per unit of population growth (1/1500 of a year per person in2009, that is, P−10 (80, 000) = 1/1500 which means that Bellingham is gainingabout one new person every fourth of a day, or four people per day.)

1.8.3 Logarithms

We know that exponential functions are always monotonic: in particular, ax

is increasing if a > 1 and decreasing if 0 < a < 1. In each case the domainof ax is R, the set of all real numbers, and the range is (0,∞), the set of allpositive numbers. Thus all of these functions have inverses with domain (0,∞)and range R. The inverse of ax is, of course, the logarithm to the base a,denoted loga x. For example,

53 = 125 and log5 125 = 3

are inverse statements of the same relationship, corresponding to the points(3, 125) and (125, 3) on the respective graphs. Since a0 = 1 for all a >0, loga 1 = 0 for all a > 0. Graphically, since all exponential functions ax passthrough the point (0, 1) , all logarithm functions loga x pass through (1, 0) .

-2 -1 1 2 3

-2

-1

1

2

3

x

y

10x and log10 x

60 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 61: Notes on Calculus

In other words, the values of a logarithm function are exponents where thebase of the exponential function is the base of the logarithm function: 5 in theexample just above. In principle this is all there is to know about logarithms,and all the properties of logarithms are just properties of exponents writtenin a different notation. For instance,

loga xy = loga x+ loga y

is just a restatement of the fact that you multiply exponentials with a commonbase by adding the exponents. (Adding logarithms = adding exponents =multiplying the numbers). 23 ·25 = 23+5 = 28 = 256 is just the same statementas log2256 = log2 2

8 = log2 23 + log2 2

5 = log2 8 + log2 32.One difference in the use of exponential and logarithm functions is that

while it can be very convenient to use whichever exponential function is mostsuited to a given problem, it is unusual to use any logarithm function exceptbase 10 (common logarithms) or base e (natural logarithms). This ispartly because logarithms tend to be used as a manipulative device to simplifyequations that have already been written down, while we have seen that a goodchoice of exponential function can greatly simplify writing the equation downin the first place. As a manipulative device it mostly doesn’t matter whichbase is used for logarithms, and the fact that calculators are generally set uponly for common and natural logs makes these more convenient than logs toother bases.

1.8.4 Using Logarithms

You are already familiar with using logarithms to manipulate equations wherethe unknown quantity is in an exponent. As a first example, here is a sym-bolic solution of the equation 2t/3 = 10 that came up in Example 3 in theExponential section.Example 1. To solve 2t/3 = 10 symbolically, take the logarithm of both

sides of the equation to obtain

t

3log 2 = log 2t/3 = log 10

or

t =3 log 10

log 2=

3

log 2≈ 9.9657842847.

Note that the answer is the same as the graphical solution to 10 decimal places.(Actually 10 decimal places of accuracy aren’t particularly appropriate here,but I wanted to demonstrate that the graphical solution can be as accurate asthe symbolic one.)

INVERSE FUNCTIONS, LOGARITHMS, ROOTS 61

Page 62: Notes on Calculus

In the calculation above I have used logarithms to the base 10. (In thiscourse we will use the usual abbreviations log x for log10 x and lnx for loge x.)We could do the algebra with logarithms to any positive base b, but mostcalculators are set up so that only base 10 and base e logarithms are easyto compute. If we had used natural logs, both the calculation and the resultwould have been the same:

t =3 ln 10

ln 2≈ 9.9657842847.

This is the symbolic version that the Solve command on the TI-89 will giveyou.Here is a second example.Example 2. Suppose that 10 grams of Plutonium-210 were released in

the Chernobyl nuclear accident. The half-life of Pu-210 is 24,360 years. Howlong will it be until only one gram of this Pu-210 remains?

Solution. The amount remaining t years after the accident is 10·¡12

¢t/24360.

We must solve the equation 10 ·¡12

¢t/24360= 1, or

¡12

¢t/24360= 0.1. To estimate

the solution first, note 18=¡12

¢3is more than 0.1, and 1

16=¡12

¢4is less than

0.1. Thus the answer is between 3 and 4 half-lives, or 73, 080 ≤ t ≤ 97, 440.To solve the equation symbolically, take logs on both sides to get

t

24360log

1

2= log 0.1 = −1.

Then, using log 12= − log 2,

t =24360 (−1)− log 2 =

24360

log 2≈ 80, 922.17 years.

Of course we can also get this result graphically. Graphing y1 = .5ˆ(x/24360)and y2 = 0.1 with a viewing rectangle of 70, 000 ≤ x ≤ 100, 000 (using therough estimate), .05 ≤ y ≤ .15 gives a picture something like this:

70000 80000 90000 1e+5

0.060.080.100.120.14

x

y

Using Intersection gives t ≈ 80, 922.17 years.

62 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 63: Notes on Calculus

1.8.5 How Many Exponential Functions Are There?

I have defined an exponential function to be anything of the form

cakx

where a > 0 and c and k are any real numbers. Many calculus texts say thatan exponential function is something of the more restricted form cax. Someothers say that an exponential function is something of the form cekx, that is,the base has to be the number e. Is there any difference in the collection offunctions described by these families, or is it just a difference in naming?It turns out to be just a difference in naming. In fact, exponential functions

can be written in many different forms, with all three of the constants c, a, kinteracting. For instance, 5 · 23x = 5 · (23)x = 5 · 8x. We could change the baseto e by noting

8x = eln(8x) = ex ln 8 or 23x = e3x ln 2.

Thus the combinations a = 2, k = 3 or a = 8, k = 1 or a = e, k = 3 ln 2all describe the same function. We could even bring the constant 5 into theexponent by writing 5 = eln 5, so

5 · 23x = eln 5+3x ln 2.

There are two points to keep in mind:(1) any exponential function can be written in many different forms—you

can choose the base to be any positive number except 1, and then determine theappropriate value of the constant k in the exponent, and can even accomodatea multiplicative constant into the exponential as an additive constant in theexponent. If you can recognise the most convenient form in a given situation,it can save you a lot of effort.(2) some skill in translating between different exponential forms is actually

useful — when talking about calculus (though not otherwise), e is nearly alwaysthe most convenient base. The best way to remember how to differentiate 8x isto remember how to convert it to ex ln 8 and then remember how to differentiatethat.

1.8.6 Accuracy, Exactness, and Calculator Technique

When changing the base of an exponential function, it is almost always best torestrain your impulse to replace constants like ln 2 by decimal approximations,because the decimal will introduce unnecessary errors. These can be largerthan you might expect. For instance, 210 = 1024, and ln 2 ≈ .69 to twodecimal places, but if we write 2x = ex ln 2 ≈ e.69x and then plug in x = 10, wefind

1024 = 210 ≈ e6.9 ≈ 992.27.

INVERSE FUNCTIONS, LOGARITHMS, ROOTS 63

Page 64: Notes on Calculus

Thus the substitution of .69 at this point for ln 2 in the design of almost anyobject would probably cause the object to fail to perform correctly.The moral here is that “approximately equal” is a much more complicated

concept than “equal”. If we do the same thing to two equal quantities, wewill get the same result. As we have just seen, if we do the same thing totwo quantities that are merely close to equal, such as ln 2 and .69, the resultsmay be more different than we are comfortable with. Numerical analysis is adifficult subject precisely because computations on a calculator or computernearly always involve some error, and unless great care is taken the cumulativeeffects of even very small errors may be unacceptably large.This is one big reason why mathematicians like exact answers when they

can get them, and in any case like to distinguish between “exactly” and“nearly”. The situation in science is very different. Few numbers are ex-act, and the distinction between “equal” and “approximately equal” is oftenblurred. In particular, this is why math teachers (like me) will try to get you togive answers like π or

√3−12rather than whatever decimal equivalent you hap-

pen to come up with. It is not that approximation is necessarily evil, just thatit is irreversible, and most mathematicians believe that exactness should notbe surrendered without a good reason. In these notes I will normally indicate“approximately equal” with the symbol ≈ and save = for true equality.To bring this discussion down to the mundane level of good calculator

technique, consider the problem of finding x so that 2x = 3. A little ma-nipulation yields that x = ln 3/ ln 2. If we actually want a decimal approx-imation, bad technique would be to compute ln 3 ≈ 1.10, ln 2 ≈ .69 and sox ≈ 1.10/.69 ≈ 1.5942. On the other hand, if instead of approximating ln 2and ln 3 separately and then dividing, I just enter ln 3/ ln 2 into my calculatorand evaluate, I get 1.58496. Thus, in addition to involving more steps, ap-proximating the individual natural logs to two decimal places has yielded anapproximation that is not correct to two decimal places.In general you should always enter all of a complex expression into a calclu-

lator at once and evaluate in a single step rather than breaking it into piecesto evaluate separately. This will be more accurate, faster, and less subject tosilly copying errors.

1.8.7 How Many Logarithm Functions Are There?

Another reason that we don’t use many different logarithm functions is thatthey are all constant multiples of one another. To see this, consider, say, lnxand log x, that is, logs to the respective bases e and 10. For any given x, lety = lnx and z = log x. Then interpreting y and z as exponents,

x = ey = 10z.

64 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 65: Notes on Calculus

Solving ey = 10z for y, y = ln (ey) = ln (10z) = z ln 10, that is, lnx =(ln 10) log x. The two functions lnx and log x are just constant multiples ofone another, with the factor of proportionality being ln 10.We can compare any other pair of log functions in the same way. For

instance if y = log2 x and z = log3 x, then

x = 2y = 3z (1.4)

and solving 2y = 3z by taking logs of both sides (say logs to the base 10),

y log 2 = z log 3 or y =log 3

log 2z.

Thus again the log functions are proportional: log2 x =log 3

log 2log3 x.

It is clear that this process can be repeated for any other pair of log func-tions, and you should be able to guess the constant of proportionality linkingloga x and logb x for any numbers a and b greater than 1. This constant isnot worth remembering, but the ability to manipulate logarithms well enoughto be able to reproduce the calculation above if you ever need the result isdefinitely worth cultivating.

EXERCISES.

1. For each part, decide whether the function f is invertible. Explainbriefly.

(a) f (t) is the number of gallons of fuel your car has used t minutesafter leaving Bellingham to drive to Mt. Baker.

(b) f (x) is the volume in liters of x kilograms of water at 4◦C.

(c) f(z) is the cost of mailing a letter that weighs z ounces.

2. Let f (x) be equal to the Fahrenheit temperature when the columnof mercury in a particular thermometer is x inches long. What doesf−1 (75) represent in terms of this description?

3. A kilogram weighs about 2.205 pounds.

(a) Write a formula for the function, f , which gives an object’s mass, k,in kilograms as a function of its weight, p, in pounds. (Check witha simple example to be sure you’ve got it the right way around!)

(b) Find a formula for the inverse function of f.What does this functiontell you in practical terms?

INVERSE FUNCTIONS, LOGARITHMS, ROOTS 65

Page 66: Notes on Calculus

4. What is the graph of arcsin (sinx) as a function defined on the entirereal line? If you use the TI-89, try to predict what you will see beforepunching the buttons. (What will the range of this function be, forinstance?) Sketch the graph and explain what you see.

5. What is the largest interval containing x = 3 on which |sinx| is invert-ible? What is the domain of the inverse? The range of the inverse?Sketch the graph of the inverse on its domain. What is the value of theinverse function at x = .5?

6. The hyperbolic cosine function, coshx, is defined by coshx =ex + e−x

2.

What is the largest interval containing x = 1 on which coshx is invert-ible? What is the domain of the inverse? Sketch the graph of the inverseon its domain. What is the value of the inverse function at x = 2? (Sug-gestion: don’t try to find a formula for the inverse function. Do thisgraphically and/or numerically.)

7. Let f be an invertible function on an interval a ≤ x ≤ b with inversefunction g.

(a) Explain, using a diagram, why if the graph of f touches or crossesthe line y = x at x = c, then the graphs of f and its inverse gintersect at x = c.

(b) Show by making up a specific example (or at least a picture of one)that the converse is not true—it is possible to have f (c) = g (c)without the point of intersection lying on the line y = x.

(c) Is it still possible if f is an increasing function?

8. We’ll use the properties of arctanx and |x| to study the compositionarctan

µ1

|x|

¶.

(a) What are the domain and range of arctanx? What function, withwhat domain, is it the inverse of? (It is not enough to just say“tanx”.)

(b) What is the domain of arctanµ1

|x|

¶? Is x = 0 in the domain?

Sketch the graph of this function.

(c) Does arctanµ1

|x|

¶have a vertical asymptote at x = 0? (You will

have to think carefully about what a vertical asymptote is, and lookvery carefully at the graph near x = 0. It might help to make a tableof values for values of x approaching 0, say x = ±.1,±.01,±.001....)If the y-axis is not a vertical asymptote, then describe the behavior

66 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 67: Notes on Calculus

of arctanµ1

|x|

¶near x = 0 as well as you can. For instance, what

about the slope of the graph near x = 0?

9. The rate at which a certain chemical dissolves in water is proportionalto the amount still undissolved. (This is just like radioactive decay.) If8 grams of the substance are placed in water and 3 grams dissolve in 5minutes, when will the chemical be 99% dissolved? (Don’t use e! Use abase more natural for this problem.)

10. Forty percent of a radioactive substance disappears in 100 years. Whatis its half-life? (Again e is the wrong base to use.)

11. Carbon-14 dating of organic objects is based on the fact that the ratioof radioactive 14C to normal 12C in a living organism is equal to theatmospheric concentration of 14C and thus is approximately constantover time. When an organism dies and no longer replaces its 14C, theconcentration of 14C decreases according to its half-life of 5730 years. Aparchment fragment is found to contain 74% as much 14C as living plantmaterial. Estimate its age. Think about how many significant figuresmake sense for your answer.

12. Lois Lane is now a scientist. She needs one gram of kryptonite for hernext experiment. She has 15 grams of kryptonite now, but has not yetset up the experiment. Kryptonite is radioactive and 24 hours from nowshe will have 12 grams of kryptonite left. How long does she have to setup her experiment until there is not enough kryptonite left for her torun it? Give your answer correct to the nearest hour.

13. Show that for any positive a and b there is a number k so that so thatloga x = k logb x. (Suggestion: Convert to an equation involving expo-nents.)

14. For any a > 1, what is the relationship between loga x and log1/a x?Sketch the two graphs for a = 2.

15. Show that log2 3 is an irrational number. Can you generalize? (Sug-gestion: Suppose that log2 3 = p/q where p and q are integers. Finda contradiction using the definition of a logarithm to rewrite the giveninformation as an equation involving powers of integers.)

INVERSE FUNCTIONS, LOGARITHMS, ROOTS 67

Page 68: Notes on Calculus

1.9 Comparing Families of Functions

Almost any graph of a function that you draw, or that a machine draws foryou, will show only a part of the domain of the function. If you choose thewindow wisely, you will see the “important part” of the graph. Making a wisechoice depends on knowing enough about the qualitative behavior of the basicfamilies of functions; in particular, it is often useful to know about behavioras “x approaches∞” or “x approaches −∞”, that is, as x gets more and morepositive or more and more negative.Much of this is familiar. For instance, any power function xa, for a > 0 gets

larger and larger without any upper limit as x → ∞. (We often abbreviatethis by writing xa → ∞, but remember that ∞ is not a number, it’s just anabbreviation for “gets larger and larger without any upper limit.”) Moreover,the greater the exponent a, the faster the growth of xa. In fact this followseasily from the previous statement, since for example

x2

x3=1

x→ 0 as x→∞,

that is, x3 grows faster than x2 as x gets large. More generally, for any twonumbers a and b (positive or negative), if a < b, then

xa

xb=

1

xb−a→ 0 as x→∞.

Note that this applies to fractional exponents as well as integers; for instancex1/2

x2/3=

1

x1/6→ 0 as x → ∞ so x2/3 grows faster than x1/2 as x → ∞. We

will describe this situation by saying that if b > a, then xb dominates xa asx→∞.Similarly, exponential functions ax → ∞ as x → ∞ whenever a > 1. (Of

course¡12

¢x → 0 as x→∞ and in general ax → 0 as x→∞ whenever a < 1.In fact this is really the same statement as in the previous sentence, since for

instance¡12

¢x=1

2xcan be thought of as a fraction whose numerator is always

1 and whose denominator gets large as x→∞.)Different exponential functions can be compared by looking at quotients,

just as we compared power functions. We just have to remember some rulesof exponents. For instance

2x

3x=

µ2

3

¶x

→ 0 as x→∞

since 23< 1. Thus 3x grows faster than 2x as x→∞. In fact, whenever a and

b are positive and a < b,

ax

bx=³ab

´x→ 0 as x→∞

68 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 69: Notes on Calculus

since ab< 1. Thus bx dominates ax as x→∞ whenever b > a.

We can also incorporate constant multiples for both power functions andexponentials. For any constant K > 0, no matter how large, if a < b are

positive numbers, thenKxa

xb→ 0 as x → ∞ and

Kax

bx→ 0 as x → 0. For

instance, 10102x < 3x for all “large” x, where here “large” means roughlyx > 57 (not as large as you expected?) and 1010x2 < x3 for x > 1010.

Comment on measuring: Notice that we have compared functions aboveby looking at their ratio rather than their difference. Ratios are a moreeffective way of measuring the relative size of functions. Think of how thegraphs would look if we zoomed way out and looked at the “big picture.” Theratio of x3 and x2 is x which means that on a graph showing both functions, asin the one below left, x3 will appear very much higher than x2 for all reasonablylarge x. The difference between x3 + x and x3 is also x, but now if we zoomout, it will become very difficult to tell the difference between the graphs, asin the graph below right.

0 2 4 6 8 100

200

400

600

800

1000

x

y

x3 solid, x2 dashed

0 2 4 6 8 100

200

400

600

800

1000

x

y

x3 + x solid, x3 dashed

1.9.1 Little Oh Notation

Mathematicians have a convenient shorthand notation for comparing the sizeof functions. We write

f (x) = o (g (x)) as x→∞

(this is read “f (x) is little oh of g (x)” if g dominates f as x→∞, that is, if

f (x)

g (x)→ 0

COMPARING FAMILIES OF FUNCTIONS 69

Page 70: Notes on Calculus

as x → ∞. Thus the following are all correct statements (and have alreadybeen made above):

x2 = o¡x3¢as x→∞,

x1/2 = o¡x2/3

¢as x→∞,

2x = o (3x) as x→∞,

xa = o¡xb¢as x→∞ if a < b,

ax = o (bx) as x→∞ if a < b.

In particular, a function f is o (1) as x→∞ iff (x)

1= f (x)→ 0 as x→∞.

For instance,1

x= o (1) as x→∞, and in fact x−c = o (1) as x→∞ for each

c > 0. This statement is actually equivalent to the statement xa = o¡xb¢

if a < b if we make the substitution c = b − a so that x−c =xa

xb.Thus any

statement in little oh language comparing two functions f and g can be made ineither of two ways: one can write either f (x) = o (g (x)) or f (x) /g (x) = o (1)with exactly the same meaning.In practice I will mostly avoid using the little oh terminology to avoid our

getting too bogged down in jargon. I will tend to write out what it meansexplicitly whenever I can do that without too much awkwardness.

1.9.2 Comparing Power Functions and Exponentials

Which grows faster, power functions or exponential functions? Power functionsor logarithm functions? This is a little harder to decide than comparing powerfunctions among themselves or exponential functions among themselves.It’s best to begin with an experiment, say to compare f (x) = 2x and

g (x) = x10. You should do this now, if you haven’t already: Make a chartof values, say for x = 1, 2, 3, 10, 100, 1000, expressing all the later entries aspowers of 10. It is a good exercise to avoid using your calculator for this—insteadremember that 210 = 1024 ≈ 103 and use this and properties of exponents toestimate the high powers of 2. Where (very roughly) do f and g intersect?Which function is greater between points of intersection? How can you becertain that the largest point of intersection that you found is really the largestone that there is?The last question is the most important one, and the most difficult to give

a good answer to, though the numbers in a chart of values can actually bepretty convincing once one begins to get a feeling for what is happening.Another way to approach comparing the sizes of 2x and x10 as x grows

large is to use the Limit command on the Calc menu (F3) of the TI-89. Ifwe enter limit(xˆ10/2ˆx, x,∞) (note that∞ is an allowable “point” at which

70 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 71: Notes on Calculus

to compute a limit) the calculator returns 0, that is, it tells us that x10 isdominated by 2x as x → ∞. (Recall, however, from section 1.4 that whenwe asked the Solve command for the solutions to x10/2x = .1 it missed thepositive solution beyond which this quotient is less than .1. Thus the Limitcommand “knows” that the exponential function is eventually much biggerthan the power function, but the Solve command does not. To repeat yetagain, don’t rely on any machine to do your thinking for you.)

Of course the relative sizes of power functions and exponentials can alsobe tackled algebraically. One approach is to go back to the characterizationof exponential functions as functions that increase by a constant ratio as xincreases by a constant amount, and to compare this with the situation forpower functions.I’ll illustrate this by working through the calculations for the functions

f (x) = 3x and g (x) = x4 of exercise 10 from section 1.7. The idea will be to

look at the ratiosf (n+ 1)

f (n)and

g (n+ 1)

g (n)and to use the fact that the first

ratio is constantly equal to 3 while the second ratio starts out large but thenapproaches 1.To begin with, look at a table of values:

n f (n)f (n+ 1)

f (n)g (n)

g (n+ 1)

g (n)

1 3 3 1 162 9 3 16 5.063 27 3 81 3.164 81 3 256 2.445 243 3 625 2.076 729 3 1296 1.857 2187 3 2401 1.718 6561 3 4096 1.609 19683 3 6561 1.52

We can see from the table what we have already seen graphically in section1.7–that the graphs of 3x and x4 cross between x = 1 and x = 2 and againbetween x = 7 and x = 8. We believe that there are no more crossings, buthow can we justify this?One way is this. From the right hand column of the table, we see that

g (n+ 1)

g (n)=(n+ 1)4

n4< 2 for all n ≥ 6 : 7

4

64≈ 1.85 and so forth. (We could also

justify this algebraically by solving the inequality(x+ 1)4

x4=

µ1 +

1

x

¶4< 2

for x to find x >¡21/4 − 1

¢−1 ≈ 5.28.) We can use this to find an exponentialfunction that is larger than x4 but smaller than 3x and is easier to compare toeach of these functions than they are to compare directly to each other.

COMPARING FAMILIES OF FUNCTIONS 71

Page 72: Notes on Calculus

Here are the first couple of steps. Each step uses the fact that (n+ 1)4 <2n4 for the values of n being considered.

74 < 2 · 64

84 < 2 · 74 < 2 ·¡2 · 64

¢= 22 · 64 (substituting from previous line)

94 < 2 · 84 < 2 ·¡22 · 64

¢= 23 · 64 (substituting from previous line)

104 < 2 · 94 < 24 · 64 (substituting from previous line)

and so forth.The pattern is that for any n ≥ 7,

n4 < 2n−664 =¡64/26

¢2n.

Since the expression in parentheses is a constant (=81

4= 20.25 though this

doesn’t really matter), we now have that n4 < 20.25 · 2n for all n ≥ 7. But weknow that for all large enough n, 20.25 · 2n < 3n. (In this specific case, a littleexperimentation reveals 20.25 ·28 = 5184 < 38 = 6561 so the second inequalityis true for all n ≥ 8.) We have then the double inequality

n4 < 20.25 · 2n < 3n

where the first inequality is true for all n ≥ 7 and the second is true for alln ≥ 8. Thus both parts of the inequality are true for all n ≥ 8.The graphs of these three functions tell the same story: the auxiliary func-

tion 20.25 · 2x that we introduced lies between x4 and 3x for all x ≥ 8.

0 2 4 6 8 100

5000

10000

15000

x

y 3

x

20.25x2x

4

x

Graphs of x4, 20.25 · 2x, 3x

This shows that n4 < 3n for all n > 8 (no matter how large n is, that isthere are no surprises for tremendously large numbers) and in fact since all ofthese functions are increasing regularly, we don’t have to stick to integer values

72 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 73: Notes on Calculus

of the original functions, that is we now know x4 < 3x for all real numbersx ≥ 8. In fact it shows more than that. We already know that 2x is dominatedby 3x as x → ∞ so that also 20.25 · 2x is dominated by 3x as x → ∞. Sincex4 < 20.25 ·2x for x ≥ 8, it must be also that x4 is dominated by 3x as x→∞,which is what we were trying to show.

SIDE REMARK: The overall strategy here is a very common one whentrying to compare two functions that “look different.” We have compared thetwo by finding a third, intermediate, function that it is reasonably easy tocompare to each of the two we are really interested in. It is usually the case,as it is here, that there are many possible choices of intermediate function,(here we could have used a multiple of 1.5x or

¡√2¢xor ... in place of 2x — can

you describe some whole class of possibilities?) and you use whatever comesto mind first.

The upshot of all this is the following “principle”:

Increasing exponential functions always dominate power functions as x→∞.

Stated more formally, for any b > 1 and any positive k,K,M,

KxM

bkx→ 0 as x→∞.

1.9.3 Comparing Power Functions and Logarithms

Which grows faster as x → ∞, power functions or logarithms? Since weknow the graphs of logarithm functions are concave down, it makes sense tocompare them to fractional powers. So to start with, construct a table ofvalues for h (x) = x1/10 and (x) = log10 x for x = 1, 2, 3, 10, 100, 1000. Wheredo h and intersect? Which function is greater between points of intersection?How can you be certain that the largest point of intersection that you foundis really the largest one that there is?As before, the last question is the most important one. As before, the table

can give you a pretty good idea of what the answer is, but to really establish itcarefully takes something more. One way to proceed would be to do somethingsimilar to what we did in comparing power functions and exponentials. Wewill do something different, however, partly because it is more interesting tovary strategies, but mostly because it will be easier to make use of some thingswe have already done than to start from scratch again.The point is that these functions are inverses of functions we have already

studied. The inverse of h (x) = x1/10 (for x ≥ 0) is g (x) = x10. The inverse

COMPARING FAMILIES OF FUNCTIONS 73

Page 74: Notes on Calculus

of (x) = log10 x is f (x) = 10x. We know that x10 = o (10x) as x → ∞.What does that tell us about the relationship between x1/10 and log x? Well,we know that the graphs of a pair of inverse functions are reflections of eachother across the line y = x. So we have two pairs of reflected graphs, giving adiagram like this:

0 1 2 3 4 50

1

2

3

4

5

x

y x3 2x

x

x1/3

log2x

Actually this is a diagram of x3, 2x and their inverses x1/3, log2 x since theymake a picture that is much easier to see. There is, however, a problem: notethat as they disappear out the top, x3 > 2x, while we know that eventuallythe opposite must be true. (To see why I chose a scale of 0 to 5 on both axes,note that the next intersection of 2x and x3 is near the point (10, 1000) andlook at what your calculator gives you when you set both scales to the range 0to 1000.) The graph does make the following point: reflecting across y = xreverses which function is “on top.” That is, the fact that x3 > 2x forthe largest values we can see corresponds to the fact that x1/3 < log2 x for thelargest values we can see.

To see the “real” long-term picture for x10, 10x, x1/10 and log10 x we willhave to look at the two parts separately, and not use equal scales for theaxes. The functions x10 and 10x intersect at x ≈ 1.3713 and at x = 10.The actual points of intersection are (1.3713, 23.5119) and (10, 1010) . Thus toshow the second intersection we will need scales something like 0 ≤ x ≤ 11and 0 ≤ y ≤ 2 × 1010. This produces the diagram on the left below. Thediagram on the right shows the inverse functions x1/10 and log x on the scales0 ≤ x ≤ 2× 1010, 0 ≤ y ≤ 11.

74 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 75: Notes on Calculus

0 5 100

5e+9

1e+10

1.5e+10

2e+10

x

y

10x and x10

0 1e+10 2e+100

2

4

6

8

10

x

y

log x and x1/10

It is clear that the fact that 10x > x10 for all large x, that is, the graph off is above the graph of g, reflects across y = x into the fact that the graph of= f−1 is below the graph of h = g−1 for all large x. So we know that log10 xis less than x1/10 for all large x. But is it true that log10 x is dominated byx1/10 as x→∞? This is harder to read from the diagram by reflection. Thatx10 is dominated by 10x as x→∞ is a statement about the relative size of twovertical distances at x. (This becomes quite clear visually if you extend the xscale to 15 or so.) By reflection, this becomes a statement about horizontaldistances between the y-axis and the graphs of log x and x1/10. This is notwhat we want to know about. In fact it can turn out that for two increasingfunctions f and g, f = o (g) as x→∞, but g−1 6= o (f−1) as x→∞. (Thereis an exercise below about this.) So we will need to be a little more careful.Fortunately, we can still use the inverse relationship, just in a symbolic

way instead of a graphical way. Suppose we try to solve the equation

log10 x =1

kx1/10. (1.5)

The statement that log10 x is dominated by x1/10 is equivalent to the statement

that the displayed equation is true for successively larger values of k (withoutany upper bound) as x grows. (That is, we first fix x and then find k to makeequation (1.5) true. For log10 x to be dominated by x

1/100 would mean that aswe fix larger and larger values of x, we will get larger and larger values of k.)

But if we set u = log x =1

kx1/10 and solve each of the individual equations

u = log10 x and u = 1kx1/10 for x, we get x = 10u and x = (ku)10 , that is,

the number u satisfies the equation (ku)10 = 10u or u10 =1

k1010u. Since u10

is dominated by 10u, this equation is true for successively larger values of k(without any upper bound) as u grows. But the k that satisfies this equationfor a given u also satisfies equation (1.5) for the corresponding x = 10u. Thuslog x

x1/10= 1/k does approach 0 as x grows, and it is true that log x is dominated

by x1/10.

COMPARING FAMILIES OF FUNCTIONS 75

Page 76: Notes on Calculus

1.9.4 Behavior as x→ 0+

We also often need to consider the behavior of a function as x → 0. For thefunctions being considered in this section, the situation is fairly simple, butwe will get to more complicated situations later—the derivative, for instance!Any power function xc, where c > 0, approaches 0 as x → 0. This is

familiar, and can easily be checked with a graphing calculator. The only thingto watch out for is that for “awkward” values of the exponent c, xc is definedonly for positive values of x, while for “simple” exponents, xc is defined forall real numbers.Which exponents are simple? Well, integers, and some fractions: x1/3

is defined for all real x, but x1/2 is not. In general, odd roots of negativenumbers are defined, so xp/q makes sense for all real numbers whenever q is anodd integer, but not when q is an even integer. And when c is irrational, xc isalso defined only for positive values of x.To avoid cases, then, for this subsection we will consider only positive

values of x.We will abbreviate this situation by writing x→ 0+ (x approaches0 from the positive side).Comparing two different power functions is easy: if a < b, then xb/xa =

xb−a → 0. By analogy with the situation as x → ∞ we will say that xa (thelarger function) dominates xb (the smaller function) as x→ 0+.(equivalentlyxb is dominated by xa) or, more informally,

larger powers of x approach 0 faster as x→ 0+.

For instance, x2 is dominated by x as x→ 0+—just the opposite of the behavioras x → ∞. Graphically, this just means that the graph of x2 is much flatterthan the graph of x as x → 0+. (And the graph of x3 is flatter still....) Thegraphs look like this, where the second is a closeup of the first.

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

x

y

x

x

x

x

x

1/2

2

3

10

x1/2, x, x2, x3, x10

0.000 0.002 0.004 0.006 0.008 0.0100.0e+0

5.0e-6

1.0e-5

1.5e-5

2.0e-5

x

y

x2

x3

x2 and x3

Notice that this behavior is backwards from the behavior of power functionsas x→∞. This is why the “as x→∞” or “as x→ 0+” part of the statement

76 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 77: Notes on Calculus

is important: the relative behavior of functions often depends on where youare.What about exponentials and logs? Well, exponentials don’t approach 0

as x→ 0+; in fact for any a > 0, ax → 1 as x→ 0+. And logarithm functionsdon’t have a limit as x→ 0+; recall that log10 x has a graph like this.

1 2 3 4 5

-8

-6

-4

-2

0

x

y

What about polynomials instead of just power functions? As x → ∞ itis the leading term, the one with the highest exponent, that determines thepolynomial’s rate of growth. But as x→ 0+ it is the terms in a polynomial withlower exponents that become progressively more important. For instance, thegraph of p (x) = 1 + x − x2 is a parabola opening downwards, but the closeryou get to 0, the more its graph looks like the graph of a line with slope 1.(As if the −x2 part of the formula wasn’t there.) Here are three views withdifferent scales of the graph of 1+x−x2. Note that the horizontal axis on thetwo right-hand graphs is not the x-axis. As you look from right to left (thatis, look at a progressively larger part of the domain surrounding x = 0 youcan see the three terms in the expression for p expressed one at a time. Theconstant term gives the value at x = 0, the linear term gives the behavior verynear x = 0 as in the right graph and the quadratic term governs the behavioras you get farther from x = 0 as in the left graph.

-4 -2 2 4

-15

-10

-5

xy

1 + x− x2

-0.4 -0.2 0.0 0.2 0.4

0.4

0.6

0.8

1.0

1.2

x

y

1 + x− x2

-0.04 -0.02 0.00 0.02 0.04

0.96

0.98

1.00

1.02

1.04

x

y

1 + x− x2

COMPARING FAMILIES OF FUNCTIONS 77

Page 78: Notes on Calculus

1.9.5 Section Summary

Behavior As x→∞(Faster growth higher on list)

Family General relation (b > a) ExamplesExponentials ax = o (bx) 2x = o (3x) , 3x = o (πx)

Power functions* xa = o¡xb¢

x2 = o (x3) , x3 = o (xπ)Logarithms All log functions are multiples log x = (log e) lnx

* Polynomials behave as x→∞ like their term with the highest exponent:5x3 + 1345x2 + 1010 behaves like 5x3 for sufficiently large positive values of x.(“Sufficiently large” in this example is about 104.)

Behavior As x→ 0+

Here we are generally interested in the rate at which a function approaches0 as x→ 0+. Of the families above, only power functions approach 0.

General relation (b > a) Examplesxb = o (xa) x3 = o (x2) , xπ = o (x3)

EXERCISES

1. Let f (x) = 100 · 2x, g (x) = 0.01 · 3x. For which x > 0 is the quotientf/g less than 0.01? < 10−6? Determine a formula in terms of c for whenf (x) /g (x) < c.

2. Show that if 0 < a < b and if A and B are positive constants (think ofA as large and B as small), then for every c > 0, Aax < cBbx for all xgreater than some number xc which depends on c.

3. The number of dishes that a family uses per day is proportional to apower of the number of family members. If a bachelor uses one dishper day, and a couple without children use 8 dishes per day, how manydishes per day will a family of four use? What is the power? (Can yousee how to answer the first question without working out the power?) Ifthe number of dishes increased exponentially, how many dishes would afamily of four use?

4. What is the smallest integer n so that¡1 + 1

n

¢4< 1.1? (Do this alge-

braically. Don’t just try values of n or you’ll waste a lot of time.)

78 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 79: Notes on Calculus

5. Find a constant K for which the inequality n4 < K · 1.1n is true for alln greater than the value you found in the previous problem. For whatother values of K would it also be true? (Larger K? Smaller K? Someof both? Neither? Explain.)

6. Use an argument similar to that above to show that x100 = o (3x).

7. Use an argument similar to that above to show that x1000 = o (1.1x) .

8. Find an example of increasing functions f and g so that f (x) = o (g (x))as x → ∞, but g−1 (x) 6= o (f−1 (x)) as x → ∞. Hint: consider,forinstance, 2x and 3x.

9. Find three different viewing windows centered at the orgin for the poly-nomial x3 − 4x2 + x− 2 so that the linear term is dominant in the firstgraph, the quadratic term in the second graph and the cubic term in thethird graph. (That is, the first graph should be approximately a line,the second approximately a parabola and the third like a cubic.)

1.10 Parametrized Curves

We can describe the motion of a particle on the real line by giving its positionas a function of t: for instance, x = (t− 1)2 describes a particle which at time0 is at x = 1, which moves leftward to the origin (arriving at t = 1) thenreverses direction and moves off to the right at a steadily increasing speed.(Or, if you prefer to think of negative as well as positive times, the particleappears from the right, slowing down as it comes, stops at the origin and thenmoves back to the right.)To describe the motion of a particle in the plane in a similar way we need

two functions, one to describe the particle’s x-coordinate, and one to describeits y-coordinate. Such a pair, for instance

x = (t− 1)2 ,y = t.

is called a set of parametric equations, since x and y are given in terms ofthe parameter t. (This is a somewhat different use of the word “parameter”than in section 1.3 where it was used to mean a constant value that may havedifferent values in different situations. Here t is just the independent variableand not any kind of constant.)In this case an easy way to see what path the particle follows is to eliminate

t from the pair of equations to get the relationship x = (y − 1)2 . This is theequation of a parabola with vertex at (0, 1) which is lying on its side. Since

PARAMETRIZED CURVES 79

Page 80: Notes on Calculus

the y-coordinate is increasing throughout, the particle moves to the left alongthe lower arm of the parabola, passes through (0, 1) , and then goes off to theright along the upper arm. See the diagram below.

-4 -2 2 4 6 8 10 12 14

-4

-2

2

4

x

y t

t

t

= 4

= -2

= 1

x = (t− 1)2 , y = t

Now consider the pair of parametric equations

x = (t+ 1)2 ,

y = −t.

If we eliminate t again we get x = (1− y)2 = (y − 1)2 . This is the sameparabola as the previous example. Is it the same motion? No, because thistime y is always decreasing as t increases. This time the particle appears fromthe right along the upper arm of the parabola and moves back to the rightalong the lower arm. It passes through the same points as before but movingin the opposite direction.For a third example, consider

x =¡t3 − 1

¢2,

y = t3.

This is motion along the same parabola, in the same direction as the firstexample, but it is still not the same motion. For instance, although in bothcases the particle is at (1, 0) at t = 0 and at (0, 1) at t = 1, in the first exampleit is at (1, 2) at t = 2 and in the third example it is at (49, 8) at t = 2. Ingeneral the higher power of t will make the particle move away from the originmuch more quickly as t grows.The moral of all this is that to study motion in the plane we must keep

track of not only the points that the particle passes through, but also of thecorresponding times. The formal way to do this is to to define a curve in theplane to be a function.

Definition 1 . A parametrized curve (or just curve for short) in the plane,R2, is a function F from an interval I of real numbers into R2.

80 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 81: Notes on Calculus

We can think of F as being made up of two real-valued component func-tions: F (t) = (f (t) , g (t)) . In the first example above we have F (t) =¡(t− 1)2 , t

¢made from f (t) = (t− 1)2 and g (t) = t. This gets us back to

where we started:

Definition 2 The equations

x = f (t)

y = g (t)

are the parametric equations of the curve F (t) = (f (t) , g (t)) .

And to finish things off:

Definition 3 The track (or sometimes trace) of the curve F is the set ofpoints

{(x, y) : x = f (t) , y = g (t) for some t in I} ,

that is, the track of F is just the range of F.

So the three curves given by the displayed equations above all have thesame track, but are three different curves. This is not quite the ordinarylanguage use of the word “curve”, since that generally just refers to the shape(that is, the track in this terminology), but we have already seen that thedistinction matters.

Example 1. Consider motion around the unit circle in the usual (coun-terclockwise) direction. Since the coordinates of a point on x2+ y2 = 1 can begiven in the form (cos t, sin t) where t is the angle between the positive x-axisand the line from the origin to the point. Thus

x = cos t

y = sin t

is a set of parametric equations defining counterclockwise motion around theunit circle. Alternatively, we could speak of the curve F (t) = (cos t, sin t) .Notice that we go once around the circle as t increases from 0 to 2π. Thus

if we want to describe going once around the circle, starting from the point(1, 0) , we want the t values 0 ≤ t ≤ 2π.This leads to one last addition to the terminology.

Definition 4 If a curve F (t) = (x (t) , y (t)) is defined for a ≤ t ≤ b, then ais the initial time and b is the terminal time. The corresponding points,(x (a) , y (a)) and (x (b) , y (b)) are the initial and terminal points respec-tively. A curve is closed if the initial and terminal points coincide.

PARAMETRIZED CURVES 81

Page 82: Notes on Calculus

Thus in the example above, 0 is the initial time, 2π is the terminal time,and (1, 0) is both initial point and terminal point. So this is a closed curve.Example 2. F (t) = (cos t, sin t) , 0 ≤ t ≤ 3π would be a trip one and

a half times around the unit circle. Not only is it a different curve from theprevious one, it is not a closed curve (even though it has the same trace as aclosed curve) because we don’t finish where we start.

Example 3. You are familiar with problems in which a ball (or perhapssome other object like a haggis) is thrown straight up in the air. If the haggisis thrown from a height y0 with initial upward velocity w0, then its height aftert seconds will be given by

y = y0 + w0t−1

2gt2

where g ≈ 9.80m/sec2 is the acceleration due to gravity near the surface of theearth. We will remind ourselves a little later where this formula comes from.(Strictly speaking, this is the formula for the height of the haggis assumingthat there is no retarding force like air resistance. So it would work quitewell in a vacuum. When we study differential equations we will examine whathappens if we allow for air resistance.)But for now we will make this situation more realistic in a different way.

Mostly objects are not thrown straight up, but at some angle with the vertical.Thus they travel horizontally as well as vertically and we should keep trackof horizontal motion also. Thus we need a second equation. We assume thatgravity is the only force acting on the haggis. Newton’s law of motion saysthat the acceleration (rate of change of velocity) of an object is proportionalto the net force acting on it. Since no force acts horizontally on the haggis, itshorizontal velocity is constant. Thus the horizontal component of the haggis’position is given by a linear equation of the form

x = x0 + u0t

where x0 is the initial position and u0 is the initial horizontal velocity. Finally,we can put these two equations together to get this set of parametric equationsfor a haggis (or any other object) moving from an initial position (x0, y0) underthe influence of gravity with initial horizontal velocity u0 and vertical velocityw0 :

x = x0 + u0t

y = y0 + w0t−1

2gt2.

What is the track of the haggis? If we solve the first equation for t andplug into the second, we get

y = y0 +w0u0(x− x0)−

g

2u20(x− x0)

2 .

82 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 83: Notes on Calculus

This may look a little complicated, but it should be clear that if we were tomultiply it all out (I’m not going to) then y would end up being a quadraticfunction of x—one where the quadratic term has a negative coefficient. So thetrack is a parabola opening downwards. Note that this is not quite the sameas the fact that you may be thinking of: the graph of y versus t is a parabola,even when the haggis is thrown straight up and so travels only up and down.Here we see that the physical track of the haggis in space (what you would seeif you watched) is a parabola, at least as long as the horizontal velocity u0 isnon-zero.Let’s look at a specific example. If the haggis is thrown from a height of 2

meters with a horizontal velocity of 10 meters per second and also a verticalvelocity of 10 meters per second, how far will it go before it hits the (horizontal)ground, and how long will that take? We have y0 = 2 and may as well takex0 = 0. Also u0 = w0 = 10. We can find the time to hit the ground by settingy = 0 in the second equation and solving for t, that is, we want the root of

4.90t2 − 10t− 2 = 0.

We find t ≈ 2.22 seconds. Then the distance traveled horizontally will bex ≈ 10 (2.224) = 22.2 meters. Notice that the vertex of the parabola y =2 + 10t− 4.9t2 occurs for t = 10/ (2× 4.9) ≈ 1.02 seconds, so that the haggisis rising for a slightly shorter time than it is falling. Since it lands lower thanit starts and so has farther to fall than it rose, this makes sense.

EXERCISES.

1. Write parametric equations, including initial and terminal times, for eachof the following situations:

(a) Counterclockwise once around x2 + y2 = 4, starting at (2, 0) .

(b) Counterclockwise twice around x2 + y2 = 9, starting at (3, 0) .

(c) Clockwise once around x2 + y2 = 16, starting at (0, 4) , with initialtime t = 0 and terminal time t = 1.

(d) Clockwise once around x2 + y2 = 8, starting at (2, 2) when t = 0.

2. Write parametric equations, with initial and terminal times as specified:

(a) The straight line from (0, 0) to (2, 3) with initial time t = 0 andterminal time t = 1.

(b) The straight line from (0, 0) to (2, 3) with initial time t = 0 andterminal time t = 3.

(c) The straight line from (1, 2) to (3, 5) with initial time t = 0 andterminal time t = 1.

PARAMETRIZED CURVES 83

Page 84: Notes on Calculus

(d) The straight line from (a, b) to (c, d) with initial time t = 0 andterminal time t = 1.

(e) The straight line from (a, b) to (c, d) with initial time t = 0 andterminal time t = 1, but moving at a varying speed.

3. You start at (0,−2) , walk along the y-axis at unit speed to the origin,take one second to pivot 90◦ to the left to face down the negative x-axisand walk to (−2, 0) at unit speed.

(a) Find a set of parametric equations for your motion for 0 ≤ t ≤ 5.(b) You are carrying an awkward statue for your Fairhaven class on

shamanism which sticks out two units to your right at all times.(Thus the head is at (2,−2) initially, and at (−2, 2) at t = 5.) Finda set of parametric equations for the motion of the statue’s headfor 0 ≤ t ≤ 5.

4. Here is the track of a set x = f (t) , y = g (t) of parametric equationsfor 0 ≤ t ≤ 4. The track starts at (2, 0) at t = 0 and proceeds counter-clockwise around the diamond, passing the corners at t = 1, 2, 3, 4. Ploton separate graphs the individual functions f and g. (You won’t haveto work out formulas. See the instructions below.) What property orproperties of the functions f and g correspond to the fact that this is aclosed curve?

-2 -1 1 2

-1.0

-0.5

0.5

1.0

x

y

5. For each of the following pairs of functions f (t) , g (t) plot the piecewiselinear track of the parametric equations x = f (t) , y = g (t) for the indi-cated values of t. (See instructions below.) For each graph:add the t value at each corner of the track and list the initial and termi-nal values of the curve and discuss, using complete sentences, symmetryof the track with respect to the x-axis or y-axis (if there is any) or othersymmetry and relate this symmetry to properties (eveness or oddness)of the coordinate functions f and g. For some extra credit verify yourassertions symbolically using the formal definitions of what it means tobe even or odd around a real number as explained in section 1.5 of theclass notes.

84 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 85: Notes on Calculus

Here are the pairs, indicated graphically. Don’t bother to create formu-las. Plot the important points on the parametric equations and connectthe dots.

(a)1 2 3 4

-1

0

1

t

x

f (t)

1 2 3 4

-1

0

1

t

y

g (t)

(b)

0 1 2 3 40

1

2

t

x

f (t)

0 1 2 3 40.0

0.5

1.0

t

y

g (t)

6. Colin the cockroach is crawling along a radial beam on a ferris wheelwhose radius is 20 feet. He is moving toward the center of the wheel at aconstant speed of 5 feet per minute. If we introduce coordinates so thatthe center of the wheel is at the origin, if his initial position (in feet) is(20, 0) and if the wheel revolves at the constant rate of one revolutionin four minutes, counterclockwise, find parametric equations for Colin’sposition as a function of time.(Suggestion: First write equations for themotion of the point on the edge of the ferris wheel that Colin starts from.Draw a picture.) When, if ever, will Colin reach the center of the ferriswheel? Sketch Colin’s track.

7. Consider the track of the parametrized curve x = cos2 t, y = cos t sin tfor 0 ≤ t ≤ a, where a is the smallest positive value of t for which thetrack returns to its initial position.

(a) What is a? Sketch the track, being careful to use equal scales for xand y.

(b) From your track, guess an xy formula for the track, and verify thatthe expressions for x and y do satisfy that formula.

8. Sketch the track of the parametrized curve x = cos 4t, y = sin 3t for0 ≤ t ≤ 2π, indicating for which values of t it crosses the coordinate

PARAMETRIZED CURVES 85

Page 86: Notes on Calculus

axes. Also sketch the individual graphs of x (t) and y (t) against t andindicate what properties of these graphs correspond to the fact that theparametrized curve is symmetric with respect to the x-axis.

9. Same question for x = sin 2t, y = sin t.

10. Sketch the track of x = sin 3t, y = sin t for 0 ≤ t ≤ 2π and the graphs ofx (t) and y (t) against t.What kind of symmetry do x and y have aroundt = π/2 and how does this explain what you see for the track of theparametrized curve?

11. Sketch the track of x = sin t− t cos t, y = cos t+ t sin t for −10 ≤ t ≤ 10,indicating the direction of motion, and explain what “the rest” of thetrack (for other values of t) looks like. How do you know? Explain whythe track is symmetric with respect to the y axis. Find the values of t in[−10, 10] for which the track intersects itself and the coordinates of theintersections. (You will have to do this numerically.)

12. The hyperbolic cosine and sine functions are defined by

coshx =ex + e−x

2; sinhx =

ex − e−x

2.

(a) Verify the identity cosh2 x− sinh2 x = 1.(b) Sketch the track of the parametrized curve x = cosh t, y = sinh t

for −∞ < t < ∞, indicating the direction of motion and the timeat which the curve meets or crosses any coordinate axis.

(c) What is the equation (in terms of x and y) of the track? What kindof curve is this?

13. Elizabeth and Jeanie are standing 10 meters apart, each holding a yam.Elizabeth throws her yam toward Jeanie with an initial horizontal veloc-ity of 10 meters/second and an initial vertical velocity of 5 meters/second.Assuming Elizabeth takes enough of a windup so that Jeanie can timeher own throw to start at the same time as Elizabeth’s, what is the sim-plest choice Jeanie can make for the vertical and horizontal velocity ofher yam in order to be sure that it will hit Elizabeth’s yam in midair?How long after being thrown will the yams collide? How much can yousay about the location of the collision point without doing any calcula-tions? What additional information would you need to say more aboutthe collision point? Draw a diagram showing the motion of both yams.(Do this first in order to guess the smart thing for Jeanie to do.)

14. Use technology (your calculator or Scientific Notebook) to sketch thetrack of the parametric equations x = sin 8t, y = sin 7t for 0 ≤ t ≤ 2π.Edit the plot so that there are enough points to make a smooth curveand so that there are equal scales on the x and y-axes.

86 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 87: Notes on Calculus

(a) Does the parametric graph pass through the point (1, 1)? Defendyour answer both graphically (for instance by plotting the functionssin 7t and sin 8t on the same set of ty-axes to decide whether theyare ever both equal to 1 for the same value of t) and symbolicallyby working with the properties of sin 7t and sin 8t to determinewhether they can ever both be equal to 1 for the same value of t.

(b) Now sketch the track of x = cos 8t, y = cos 7t. Describe the dif-ferences between motion along this track and along (sin 8t, sin 7t) .Relate this to the even and odd symmetries of the trig functionsinvolved.

PARAMETRIZED CURVES 87

Page 88: Notes on Calculus

88 FUNCTIONS - A FRAMEWORK FOR PROBLEM SOLVING

Page 89: Notes on Calculus

2. RATES OF CHANGE

2.1 Introduction

A one sentence, very incomplete, description of calculus might be that it dealswith the following two kinds of problems:1. Given complete information about the position of an object as a function

of time, find its velocity (= rate at which its position is changing), and2. Given complete information about the velocity of an object and knowl-

edge of its position at one instant, find its position at all times.

These are indeed two very important problems that calculus can solve.However the importance of calculus and its central position in mathematicsdepends on the fact that it can deal with the relationship between any functionand its rate of change, no matter where the function comes from. Here arethree examples leading to the differential calculus, two of which may not beas familiar as position and velocity.

Example 1. Velocity A contestant at the Bellingham Highland Gamesthrows a weight up in the air. (At highland games the weight is thrown forheight, not distance.) Its height in meters t seconds after leaving the thrower’shand is H (t) = 1.5 + 14t− 9.8t2.Question: What is the velocity of the weight half a second after leaving

the thrower’s hand? (Part of this question is to specify exactly what it meansto talk about the velocity at a single instant when that velocity is changingwith time.)

Example 2. Magnification. Imagine a situation where a light shinesthrough a slide (represented by the left hand line below), passes through acomplicated lens and hits a screen (represented by the right hand line below).In particular, suppose that the arrangement projects the point at position x

on the slide onto the point at position x2 on the screen. Thus 2 is carried to4, 3 is carried to 9, and in fact the interval [2, 3] of length one on the slide iscarried to the interval [4, 9] on the screen which is 5 times as long as [2, 3].However the interval [0, 1/3] is projected onto the interval [0, 1/9] which isonly one third as long as [0, 1/3] . So the amount of magnification depends onposition. For x near 0 the lens reduces, but for large x it magnifies, and to agreater and greater extent as x grows.Question: What is the magnification of the lens at x = 2? (Part of this

question is to decide what it means to talk about the magnification at a point.)

RATES OF CHANGE 89

Page 90: Notes on Calculus

Figure 2-1 A Strange Lens (x2)

Example 3. Density. Imagine a non-homogeneous piece of string, thatis, the mass of a unit length of string varies from one place on the string toanother. In particular, suppose that for each positive number x, the mass ofthe left hand x centimeters of string is x2 grams. Thus the first centimeter onthe left has mass one gram, but the mass of the second centimeter of stringis 3 grams (4 grams for the first 2 centimeters minus 1 gram for the left handhalf of it), the mass of the third centimeter is 5 grams ( = 9 - 4) etc.Question: What is the linear density ( = mass per unit length) of the string

two centimeters from the left end? (Part of this question is to decide what itmeans to talk about density at a point.)

Discussion of Velocity We can find the average velocity from t = .5 to

some later time, say t = 1 from the formula (average velocity) =distancetime

. Weget

average velocity =H (1)−H (.5)

.5=5.7− 6.05

.5= −.7,

that is, the average velocity during this half second interval is 0.7 metersper second downwards. (The weight is lower after one second than after halfa second.) A look at the graph of H reveals that although the weight isdefinitely still going up after half a second, it reaches its maximum height atabout t = .7 and so is well on its way down at t = 1. So this is not a verysatisfactory estimate of what is happening after half a second. We can dobetter by looking at a smaller time interval, say from t = .5 to t = .6 :

average velocity =H (.6)−H (.5)

.1≈ 6.37− 6.05

.1= 3.2,

or from t = .4 to t = .5 :

average velocity =H (.5)−H (.4)

.1≈ 6.05− 5.53

.1= 5.2.

These two estimates suggest that the velocity might be about 4 meters persecond near t = .5.

90 RATES OF CHANGE

Page 91: Notes on Calculus

We could repeat these calculations for a shorter time interval, say .01 secondinstead of .1 second. Instead, let’s try to deal with an arbitrary time intervalh. We get

average velocity =H (.5 + h)−H (.5)

(.5 + h)− .5

=

¡1.5 + 14 (.5 + h)− 9.8 (.5 + h)2

¢− 6.05

h

=4.2h− 9.8h2

h= 4.2− 9.8h.

Note that this calculation agrees with the previous one for h = ±0.1. Whenh = −.1 both numerator and denominator are in the “wrong order,” (both arenegative) but the two negative signs cancel, so there is no effect on the result.What happens as we look at shorter and shorter time intervals? It is

apparent that the second term in the expresssion for average velocity shrinksdown to zero as h approaches 0 and so the average velocity gets closer andcloser to 4.2 meters per second.It therefore seems reasonable to define the instantaneous velocity of the

weight at t = .5 to be 4.2 meters per second.

IMPORTANT REMARK. The average velocity over a given time intervalis a quantity that I have calculated directly from information about the heightof the weight. The instantaneous velocity, on the other hand, is an abstractioncreated from the entire collection of average velocities. Note that it is not thesame as any of the average velocities.

Discussion of magnification. The lens projects an interval from theslide onto an interval on the screen. Define the average magnification of theslide interval to be the ratio of the length of the screen interval to the lengthof the corresponding slide interval. For instance, the average magnification of

the interval [2, 3] is 5, because the ratio of lengths is5

1= 5. Similarly, the

average magnification of the interval [2, 2.5] is

2.52 − 222.5− 2 =

6.25− 4.5

=2.25

.5= 4.5

and the average magnification of the interval [1.5, 2] is

22 − 1.522− 1.5 =

1.75

.5= 3.5.

Instead of calculating more examples of average magnification, let’s try toinvestigate the average magnification of intervals with one endpoint at x = 2,say an interval from x = 2 to x = 2 + h. We want intervals both with 2 atthe left hand end and with 2 at the right hand end, which we can accomplishby allowing h to have both positive values (2 is at the left end) and negative

INTRODUCTION 91

Page 92: Notes on Calculus

values (2 is at the right end). We find for the average magnification of theinterval from 2 to 2 + h :

(2 + h)2 − 22(2 + h)− 2 =

4 + 4h+ h2 − 4h

=4h+ h2

h= 4 + h.

Note that this agrees with the previous computations: the average magnifica-tion is 4.5 when h = .5 and 3.5 when h = −.5. Note also that when h < 0we have written both numerator and denominator in “the wrong order”, thatis both are negative, but this doesn’t affect the result since the two negativesigns cancel out. In other words we don’t have to treat h > 0 and h < 0 as

two separate cases because e.g.1.52 − 221.5− 2 =

22 − 1.522− 1.5 = 3.5.

Now we are ready to decide what to mean by themagnification at x = 2.We notice that the magnification of any short interval with x = 2 as oneendpoint is near to 4, and depends on the length of the interval in such away that the shorter the interval, the closer the average magnification is to4. In other words, the average magnification of an interval with 2 as endpointapproaches 4 as the length of the interval approaches 0. In these circumstanceswe say that the magnification of the lens at x = 2 is 4.IMPORTANT REMARK. Note that average magnification is a quantity

that we can measure directly. Magnification at a point, on the other hand, isan abstraction created out of a collection of average magnifications. It is notat all clear how to measure it directly.

Discussion of linear density. The average linear density of a piece ofstring is simply the mass of a piece divided by its length. Since the functionx2 represents the mass in grams of the piece from 0 to x, we find the mass ofa piece located away from x = 0 by subtraction. For instance, the mass of theinterval [0, 3] is 32 = 9 grams and the mass of [0, 2] is 22 = 4 grams, so themass of [2, 3] is 32− 22 = 5 grams. The linear density of this piece is then 5/1= 5 grams/cm. (For brevity, we’ll abbreviate ‘linear density’ to ‘density’ inwhat follows.)Let’s study systematically the average density of short pieces of string with

one end at x = 2. Suppose the string extends from x = 2 to x = 2 + h, whereh may be either positive or negative. (If h > 0, the string covers the interval[2, 2 + h] ; if h < 0, the string covers the interval [2 + h, 2] .) For such a pieceof length |h|, the average density is mass divided by length or

(2 + h)2 − 22(2 + h)− 2 =

4 + 4h+ h2 − 4h

=4h+ h2

h= 4 + h.

Again the average density is close to 4 for short pieces of string, and the shorterthe string, the closer the average density is to 4.In this situation we say that the linear density of the string at the

point x = 2 is 4.

92 RATES OF CHANGE

Page 93: Notes on Calculus

IMPORTANT REMARK: Note that average density is a concrete conceptthat we can measure directly. Density at a point is an abstraction that wehave defined in terms of a whole collection of average densities, but it is farfrom clear how to measure it directly.

2.2 The Derivative of a Function at a Point

In each of the three examples of the preceding section, the mathematics is ex-actly the same. We look at quotients representing the average rate of changeof a quantity (magnification, density, position) over a small change in the inde-pendent variable (time in Example 1, distance in the others), and notice thatas we look at the average rates of change over smaller and smaller changes inthe independent variable, they seem to approach some definite limiting value.This limiting value seems to represent the rate of change (or instantaneousrate of change for emphasis) at the given point. The interpretation is differentin each case, but the mathematics is the same for all three cases.This is a situation ripe for the mathematician’s characteristic activity of

abstraction and generalization. That is, we will pull the mathematical oper-ation out of context and define it separately. In this way we hope to get ageneral tool that can then be applied to a wide variety of different situations,including situations quite different from any of the three that we have used asmotivation.The definition, as you recall, goes like this.

Suppose that the function f is defined for all values of x near some point

a. The derivative of f at a, denoted f 0 (a) ordf

dx(a), is the number that the

difference quotientsf (a+ h)− f (a)

h

approach as h approaches 0, if there is any such number. Symbolically wewrite

f 0 (a) = limh→0

f (a+ h)− f (a)

h= lim

x→a

f (x)− f (a)

x− a.

Note that the two expressions defining the derivative are really just twoways of writing the same thing. The connection is the substitution x = a +h or h = x − a. Each form of writing the difference quotient is the mostconvenient at times, so you should be comfortable with both forms. Note alsothat the definition requires looking at both positive and negative values of h(or, equivalently, at values of x both less than a and greater than a).

THE DERIVATIVE OF A FUNCTION AT A POINT 93

Page 94: Notes on Calculus

2.2.1 The Graphical Interpretation—Secant Lines and Tangent Lines

As you know, there is another very important interpretation of difference quo-tients and the derivative. The difference quotient is the slope of the line joiningtwo points on the graph of f, the points (a, f (a)) and (a+ h, f (a+ h)) . Sucha line is called a secant line.

a a + h

(a + h, f(a + h))

(a,f(a))

Secant Line

Diagrams suggest that as h → 0, the secant lines through the fixed point(a, f (a)) approach more and more closely a line through the point (a, f (a))that is tangent to the graph of f at this point. Thus f 0 (a) , the limit of theslopes of the secant lines, becomes the slope of the tangent line.

tangent line

Secant lines and tangent line

It is not always made clear that when we say this, we are really definingthe tangent line to the graph of f to be the line through (a, f (a)) with slopef 0 (a) , but this is what we are really doing. Pictures of familiar functionsconvince us that this is a reasonable thing to do, but the fact is that thereis no other simple and general way to characterize which line is the tangent

94 RATES OF CHANGE

Page 95: Notes on Calculus

line. (It is common to think of the tangent line as the one that intersects thecurve at the given point and no other—or at least no other point nearby. Thisis most often true, but it turns out that it is not always true, no matter how“nearby” is defined. An example will be given in the exercises at the end ofthis chapter. Thus it cannot characterize what it means to be a tangent linein the calculus sense and so cannot be the definition.)

2.2.2 Interpreting the Derivative

We have seen that for any function f, the value f 0 (a) of the derivative repre-sents the instantaneous rate of change of f at a. For a function that representssome concrete situation, it can be very useful to consider the units involved inorder to interpret the information provided by the derivative. To interpret theunits it is usually best to be concrete by referring to the average rate of changerather than the instantaneous rate, that is by saying that a unit change (orsome other specific amount) of change in the input produces approximately(whatever amount) of increase or decrease in the output. It is best to avoidphrases containing the word “rate” since in this context they tend to serve asa phrase that you can say without thinking about what it means.

EXAMPLE 1. We know that the temperature F in degrees Fahrenheit isgiven in terms of the temperature C in degrees Celsius by

F =9

5C + 32.

ThusdF

dC=9

5. This means that each increase of 1 degree Celsius corresponds

to an increase of9

5degrees Fahrenheit.

EXAMPLE 2. The derivative of f (x) = x2 at x = 2 is 4. What does thistell us about the graph of f?The slope of the tangent line at x = 2 is 4. Thus, along the tangent line, any

change in x causes a change of exactly four times as much in y. For instance,the point on the tangent line with x = 2.2 will have y = 4.8. The point withx = 1.8 will have y = 3.2. Since the tangent line is close to the graph of ffor a short distance the change in the y value along the graph of x2 will beapproximately 4 times the change in x, that is, we expect that f (1.8) ≈ 3.2and f (2.2) ≈ 4.8. (In fact f (1.8) = 3.24 and f (2.2) = 4.84.) Thus we canestimate an actual change by using the instantaneous rate of change, thoughthe result is not exact .Note that referring here to a unit change in x is nota good idea since the graph will have diverged from the tangent line by quitea bit when x changes by 1. So we might say instead that near x = 2,if xincreases by 0.1, then x2 increases by about 0.4.

THE DERIVATIVE OF A FUNCTION AT A POINT 95

Page 96: Notes on Calculus

1.9 2.0 2.1 2.23.6

3.8

4.0

4.2

4.4

4.6

4.8

x

y

(2,4)

(2.2,4.84)

(2.2,4.8)

x2 solid ; tangent line dashed

EXAMPLE 3. Suppose that the cost in dollars of building a house whosefloor area is x square feet is given by the function C (x). What are the units forthe function C 0? What is the practical meaning of the statement C 0 (1300) =45?Thinkof the average rate of change of the cost for a one square foot increase

in area at a given area a. That would beC (a+ 1)− C (a)

1. The numerator

is an amount in dollars and the denominator is an area in square feet. Thissays that the units of the rate of change (either average or instantaneous) aredollars per square foot.In particular, for a = 1300 (a 1300 square foot house) the average rate of

change for a small change in area (such as one square foot) would be approx-imately the instantaneous rate of change of 45, that is, the practical meaningof C 0 (1300) = 45 is that for a 1300 square foot house, each extra square footof area would cost about an extra $45.

EXAMPLE 4. Suppose that H (t) represents the average height in inchesof a boy t years old. What are the units for H 0?What is the practical meaningof the statement H 0 (10) = 2?

An average rate of change, say for one year beginning at age 10,H (11)−H (10)

1would be a number of inches divided by a number of years. Thus the units ofH 0 are inches per year. The statement H 0 (10) = 2 means that an average boywill grow about two inches between his tenth and eleventh birthdays.

Important Comment—Average and Instantaneous Rates of Change

In Example 1 the statement that each increase of 1 degree Celsius correspondsto an increase of 9

5degree Fahrenheit is exact. The estimate in Example 2 that

near x = 2 any change in x will cause a change of about four times as muchin x2 is approximate. Similarly, the statements in Example 3 that the extracost of one square foot is about $45 and in Example 4 that the average ten

96 RATES OF CHANGE

Page 97: Notes on Calculus

year old boy will grow about 2 inches during the year are approximate—evenif we assume that the derivative statements are exact. They are approximatebecause we are using an instantaneous rate to predict an actual change overa period of time. Graphically we are using the slope of the tangent line topredict changes along the actual graph of the curve and this will be exact onlywhen the function is linear as in Example 1.

2.2.3 Numerical Approximation of the Derivative

You have already learned many rules for computing the derivatives of stan-dard calculus functions symbolically. You may not have spent much timecomputing difference quotients, since this is rather tedious to do by hand. Itis, however, much easier to do with a programmable calculator (or a com-puter), and at times can be easier than symbolic calculation by hand. Forinstance, given enough time and paper you could no doubt find the derivativeof f (x) = 3

psin (xx) at x = 2 symbolically. However it would be a hassle.

With a TI-89 it is no harder than just punching in this function as y1, enter-ing (y1 (2 + h)− y1 (2)) /h on the home screen and evaluating at the values ofh to compute the following table:

h f(2+h)−f(2)h

.01 -1.6907-.01 -1.8641.001 -1.7682-.001 -1.7855.0001 -1.7760-.0001 -1.7777

From this we may conclude that f 0 (2) ≈ −1.78.We could get a somewhat moreaccurate approximation by considering even smaller values of h, but there isa limit to this—the TI-89 for instance carries 14 decimal digits internally andreports up to 12. Sometimes the effect is to limit approximations to fewer,perhaps many fewer digits—see exercise 11 at the end of this section.

THE DERIVATIVE OF A FUNCTION AT A POINT 97

Page 98: Notes on Calculus

tangent lin

e

secant line, h < 0

secant line, h > 0

Secant lines and tangent line

The importance here of using both positive and negative values of h is thatas we see in the diagram above, the difference quotients for different signs ofh bracket the derivative. Since the derivative is between them, we can judgehow accurate the difference quotients are as estimates by noticing the size ofthe gap between what we get for positive h and what we get for negative h.Notice that this gap shrinks, as it should, as |h| shrinks. If we need to knowthe derivative to some prescribed amount of accuracy, watching the gaps tellsus when we can quit computing.What determines whether the difference quotients for positive and negative

h trap the derivative between them? A look at the diagram above suggeststhat when the curve is concave up, difference quotients for positive h will begreater than the derivative and difference quotients for negative h will be lessthan the derivative. On the other hand, the diagram below suggests that whenthe graph is concave down the situation is reversed—a difference quotient forpositive h will be less than the derivative, and a difference quotient for negativeh will be more than the derivative.

0.5 1.0 1.5 2.0

-1.0

-0.5

0.5

1.0

x

y

tangent line

secant lines, h < 0

secant lines, h > 0

Secant lines and tangent line

Of course the remarks above about the difficulties of computing some deriv-atives symbolically must be interpreted in the right way. Symbolic manipula-

98 RATES OF CHANGE

Page 99: Notes on Calculus

tors like the TI-89 have no difficulty with the example above—you can verify

that it says ddx

³3psin (xx)

´=(cosxx)xx (lnx+ 1)

3 (sinxx)2/3and that the value of this

function at x = 2 is −1.77686. But even in this age of symbolic manipulatorsthere are occasions when looking at difference quotients is necessary. Severalexamples are explored in the last two sections of this chapter.

EXERCISES.

1. Let f (x) be the elevation in feet of the North Fork of the Nooksack Riverx miles from its source. What are the units of f 0 (x)?What is the sign off 0? What is the practical meaning of the quantity f 0 (20) , that is, whatdoes it say about actual change in height along the river? Explain youranswers with a diagram.

2. Suppose that C (r) is the total cost in dollars of paying off a car loanborrowed at an annual interest rate of r%.What are the units of C 0 (r)?What is the practical meaning of the quantity C 0 (7)? What is the signof C 0?

3. Suppose that P (t) is the monthly payment in dollars on a fixed rate$100,000 mortgage which will take t years to pay off. What are the unitsof P 0 (t)? What is the sign of P 0? What is the practical meaning of thequantity P 0 (30)?

4. Suppose that g (v) is a car’s gas consumption in miles per gallon at speed

v miles per hour. What are the units of g0 (v)? What is the practicalmeaning of the statement g0 (65) = −0.3?

5. Let P (x) be the number of people in the US whose height is x inches orless. Recall that you drew a rough graph of this function in problem 4of section 1.7.

(a) Thinking of the graph as a smooth curve instead of the graph of aninteger-valued function (it would be impossible to tell the differenceby eye with a normal-sized graph) what are the units of P 0 (x)?

(b) What is the practical meaning of P 0 (66)?

(c) Use your graph from the earlier problem and a difference quotientto make a very rough estimate of P 0 (66) . (Just try to get the orderof magnitude right. Would it be thousands? Tens of thousands?Millions? Billions?)

6. Let x be the distance in miles from Red Square along the great circle fromthe classroom through the North Pole, then the South Pole, and finallyback to Red Square. Let T (x) be the temperature in degrees Fahrenheit

THE DERIVATIVE OF A FUNCTION AT A POINT 99

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 100: Notes on Calculus

x miles from the classroom, where we start out going north. The NorthPole is approximately x = 2850, the South Pole is approximately x =15280, and we are back to Red Square at about x = 24860. At the timeof writing this problem (August 2008) the temperature in Red Square is64◦, the temperature at the North Pole is about 30◦ and the temperatureat the South Pole is about −52◦ (it’s winter there!)

(a) The Canadian border is about 17 miles north of Red Square. Sup-pose that I tell you that T 0 (17) = −2.What is the practical mean-ing of this statement? Do you think the graph is approximatelylinear with this slope between Red Square and the border? Ex-plain. Sketch a graph of T for 0 ≤ x ≤ 25 that is consistent bothwith T 0 (17) = −2 and with your best guess about the temperatureat the border.

(b) Use the data at the beginning of the problem to sketch a (very)rough graph of T for 0 ≤ x ≤ 24860. Use appropriate scales on theaxes!

(c) Use your sketch from (b) to answer these questions:

i. For x between 0 and 2850, do you expect T 0 to be always (oralmost always) positive, always (or almost always) negative, orsome of both?

ii. What would you expect a typical value of T 0 to be? Do youbelieve what I told you in part (a)? Why or why not?

7. Use a table of difference quotients to estimate the derivative of 2x atx = 1 correct to three decimal places. Explain, with a diagram of secantlines and the tangent line, how you know that you are that close.

8. Same as #7 for the derivative of 2x at x = −1.

9. Same as #7 for the derivative of sin (x2) at x =√π. (For best results

enter the point as√π, not as a decimal approximation.)

10. Evaluate the limit limx→2

√6− x− 22− x

by making the change of variable

u = 2 − x to convert the expression to a difference quotient for a fa-miliar function whose derivative you know. Do not use, for instance,L’Hospital’s Rule.

11. I tried to estimate limx→0

√x2 + 4− 2

x2by plugging successively smaller val-

ues of x into my TI-89 and got the table below. What conclusion(s)should I draw?

x 10−1 10−2 10−3 10−4 10−5 10−6 10−7√x2 + 4− 2

x2.24984 .249998 .25 .25 .25 .2 0

100 RATES OF CHANGE

curgus
Right Red Arrow
curgus
Right Red Arrow
Page 101: Notes on Calculus

2.3 Average and Instantaneous Rates of Change,Limits

It is essential to understand the difference between these two kinds of rateof change and the relationship between them. An average rate of changeis a very straightforward quantity—the quotient of the change of a quantity(e.g. distance from a reference point) divided by a change in the independentvariable (e.g. time). Graphically an average rate of change is rise over run—theslope of a secant line. In practical situations an average rate of change (or atleast its two constituent parts) can be directly measured.

a a + h

(a + h, f(a + h))

(a,f(a))

h

f(a + h) - f(a)

An instantaneous rate of change is much more mysterious. It is a limit ofaverage rates of change—a number that the average rates of change approachas the denominator (hence also the numerator) shrinks down to zero. Thisis a much more complicated idea conceptually; it is really a mathematician’sidealization that cannot be directly measured. Its value lies in the powerand convenience of the the remarkable collection of rules for manipulatinginstantaneous rates of change (the differential calculus!)—a collection that hasno counterpart for average rates of change.What exactly is a limit, and why do people make such a fuss about them?

Unfortunately, attempts to answer the first question tend to obscure the answerto the second question.Very roughly, the limit, say, of the values f (x) of a function f as x ap-

proaches some real number a is the real number L (if there is one) that thevalues f (x) approach as x approaches a. To make this into a real definition, wewould have to replace the vague word “approach” by something more precisein both places where it appears.The idea of a limit is often illustrated by an example something like the

following. What is the limit of x2−1x−1 as x approaches 1? We cannot just

substitute x = 1 into the expression since it then becomes 00, but we can rewrite

AVERAGE AND INSTANTANEOUS RATES OF CHANGE, LIMITS 101

Page 102: Notes on Calculus

the expression to avoid this problem. We just factor x2 − 1 = (x− 1) (x+ 1)and then cancel factors (which is legal for any x 6= 1) to get

x2 − 1x− 1 =

(x− 1) (x+ 1)x− 1 = x+ 1.

Now it is clear that the expression approaches 2 as x approaches 1, in factwe can just plug x = 1 into the simplified expression x + 1 to get our 2.Some experimentation may convince you that as long as we stick to rationalfunctions of x (or even sums of fractional powers of x), it is always possibleto do something like this manipulation to produce a simpler expression intowhich we can just substitute the limiting value of x.If that was all there was to limits, there would be no need for fuss. Unfor-

tunately (or fortunately, if you like a challenge) “most” limits are less straight-forward than this one.Another example that you have probably seen before: to find the derivative

of the sine function at x = 0 we must find the limit ofsin (h)− sin 0

h=sin (h)

has h approaches 0. Now there is no way to rewrite the expression that will al-low us just to plug in the value h = 0.We can, however, try to guess the limitby drawing the graph of sinh

hfor h near 0 (or equivalently, of sinx

xfor x near

0). We get the picture on the left, or if we zoomway in, the picture on the right.

-4 -2 2 4-0.2

0.2

0.4

0.6

0.8

1.0

x

y

sinxx

-0.010 -0.005 0.000 0.005 0.010

0.999985

0.999990

0.999995

1.000000

1.000005

1.000010

x

y

sinxx

It is certainly tempting to guess from these graphs that the limit exists andis equal to 1. If we are properly sceptical, however, we have to concede thateven the closeup view does not really establish that the limit is exactly 1.The smallest vertical distance that we can see by eye in the right hand graphis about 10−6. So if the limit exists, but equals, say, 1.00000000123456789, theright hand graph would not look any different. We may think that that is notvery likely, but the point is that it is only our fuzzy and rather optimistic faiththat nature (or mathematics) would not play nasty tricks on us that supportsour belief that the limit is 1 and not some other number close to 1.There are two points for discussion here. One is that mathematicians like

exact answers when they can get them, and in any case like to distinguishbetween “exactly” and “nearly”. I have already said something about thisin section 1.8.6. We will talk in class about the fact that the whole formal

102 RATES OF CHANGE

Page 103: Notes on Calculus

computational scheme of calculus that you spent so much time on in a previouscourse depends on quantities being equal rather than just close to one another.(If close was good enough, we wouldn’t need limits such as derivatives orintegrals at all.) The other is that while our optimistic views of what webelieve are very often correct, they are sometimes wrong. The history ofprecision in mathematical analysis (the branch of mathematics that deals inlimits) is to a great extent the history of mathematicians mending their waysafter having been burned—discovering that something that “could not be true”,is true. (Or that something that “must be true”, is not.) Analysts have learnedthe hard way that when they depend solely on their intuition or on physicalanalogies, they are at least occasionally wrong. This is embarrassing and notreally necessary if we take care to avoid it. So precise definitions and carefulproof methods have developed over the past two centuries as a mechanism torefine intuition and avoid even occasional mistakes.In this course we will not go into a really rigorous treatment of limits. (See

Math 226 for that.) But the technicalities of rigorous limits can be viewed, asso often in analysis, as just writing down the inequalities that an appropriatepicture tells you to write down. So here we will concentrate on interpretingdiagrams and on the connection between diagrams and equations or inequali-ties.

EXERCISES.

1. On a sketch like the left diagram below, indicate a length or slope asappropriate to represent each of

(a) f (3) (b) f (3)− f (1) (c)f (3)− f (1)

3− 12. Referring to the left diagram below, indicate which of each pair of num-bers is greater. Explain in each case.

(a) f (2) or f (3) (b) f (3) − f (2) or f (2) − f (1) (c)f (2)− f (1)

2− 1 or

f (3)− f (1)

3− 1 .

0 1 2 3 40

1

2

3

4

5

6

x

yf f

f

3. On a sketch similar to the middle diagram above, pick a value a on thehorizontal axis, a positive number h and mark lengths corresponding to

AVERAGE AND INSTANTANEOUS RATES OF CHANGE, LIMITS 103

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 104: Notes on Calculus

each of the following:(a) f (a) (b) f (a+ h) (c) f (a+ h) − f (a) (d) h (e) Indicate the slope

of a line representingf (a+ h)− f (a)

h

4. On a sketch similar to the right diagram above, pick a value a on thehorizontal axis, a positive number h and mark lengths corresponding toeach of the following: (Note the phrase “corresponding to.” Also indicatewhich of the numbers below are negative.)(a) f (a) (b) f (a+ h) (c) f (a+ h) − f (a) (d) h (e) Indicate the slope

of a line representingf (a+ h)− f (a)

h.

5. Repeat the previous problem using a negative value for h.

6. From The New York Times of May 10, 2007:

“In a reversal of recent trade trends, the trade deficit with China im-proved in March, growing by $17.2 billion, compared with an increase of$18.4 billion the month before.”

Here are statistics from the Census Bureau for US trade with China forthe first six months of 2007 in billions of dollars:

Month Exports Imports BalanceJan 4.364 25.635 -21.271Feb 4.631 23.065 -18.434Mar 5.479 22.725 -17.246Apr 4.849 24.223 -19.374May 5.323 25.338 -20.015June 5.900 27.061 -21.161

(a) Let T (n) be the monthly trade deficit with China in the n-th monthof 2007 in billions of dollars (the absolute value of the last columnof the table).State in terms of T what is equal to 17.2 and what is equal to 18.4.

(b) How is T related to the quantity C that the quote calls “the tradedeficit?” Define carefully what is meant by C (n) . Note that Cshould grow by 18.4 in February and 17.2 in March. (There maybe some choice for C. Make a reasonable choice.)

(c) In terms of your definition of C from the previous part does theauthor of the quote really mean to say that C “improved” in March?Express more accurately what improved in March.

(d) Graph both C and T using a numerical scale of 1 through 6 on thehorizontal axis and a scale in billions (or tens of billions) on thevertical axis..

104 RATES OF CHANGE

Page 105: Notes on Calculus

2.4 Local Linearity

It is not much of an exaggeration to say that the whole point of the differentialcalculus is that a function is “almost linear” near any point at which it isdifferentiable. We will see many applications of “local linearity” in these notes.Pictorially the graph of the function near a point a is almost the same as thegraph of the tangent line to the graph at a. One can recognise this graphicallyby zooming in on a graph and noticing that the farther in you zoom, themore the graph looks like the graph of a straight line. In fact, on a calculatoror a computer it will actually be the graph of a straight line after awhile.(orat least as close to a straight line as your device can manage (not all thatclose for many graphing calculators)) because of limited screen resolution. Inreality (as usual in mathematics this means in your imagination as opposed tosomething you physically see!) the graph will never be quite straight unless itwas straight to begin with.

This is the graphical test for differentiability of a function. To illustrateit, here are a series of closeups for a function at a point at which it is notdifferentiable (|x| at x = 0) and for an apparently similar function which isdifferentiable at the same point (

√x2 + .0001 at x = 0).

-1.0 -0.5 0.0 0.5 1.0

0.5

1.0

x

y

|x|

-0.02 -0.01 0.00 0.01 0.02

0.01

0.02

x

y

|x|

-0.001 0.000 0.001

0.0005

0.0010

x

y

|x|

-1.0 -0.5 0.0 0.5 1.0

0.5

1.0

x

y

√x2 + .0001

-0.02 -0.01 0.00 0.01 0.02

0.01

0.02

x

y

√x2 + .0001

-0.001 0.000 0.001

0.0100

0.0105

x

y

0.0095

√x2 + .0001

Before looking at the situation algebraically, we’ll look at some numericaldata. What is the distinction between the tangent line to a curve at a pointand any other line through the same point?

LOCAL LINEARITY 105

Page 106: Notes on Calculus

tangent lin

e

secant line, h < 0

secant line, h > 0

Secant lines and tangent line

Let’s look at a simple example, say the tangent line to f (x) = x3 at x = 1.We know that the derivative of x3 is 3x2 so that the derivative of f at x = 1is 3. Thus the point-slope equation of the line tangent to x3 at x = 1 is

y − 1 = 3 (x− 1) ory = 3x− 2.

How does that approximate x3 better than y = 2x− 1 or y = 4x− 3, both ofwhich also pass through the point (1, 1)? Here is a table showing for a rangeof values of x the respective values of x2, 2x− 1, 3x− 2, .and 4x− 3.

x x3 2x− 1 3x− 2 4x− 31 1 1 1 11.5 3.375 2 2.5 31.1 1.331 1.2 1.3 1.41.01 1.030301 1.02 1.03 1.041.001 1.003003 1.002 1.003 1.0040.99 0.970299 0.98 0.97 0.960.999 0.997003 0.998 0.997 0.996

Here are graphs of x3 and the linear functions near x = 1.

0.96 0.97 0.98 0.99 1.00 1.01 1.02 1.03 1.040.85

0.90

0.95

1.00

1.05

1.10

1.15

x

y

2x - 1

4x - 3

x3 thick

106 RATES OF CHANGE

Page 107: Notes on Calculus

The graph of y = 3x− 2 is slightly below the graph of x3 except at x = 1, andis clearly somewhat closer to x3 than either of the other two lines, but we willneed to work a little harder to make the qualitative difference clear.

Now here is a table of the differences between the value on each line andthe value on the curve y = x3 at the same values of x. One of the differencesfor x3 − (2x− 1) is illustrated in the diagram just below.

x x3 − (2x− 1) x3 − (3x− 2) x3 − (4x− 3)1 0 0 01.5 1.375 .875 .3751.1 .131 .031 -.0691.01 .010301 .000301 -.006991.001 .001003 .000003 -.0009971.0001 .000100 .00000003 -.0001000.99 -0.009701 -0.000299 0.0102990.999 -0.000997 -0.000003 0.0010030.9999 -0.000100 0.00000003 0.000100

0.90 0.95 1.00 1.05 1.10 1.15

0.8

1.0

1.2

1.4

x

y

The difference x3 − (2x− 1)

If we follow down any of the three right hand columns in the table above, wesee that the numbers in each column get smaller as the difference between xand 1 decreases and that in each column they appear to be approaching 0.This just says that each line is getting closer to y = x3 (measured vertically)as x gets closer to 1. What is different about the column for 3x− 2?Well, thevertical differences are smaller than for either 2x − 1 or 4x − 3, but there ismore to see than this. For either 2x− 1 or 4x− 3 (or indeed any line through(1, 1) except the tangent line) the difference between the value on the line nearx = 1 and the value of x3 near x = 1 is approximately proportional to x− 1.To look at the difference for just those two columns, compare the third and

LOCAL LINEARITY 107

Page 108: Notes on Calculus

fourth columns of the next table to the second column:

Difference Difference/(x− 1)

x x− 1 x3 − (2x− 1) x3 − (4x− 3) x3 − (2x− 1)x− 1

x3 − (4x− 3)x− 1

1 0 0 0 - -1.5 .5 1.375 .375 1.5 .751.1 .1 .131 .069 1.1 -.691.01 .01 .010301 -.009699 1.01 -.969991.001 .001 .001003 -.000997 1.001 -.996990.99 -0.01 -0.0097 0.010299 0.97 -1.02990.999 -0.001 -0.000997 0.001003 0.997 -1.0030

We see that for x very close to 1, x3− (2x− 1) ≈ (x− 1) and x3− (4x− 3) ≈− (x− 1) . To put it another way, if we consider the ratio of x3 − (4x− 3) tox − 1 as x approaches 1, this ratio is approximately −1 and appears to ap-proach −1 as x approaches 1. The fourth and fifth columns of the table aboveshow these ratios. We could say that y-values on each of the lines y = 2x− 1and y = 4x − 3 approach the values of x3 linearly as x approaches 1. Wesaw in Exercise 9 of Section 1.3 that this is how the vertical distance betweentwo intersecting lines behaves as we approach the point of intersection. Equiv-alently, the ratio between the quantities dy and dx in the diagram below isconstant as dx varies.

dy

dx

The ratio of dy to dx is constant for two lines

Furthermore we saw that for two lines the the constant of proportionality(equivalently, the ratio) is just the difference between the slopes of the twolines. For our present difference between x3 and a line through (1, 1) , theslope of the tangent line is f 0 (1) = 3 and the ratio in the chart above is nearlyequal to the difference between the tangent line slope and the slope of thegiven line (3 − 2 = 1 for y = 2x − 1 and 3 − 4 = −1 for y = 4x − 3). Wemight predict that the difference between y = x3 and say, y = 6x− 5 is about(3− 6) (x− 1) = −3 (x− 1) for x near 1 and we would be right. (Try it.)

108 RATES OF CHANGE

Page 109: Notes on Calculus

Now consider a similar table of differences and ratios for y = 3x− 2 :

x x− 1 x3 − (3x− 2) x3 − (3x− 2)x− 1

1 0 0 -1.5 .5 .25 1.751.1 .1 .01 .311.01 .01 .0001 .03011.001 .001 .000001 .0030010.99 -0.01 -0.000299 0.02990.999 -0.001 -0.000003 0.0030

This time, instead of approaching a non-zero limit, the ratios approach 0.In fact, the ratios are approximately equal to 3 (x− 1) , that is, multiplyingx3 − (3x− 2)

x− 1 ≈ 3 (x− 1) through by x− 1,

x3 − (2x− 1) ≈ (x− 1) ,x3 − (4x− 3) ≈ − (x− 1) , butx3 − (3x− 2) ≈ 3 (x− 1)2 .

Since (x− 1)2 is much smaller than x− 1 for x near 1, (the square of a smallnumber is much smaller than the number itself), we see that the verticaldistance between 3x−2 and x3 decreases much faster than the vertical distancebetween any other line through (1, 1) and x3. In fact the vertical distancebetween any other line through (1, 1) and x3 for x near 1 is essentially justthe vertical distance between the line and the tangent line 3x − 2. This iswhy the distances between other lines through (1, 1) and x3 decrease linearlyas x approaches 1—it is essentially just the distance between two lines through(1, 1) .To sum up, the qualitative difference between the line 3x− 2 and all other

lines through (1, 1) is that while all other lines approach the graph of x3 lin-early, 3x − 2 approaches the graph faster than linearly (more precisely like(x− 1)2). Incidentally, this observation is the basis for the study of Taylorseries, which can be thought of as the study of how to approximate a differen-tiable function as closely as possible near a “base point” using a polynomialof degree n.

Now let’s look at the same situation algebraically. The equation of thetangent line to f at a (that means through the point (a, f (a)) ) has theequation in point-slope form

y − f (a) = f 0 (a) (x− a) or

y = f (a) + f 0 (a) (x− a) .

For our example above with a = 1, f (a) = 1, f 0 (a) = 3 it becomesy = 1 + 3 (x− 1) = 3x− 2.

LOCAL LINEARITY 109

Page 110: Notes on Calculus

We know the difference quotientf (x)− f (a)

x− aapproaches f 0 (a) as x ap-

proaches a or equivalently

f (x)− f (a)

x− a− f 0 (a) = (a quantity that approaches 0) as x→ a

or, making a common denominator on the left and rearranging a little,

f (x)− {f (a) + f 0 (a) (x− a)}x− a

= (a quantity that approaches 0) as x→ a

or

f (x)−{f (a) + f 0 (a) (x− a)} = (a quantity that approaches 0) (x− a) as x→ a

This is the general version of the last column of the last table above. In that ex-ample the quantity that approaches 0 as x→ a is about 3 (x− a) . Most oftenthis quantity is nearly proportional to x−a as it is here, that is, the vertical dis-tance from the graph of f to the tangent line, f (x)− {f (a) + f 0 (a) (x− a)} ,is nearly proportional to (x− a)2 . At some points (points of inflection for in-stance) the difference may behave like a higher power of x−a and if I work atit I can make the difference f (x)−{f (a) + f 0 (a) (x− a)} behave like (x− a)k

where 1 < k < 2. (One example is in the exercises.) It is this form of the defin-ition of the derivative that is generally used in careful proofs of the propertiesof differentiable functions that depend on local linearity, such as the chainrule.)Here is the picture, where I have labelled the vertical difference between

the curve and the tangent line as the error term.

f(x)

f(a)

a x

{f(x) - f(a) }} error term

f'(a) (x - a)

Local linearity

To sum up, the tangent line through (a, f (a)) is the unique line with the

property that the vertical distance between it and the graph of f decreases fasterthan linearly as x → a. This is the true characterization of the tangent line.The more familiar “touches the graph only at the given point” is just not truesometimes as we will see shortly, even if we restrict our attention to a smallneighborhood of the point.Most often the vertical distance is like a multiple

110 RATES OF CHANGE

Page 111: Notes on Calculus

of (x− a)2 but this is not always the case. Thus we can regard f near a asthe sum of a linear function (whose graph is the tangent line) and an errorfunction that is essentially smaller than linear as x → a as illustrated justabove.We say that the tangent line is the best linear approximation to f neara. Since, as we have seen, this local linearity comes directly from the definitionof the derivative of f at a, local linearity is the characteristic property ofdifferentiable functions.

2.4.1 What is the Point of Local Linearity?

You may have run into local linearity before, perhaps in the form of differ-entials, as a device for, say, estimating

√4.03 without using your calculator.

While it is true that at some time in the future you may be stranded on adesert island with your calculator ruined by salt water and your only hopeof escaping by raft totally dependent upon your ability to calculate

√4.03 —

perhaps a genie has appeared from a bottle and offered to supply you with araft provided you can do this — it doesn’t seem very likely. It is natural towonder what is the point of learning a method for doing by hand somethingthat can be done much better by your calculator.Fortunately for us calculus teachers, there is more to local linearity than

being able to impress a genie with your ability to estimate√4.03 by hand.

Here are three other uses.(1) It is a recurring theme of this course that success in doing mathematics

depends to a large degree on your ability to judge whether what you are doingmakes sense. One aspect of this is the ability to judge whether the numberproduced by a calculation is correct. This depends largely on your ability tomake intelligent predictions about the likely outcome of the calculation. In thiscontext local linearity is just the next stage of predicting how a function willchange after the basic qualitative fact that a positive derivative correspondsto increasing values. (If f (100) = 203 and f 0 (100) = 27, which of these valuesseems more likely for f (110): 205, 255, 455, 2050? It’s hard to give an informedopinion without using local linearity or the equivalent. Of course if f 0 changesa lot from 27 between 100 and 110, that is if the graph moves a long wayfrom the tangent line, then your first opinion might be wrong, but at least youwould be thinking about the right issues.)(2) A refinement on#1: It is common to use linear approximation in science

and engineering to get a feel for the way in which one quantity depends onanother. For instance, the value of the gravitational “constant” g actuallyvaries from place to place depending on several factors, including latitudeand altitude. Let’s estimate the altitude correction for g. (Remember that grepresents the acceleration due to gravity of an object dropped near the earth’ssurface, assuming that there are no complicating factors like air resistance.)

LOCAL LINEARITY 111

Page 112: Notes on Calculus

We know that gravitational acceleration has the form a = k/r2 where r isdistance from the center of the earth. At the earth’s surface (r ≈ 6.37 × 106meters), g ≈ 9.8 m/sec2, so k ≈ 9.8× (6.37× 106)2 . Now

da

dr= −2k

r3≈ −2 (9.8)× (6.37× 10

6)2

(6.37× 106)3=

−19.66.37× 106 ≈ −3× 10

−6

at the earth’s surface. Thus we have the linear approximation

g ≈ 9.8− 3× 10−6h

at an altitude of h meters, that is, a distance of 6.37 × 106 + h meters fromthe earth’s center. Geophysicists need to adjust for this phenomenon, and the“free air correction for altitude” of −3× 10−6m/sec2 per meter can be foundin tables of physical constants. It is so small, however, that it doesn’t affectordinary life much — at the summit of Mt. Rainier (about 4300 meters) g hasdecreased by about .013 or about 0.13 percent. The weight of a 150 poundperson would be about 3 ounces less than at sea level.(3) As mentioned above, one common pattern for justifying theorems in-

volving rates of change of functions near a given point is that the result wouldbe obvious if the functions involved were all linear, and it remains true evenfor non-linear functions because they are “nearly linear” in very small regions.We will go through this kind of argument later in discussing the chain rule.

EXERCISES.

1. Use local linearity for f (x) =√x near a = 4 to estimate

√4.03 without

using your calculator. Bonus points if you can find a genie who will giveyou something for the answer. (Some years ago there was a Math Fellownamed Jeanie, but she graduated and flew away.)

2. If f (100) = 203 and f 0 (100) = 27, which of these numbers is most likelyto be equal to f (110) : 205, 255, 455, 2050? Explain. Is there some othernumber that you would choose as an estimate in preference to any ofthese?

3. The application of local linearity in part (a)of this problem is often usefulto estimate the magnitude of a small deviation from a fixed square root.

(a) Show that for x near 0,√1 + x ≈ 1 + 1

2x.

(b) Use (a) to show that for x near 0,

√x2 + 4− 2 = 2

³p1 + x2/4− 1

´≈ x2/4.

(Compare with problem 11 of section 2.2.)

112 RATES OF CHANGE

curgus
Right Red Arrow
curgus
Right Red Arrow
Page 113: Notes on Calculus

4. Assume that the graph of the cost C in dollars of building a house as afunction of its floor area (in square feet) is concave down. If C (1300) =50, 000 and C 0 (1300) = 45,

(a) Estimate C (1450) . Do you expect your estimate to be too large ortoo small? Explain, using a diagram.

(b) Estimate C (1170) . Do you expect your estimate to be too large ortoo small? Explain, using a diagram.

5. Suppose that H (t) represents the height in inches of an average boy atage t. Assume that H (13) = 63, that H 0 (13) = 2 and that H has a pointof inflection at t = 13 reflecting the fact that the average boy starts agrowth spurt at that age.

(a) Sketch a graph of H for ages near 13.

(b) Estimate H (15) . Do you expect your answer to be too large or toosmall? Explain.

(c) Estimate H (12) . Do you expect your answer to be too large or toosmall? Explain.

6. You put a haggis, initially at room temperature, in a 200◦C oven.

(a) Sketch a possible graph of the temperature, T , of the haggis asa function of time. What do you expect for the concavity of thegraph?

(b) Suppose that at t = 30, T = 130◦ and is increasing at the instanta-neous rate of 3◦ per minute. Estimate the temperature at t = 40.Based upon the concavity, do you expect this estimate to be toohigh or too low?

(c) Suppose you also know that at t = 50, the temperature of the haggisis 170◦C What estimate can you get for the temperature at t = 40from the two temperatures? Do you expect this estimate to be toohigh or too low?

7. The number of hours, H, of daylight in Bellingham as a function of theday of the year is approximated by

H = 12 + 4 sin [0.017 (t− 80)]

where t is the number of days since January 1. Here is one month’sworth of the graph of H where the horizontal scale refers to the day ofthe month, not the number of days after January 1st as in the formula.

LOCAL LINEARITY 113

Page 114: Notes on Calculus

0 10 20 30

11.4

11.6

11.8

12.0

12.2

12.4

day of month

H

(a) Why does the graph look like a straight line? Would the graph stilllook straight over an entire year? If not, what might it look like?

(b) What is the approximate slope during this month? What does thisrepresent in practical terms?

(c) What month does this graph show? Explain how you know.

8. Investigate local linearity for sinx near x = π/3 by making a table withfour columns similar to the last two tables in the section. The firsttwo columns should show x, and x − π

3, the third column should show

the difference between sinx and the tangent line at¡π/3,√3/2¢and

the fourth column should show the difference between sinx and the linewith slope 1 through this point. Report what multiple of what power ofx−π/3 each difference is approximately equal to. Sketch a diagram thatshows the differences you’re computing. To make this process relativelysimple and accurate using a TI-89, enter the difference between sinxand the line you’re working on as a function on the entry line on thehome screen and then evaluate at points of the form π/3+ h for varioussmall values of h (.1, .01, .001...). Use symbolic values for constants (e.g.π,√3) as much as possible, not decimal approximations. You evaluate

a function at a sequence of points on the TI-89 by following the formulafor the function by the vertical bar (left hand column of the TI-89) andthen x = π/3+h where you just fill in each value you want for h in turn.If you start with h = .1 you can get additional values by just insertingextra 0’s one (or several) at a time.

9. Repeat the process of the previous problem for sinx near x = 0. Foryour table use the tangent line at 0 and the line through the origin ofslope 1/2. (The differences between sinx and the tangent line decreaselike a different power of h than in the previous problem. Why is that?

114 RATES OF CHANGE

Page 115: Notes on Calculus

2.5 Two Fine Points

2.5.1 Is Every Function Differentiable?

The definition says that f 0 (a) is the limit of the difference quotientsf (a+ h)− f (a)

has h approaches 0 if this limit exists. Is there any point to this proviso? Canthe limit fail to exist?If you think of the pictorial interpretation, it is pretty easy to answer this

question. We know that f 0 (a) represents the slope of the line tangent to thegraph of f at the point (a, f (a)) . So one natural way to look for exampleswhere the limit doesn’t exist is to think of examples of points on graphs wherethe tangent line doesn’t exist.Perhaps the most obvious point of this kind is at the origin on the graph

of |x| , where there is a sharp corner, in fact a right angle.

-4 -2 0 2 4

2

4

x

y

|x|If we try the definition out on f (x) = |x| at a = 0, we see that thereare two cases. If h > 0, then f (a+ h) = |h| = h, while if h < 0, thenf (a+ h) = |h| = −h. Thus

f (0 + h)− f (0)

h=

½1 if h > 0,−1 if h < 0.

It is also obvious from the diagram that any secant line from x = 0 to a pointon the right is just the line y = x, while any secant line from x = 0 to a pointon the left is just the line y = −x.Thus there is no single value that the difference quotients approach as |h|

gets small. Instead there are two values, one for positive h and one for negativeh. Thus the absolute value function has no derivative at a = 0. (Of course itis differentiable at any other point—its derivative is 1 at positive a and −1 atnegative a.)In this situation one often talks about one-sided derivatives, or the

derivative from the right and the derivative from the left, but we willprobably not need this idea.

TWO FINE POINTS 115

Page 116: Notes on Calculus

A more subtle problem comes up with the function x1/3 at a = 0. Here thegraph is perfectly smooth, but the limit fails to exist for a different reason.Graphically the problem is that the tangent line is vertical. This corresponds,via the usual “reflect across y = x” principle, to the fact that x1/3 is the inversefunction to x3, a function with a horizontal tangent at a = 0.

-2 -1 1 2

-2

-1

1

2

x

y

x1/3 (x3 dashed)

An even worse situation can develop at a point where the graph has ajump. Consider for example a simplified postage function defined by

f (x) =

⎧⎨⎩ 0 if x = 01 if 0 < x < 12 if 1 ≤ x

.

If we consider difference quotients for a = 1, we see that there are again twocases.

f (1 + h)− f (1)

h=

⎧⎨⎩ 0 if 0 < h < 1,1

|h| if −1 < h < 0.(2.1)

0.2 0.4 0.6 0.8 1.0 1.2 1.4

0.5

1.0

1.5

2.0

x

y

h = -.2h = -.5

As we see from the diagram, secant lines that go to a point on the left of a = 1have positive slopes that get larger and larger as the point 1 + h approaches1 from the left. So here even the one-sided limit from the left does not exist.

116 RATES OF CHANGE

Page 117: Notes on Calculus

This kind of thing will happen at any point where the graph has a jump. Soa function has a chance of being differentiable at a point only if its graph iscontinuous there. (See Section 2.6 for a brief discussion of continuity.)

2.5.2 Why Does a Have to Be an Endpoint?

To define the derivative of a function at a point a in its domain we considerthe average rates of change of the function over intervals of the form [a, a+ h]for many values of h. Whether a is the left endpoint of this interval or theright endpoint depends on whether h is positive or negative, but either waya is certainly an endpoint. A diagram, like the one below, suggests prettystrongly that we would get a better approximation to f 0 (a) for a given valueof |h| by looking at the interval from a− h to a+ h — the so-called symmetric

differencef (a+ h)− f (a− h)

2h. Why don’t we use that instead?

f

dashes: difference quotients dots: symmetric difference

The fact is that numerical analysts do use the symmetric difference rou-tinely to approximate derivatives. The reason we do not use symmetric dif-ferences in the definition has to do with the question of whether a function isdifferentiable at a point as discussed in the previous subsection. Consider forexample, the absolute value function again. All symmetric differences about

TWO FINE POINTS 117

Page 118: Notes on Calculus

0 have the value 0, as is obvious from the diagram:

-4 -2 0 2 4

2

4

x

y

h = 0.5h = 1.5

|x|Thus using symmetric differences would lead to the conclusion that |x| isdifferentiable at a = 0 with derivative 0. More generally, whenever the deriva-tives from the right and left exist but are different, using symmetric differenceswould suggest the average of these two numbers as the derivative. While sucha notion of average derivative might be useful at times, it is clearly a verydifferent notion, since for this “derivative” we cannot use the fundamentalproperty of local linearity.The upshot of the preceding discussion is that, if you already know that the

derivative of a function exists at a point a, then using symmetric differencesis an efficient way to estimate the value of the derivative. But symmetricdifferences cannot tell you whether the derivative exists or not, so using ituncritically will lead to some “false positives.”

EXERCISES.

1. Is (x+ |x|)2 differentiable at 0? Look at this several ways: graph thefunction and zoom in on the graph near 0 to see if there appears to be atangent line at x = 0. (If so, what is its slope?) Look at some differencequotients, remembering to consider both positive and negative h. Afteryou have made up your mind, try using your calculator to compute thederivative symbolically. What does the answer mean for x > 0 andx < 0? Does the answer make sense at x = 0?

2. Let f (x) =½

3x2, x ≥ 0,−17x3, x < 0.

Sketch the graph of f. Is f differen-

tiable at 0? How can you answer this question without using the rules ofdifferentiation?

3. Is |x|3/2 differentiable at 0? Repeat the steps of #1.

4. Where is arcsin (sinx) differentiable? Do this problem graphically. Thenuse your calculator to compute the derivative symbolically. What doesthe expression that it gives you mean?

118 RATES OF CHANGE

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 119: Notes on Calculus

5. (a) Plot the numerical values in the table.

x 0 1 2 3 4 5 6 7f (x) 18 13 10 9 9 11 15 21

.

(b) Where do you think f 0 is positive? Negative?(c) Make the best estimate you can for f 0 (2) , f 0 (5) , f 0 (4.5) , f 00 (4.5) .

6. A few weeks ago I drove to Sea-Tac on I-5. My speed was rather variablebecause of roadwork and some heavy traffic. The table is a record of thedistance f (t) I had traveled from Bellingham after t hours.

t 0 .5 1 1.5 2 2.5 3 3.5f (t) 0 32 60 62 70 85 92 100

.

Estimate f 0 (0.5) , f 0 (1.25) , f 0 (3) .

2.6 Continuity

Continuing with the example of f (x) = |x| at x = 0, now that we know thatthis function is not differentiable there, is there anything nice we can say aboutit? Looking at the graph we see that although the graph has a sharp cornerat x = 0, at least it does not have a jump there. We could draw the graph ona piece of paper without lifting our pencil from the paper. This would not betrue of the function

u (x) =

½1, x ≥ 1−1, x < 1

whose graph looks like this.

-1 1 2 3

-1.0

-0.5

0.5

1.0

x

y

Here we must pick up our pencil as we pass x = 1 and move it to a newlocation.

CONTINUITY 119

Page 120: Notes on Calculus

To formulate the difference a little more carefully, the absolute value func-tion f (x) = |x| has the property that as x approaches 0 (or any other realnumber) the value of |x| also approaches a definite value; 0 in the case of x = 0.Using the limit notation, we can say

limx→0

|x| = 0.

On the other hand, it is not true that the function g whose graph is above hasthis property at x = 1; there is just not a single value that u (x) approachesas x→ 1. (You might be tempted to say that there are two values instead ofone—it just depends from which side you approach x = 1. The fact remainsthat there is no single real number that we could regard as the limiting valueof u (x) as x approaches 1.)We say that if the real number a and all numbers near a, at least on one

side, are in the domain of a function f and if limx→a

f (x) exists and is equal to

f (a) , then f is continuous at x = a. Thus |x| is continuous at x = 0 althoughit is not differentiable there. But the function u (x) =

½1, x ≥ 1−1, x < 1

is not

continuous at x = 1 because 1 is in the domain of u but limx→1

u (x) does not

exist. We say u has a discontinuity at x = 1. Similarly the “postage function”(2.1) of the previous section has discontinuities at x = 0 and x = 1. Note thatwith this definition a function cannot have a discontinuity at a point not inits domain. For instance, if I modified the function u slightly so that the twocases apply when x > 1 and x < 1 then u would no longer have a discontinuityat x = 1 because 1 is no longer in the domain of u. This situation is discussedfurther in the next subsection.It is easy to see that if a function f is differentiable at a point a, then it

must also be continuous there. (If f is nearly linear near x = a then all ofthe numbers f (x) are very near to f (a) when x is near a; to be more precise,f (x) − f (a) ≈ f 0 (a) (x− a) and the right side of this “almost equation”certainly approaches zero as x− a does.) Thus nearly all of the functions wedeal with in this course will be continuous at nearly all of the points in theirdomain.

2.6.1 Plugging Holes in Graphs

One way in which these ideas come up from time to time is in examining

functions with holes in their graphs. A trivial example is f (x) =x2 − 1x− 1 . It

is clear that x = 1 is not in the domain of this function. In the terminologyintroduced at the end of Section 1.2, f has a singularity at x = 1. However,unlike the situation in the previous subsection, we can make a function thatis not only continuous at x = 1 but even differentiable there by noting that

120 RATES OF CHANGE

Page 121: Notes on Calculus

x2 − 1x− 1 = x + 1 for all x 6= 1 so that if we "fill in the missing point" we willget the function F (x) = x + 1 that is continous on the entire real line. Seethe left diagram below.

-1 0 1 2 3

1

2

3

4

x

y

x2 − 1x− 1

-3 -2 -1 1 2 3

-0.6

-0.4

-0.2

0.2

0.4

0.6

x

y

cosx− 1x

Of course this example suffers from the disability that it looks very artifi-cial; we should just have defined f (x) = x+1 in the first place. But sometimesthe same situation arises when it is not so clear how to simplify. Consider for

example the function g (x) =cosx− 1

xwhose graph is on the right above. (We

will see soon why the behavior of this function near x = 0 is a very naturalthing to consider. Can you guess?) Again it is clear that x = 0 is not in thedomain of g. Again it appears from the graph that by filling in a single missingpoint, (0, 0) this time, we would get a smooth curve going through this point.But this time there is no way to simplify the expression to demonstrate thatsymbolically. (And don’t you dare even think of cancelling the x0s.)

Nevertheless, although g (x) =cosx− 1

xis not even defined at x = 0, the

extended function

G (x) =

( cosx− 1x

, x 6= 0,0, x = 0

(2.2)

appears to be both continuous and differentiable at x = 0. (What is yourestimate for G0 (0)?) Note that the apparent slope is misleading, since thescales on the two axes are not the same. This is a case where it is not soclear how to determine G0 (0) exactly without using some big tools from thecomputational apparatus of calculus, though we can certainly estimate it easilywith difference quotients.)

On the other hand, the graph of u (x) =½1, x ≥ 1−1, x < 1

cannot be made

continuous, let alone differentiable by a clever choice of value at x = 1. Wesay that f and g have removable singularities—they can be removed justby adding another point to the graph. But the discontinuity in the graph ofu cannot be removed. It is important to be able to distinguish between thesetwo situations.

EXERCISES.

CONTINUITY 121

Page 122: Notes on Calculus

1. For the function h defined in the previous discussion by filling in thehole in the graph of cosx−1

x, find at least an estimate for h0 (0) . You can

get a rough estimate from the graph, though watch out for the verticalscale. You can get a better estimate from difference quotients. Of courseyou can get an exact answer using the limit command on the TI-89, butthere’s not much sport in that.

2. arctanµ1

|x|

¶is not defined at x = 0. Can you extend the definition by

choosing a suitable value at 0 so that the extended function becomescontinuous at 0? If so what is the value? Is the extended functiondifferentiable at 0? If so, estimate the derivative at 0. Explain.

3. Same questions for arctanµ1

x2

¶.

4. Same questions for1

x+

1

ln (1− x).

5. Same questions for cot(x+ 3x2)− 1/x.

6. Same questions for1

sinx− x2− 1

x.

7. Same questions for x sinµ1

x

¶.

8. Same questions for x2 sinµ1

x

¶. Sketch the graph of the function near 0

(you will have to use your imagination) and the tangent line at 0. Howmany times does the tangent line intersect the curve?

9. (a) What is the domain of f (x) = arctanµr

1− cosx1 + cosx

¶? Can you plug

the holes in the graph of f (x) so that it becomes continuous for all realnumbers x? What property of arctangent does this depend on?(b) What is the domain of its derivative? Can you plug holes so that itbecomes differentiable for all real numbers? Discuss parts (a) and (b) interms of the formula and also the graph of f .

10. Let fa (x) = arctanµ√

x+√a

1−√ax

¶, where a is a positive constant.

(a)What is the domain of this function? Graph fa for several values of aand describe the essential features of the graph and how they change with a.

(b) Findd

dxfa (x) . What is the domain of this function?

122 RATES OF CHANGE

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 123: Notes on Calculus

(c) Can you plug the holes in the graph of fa (x) = arctan

µ√x+√a

1−√ax

¶and/or its derivative so that one or both becomes continuous for all real num-bers x? Weird, isn’t it?

2.7 Mean Value Theorem

There is another useful and very intuitive connection between secant lines andtangent lines. The symmetric difference diagram in section 2.5.2, looked atfrom anther point of view, suggests that for a fixed secant line joining (a, f (a))and (b, f (b)), the slope of tangent lines at points between a and b are likely tobe closer to the slope of the secant line than the slopes of the tangent lines atx = a or x = b are. In fact, the following seems reasonable:

Theorem 5 If f is continuous on [a, b] and differentiable for each x, a < x <b, then there is at least one point c so that

f 0 (c) =f (b)− f (a)

b− a.

So, the slope of the tangent line at the intermediate point c is equal to theslope of the secant line. If f (t) represents distance traveled at time t, thenthe right side of the equation is the average velocity between t = a and t = band the theorem says that there must be an instant during the time intervalwhen the instantaneous velocity has that value. So if you averaged 60 mphbetween Bellingham and Seattle, your instantaneous velocity was exactly 60mph at least once.How could we prove this theorem? It turns out to be partly easy and partly

quite difficult. Here is the easy and intuitive part. How could you go aboutfinding c geometrically? One way would be to move a copy of the secant lineup or down, keeping it parallel to the secant line until it just barely touchesthe curve. At this point it would have to be tangent to the curve. See thediagram below where there are two such points.

a b

MEAN VALUE THEOREM 123

Page 124: Notes on Calculus

We can make this visual argument more symbolic. Notice that if we mea-sure distance from the secant line joining (a, f (a)) to (b, f (b)) in the directionperpendicular to the secant line, then the point of tangency is the point on thecurve farthest from the secant line. Thus the way to “find” c is to maximizedistance from the secant line in the proper direction.In the special case where f (a) = f (b) , the secant line is horizontal and the

perpendicular direction is vertical. Thus if f (c) is the greatest value of f on[a, b] (or the least—whichever is different from the common value f (a) = f (b))then c should work.

x

y

When f (a) = f (b)

In this case we can justify the conclusion that a value c exists where f 0 (c) =0 a little more formally in terms of difference quotients like this. If the maxi-

mum (say) of f between a and b occurs at c, then for x > a,f (x)− f (a)

x− a≤ 0

since the denominator is positive and the numerator is not. On the other

hand, if x < a, thenf (x)− f (a)

x− a≥ 0, since the numerator and denominator

have the same sign. Thus if f 0 (c) exists, its only possible value is 0, since thatis the only number that can be approached both by positive numbers and bynegative numbers.The general Mean Value Theorem follows immediately from this—we just

have to flatten out the original picture. If now f (a) 6= f (b) , define a new

function g by g (x) = f (x) − x− a

b− a(f (b)− f (a)) . We see g (a) = f (a) and

also g (b) = f (a) . Thus there is c between a and b so that

0 = g0 (c) = f 0 (c)− 1

b− a(f (b)− f (a)) or

f 0 (c) =f (b)− f (a)

b− a.

124 RATES OF CHANGE

Page 125: Notes on Calculus

2.7.1 The Catch

The catch in the above argument is this: we found c as the point where themaximum (or minimum) value of f occurs between a and b. But how do weknow that there is a maximum or minimum value? This is a variation onthe difficulty about the existence of

√2 discussed in section 1.4.2. You may

be tempted to reply that the existence of a maximum is obvious. Certainlylooking at the graphs your calculator draws makes it seem so. But if youthink about it for a moment you will see that this apparently obvious prop-erty actually depends on the nature of the real number system. And whatare the real numbers? The numbers that can be represented as decimals inyour calculator are all rational numbers (fractions) with denominators that are

powers of 10. (For instance, 2.317 =2317

1000; 0.56789 =

56789

100, 000and so forth.)

But the maximum property is false in the rational number system!

A simple example is the function f (x) =1

x2 − 2 which is defined for everyrational number between 0 and 3, but which has neither a maximum nor aminimum value in that interval. (By choosing a rational x sufficiently close tothe irrational

√2 you can make |f (x)| greater than any preassigned bound.)

It also has both positive and negative values without ever equaling zero, so theIntermediate Value property is also false for the rational numbers, but that isnot so important here.

The problem here is that the rational number system is full of “holes” andthings can happen in the holes that are not happening at any rational number.From this perspective, the point of the real number system is that it fills upall the holes between the rationals. On the other hand, most real numbersthat are not rational (irrational numbers!) are honestly named in that theyseem to have no reason for existing except as filler. The real number system,like a geometer’s line, is an object that exists only in our imagination, andis not to be found in the physical world. But then it is a little dangerous topresume very much about its properties without a careful definition to workwith. However it is no simple matter to give a careful definition of the realsand I will not attempt it here. See Math 312 for a justification of the maximumproperty, based on at least some additional information about the reals.

EXERCISES.

1. For the differentiable function f with this graph on [−3, 3] determineapproximately all points c for which the conclusion of the Mean ValueTheorem holds for the interval [−2, 2] . The values of f at ±2 are in-dicated on the graph. Does it matter that the horizontal and verticalscales are not the same? Explain.

MEAN VALUE THEOREM 125

curgus
Right Red Arrow
Page 126: Notes on Calculus

-3 -2 -1 1 2 3

-6

-4

-2

2

4

6

x

y

2. Explain how it follows from the MVT that |sinx− sin y| ≤ |x− y| forall real numbers x and y.

3. Let f (x) = 1 − x2/3 for −1 ≤ x ≤ 1. Then f (−1) = f (1) = 0 butthere is no x in [−1, 1] wheref 0 (x) = 0. Why does this not contradictthe MVT?

4. Suppose that a and b are two consecutive zeros of f 0. Show that f hasat most one zero between a and b. (What happens if f has two zerosbetween a and b?)

5. Let f be differentiable on [a, b] .

(a) Show that if f 0 > 0 on [a, b] , then f is increasing on [a, b] .

(b) Is it possible for f to be increasing on [a, b] even if f 0 (c) = 0 forsome c in [a, b] as long as f 0 (x) > 0 otherwise? Either show thatthis is not possible or give an example where it happens.

6. Suppose that c is a fixed number between a and b. For a differentiablefunction f on [a, b] must there be u and v in (a, b) such that

f 0 (c) =f (u)− f (v)

u− v?

Either show that this is so, or find an example where there is no such uand v.

126 RATES OF CHANGE

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 127: Notes on Calculus

3. COMPUTING DERIVATIVES

3.1 Introduction

As you know, the apparatus we use for calculating the derivatives of “standardcalculus functions” has two parts. One is a fairly short list of derivatives ofspecific functions. The other is a list of different ways to relate the derivativeof a function to the derivative of several of its parts (the sum rule, the productrule, the chain rule and so forth). A summary of these two lists appears forreference purposes as the last section of this short chapter. We are not goingto go through the construction of all of this apparatus, since you are alreadyquite familiar with it. It does seem worthwhile, however to go over just acouple of less obvious points, concentrating, as usual on where things comefrom when that is not straightforward and how to interpret them.

3.2 Derivatives of Trigonometric Functions

How can we calculate the derivative of sinx symbolically? Just for comparison,note that computing the derivative of power functions is pretty easy. To findthe derivative of x2 at x = a, we must write down the difference quotient,simplify it and take the limit as h→ 0:

(a+ h)2 − a2

h=2ah+ h2

h= 2a+ h→ 2a as h→ 0.

All the steps are simple, and passing to the limit is particularly simple. Withhigher powers (or fractional powers) the algebra gets more complicated, butwhen you finally get that done, the value of the limit is still obvious. If thiswas all there was to limits you might well wonder what all the fuss is about.Now let’s try sinx. We can still write down the difference quotient, and if

we remember the addition rule for sine, we can simplify to some extent:

sin (a+ h)− sin ah

=sin a cos (h) + sin (h) cos a− sin a

h

=cos (h)− 1

hsin a+

sin (h)

hcos a.

To finish calculating the derivative, we must find these two limits:

limh→0

sin (h)

hand lim

h→0

cos (h)− 1h

.

COMPUTING DERIVATIVES 127

Page 128: Notes on Calculus

Note that the first one of these is actually the difference quotient for thederivative of sinx at x = 0 (since sin 0 = 0) and the second one is the differencequotient for the derivative of cosx at x = 0 (since cos 0 = 1). Thus in order tocompute the derivative of sinx exactly at any point, we just need to be ableto compute the derivatives of sinx and cosx at the single point x = 0.Neither limit is obvious in the way that the limit encountered in differ-

entiating x2 is obvious, and neither one can be done by any process of puresymbolic manipulation. In this course, we will settle for a picture — a graphof these two functions of h near h = 0 — but in later courses like Math 226you may look a little more carefully at how to justify what we see. The twographs look like this:

-4 -2 2 4-0.2

0.2

0.4

0.6

0.8

1.0

h

y

sin (h) /h

-4 -2 2 4

-0.6

-0.4

-0.2

0.2

0.4

0.6

h

y

(cos (h)− 1) /hFrom the graphs it seems likely that

limh→0

sin (h)

h= 1 and lim

h→0

cos (h)− 1h

= 0.

Thus the derivative of sinx at x = a is 0 · sin a+1 · cos a = cos a. No surpriseshere! (Since each of the limits is a difference quotient, another thing we couldhave done is to estimate them numerically.)The computation of the derivative of cosx is pretty similar. You just have

to remember (or look up) the addition formula for cosine. The same specificlimits appear, but now they have already been evaluated.IMPORTANT COMMENT. There is something hidden in the calcu-

lation above that is worth mentioning. For the graphs above, we used radianmeasure for sine and cosine. What difference, if any, would it have made touse degree measure? One way to answer that is to try it on your calculator.Set it to degree measure, and then graph sin (h) /h near h = 0. Do you getthe same thing? What would the differentiation formula for sinx look like ifwe used degree measure for sine and cosine? Think of the way that the graphof sinx changes if you change from radians to degrees. (The horizontal scalechanges. The period is now 360 instead of 2π.) Does this make the result seemreasonable? You can check your conclusion by putting your TI-89 in degree

mode and then calculatingd

dxsin (x) symbolically. Don’t forget to change

back to radian mode afterwards or you will get messed up whenever you useyour calculator for trig in a math class.)

128 COMPUTING DERIVATIVES

Page 129: Notes on Calculus

EXERCISES1. Write out the computation for the derivative of cosx at x = a. When

you need to evaluate the limits displayed above, you can just use the valueswe found from the graphs.2. Do all the stuffmentioned in the Important Comment just above. What

is the differentiation formula for sinx now? What is a symbolic form for theconstant that arises, and where does it come from?

3.3 Derivatives of Exponential Functions

Now we are going to look at the derivative of bx where b > 0. As with the trigfunctions, we can simplify the difference quotient to some extent, but then werun into the problem of evaluating a limit whose value is far from obvious.Starting with the difference quotient at x = a,

ba+h − ba

h=

ba¡bh − 1

¢h

=bh − 1h

ba.

The result is that the derivative of bx at x = a is just a multiple of thefunction we started with. But the value of the multiplicative constant,

limh→0

bh − 1h

, is far from clear. We can see that it is just the derivative of bx at

x = 0 (since b0 = 1) but this does not give us a value for it. It looks like itdepends on the value of the base b, but we can’t be absolutely sure of this.So far we can sum up what we know by writing

d

dxbx = Cbb

x

where Cb = limh→0

bh − 1h

is just the value of the derivative of bx at x = 0. We

can evaluate this limit graphically or numerically, as before, for any specificbase b. Using difference quotients with b = 2 or 3 and h = 0.0001, we find

C2 ≈2.0001 − 1.0001

≈ .693 and

C3 ≈3.0001 − 1.0001

≈ 1.099.

Thus ddx2x ≈ .693 ·2x and d

dx3x ≈ 1.099 ·3x. (I have used ≈ to remind you that

the two sides are only nearly equal, since the decimal approximations to theconstants are correct only to three decimal places.)Look at the graphs of these two functions near x = 0, remembering that the

constant Cb represents the slope of the tangent line to the graph as it crosses

DERIVATIVES OF EXPONENTIAL FUNCTIONS 129

Page 130: Notes on Calculus

the y-axis. If we think that bx should increase more rapidly as b increases, it isnatural to guess that somewhere between b = 2 and b = 3 there is a “mysterybase” m for which the slope of the tangent line as its graph crosses the y-axisis exactly equal to 1. Here is a graph with 2x, 3x and, as a dashed curve toremind us of its so far ethereal status, mx.

-0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5

0.6

0.8

1.0

1.2

1.4

x

y

2x3x mx

2x, 3x and mx

How can we find a value for this “mystery base?” We can find it to anydesired degree of accuracy by just calculating constants Cb, looking at theresult, and then trying another value for b chosen to bring us closer to thedesired Cb = 1. For instance, we might generate the following table, where Ihave begun with the two values already chosen.

b Cb

2 .6933 1.0992.75 1.0122.72 1.00062.7185 1.000082.7183 1.0000072.71828 0.9999993

Here I chose values successively for b by looking at the preceding values andpretending that Cb is a linear function of b. (A specific example of the use oflocal linearity!) For instance, having tried b = 2 and b = 3 I saw that 1 isabout one fourth of the way from C3 to C2, so my next guess was one fourthof the way from 3 to 2.You can see that we have closed down pretty well on the “mystery base.”

And of course we already know its name — it is the number e. So this is why ex

is, from the point of view of calculus, a more “natural” exponential functionthan 2x or 3x. The differentiation formula could not be simpler: d

dxex = 1 ·ex =

ex. We will take this property as the definition of the number e : e is theunique number such that the graph of the exponential function withbase e crosses the y-axis with slope 1.

130 COMPUTING DERIVATIVES

Page 131: Notes on Calculus

You may think that this construction leaves the exact value of e a littlemysterious–all we have done in the table above is to find an approximationto e correct to 5 decimal places–but a little thought makes it clear that anydecimal that we can actually calculate is at best an approximation to e correctto some finite number of decimal places. In this way e is like π or

√2. (Actually

the issue here is whether e is an irrational number like the other two. It turnsout that it is, but this is pretty hard to prove.) Thus, as with π or

√2, the

precise definition of e is given not by the decimal, but by the property that itsatisfies.It will turn out soon that we can identify the constant Cb in terms of a

familiar function of b, but we need another tool first.

EXERCISES (not about the stuff above)

Refer to Example 2 in section 2.1 of the notes and the mapping diagramsin section 1.2.4 in connection with these exercises.

1. What function would produce a constant magnification factor of 3? 4?Any constant magnification k?

2. Suppose light passes through two lenses successively. One magnifies 4times, and the other 3 times. What is the combined effect of passing throughboth lenses?

3. Express the situation in #2 in terms of the constant magnificationfunctions from #1 — that is, find three functions expressing the three differ-ent magnifications of #2. How can you “construct” the function expressingthe combined magnification from the two functions giving the two individualmagnifications?

4. In the situation of #2, suppose the first lens still magnifies 4 timesfrom all positions, but suppose now that the second lens is as in Example 1:it sends light at x to x2. What is the magnification at x = 1 of light passingthrough the first lens and then the second lens? Can you find a formula forthe magnification at a of light passing through both lenses?

5. In the situation of #4, reverse the position of the lenses, so that the“x2-lens” now comes before the “4 times lens. Does this make any differenceto the combined magnification?

6. What rule of differentiation are these problems about?

3.4 The Product and Chain Rules

The rule for the derivative of the sum of two functions is just what you wouldguess it to be. Guessing the product and chain rules is a bit harder, so it seemsworthwhile to explain “why” they are as they are. It is not the purpose of this

THE PRODUCT AND CHAIN RULES 131

Page 132: Notes on Calculus

course to offer a modern proof of these rules—see Math 226 for that—but I willtry to make them seem reasonable.

3.4.1 The Product Rule

Apparently when Gottfried Leibniz first thought about the derivative of theproduct of two functions in 1675 he made the obvious assumption—the deriv-ative of the product is the product of the derivatives. Ten days later he hadthe correct formula, due at least in part to experimenting with the derivativesof polynomials. Isaac Newton had known the form of the product rule in 1665but did not publish it (or anything else concerning his development of thecalculus) until after Leibniz had done so. Newton offered a tricky “proof” insupport of the product rule which is perhaps worth remembering today onlybecause it was one of the calculus arguments that Bishop George Berkeleychose to attack in his book The Analyst (1734) where he criticized calculus forbeing far less rational than theology and its practitioners for blindly follow-ing rules they didn’t understand. Berkeley’s criticism was somewhat justifiedsince it does appear that most of the calculus users of the seventeenth centurywere to some extent Vulgar Mechanicks in the sense of the quote from IsaacNewton at the beginning of these notes. It is my personal opinion that New-ton himself had an accurate conceptual understanding of the limit conceptbut not the modern technical terminology for manipulating inequalities thatultimately allowed mathematicians at the beginning of the nineteenth centuryto finally justify what they had been doing for a century and a half.For a first intuitive look at the product rule think of trying to find the rate

of growth of the total personal income I of the US, thought of as the product ofthe population P and the per capita income C, both of which are functions oftime. In a short time ∆t during which both the population and the per capitaincome increase, one source of increased income is that each extra individualsupplies one person’s income, that is we have the increase ∆P × C. There isalso the increase in per capita income times the number P of people with thatextra income, P × ∆C. Thus, roughly at least the total change in income isapproximately

∆I ≈ ∆P × C + P ×∆C.

Now dividing by the elapsed time∆t gives the average rate of change of incomeas

∆I

∆t≈ ∆P

∆t× C + P × ∆C

∆tand the limit as ∆t→ 0 is the familiar form of the product rule.You should probably be a little uncomfortable about my having in effect

said that we can consider the effects of the change in population and the changein per capita income separately, holding each in turn constant while lookingat the change in the other. This exact issue will resurface in multivariable

132 COMPUTING DERIVATIVES

Page 133: Notes on Calculus

calculus where we will study how the rate of change of a quantity (total incomehere) that depends on several varying quantities (P and C here) changes inresponse to changes in the varying quantities. This is the subject of “partialderivatives” which is fundamental to the study of the differential calculus ofseveral variables, but we need not go into it here. (You’ll have to wait forMath 224.)Instead, here is a pictorial interpretation of the product rule which suggests

that the calclulation above is correct. If we think now of P and C as lengthsthat vary with time, then I becomes the area of the rectangle with sides Pand C. In time ∆t, P changes by an amount ∆P and C changes by an amount∆C, giving us this diagram where I have assumed that both P and C areincreasing.

P

C

C

Δ

We see that the difference between the changed area (P +∆P ) (C +∆C) andthe original area PC consists of three parts, ∆P × C on the right, P × ∆Con the top and ∆P × ∆C in the top right corner. This is the part of thedifference that we ignored in the example of income growth just above. Apicture is certainly not a proof, but the picture makes it very plausible thatthe part ∆P ×∆C is relatively insignificant in the sense that as ∆P and ∆Cshrink, then their product not only shrinks but in fact occupies a smaller andsmaller percentage of the total change. For instance if each of ∆P and ∆Care 10% of P and C respectively, then ∆P × C and P ×∆C would each beabout 10% of the new area while ∆P × ∆C would only be about 1% of thenew area, that is, about a twentieth of the change. If each of ∆P and ∆C are1% of P and C respectively, then ∆P ×C and P ×∆C would each be about1% of the new area while ∆P × ∆C would only be about 0.01% of the newarea, that is, about one two hundredth of the total change. (You should dothe calculations to see that you agree with these assertions.) That is why wecan ignore this little piece in the corner as we pass to the limit and consideronly the two larger parts of the change.There is a variation on the product rule that looks much more linear than

the rule itself. Recall that the logarithmic derivative of a function f is the

THE PRODUCT AND CHAIN RULES 133

Page 134: Notes on Calculus

quotientf 0

f. You can think of it as representing the “per capita” rate of change

of f, that is, the rate of change per unit of f. If, for instance, f (a) = 50 andf 0 (a) = 2 then we expect changing from a to a + 1 in the input to result ina change of about 2 in the output, that is, about a 4% change in the value

of f. Correspondingly,f 0 (a)

f (a)=2

50= .04. Now the “logarithmic form” of the

product rule is (using the notation of the previous example)

(PC)0

PC=

P 0C + PC 0

PC=

P 0

P+

C 0

C

so that for instance we would expect a 1% increase in P and a 2% increase inC to result in a 1 + 2 = 3% increase in I = PC. Once again we are saying ineffect that we can look at the individual rates of change one at a time and addthe effects.

3.4.2 The Chain Rule

The chain rule relates the derivative of a composite function to the derivativesof its parts. For functions changing at a constant rate, it is just common sense.If light passes through two lenses, where the first lens magnifies by a factorof 3 (that is, it triples the length of an interval, or moves any two points 3times as far away from each other than they were initially) and the second lensmagnifies by a factor of 5, then the total effect of passing through both lenseswould be to magnify by a factor of 3×5 = 15. Thus the combined effect is theproduct of the individual effects.Symbolically, a function changing at a constant rate is a linear function:

F (x) = ax + b. If we consider the composition H of F with another linearfunction G (x) = cx+ d we have

H (x) = F (G (x)) = aG (x) + b = a (cx+ d) + b = acx+ ad+ b.

Thus the slope of H is ac, the product of the slopes of F and of G.For the composition of functions with variable rates of change, one might

hope that the derivative of the composition would still be the product of thederivatives of the individual functions, but as the mapping diagram belowindicates, one would expect to have to evaluate the individual derivatives atthe right points. For instance, if H (x) = sin (x2) , what is H 0

³pπ/2

´? Here

G (x) = x2, F (x) = sinx. If x =pπ/2, then x2 = π/2, so H

³pπ/2

´=

sin (π/2) = 1. If H 0³p

π/2´is a product of the derivative of x2 somewhere

and the derivative of sinx somewhere, then the diagram below

134 COMPUTING DERIVATIVES

Page 135: Notes on Calculus

-2

-1

0

1

2

-2

-1

0

1

2

-2

-1

0

1

2

Figure 3-1 Mapping diagram for sin (x2)

suggests that we should evaluate the derivative of x2 atpπ/2 (the point where

we evaluate x2 to compute H³p

π/2´) and the derivative of sinx at π/2 (the

point where we evaluate sinx to compute H³p

π/2´.).

This expectation turns out to be right. The chain rule in functional nota-tion for the composition H (x) = F (G (x)) is that for a point a in the domainof H (or in the domain of the inner function G)

H 0 (a) = F 0 (G (a))G0 (a) .

A careful justification of the chain rule relies on local linearity—that in asmall interval (all that is required to compute the derivative at a single point)every function is nearly linear. Verifying that the errors—the deviations fromlinearity—die out as x→ a takes more careful bookkeeping than is appropriatehere, but it is reasonably easy to see the outline of how it would go.We must look at the diffrence H (x) −H (a) and how it relates to x − a.

We will use local linearity in the form that for x near a, and any functiondifferentiable at a,

f (x)− f (a)

x− a= f 0 (a) + (a quantity that → 0 as x→ a) or

f (x)− f (a) = f 0 (a) (x− a) + (a quantity that → 0 as x→ a) (x− a) .

Thus for the inner function G local linearity says that

G (x)−G (a) = G0 (a) (x− a) + (a quantity that → 0 as x→ a) (x− a) .

Now H (x)−H (a) = F (G (x))−F (G (a)) and using local linearity for F nearthe point G (a) in its domain (that is as we shift from evaluating F at G(a)to evaluating F at G (x)),

F (G (x))− F (G (a)) = F 0 (G (a)) {G (x)−G (a)}+ {a quantity that → 0 as G (x)→ G (a)} (G (x)−G (a)) .

THE PRODUCT AND CHAIN RULES 135

Page 136: Notes on Calculus

Substituting for G (x)−G (a) from the local linearity statement for G,

F (G (x))− F (G (a)) = F 0 (G (a))G0 (a) (x− a)

+F 0 (G (a)) {a quantity that → 0 as x→ a} (x− a)

+ {a quantity that → 0 as G (x)→ G (a)} (G (x)−G (a)) .

Dividing through by x−a to get the difference quotient that we must examineto compute the derivative of H :

H (a)−H (a)

x− a=

F (G (x))− F (G (a))

x− a

=F 0 (G (a))G0 (a) (x− a)

x− a+ (other terms) .

The first term is just what we want. The first of the other terms is (aftercancelling factors of x − a) just F 0 (G (a))× (a quantity that → 0 as x→ a)and so certainly approaches 0 as x→ a. The second term is more complicatedbut has the form :

G (x)−G (a)

x− a(a quantity that → 0 as G (x)→ G (a))

where the quotient approaches the limit G0 (a) and the other expression ap-proaches 0 as x → a because G is a continuous function. Thus the termapproaches G0 (a) × 0 = 0 also. Checking that all of this really works outrequires some facility with the methods of rigorous analysis, but there are nosurprises.

EXERCISES.

1. Use the product rule and the chain rule to derive the quotient rule. (Ifyou can’t quite remember the quotient rule, so much the better. Use the

other rules to figure it out, recalling that1

f (x)= (f (x))−1 .)

2. The frequency of vibration of a guitar string is given by f =1

2L

rT

ρwhere L is the length of the string, ρ is its linear density, and T is thetension in the string.

(a) Use logarithmic differentiation to estimate the percentage change(and direction of change) in the frequency resulting from a 1% in-crease in the tension in the string.

(b) You have a strange instrument in which you can adjust the lengthof the strings as well as their tension. If you decrease the length ofa string by 1%, how would you expect to adjust the string’s tensionto produce the same note as before?

136 COMPUTING DERIVATIVES

Page 137: Notes on Calculus

3. A book has 400 words per page.

(a) If there are 9 pages per section, then how many words per section?

(b) You can read 200 words per minute. How many pages per minute?How many minutes does it take you to read one section?

(c) Show how to express the units of the answer in (b), minutes/section,in terms of the units of the given numbers (words/page, pages/section,words/minute) and verify that combining the numbers in that waydoes produce the answer in (b).

3.5 First Applications of the Chain Rule

We will look at some standard applications of the chain rule in the next chap-ter. For the moment here are two applications of the chain rule to buildingthe apparatus of symbolic differentiation

3.5.1 Derivative of bx

We know that for each b > 0 there is a constant, presumably depending on b,so that

d

dxbx = Cbb

x.

We chose the number e precisely to be the base for which this constant isequal to 1. We can now use this simple derivative formula and the chain ruleto identify Cb for all b > 0. The key is to translate all exponential functionsinto functions with base e. We write

bx = eln(bx) = ex ln b.

Differentiating,

d

dxbx =

d

dxex ln b = ex ln b

d

dx(x ln b) = ex ln b ln b = ln b · bx.

That is, Cb = ln b. (In particular, Ce = ln e = 1, which is what we should get!).

3.5.2 Differentiating Inverse Functions

We begin by noting that, as usual, for functions changing at a constant rate,the result is clear. For instance, suppose that one piece of bubble gum costs

FIRST APPLICATIONS OF THE CHAIN RULE 137

Page 138: Notes on Calculus

5 cents. (I am remembering fondly the days of my youth.) Then the functiongiving the cost in cents of n pieces of bubble gum is C (n) = 5n. Clearlyeach increase of 1 in pieces of bubble gum corresponds to a cost increase of 5cents, that is, C 0 = 5. (Writing C 0 does imply that I am thinking of n as areal number that is not necessarily an integer, so that difference quotients forsmall h make sense.) The inverse function, C−1, would be a “buying powerfunction.” C−1 (c) would be the number of pieces of bubble gum you couldbuy for c cents. Since it takes 5 extra cents to get one extra piece of gum, ifyou have 1 additional cent, you will be able to buy only 1/5 of an additionalpiece of bubble gum. (And then only if you know an unusually cooperativemerchant.) Symbolically, (C−1)0 = 1/5. The point is that the rates of changeof the bubble gum cost function and its inverse the buying power function arereciprocals. We will see that this is generally true, but that as soon as therates of change are no longer constant, you have to be very careful about thefact that the derivative of a function and of its inverse function are reciprocalsnot when you evaluate them at the same point, but when you evaluate them atcorresponding points in their respective domains.We begin by deriving a formula for the derivative of lnx. Note first of all

that if we just look at the difference quotient, the result is not that clear:

ln (x+ h)− lnxh

=ln x+h

x

h=ln¡1 + h

x

¢h

.

We need to find the limit of this quantity as h → 0. The value of the limitis not very clear by inspection, and furthermore the value appears to dependon x, so that there is more to do here than there was with the trigonometricfunctions, where it was only necessary to compute derivatives at x = 0 in orderfor algebra to suffice to provide derivatives at all other points.Fortunately we can avoid the problem of evaluating this limit by proceeding

in a completely different way. lnx is the inverse function to ex. Thus elnx = x.Let’s differentiate both sides of this equation, using the chain rule on the left,and the fact that d

dxex = ex. We get (reversing the sides of the equation),

1 =d

dxx =

d

dx

¡elnx

¢= elnx

d

dxlnx = x

d

dxlnx.

Now if we just divide both sides of the equation by x, we get the familiarformula

d

dxlnx =

1

x.

For a second example of this procedure, we’ll find the derivative of an in-verse trig function. Recall that sinx is increasing for −π/2 ≤ x ≤ π/2 withrange −1 ≤ y ≤ 1, so that we can define an inverse function, generally denotedby arcsinx, with domain −1 ≤ x ≤ 1 and range −π/2 ≤ y ≤ π/2.

138 COMPUTING DERIVATIVES

Page 139: Notes on Calculus

-1.5 -1.0 -0.5 0.5 1.0 1.5

-1.5

-1.0

-0.5

0.5

1.0

1.5

x

y

sin

arcsinx

x

sinx and arcsinx

Thus x = sin (arcsinx) for −1 ≤ x ≤ 1. As before,

1 =d

dxsin (arcsinx) = cos (arcsinx)

d

dxarcsinx

so thatd

dxarcsinx =

1

cos (arcsinx).

This is a little less satisfactory than the previous example, since the denom-inator here is a rather inscrutable function. We can, however, make it look agood deal more familiar by using some triangle trigonometry. Set θ = arcsinxso that sin θ = x. We can visualize a right triangle with hypotenuse 1 wherethe side opposite the angle θ has length x. By Pythagoras, the third side haslength

√1− x2. Thus cos θ = cos (arcsinx) =

√1− x2.

1x

√(1- )x2

θ

Plugging this into the result obtained from the chain rule,

d

dxarcsinx =

1√1− x2

.

FIRST APPLICATIONS OF THE CHAIN RULE 139

Page 140: Notes on Calculus

Now we’ll try the general case. The first part of the preceding two examplescan be carried out for any invertible function f and its inverse f−1. (Don’tconfuse this with the use of the exponent −1 to mean reciprocal. For instancethe reciprocal of the sine function is the cosecant function, cscx, whose domainconsists of all real numbers except multiples of π (where sinx = 0) and whoserange consists of all numbers greater than or equal to 1 in absolute value. Thisis nothing at all like arcsinx. It is exactly to avoid this confusion that the “arc”terminology for inverse trig functions is used. Unfortunately for most otherinverse functions one must just cope with the ambiguity of the -1 notation anddraw the correct meaning from context.)Differentiating x = f (f−1 (x)) , we get

1 =d

dxf¡f−1 (x)

¢= f 0

¡f−1 (x)

¢ ¡f−1

¢0(x)

or ¡f−1

¢0(x) =

1

f 0 (f−1 (x)).

Here we see explicitly what was implicit in the two previous examples: thederivative of f−1 is the reciprocal of the derivative of f provided you evaluatethe two derivatives not at the same point, but at corresponding points. Theslope of the tangent line to the graph of f−1 at the point (x, f−1 (x)) is thereciprocal of the slope of the tangent line to the graph of f at the point(f−1 (x) , x) obtained by reversing the coordinates.

-1 1 2 3 4

-1

1

2

3

4

x

y

(e,1)

(1,e)

Tangents to ex at (1, e) & lnx at (e, 1)

The tangent line to y = ex at the point (1, e) has slope e. The tangent line to

y = lnx at (e, 1) has slope 1/e.

EXERCISES

1. Use the chain rule, the fact thatd

dxsinx = cosx, and the identity cosx =

sin (x+ π/2) to show thatd

dxcosx = − sinx. Explain where the chain

rule was used

140 COMPUTING DERIVATIVES

Page 141: Notes on Calculus

2. Finish the direct calculation of ddxlnx by showing that if u = h/x, then

ln (x+ h)− lnxh

=1

x

ln (1 + u)

u.

Interpret the quotient on the right side of this equation in terms of thegraph of lnx (Slope of what secant line?). Evaluate the limit numerically

to findd

dxlnx.

3. The hyperbolic cosine function, coshx, is defined by coshx =ex + e−x

2.

Verify that the derivative of cosh is the hyperbolic sine function sinhx =ex − e−x

2. What is the largest interval containing x = 1 on which coshx

is invertible? Find the derivative of the inverse of the restriction of coshxto this interval at x =

5

4. (Suggestion: don’t look for a formula for the

inverse. Verify and use that cosh(ln 2) = 5/4. ) Illustrate with a diagram.

4. coshx is also invertible on an interval containing x = −1, and it is alsotrue that cosh (− ln 2) = 5/4.What is the derivative at 5/4 of this inverseof a restriction of coshx? Illustrate with a diagram.

5. Now derive a formula for the derivative of the inverse of of the restrictionof coshx to the interval containing x = 1 from the previous problem. You

will need thatd

dxcoshx = sinhx and that cosh2 x−sinh2 x = 1.What is

the derivative of the restriction of coshx to the interval containing −1.Illustrate the answer with a diagram.

6. Explain why the function f (x) = x3 + 2x− 1 has an inverse function g.Find the domain and range of g. How do you know? Show that (−1, 0)is on the graph of g and find g0 (−1) . Illustrate your reasoning with adiagram.

7. This problem is still about the function f (x) = x3+2x−1 and its inverseg.

(a) Find (at least approximately) all points x such that f (x) = g (x) .(Suggestion: Remember from problem 5 in section 1.8.1 where fand g can intersect.)

8. Explain why tan 2x has an inverse function g forπ

4< x <

4. What

is the domain and range of this inverse? Graph tan 2x restricted toπ

4< x <

4and its inverse g on the same set of axes. (Why don’t I

want to call the inverse function arctan (x)?) Show that³0,π

2

´is on the

FIRST APPLICATIONS OF THE CHAIN RULE 141

Page 142: Notes on Calculus

graph of g and determine g0 (0) . Don’t try to work out a general formulafor g or g0. Just reason from the diagram.

9. Derive a formula for the derivative of arctanx.What are the domain andrange of arctanx and its derivative?

10. Derive a formula for the derivative of arccosx.What are the domain andrange of arccosx and its derivative?

11. Derive a formula for the derivative, arcsecx, of secx. What are the do-main and range of arcsecx and its derivative?

12. Find the largest possible interval containing x = 3 on which |sinx| hasan inverse g. What is the domain and range of g? Sketch a graph of gand compute g0 (1/2) .

142 COMPUTING DERIVATIVES

Page 143: Notes on Calculus

3.6 Summary of the Rules of Differentiation

Derivatives of some common functions

1.d

dxxa = axa−1, a any real number

2.d

dxlnx =

1

x

3.d

dxax = (ln a) ax, a any positive number

4.d

dxsinx = cosx;

d

dxcosx = − sinx; d

dxtanx = sec2 x

General rules of differentiation

5. Sums: (f + g)0 (x) = f 0 (x) + g0 (x) whenever f and g are differentiableat x

6. Multiples: (af)0 (x) = af 0 (x) whenever f is differentiable at x and a isany real number

7. Products: (fg)0 (x) = f 0 (x) g (x) + f (x) g0 (x) whenever f and g aredifferentiable at x

8. Quotients:µf

g

¶0(x) =

f 0 (x) g (x)− f (x) g0 (x)

(g (x))2whenever f and g are

differentiable at x and g (x) 6= 0

9. Chain rule: (f ◦ g)0 (x) = f 0 (g (x)) g0 (x) whenever f 0 (g (x)) and g0 (x)exist

Common uses of the chain rule

10.d

dxua = aua−1

du

dxif u is a differentiable function and a any real number

11.d

dxeu = eu

du

dxif u is a differentiable function

SUMMARY OF THE RULES OF DIFFERENTIATION 143

Page 144: Notes on Calculus

3.7 Velocity Along a Parametrized Curve

3.7.1 Measuring Change of Position

We want to discuss the velocity of an object moving along a parametrizedcurve x = f (t) , y = g (t) in the plane. Since velocity is the rate of change ofposition, we must first discuss displacement (change of position) in the plane.If an object moves from the point (2, 1) to the point (3, 5) , then its x-

coordinate has increased by 1 and its y-coordinate has increased by 4. Simi-larly, if an object moves from (−2, 0) to (−1, 4) or from (3,−2) to (4, 2) thenagain its x-coordinate has increased by 1 and its y-coordinate has increasedby 4. In each case there has been the same displacement—the same change ofposition.We can indicate the result of this displacement symbolically as the result

of a kind of addition. To go from (2, 1) to (3, 5) we write

(2, 1) + [1, 4] = (3, 5) .

Note that you can regard this addition as just adding first components andalso adding second components.The reason for the square brackets in [1, 4] is that this does not refer to the

point (1, 4) in the plane, but to something different—a movement by one unitto the right and four units up. It is not associated with any particular placein the plane and we could equally well write

(−2, 0) + [1, 4] = (−1, 4) or (3,−2) + [1, 4] = (4, 2) .

Normally we prefer to add like to like, rather than adding different kinds ofobjects to one another. So let’s change our point of view a little to rememberthat the coordinates of a point in the plane really indicate displacement fromthe origin—the point (1, 2) is by definition one unit to the right and two unitsup from the origin. So for discussing position of objects in the plane we’llthink solely in terms of displacement from some base position. What positionis normally clear from context.Thus we can now rewrite all the sums above using square brackets through-

out, e.g.[2, 1] + [1, 4] = [3, 5] .

To repeat: our interpretation of this equation is that if we start two unitsto the right and one unit up from the origin (or any other base point in theplane) and move a further one unit to the right and four units up, then wefinish three units to the right and five units up from the base point.Note that the order of adding the terms does not matter. (In fancy lan-

guage, this new addition is commutative). We also have

[1, 4] + [2, 1] = [3, 5] .

144 COMPUTING DERIVATIVES

Page 145: Notes on Calculus

There is a nice pictorial version of this whole process. We can indicate thedisplacement from (2, 1) to (3, 5) pictorially by drawing an arrow with its tailon (2, 1) and its head on (3, 5) . The displacement from (−2, 0) to (−1, 4) orfrom (3,−2) to (4, 2) would be indicated by arrows of the same length anddirection, but located at a different point in the plane. From the point of viewof measuring displacement, it is the length and direction that matter, not theposition. So we will view all three arrows as the same, and denote any one ofthem by [1, 4] .

-2 2 4

-2

2

4

x

y

3 pictures of [1, 4]

We can give an arrow version of the addition of displacements. For thearrow version of [2, 1] + [1, 4] = [3, 5] , represent [2, 1] by an arrow with tailat (0, 0) , [1, 4] by an arrow with tail at the head of [2, 1] that is, at the point(2, 1) and [3, 5] by an arrow at with tail at (0, 0) as in the diagram below.

0 1 2 30

1

2

3

4

5

x

y

[3,5] [1,4]

[2,1]

[2, 1] + [1, 4] = [3, 5]

The interpretation of the diagram is that if we start at (0, 0) and move firsttwo units to the right and one unit up, and then move from there an addtionalone unit to the right and four units up, the effect is the same as a single moveof three units to the right and five units up.

VELOCITY ALONG A PARAMETRIZED CURVE 145

Page 146: Notes on Calculus

Operationally, we can think of the addition of arrows as taking place viathe parallelogram law : if we place the arrows we’re adding along two sides ofa parallelogram, the sum will be the diagonal of the parallelogram.

0 1 2 30

1

2

3

4

5

x

y

[2, 1] + [1, 4] = [3, 5]

Here I have filled in the other two sides of the parallelogram with dashed lines.Of course they can be thought of as showing the addition of the two arrows inthe opposite order.Looking at the pictures, we are reminded that the arrows have a definite

length. It is just given by the distance formula (or the Pythagorean Theorem).For instance, the length of [2, 1] is

√22 + 12 =

√5. Since this is a measure of

size, we will use a notation reminiscent of absolute value for numbers and writek[2, 1]k =

√5. This also suggests another operation on displacements. Moving

4 units to the right and 6 units up would produce an arrow pointing in the samedirection but twice as long, k[4, 2]k = 2

√5 (check if you doubt it). So it seems

reasonable to regard [4, 2] as being twice [2, 1] , that is, [4, 2] = 2 [2, 1] . Herethe arithmetic also happens component-wise, but we really are multiplyingtwo different things together—the real number 2 and the displacement [2, 1] toget the displacement [4, 2] .It is time to make all of this more formal in the following definitions.

Definition 6 A vector is an ordered pair [a, b] of real numbers. Vectors canbe added

[a1, b1] + [a2, b2] = [a1 + a2, b1 + b2]

and multiplied by real numbers

c [a, b] = [ca, cb] .

The length (or norm) of a vector [a, b] is the number

k[a, b]k =√a2 + b2.

146 COMPUTING DERIVATIVES

Page 147: Notes on Calculus

So formally vectors are algebraic objects. But in working with vectorsit is essential to keep the picture of vectors as directed arrows in mind. Thedefinition does not mention displacement, because vectors turn out to be usefulfor many purposes besides measuring displacement. In this course we will beinterested in vectors related to motion almost exclusively, but you will seevectors used to represent other quantities in other courses.

3.7.2 Velocity in the Plane

Now that we know what displacement is, we are ready to talk about the rate ofchange of displacement, that is, about velocity. So imagine a particle movingaround the unit circle according to the parametric equations

x = cos t

y = sin t.

What is the velocity of this particle, say at time t = 0 when it is at the point(1, 0)? In studying velocity for motion in a straight line in Section 2.1 we beganwith average velocity—just the ratio of a change of position (a displacement)to the elapsed time for that displacement, and then defined the instantaneousvelocity at a given time a as the limit of average velocities as the elapsed timeshrinks down to zero. (Strictly, that’s what instantaneous velocity is wheneverthe limit exists.) Now we will do the same again, but it will be a calculationwith vectors.At time 0 the particle is at (1, 0) . At time π/2 it is at (0, 1) . The displace-

ment is the vector [−1, 1] and the average velocity of the particle from t = 0to t = 1 is

average velocity =displacement

time=[−1, 1]π/2

=2

π[−1, 1] =

∙−2π,2

π

¸.

-1.0 -0.5 0.5 1.0

-1.0

-0.5

0.5

1.0

x

y

[-1,1]

VELOCITY ALONG A PARAMETRIZED CURVE 147

Page 148: Notes on Calculus

In particular the average velocity is a vector. What is still a number isthe average speed of the particle. This is the length of the average velocityvector, that is

average speed =

°°°°∙−2π , 2π¸°°°° = 2

√2

π≈ .90.

Just as with motion along the x-axis we now make the time interval shorter.For instance, if we consider t = 0 and t = π/4 we have a displacement ofh√

22− 1,

√22

iso that

average velocity =displacement

time=

h√22− 1,

√22

iπ/4

=4

π

h√22− 1,

√22

i≈ [−.373, .900]

andaverage speed = k[−.373, .900]k ≈ .974.

The general calculation, from t = 0 to t = h would be

average velocity =displacement

time=[cosh− 1, sinh]

h=

∙cosh− 1

h,sinh

h

¸.

Now we have to decide what happens as h → 0. But we can recognize whathappens to each of the components of this vector (or I hope we can), sinceeach component is just a difference quotient for a familiar derivative. The firstis the difference quotient for the derivative of cos t at t = 0 and the second isthe difference quotient for the derivative of sin t at t = 0. So we know

cosh− 1h

→ − sin 0 = 0; sinhh→ cos 0 = 1

and hence∙cosh− 1

h,sinh

h

¸→ [0, 1]

as h → 0. Thus it seems natural to decide that the instantaneous velocity ofthe particle at t = 0 is the vector [0, 1] . The instantaneous speed is the lengthof this vector, that is, the speed at t = 0 is 1.If we repeat the calculation at t = a, that is, consider the displacement

from t = a to t = a+ h and divide by the elapsed time h we get

average velocity =displacement

time=[cos (a+ h)− cos a, sin (a+ h)− sin a]

h

=

∙cos (a+ h)− cos a

h,sin (a+ h)− sinh

h

¸and again we can recognize the components of average velocity as differencequotients for computing the derivative of cosine and sine at t = a, that is,

cos (a+ h)− cos ah

→− sin a; sin (a+ h)− sinhh

→ cos a

148 COMPUTING DERIVATIVES

Page 149: Notes on Calculus

and we conclude that the instantaneous velocity of the particle at t = a is thevector [− sin a, cos a] . Note that the instantaneous speed is k[− sin a, cos a]k =1 for all a. Thus the speed is constant (as one would intuitively think for sucha regular motion), but the direction of the velocity keeps changing.

-1.0 -0.5 0.5 1.0

-1.0

-0.5

0.5

1.0

x

y

Velocity vectors

Note that each of the velocity vectors is tangent to the track of the particle.This is what we expect—over a very short time the curve is approximately astraight line, the tangent line. Thus the change of position is approximately inthe direction of the tangent line. In the limit, as the elapsed time shrinks downto zero, the direction of the instantaneous rate of change should be exactly thetangent direction.There is one more very important general observation we can make from

thinking about generalizing the example to an arbitrary parametrized curve,x = f (t) , y = g (t) . If we compute average velocity from t = a to t = a+h, thedispacement vector is [f (a+ h)− f (a) , g (a+ h)− g (a)] and we will againget a vector of difference quotients:

average velocity =displacement

time=[f (a+ h)− f (a) , g (a+ h)− g (a)]

h

=

∙f (a+ h)− f (a)

h,g (a+ h)− g (a)

h

¸so that taking the limits as h → 0 will again give us a vector of derivatives.This will be the instantaneous velocity along the curve at t = a :

instantaneous velocity at t = a is [f 0 (a) , g0 (a)] .

VELOCITY ALONG A PARAMETRIZED CURVE 149

Page 150: Notes on Calculus

The instantaneous speed is just the length of the velocity vector:

instantaneous speed at t = a is k[f 0 (a) , g0 (a)]k =q(f 0 (a))2 + (g0 (a))2.

Example 1. For the motion along a parabola when we first consideredparametric equations, we had

x = (t− 1)2

y = t.

The velocity vector associated with this motion is [2 (t− 1) , 1] with speedq4 (t− 1)2 + 1. Notice that as t increases from some negative value, the x-

component of the velocity is negative until t = 1, reflecting the fact that thex-component of position is decreasing (the particle is moving to the left alongthe lower arm of the parabola), and thereafter the x-component of velocity ispositive, reflecting the fact that the particle is now moving back to the rightalong the upper arm of the parabola. Clearly the speed is very large when |t|is large (approximately 2 |t| in fact) and is smallest at t = 1 when the particleis at the vertex of the parabola.

Example 2. For the thrown yam from the same section, it was stated atthe time that initially the vertical velocity and horizontal velocity were bothequal to 10. We can now say that the initial velocity vector is [10, 10] , so thatthe actual initial speed of the yam is k[10, 10]k = 10

√2 ≈ 14.14meters/second.

Since the position of the yam at time t is x = 10t, y = 2 + 10t − 4.90t2, thevelocity of the yam at time t is [10, 10− 9.80t] . The yam will be at its highestpoint when this vector points horizontally (just between when it still pointsupward and when it starts pointing downward), that is, at t = 10/9.80 ≈ 1.02seconds. (The same value as before, of course.)

3.7.3 Acceleration in the Plane

Acceleration is the rate of change of velocity, so the process of computing ac-celeration from velocity is the same as the process of computing velocity fromposition. Formally, to define the instantaneous velocity at time a along a para-metrized curve [f (t) , g (t)] we would start by computing average acceleration—the change in velocity from time a to time a+ h divided by the time intervalh. Thus we have

[f 0 (a+ h) , g0 (a+ h)]− [f 0 (a) , g0 (a)]h

=

∙f 0 (a+ h)− f 0 (a)

h,g0 (a+ h)− g0 (a)

h

¸.

Now we take the limit as h→ 0. This will turn out as before—taking the limitcomponent-wise will give us the vector whose components are the derivatives

150 COMPUTING DERIVATIVES

Page 151: Notes on Calculus

of the components of the velocity vector, that is, the instantaneous acceler-ation at time t along the curve [f (t) , g (t)] is the vector of seoncd derivatives[f 00 (t) , g00 (t)] .

For uniform circular motion at unit speed, [cos t, sin t] with velocity vector[− sin t, cos t] we get acceleration [− cos t,− sin t] . This is a vector pointingdirectly toward the origin. In particular, the acceleration is perpendicular tothe velocity. See the diagram below, where the acceleration vectors are thethick ones.

-1.0 -0.5 0.5 1.0

-1.0

-0.5

0.5

1.0

x

y

Velocity and acceleration vectorsuniform motion

If we consider non-uniform circular motion, say the parametrized curve[cos (t2) , sin (t2)] , then the velocity vectors [−2t sin (t2) , 2t cos (t2)] are stilltangent to the unit circle, though of increasing length as t increases. Theacceleration vectors, [−2 sin (t2)− 4t2 cos (t2) , 2 cos (t2)− 4t2 sin (t2)] , also in-crease in length. They are no longer perpendicular to the corresponding veloc-ity vectors, though they seem to be becoming closer to perpendicular as timeincreases.

VELOCITY ALONG A PARAMETRIZED CURVE 151

Page 152: Notes on Calculus

-1 1 2

-2

-1

1

x

y

Velocity and acceleration vectorsNon-uniform motion

It is natural to ask whether the magnitude of the acceleration vector is therate of change of speed (the magnitude of the velocity vector). The answer isthat it is not quite that simple. We can interpret the acceleration vector asconsisting of two components—one component parallel to the velocity vector,and one component perpendicular to the velocity vector. It seems reasonablethat the component parallel to the velocity vector should act to change thespeed, and in particular that the magnitude of this component should givethe rate of change of the speed, and that the component perpendicular to thevelocity vector should act to change the direction of motion. This is all true,but requires more analysis of the components of a vector. That is the subjectof the next section.For the uniform motion example above there is no change in speed, only

a continual change in direction. That is why the acceleration vectors areperpendicular to the velocity vectors in this case. In the example of non-uniform motion, the fact that the motion is speeding up but the curve remainsthe same means that the rate of change of direction is increasing rapidly. Thinkof driving a car around a fixed circular track. The faster you go, the harderit is to turn fast enough to keep on the track—it would take more force to doso, and according to Newton’s second law, the force required is proportionalto the acceleration required. This is why the component of the accelerationvector perpendicular to the velocity vector is getting longer as time increases.

EXERCISES

1. Find the displacement vector

(a) from A to B in the diagram

(b) from A to C in the diagram

152 COMPUTING DERIVATIVES

Page 153: Notes on Calculus

(c) from B to A in the diagram

(d) from C to B in the diagram

0 1 2 3 4 50

1

2

3

4

5

x

y

A

B

C

2. Find all vectors v having kvk = 7 and first component −3. Illustratewith a diagram.

3. Write each of the pictured vectors a,b, and c in component form.

0 1 2 3 4 50

1

2

3

4

x

y

a

b

c

4. For these vectors, a = [2, 1] , b = [−1, 6] , c = [4,−7] , perform each ofthe indicated operations and illustrate the operation with a diagram:

(a) a+ b and a− b(b) 2a− c(c) kak and k2ak and k−2ak(d) k2b+ ck

5. .

VELOCITY ALONG A PARAMETRIZED CURVE 153

Page 154: Notes on Calculus

(a) Write v in component form if kvk = 8 and the direction of v is asindicated in the diagram, where the scales on the two axes are thesame:

x

y

(b) Find a unit vector pointing in the same direction as v.

6. .

(a) Show that the track of the parametric equations x = 3 cos t, y =

4 sin t is the ellipsex2

9+

y2

16= 1.

(b) Sketch the track, indicating the direction of travel corresponding tothe given equations. What range of values of t corresponds to onecomplete circuit of the ellipse?

(c) When and where is the speed of this motion the greatest? Theleast? (At what time(s) and also at what point(s)?)

(d) Where on the ellipse is the slope of the tangent line equal to −1?(Use the velocity vector. Can you think of another way to do this?)

7. Consider the ellipse centered at the origin and crossing the x-axis atx = ±4 and the y-axis at y = ±3.

(a) Write a set of parametric equations for which the motion shouldtrace out the ellipse starting from (0, 3) at t = 0, and moving coun-terclockwise to complete one complete circuit at t = 2.

(b) What is the maximum speed of a particle moving according to theparametric equations in (a)? At what point or points on the ellipsedoes the maximum occur?

(c) Compute the velocity and acceleration vectors when t = 0, andwhen t = 1/3. You should find that the first pair is perpendicular,but the second pair is not. In terms of the discussion above, howcan you relate how the speed is changing at t = 0 to the direction

154 COMPUTING DERIVATIVES

Page 155: Notes on Calculus

of the acceleration vector there. How can you relate how the speedis changing at t = π/2 to the direction of the acceleration vectorthere?

8. Consider the motion moving according to the parametric equations x =cos2 t, y = cos t sin t.

(a) Find the velocity and speed of the particle at time t. Show that thespeed is constant.

(b) Sketch the track of the motion. Find an equation in x and y for thetrack.

(c) Find the acceleration. Show that if the tail of the acceleration vectorat time t is placed at the point (x (t) , y (t)) , then the accelerationvector points directly toward a fixed point at all times. What isthis point?

(d) Rewrite the original parametric equations to make all the answersto the previous parts much more obvious. (This is a trig identityquestion.)

9. A waterskier gets a ride behind a jet boat, starting from the point (0, 1) .Her position (in meters) t seconds later is (te−t, et) .

(a) What is her initial velocity? Initial speed (in meters per second)?

(b) After a short time she becomes alarmed about the boat’s accelera-tion and lets go of the tow rope. If she then coasts directly north(that is, parallel to the positive y-axis), when did she let go? Whatwas her speed then?

(c) If she had held on, what would her speed have been after 5 seconds?

10. Recall the parametric equations describing the motion of Colin the cock-roach from section 1.10.

(a) Since Colin’s speed along the beam is a constant 5 feet per minute,do you expect that his acceleration is zero?

(b) At what speed is Colin’s distance from the origin changing? Whatis Colin’s speed through the air? Does either (or both) change overtime?

(c) How do the two speeds from part (b) compare near t = 0? How dothey compare when Colin is near the center?

(d) Compute Colin’s acceleration. Explain how what you get is consis-tent with your answers about Colin’s speed.

VELOCITY ALONG A PARAMETRIZED CURVE 155

Page 156: Notes on Calculus

11. A point moves in the plane according to the parametric equations x =t1/3, y = t for −1 ≤ t ≤ 1.

(a) Eliminate t to find the equation of the track of the point in the formy = f (x) .

(b) Find the speed of the point at time t. When (and where on thetrack) is it a minimum? When (and where) is it a maximum?

(c) What is the acceleration vector here? Discuss how it relates to thebehavior of the point.

12. The position of a particle at time t is given by x = a cos3 t, y = a sin3 t wherea > 0 for 0 ≤ t ≤ 2π.

(a) Sketch the track of the particle, indicating the direction of motionand the particle’s location when t is a multiple of π/2. This trackis called an astroid, or a hypocycloid of four cusps.

(b) Relate the track to the graphs of cos3 t and sin3 t.

(c) Explain why the track of the particle has the equation x2/3+y2/3 =a2/3.

(d) Find the velocity and speed of the particle. What is the maximumspeed and where does it occur? What is the minimum speed andwhere does it occur?

13. Consider the parametric equations x = 3t2, y = t3 − 3t.

(a) Sketch the track of a point moving according to these equations,indicating the direction of motion.

(b) There is a loop in the track. At what values of t does the particlestart and finish the loop?

(c) What the is the minimum speed, and where does it occur? What isthe maximum speed and where does it occur?

14. The track of the parametric equations x = t− sin t, y = 1−cos t is calleda cycloid.

(a) Sketch the graph of the cycloid for 0 ≤ t ≤ 4π.(b) For which value(s) of t is the speed of motion along the cycloid the

greatest? The least?

(c) For which value(s) of t is the velocity vector parallel to the x-axis?To the y-axis?

(d) Sketch the velocity and acceleration vectors at times t = π/2, t =π, t = 3π/2, and t = 2π.

156 COMPUTING DERIVATIVES

Page 157: Notes on Calculus

(e) A bicycle wheel of unit radius sits on the x-axis with its center ini-tially at (0, 1) . A piece of chewing gum is stuck to the tire, initiallyat (0, 0) . The tire moves to the right along the x-axis. Show thatthe track of the chewing gum is given by x = t− sin t, y = 1− cos t.What does t represent? (Not time!)

(f) Suppose that the radius of the bicycle wheel is two units. Whatwould the parametric equations for the track of the chewing gumbe now? What is the smallest positive value of t for which thechewing gum touches the x-axis?

15. Two ants, Anita and Annabel, are sitting on the top of the minute handof the clock in our classroom a few seconds before noon. As the secondhand passes 12 at noon, it carries Annabel off with it.

(a) Write sets of parametric equations for the motion of Anita (on theminute hand) and Annabel (on the second hand) using time inminutes as the parameter and the fact that both hands are exactly10 inches long.

(b) Write a displacement vector showing Annabel’s position as viewedfrom Anita’s position. Thinking of this as a set of parametric equa-tions, sketch the track for the time it takes Annabel to make onecomplete revolution and catch up to Anita’s new position. Howlong does this take?

(c) When is Annabel farthest from Anita? What is the angle betweenthe hands at that time? (Think!)

(d) (This is extra—computations are a little messy, though of course youcould do it graphically.) When is the distance between Annabel andAnita changing the most rapidly? What is the angle between thehands then?

16. If an object moves through the air subject only to the force of gravity,its position as a function of time is given by parametric equations of theform

x = A+Bt, y = C +Dt+Et2.

(a) What is the initial position (position at t = 0) of the object in termsof A,B, ..., E?

(b) What is the initial velocity of the object? The initial speed of theobject?

(c) Near the surface of the earth if no other force besides gravity isconsidered, an object will have a constant acceleration of g = 9.8m/sec2 directed directly down (in the negative y direction). Whatdoes this tell you about one or more of the constants A, ..., E?

VELOCITY ALONG A PARAMETRIZED CURVE 157

Page 158: Notes on Calculus

(d) An object is projected from the point (x0, y0) with velocity [vx, vy] .Rewrite the parametric equations in terms of these constants. Atwhat time will the object hit the (level) ground?

(e) The object starts from ground level with initial speed s m/sec andat an angle θ with the horizontal. What angle θ will maximize thehorizontal distance the object travels before hitting the ground?

(f) Suppose, in the situation of the previous part, the initial speed isdoubled. What will be the effect on the horizontal distance trav-eled? Explain why the result you get is reasonable.

3.8 Components of Vectors—The Dot Product

Given any vector v in R2, say v = [v1, v2], it is easy to write it as a sum ofvectors parallel to the coordinate axes; we just use the individual components:

v = v1 [1, 0] + v2 [0, 1] = [v1, 0] + [0, v2] .

Often, however, we wish to write v as a sum of vectors parallel to some otherset of mutually orthogonal lines. In our current situation we want to learn moreabout motion along a parametrized curve by writing the acceleration vectoras a sum of a vector parallel to the velocity vector and a vector perpendicularto the velocity vector, and then determining the effect of each component ofacceleration.As a first example, let’s write v = [1, 3] as the sum of a vector parallel to

[1, 1] , that is, a vector along the line y = x, and a vector parallel to [−1, 1] ,that is, a vector along the line y = −x.

-2 -1 1 2

-2

-1

1

2

3

x

y

v

w

p

158 COMPUTING DERIVATIVES

Page 159: Notes on Calculus

Geometrically, it is easy to see what to do. We need to project v ontoeach of the lines. That is, we drop a perpendicular from the head of v (thepoint (1, 3)) to each of the lines. From the diagram it appears that the per-pendicular dropped to y = x intersects y = x at the point (2, 2) , and that theperpendicular dropped to y = −x intersects y = −x at the point (−1, 1) . Andin fact, we can write

[1, 3] = [2, 2] + [−1, 1] .To adopt some terminology, the orthogonal projection of a vector v onto a

vector w (or equivalently, onto the line through the origin defined by w) is thevector parallel to w whose length is found by drawing a line perpendicular tow from the head of v until it intersects the line determined by w. Thus, in theexample just above, the orthogonal projection of v = [1, 3] onto w1 = [1, 1] is[2, 2] , and the orthogonal projection of v onto w2 = [−1, 1] is [−1, 1] . Usuallyit is not so easy to see what the projections are, so we need a way to do thecalculations routinely. It turns out that the key is a new operation on vectorscalled the dot product. The dot product of two vectors will be a number.As the first stage, here is the definition of the dot product of a vector v witha unit vector u. (Recall that a unit vector is just a vector of length one.)

Definition: The dot product of a vector v and a unit vector u is the realnumber v · u equal to the signed length of the orthogonal projection vector pof v onto u.

We can use trigonometry to compute one version of what the length of theprojection will be. Looking at the right triangle in the diagram above, thelength of the projection p = [2, 2] of v = [1, 3] onto a unit vector along y = x iskvk cos θ =

√10 cos θ, where θ is the angle between the line determined by the

vector [1, 3] and the line y = x. The most natural vector to pick along y = xis [1, 1] , whose length is

√2. A unit vector u pointing in the same direction

along y = x is

u =[1, 1]

k[1, 1]k =[1, 1]√2=

∙1√2,1√2

¸.

Since the length of p = [2, 2] is√22 + 22 = 2

√2, we have that

[1, 3] ·∙1√2,1√2

¸= 2√2

and then that

p = [2, 2] = 2√2

∙1√2,1√2

¸,

that is, in more general terms,

p =(v · u)u. (3.1)

We have written p as the product of a unit vector u pointing along the sameline as p times the length (v · u) of p.

COMPONENTS OF VECTORS—THE DOT PRODUCT 159

Page 160: Notes on Calculus

In this case we worked backwards from our knowledge of the projection tosee what the dot product must be. Usually the projection is not so obvious.So what we would like is some easy way to calculate the dot product, orequivalently the cosine of the angle between the vectors.as a method for findingthe projection u to be the other unit vector along y = x.

There is another unit vector along the line y = x, the vector∙− 1√

2,− 1√

2

¸pointing in the opposite direction along the line. In this case we have

p = [2, 2] = −2√2

∙− 1√

2,− 1√

2

¸,

that is now

[1, 3] ·∙−1√2,−1√2

¸= −2

√2.

This is the meaning of the phrase signed length in the definition above. Thelength is taken to be positive if the angle between the vectors is less than a rightangle, and negative if it is more than a right angle. With this understandingthe trig version

v · u = kvk cos θ (3.2)

is valid in all cases, since cos θ > 0 if θ is less than a right angle and cos θ < 0if θ is more than a right angle. To sum up, the projection vector p dependsonly on v and on the line through the origin determined by the unit vectoru. However the sign of the dot product depends on which of the two possibleunit vectors on that line that we choose as u.To illustrate the possibilities in a very simple situation, here are three

pictures where u is the unit vector along the positive x-axis, u = [1, 0] , and vis successively [2, 1] , [0, 1] , and [−2, 1] .

-2 -1 1 2

-2

-1

1

2

x

y

[2, 1] · [1, 0] = 2

-2 -1 1 2

-2

-1

1

2

x

y

[0, 1] · [1, 0] = 0

-2 -1 1 2

-2

-1

1

2

x

y

[−2, 1] · [1, 0] = −2

(3.3)

The middle diagram illustrates the very important fact

v and u are orthogonal (perpendicular) if and only if v · u =0.

160 COMPUTING DERIVATIVES

Page 161: Notes on Calculus

This follows from equation (3.2) since cos θ = 0 exactly when θ is a right angle.

Tomake the dot product useful, it must be defined between any two vectors.I will extend the definition using linearity, that is, so that if w and u point inthe same direction but w is twice as long as u, then v ·w is twice v · u. Moreformally, for any nonzero vector w, we can get a unit vector u pointing in thesame direction by dividing w by its length: u =

w

kwk . Now

Definition (Part 2) The dot product of a vector v and a nonzero vector

w is v ·w = kwkµv · wkwk

¶. The dot product of v with the zero vector 0 is

0 (the number).

Returning to the original example above, for w = [1, 1] , with kwk =√2,

the unit vector in the same direction is u =[1, 1]√2=

∙1√2,1√2

¸. We have for

v = [1, 3]

[1, 3] · [1, 1] = kwkµv · wkwk

¶=√2

µ[1, 3] ·

∙1√2,1√2

¸¶=√2³2√2´= 4.

The trig function version of the definition follows from the version (3.2) forunit vectors and linearity:

v ·w = kwkµv · wkwk

¶= kwk (kvk cos θ) = kvk kwk cos θ

where we may think of θ as the angle between v and w since w and the unitvector made from it point in the same direction.

The definition may look complicated, but we will now find a very easy wayto compute the dot product; that is a large part of why it is so useful. To startwith, taking the dot product with the unit vector along the x or y axis is veryeasy. As the diagrams (3.3)above show, the dot product of v = [v1, v2] with[1, 0] is just the first component of v, [v1, v2]·[1, 0] = v1. Similarly, the dot prod-uct of [v1, v2] with [0, 1] is just the second component of v, . [v1, v2] · [0, 1] = v2.Here are some more diagrams to illustrate this.

COMPONENTS OF VECTORS—THE DOT PRODUCT 161

Page 162: Notes on Calculus

0.5 1.0 1.5 2.0

-1

0

1

2

x

y

[1, 2] · [1, 0] = 1, [2,−1] · [1, 0] = 2

0.5 1.0 1.5 2.0

-1

0

1

2

x

y

[1, 2] · [0, 1] = 2, [2,−1] · [0, 1] = −1

In the diagram just above we see that [1, 2] · [1, 0] = 1, [2,−1] · [1, 0] =2, [1, 2] · [0, 1] = 2, and [2,−1] · [0, 1] = −1, where the minus sign in the last dotproduct comes from the fact that the angle between [2,−1] and [0, 1] is morethan a right angle.Next, taking the dot product with an arbitrary nonzero vector [a, 0] along

the z-axis, [a, 0] = a [1, 0] if a > 0. Then

[v1, v2] · [a, 0] = a ([v1, v2] · [1, 0]) = av1.

If a < 0, then a slightly different calculation produces the same result. Now

[a, 0] = k[a, 0]k [a, 0]

k[a, 0]k = −a [−1, 0] and

[v1, v2] · [a, 0] = −a ([v1, v2] · [−1, 0]) = −a (−v1) = av1.

Similarly, taking the dot product of [v1, v2] with an arbitrary nonzero vectoralong the y-axis,

[v1, v2] · [0, b] = bv2.

Now it is easy to compute the dot product of v = [v1, v2] with an arbitraryvector w = [w1, w2] . We just write w in terms of its horizontal and verticalcomponents, w = [w1, 0] + [0, w2] and expand:

[v1, v2] · [w1, w2] = [v1, v2] · ([w1, 0] + [0, w2])= [v1, v2] · [w1, 0] + [v1, v2] · [0, w2]= v1w1 + v2w2.

Getting here has been a little longwinded, but now that we have this for-mula for the dot product, it is very easy to use. For instance, to return to theexample [1, 3] · [1, 1] from above,

[1, 3] · [1, 1] = 1 · 1 + 3 · 1 = 4.

162 COMPUTING DERIVATIVES

Page 163: Notes on Calculus

Finally, here is how to compute the projection p of v on an arbitrarynonzero vector w (or on an arbitrary line through the origin) without first

constructing a unit vector. If u =w

kwk , then

p = (v · u)u =µv · wkwk

¶w

kwk (3.4)

=v ·wkwk2

w.

This may look complicated, but it is easy to use—for instance the projection of[1, 3] on [1, 1] or equivalently on the line y = x is

[1, 3] · [1, 1]k[1, 1]k2

[1, 1] =4

2[1, 1] = [2, 2] .

And you don’t really have to remember the formula for p in terms of v andw,, since you can remember the easy form using a unit vector and then justrepeat the little calculation in (3.4).

Remark: We can express the Pythagorean Theorem in this vector contextusing the dot product. It takes the form that if v and w are any vectorswith v ⊥ w, then kv +wk2 = kvk2 + kwk2 . This is clear pictorially, sincekv+wk , kvk , and kwk are the length of the hypotenuse and the two legs ofa right triangle.

x

y

It can also be derived from our work with dot products, since

kv +wk2 = (v +w) · (v +w)= v · v+ v ·w+w · v +w ·w= kvk2 + 0 + 0 + kwk2 .

Here I have used the fact that for any vector v,

v · v =v1v1 + v2v2 = v21 + v22 = kvk2 .

COMPONENTS OF VECTORS—THE DOT PRODUCT 163

Page 164: Notes on Calculus

Summary of Formulas:

(a) If u is a unit vector, v ·u is the signed length of the projection of v ontothe line through the origin defined by u and

v · u = kvk cos θ= v1u1 + v2u2.

(b) If w is any nonzero vector, it is harder to give a simpler geometricinterpretation of v · u, but

v ·w = kvk kwk cos θ= v1w1 + v2w2.

(c) If v = w, we get

v · v =v21 + v22 = kvk2

so the dot product contains information about lengths of vectors as well asangles between vectors.(d) The projection p of v on an arbitrary nonzero vector w is

p =v ·wkwk2

w.

In the special case where w is a unit vector u, this simplifies to

p = (v · u)u,

but in practice it is easier to compute using the more general formula than toconvert a vector w into a unit vector u and then use the simpler formula.

Final Remark. For any positive integer n, we can define Rn to be the setof ordered n-tuples (v1, v2, ..., vn). For any two vectors in Rn, u = [v1, v2, ..., vn]and v = [v1, v2, ..., vn], it makes sense to write

u · v = u1v1 + u2v2 + ...+ unvn.

This is normally taken as the definition of the dot product for larger values ofn. Note that one could then turn around and define lengths by

kuk =qu21 + u22 + ...+ u2n

and then define angles (and in particular the notion of orthogonality) usingthe dot product: the angle θ between u and v is defined by

cos θ =u · vkuk kvk .

164 COMPUTING DERIVATIVES

Page 165: Notes on Calculus

Thus, from the points of view of computation and generality, it is the alge-braic form of vectors and the dot product that is fundamental; the importanceof the geometric form is that it guides our intuition about what to do withvectors.

EXERCISES.

1. Compute v ·w for the indicated vectors v and w in two ways: (a) usingthe algebraic formula, and (b) using the geometric/trig formula. For (b)explain carefully how you are finding θ (without using the result of part(a)!) and illustrate with a diagram

(a) v = [1, 1] and w = [0, 2]

(b) v = [1, 1] and w = [0,−2](c) v = [−1,−1] and w = [0,−2](d) v =

£1,√3¤and w = [2, 0]

(e) v =£√3,−1

¤and w = [1, 1]

(f) v = [1, 2] and w = [0, 0]

(g) v = [1, 2] andw = [2, 1] (Note θ = θ1−θ2 where θ1 and θ2 are anglesmeasured from the positive x-axis.)

2. Let v = [1, 2] .

(a) Find all unit vectors that are orthogonal to v.

(b) Make a diagram of the plane showing v, the unit vectors orthogonalto v, the unit vectors u such that v · u >0, and the unit vectors usuch that v · u <0.

3. Show, using Euclidean geometry, that if u1 and u2 are unit vectors, thenthe projection of u1 on u2 and the projection of u2 on u1 have the samelength. Consider both the case when the angle between the vectors is lessthan a right angle, and the case where it is greater than a right angle.

4. (a) Find the projection of [4, 1] on [1, 2] .(b) Find a vector orthogonal to [1, 2] and the projection of [4, 1] on thatvector.(c) Verify that the two projections are orthogonal and that their sum is[4, 1] . Illustrate with a diagram.(d) What is the angle between [4, 1] and [1, 2]? (In both radians anddegrees)

5. Repeat #1 for [4, 1] and [−3, 1] .

COMPONENTS OF VECTORS—THE DOT PRODUCT 165

Page 166: Notes on Calculus

6. (a) Find the projection of [2,−3] on [1, 3] .(b) Write [2,−3] as the sum of this projection and a vector orthogonalto it. Illustrate with a diagram.(c) Find the projection of [2,−3] on [2, 6] and on [−2,−6] . What is therelationship between all of these projections?

7. (a) For the parametric equations x = t2, y = t compute the velocityvector v and the acceleration vector a.(b) Now determine the speed and the rate of change of the speed (justdifferentiate).(c) Finally, determine the projection of a on v and the magnitude ofthat projection. Discuss the relationship to part (b).

8. For the parametric ellipse given by

x = 4 cos t

y = 3 sin t

(a) Compute the velocity v and acceleration a.(b) Where is the speed increasing? Decreasing? (Suggestion: Use guileinstead of calculus. Show that if A > B, then A sin2 t + B cos2 t =B + (A−B) sin2 t and decide where this function is increasing and de-creasing, e.g. by graphing it.)(c) By computing the dot product of these vectors determine for whichvalues of t the angle between v and a is less than π/2, equal to π/2, andgreater than π/2. Illustrate with a diagram. Discuss the relationshipbetween the answers to parts (b) and (c).(d) Find the projection of a on v when t = π/4. Write a as the sum ofthis projection and a vector orthogonal to v. Illustrate with a diagram.

9. Let r (t) be the distance from the origin at time t along the parametrizedcurve x = f (t) , y = g (t) , so that r (t) =

pf2 (t) + g2 (t).

(a) Show by differentiating and relatingdr

dtto a dot product that

dr

dt=

kvk cos θ, where v (t) is the velocity vector along the curve at time t,and θ is the angle between v and the line joining the origin to the point(f (t) , g (t)) .(b) Make a sketch showing the parametrized curve through the point(f (t) , g (t)) , the velocity vector v and the angle θ. Identify a length thatequals kvk cos θ and explain why the diagram shows that the change ∆rin r from time t to time t+∆t is approximately (kvk cos θ)∆t.

10. For another method of finding the algebraic formula for u · v, of theconsider the triangle with sides u and v, with the tails of these vectors

166 COMPUTING DERIVATIVES

Page 167: Notes on Calculus

placed at the origin, and to calculate the length of the third side in twodifferent ways. On the one hand, the Law of Cosines tells us that thesquare of its length is (.4 cos t, .4 sin t)

kuk2 + kvk2 − 2 kuk kvk cos θ.

u

v

u - v

θ

On the other hand, since the points at the ends of the side have coordi-nates (u1, u2) and (v1, v2), the distance formula tells us that the squareof the length is

(u1 − v1)2 + (u2 − v2)

2 .

Equate the two forms of the square of the distance and do some algebrato end up with

kuk kvk cos θ = u1v1 + u2v2.

3.9 Curvature

Before we study the components of acceleration, it is helpful to discuss thecurvature of a plane curve. Roughly, the curvature at a point should be anumber that expresses how quickly the curve is changing direction at thepoint. Thus the curvature of a straight line should be zero. The curvatureof a circle should be the same at all points, but the curvature of an ellipse ora parabola should not. Furthermore curvature, as something related to theshape of the track of a curve, should depend only on the track, and not on theway that a particle is moving along it.With this in mind, we define the curvature, κ (the Greek letter kappa), of

a plane curve F (t) = (x (t) , y (t)) to be

¯̄̄̄dφ

d

¯̄̄̄where φ is the angle between the

forward tangent vector and the x-axis and is arclength along the curve. Inwords, the curvature at a point is the magnitude of the rate at which the angleof the tangent vector is changing per unit change in distance along the curve.

CURVATURE 167

Page 168: Notes on Calculus

We need the absolute value signs to get the same number for both possibledirections of increasing distance along the curve. (If φ is increasing for onedirection of motion, φ will be decreasing in the opposite direction. For instanceφ increases if you move counterclockwise around a circle, but decreases if youmove clockwise.) An explicit expression for as a function of t is not very

clear at this point, butd

dtshould be just the rate of motion along the curve,

that is, the speed s =px02 + y02.

0 1 20

1

2

3

4

x

y

φ

tangent vector

Examples: The curvature of a line is 0. The curvature of a circle of radiusr is

total change of angletotal distance around circle

=2π

2πr=1

r

so the curvature is the same at all points and decreases as the radius increases.To get a formula for the curvature based on parametric equations F (t) =

(x (t) , y (t)) , note first that since the velocity vector is tangent to the curve,

the components x0 and y0 of the velocity satisfy tanφ =y0

x0and then cosφ =

x0

swhere s =

px02 + y02. Also we have, using the chain rule,.

dt=

d

dt

d= s

d

so that the curvature is

¯̄̄̄dφ

d

¯̄̄̄=1

s

¯̄̄̄dφ

dt

¯̄̄̄. Now

d

dttanφ =

dtsec2 φ =

dt

s2

x02(3.5)

since secφ =1

cosφ=

s

x0and also

d

dttanφ =

d

dt

µy0

x0

¶=

y00x0 − y0x00

x02. (3.6)

168 COMPUTING DERIVATIVES

Page 169: Notes on Calculus

so, substituting the expression ford

dttanφ from (3.6) into (3.5) and solving

fordφ

dt,

dt=

x02

s2d

dttanφ =

x02

s2y00x0 − y0x00

x02=

y00x0 − y0x00

s2.

Thus we get this expression for the curvature in terms of the parameterization(x (t) , y (t))

κ =

¯̄̄̄dφ

d

¯̄̄̄=1

s

¯̄̄̄dφ

dt

¯̄̄̄=|y00x0 − y0x00|

s3.

Example: Parametrizing y = cx2 as x = t, y = ct2, the curvature at (0, 0)

is2c

(1 + 4c2t2)3/2

¯̄̄̄¯t=0

= 2c. More generally, the curvature of this parabola at

(a, a2) is2c

(1 + 4c2a2)3/2.

Remark: If the track of the curve is the graph of a function y = f (x) ,then parametrizing the curve as x = t, y = f (t) shows that then the curvatureis 0 at all points where f 00 = 0 and nowhere else. (See exercise 6 below.) Inparticular, the curvature is zero at any point of inflection of the curve.

EXERCISESFind the curvature of the plane curve at the indicated point.

1. The circle of radius R with the usual parameterization (R cos t, R sin t)at any point.

2. The unit circle with the parameterization (cos t2, sin t2) at any point.

3. For the parametrized curve F (t) = (2 cos t, sin t)

(a) Sketch the track and find an equation for it in x and y.

(b) Find the curvature. Where is it a maximum? A minimum? Doesthis make sense on your sketch?

4. Repeat the previous problem for the curve F (t) = (2 sin t, cos t) . Do youget the same answer for the curvature? Explain.

5. The parabola y = x2+ax at the origin. (Convert to parametric equationsin the easiest way.)

6. Parametrize the graph y = f (x) with x = t, y = f (t) to get an expres-sion for the curvature at the point (a, f (a)) in terms of the derivativesof f at a.

CURVATURE 169

Page 170: Notes on Calculus

7. Use the result of the preceding exercise to explain the remark just pre-ceding these exercises.

8. For the parametric curve F (t) = (sin t− t cos t, cos t+ t sin t)

(a) Sketch the curve in the plane for −2π ≤ t ≤ 2π.(b) Compute the velocity and acceleration, and show that the speed at

time t is |t| . What happens on the curve when the speed is 0?

(c) Show that the curvature is1

|t| . (There is some manipulation to dothere!) Why (in terms of the picture) is there a difficulty with thecurvature at t = 0?

3.10 Components of Acceleration

Now we are ready to analyze the acceleration vector of a curve F (t) =(x (t) , y (t)) . We have that the velocity vector is

F 0 (t) = [x0, y0] = s

∙x0

s,y0

s

¸= sT

where T (t) is the unit tangent vector: T =

∙x0

s,y0

s

¸. To check that T is

indeed a unit vector, note

kTk2 =µx0

s

¶2+

µy0

s

¶2=

x02 + y02

s2=

s2

s2= 1.

Thus for acceleration we get

F 00 (t) =d

dtF 0 (t) =

d

dt(sT)=s0T+ sT0.

In order for this formula to be useful, we must study T0. We have

ds

dt=

d

dt

¡x02 + y02

¢1/2=

x0x00 + y0y00

s.

Also,

d

dt

µx0

s

¶=

x00s− x0s0

s2=

x00 (x02 + y02)− x0 (x0x00 + y0y00)

s3

=y0 (x00y0 − y00x0)

s3

170 COMPUTING DERIVATIVES

Page 171: Notes on Calculus

and similarly,d

dt

µy0

s

¶=

x0 (y00x0 − x00y0)

s3.

It follows that

T0 (t) =

∙d

dt

µx0

s

¶,d

dt

µy0

s

¶¸=

y00x0 − y0x00

s3[−y0, x0] = dφ

d[−y0, x0] .

Using the dot product, T and T0 are orthogonal since

T ·T0 = 1

s

d[x0, y0] · [−y0, x0] = 0.

Writing κ =

¯̄̄̄dφ

d

¯̄̄̄for the curvature, we get

F 00 (t) = s0T+ sT0 = s0T+ s2κN

where N is the unit vector perpendicular to T given by N =

∙−y

0

s,x0

s

¸if

d> 0 and N =

∙y0

s,−x

0

s

¸ifdφ

d< 0. Since s2κ > 0, we see that N points in

the direction that the curve is turning toward.Thus we have an expression for the acceleration that shows that the mag-

nitude of the component of acceleration tangent to the motion is therate of change of the speed, and the magnitude of the componentof acceleration perpendicular to the motion is the product of thesquare of the speed and the curvature.

Example: F (t) = (cos (kt) , sin (kt)) . Here we have motion on the unit

circle, where κ = 1. The time to go once around the circle is2π

kso the speed is

constant at k. The acceleration vector is k2 [− cos (kt) ,− sin (kt)] which doeshave magnitude k2 and points directly toward the origin, that is, perpendicularto the track.Thus here F 00 (t) = s2κN.More generally, if F (t) = (R cos (kt) , R sin (kt)) , that is, the circle now

has radius R, then the speed is kR and the curvature is1

R. The acceleration

vector is k2R [− cos (kt) ,− sin (kt)] with magnitude k2R = (kR)2 1R= s2κ.

Example: F (t) = (cos t2, sin t2) . Here the curvature is again κ = 1. ThenF 0 (t) = [−2t sin t2, 2t cos t2] so the speed is s =

√4t2 sin2 t2 + 4t2 cos2 t2 = 2t

and the acceleration is

F 00 (t) =£−2 sin t2, 2 cos t2

¤+£−4t2 cos t2,−4t2 sin t2

¤= 2

£− sin t2, cos t2

¤+ (2t)2

£− cos t2,− sin t2

¤.

The first term is the tangential component of acceleration withT = [− sin t2, cos t2] ,and the second term is the orthogonal component, withT = [− cos t2,− sin t2] .

COMPONENTS OF ACCELERATION 171

Page 172: Notes on Calculus

Thus for small positive values of t, when the motion is very slow, most of theacceleration is involved with speeding up. As time goes on, more and more ofthe acceleration is concerned with changing direction.

EXERCISES

1. For the parametrized curve F (t) = (2 cos t, sin t) from the previous ex-ercise set, express the acceleration in terms of its tangential and normalcomponents. Is either ever equal to zero?

2. For the equations x = cos3 t, y = sin3 t (recall #7 in section 3.7) find thevelocity, acceleration, and speed (remember the absolute value signs!)(you have already done these), compute the curvature κ using the for-mula from the previous section and some trig simplifications to get arelatively simple answer, find T and N (you will have to consider thedifferent quadrants separately because of the absolute value signs in s),and decompose the acceleration into its tangential and perpendicularcomponents.

3. For the parametrized curve F (t) = (sin t− t cos t, cos t+ t sin t) (refer toexercise 11 in section 1.10) find all values of t,−10 ≤ t ≤ 10 where thetangential component of acceleration is zero. (Just find where v ·a = 0.)

4. Use the version of the Pythagorean Theorem mentioned at the end ofsection 3.8 to show that the curvature at the point F (t) can be foundfrom the magnitude of the acceleration vector and the speed s along thecurve by

κ =

qkF 00 (t)k2 − s02

s2.

172 COMPUTING DERIVATIVES

Page 173: Notes on Calculus

4. APPLICATIONS OF DIFFERENTIA-TION

4.1 Introduction

The apparatus of computing derivatives symbolically has applications in anenormous range of directions. This is precisely why it makes sense to abstractthe idea of the derivative as the limit of difference quotients from the contextsin which the idea of instantaneous rate of change first arose. In this shortchapter we will examine briefly three kinds of applications: to graphing, toselecting the proper function from a family of functions in order to have somespecific property or properties, and to determining maximum and minimumvalues.

4.2 Simple Applications of the Chain Rule

4.2.1 Implicit Differentiation

How can we find the slope of the tangent line at a point (x0, y0) on the circlex2+y2 = 5?Well, we could solve this equation for y, though the exact formulawould depend on the sign of y0. (To find the slope at (1, 2) we would writey =√5− x2 while at (1,−2) we would need to use y = −

√5− x2.) It is easier

just to imagine this process, that is, to imagine that y is some function of xand to proceed using the chain rule without bothering to write down explicitlywhich function of x that y is. Then y2 becomes a composition of two functions(y followed by the squaring function) and must be differentiated using thechain rule. We get, differentiating both sides of x2 + y2 = 5,

2x+ 2ydy

dx=

d

dx5 = 0.

Now we can simply solve this equation for the unknown dy/dx to get

dy

dx= −x/y.

Note that this single formula works both at (1, 2) (where the derivative is−1/2) and at (1,−2) (where the derivative is 1/2.)

APPLICATIONS OF DIFFERENTIATION 173

Page 174: Notes on Calculus

This technique works even when it is impossible to solve the implicit equa-tion explicitly for y in terms of functions whose names we know. To find thetangent line at (1, π/2) to the curve y + x sin y = 1 + π/2, we differentiateimplicitly, using the product rule on the second term to get

dy

dx+ sin y + x cos y

dy

dx= 0

ordy

dx=− sin y

1 + x cos y.

Evaluating at (1, π/2) , we finddy

dx= −1 there, so the tangent line is y−π/2 =

− (x− 1) ory = −x+ 1 + π/2.

-4 -2 2 4

-4

-2

2

4

x

y

(1, /2)π

y + x sin y = 1 + π/2

4.2.2 Related Rates

It is often necessary to derive and use some relationship between the ratesof change of several different functions. The fundamental principle is thatordinarily this is done in two steps:(1) Derive a general relationship between the functions themselves. Gen-

erally this first requires writing down carefully what the functions themselvesare, often by starting with the rates of change and working backwards to thefunctions.(2) Establish a general relationship among the rates of change by differen-

tiation. If appropriate, evaluate at specific times, positions or whatever.The phrase “related rates” is often used in calculus texts only in contexts

where the relationship between the original functions involves composition of

174 APPLICATIONS OF DIFFERENTIATION

Page 175: Notes on Calculus

functions and the differentiation process therefore involves the chain rule. Thisis artificially restrictive, since the same priniciples apply equally well with otherrelationships between functions.

Example. I am walking parallel to a straight wall at a speed of 1 me-ter/sec. I am 10 meters from the wall and there is a light at head height 5meters farther from the wall. How fast does the shadow of my head movealong the wall?First we need to identify the functions. Since we know one speed (derivative

of position) and want to find another, it is natural to take as fundamentalquantities the position of my head and of my head’s shadow. We must choosesome position as a zero point. Since it is rates of change we care about,what choice we make really shouldn’t matter—changing positions by a constantamount will not affect their derivatives. However the most natural choice ofzero positions would be on the line through the light perpendicular to the wall,since that leads to this diagram with two similar right triangles.

light

me

I started here

wall

f(t)g(t)

View from above

If f (t) is the distance I have moved from my zero position at time t,assuming that I am at the zero position at time 0, and g (t) is the distancemy shadow has moved from its zero position, then f (t) = t and equating theratios of vertical to horizontal sides of the triangles gives

t

5=

g (t)

15or g (t) = 3t.

Now it is obvious that the speed of my head’s shadow is g0 (t) = 3 meters/sec.

EXERCISES

1. You are blowing up a spherical balloon so that its radius is increasngat a constant rate of 2 cm/sec. At what rate are you blowing into theballoon when the radius is 10cm? Be sure to include the correct units inyour answer.

SIMPLE APPLICATIONS OF THE CHAIN RULE 175

Page 176: Notes on Calculus

2. The elevator on the Space Needle goes from ground level to a height of500 feet. You are watching the elevator from a window 50 feet above theground in a building 200 feet away. The elevator begins to descend fromthe top at a constant speed of 30 feet/sec.

(a) Find an expression for h (t) , the height of the elevator t secondsafter it begins to descend, and an equation connecting h (t) andθ (t) , the angle your line of sight to the elevator makes with thehorizontal at time t.

(b) If the apparent speed of the elevator is proportional todθ

dt, where

is the elevator when it appears to you to be moving most rapidly?

3. The light in the example above about the speed of my shadow is aflashlight held by a bashful admirer, who moves directly away from thewall at 1 meter/sec from her initial position 15 meters from the wall,but who keeps the flashlight pointed at me while she retreats. Now howfast does my head’s shadow move? Does the speed increase or decreasewith time? Does it eventually approach a limit if neither of us ever getstired?

4. The light in the example is a flashlight held by the Road Runner. Hestands still initially until I have moved 5 meters and so my shadow hasmoved 15 meters. Then he starts running directly away from the wall sothat as he points his flashlight at me, my shadow doesn’t move though Ikeep walking at 1 meter/sec. Assume that he can move as fast as may berequired, but is not smart enough to think of a better direction to moveto keep my shadow stationary. Find RRs’ distance from the wall as afunction of time. Is there a limit to how long he can keep my shadowstationary even if there is no limit to how fast he can run? If so, what?Explain in terms of the diagram why the limiting time is what it is. Andwhat happens to his speed as that limiting time is approached?

5. Everybody is tired, so now the light is a lighthouse a mile offshore from astraight shoreline consisting of a high cliff. The light rotates at a constantrate, making two revolutions per minute. How fast (in miles/minute) isthe beam of light moving along the cliff when the beam is shining on apoint two miles from the closest point on the cliff to the lighthouse. Atwhat position is the beam moving most rapidly along the cliff? Leastrapidly along the cliff?

6. Consider the implicit equation x+ y (x2 + y2 − 1) = 0.

(a) Determine all points where the graph of this equation touches orcrosses a coordinate axis.

176 APPLICATIONS OF DIFFERENTIATION

Page 177: Notes on Calculus

(b) Graph the equation. It should be compatible with your answer to(a)! Your TI-89 should do this in implicit mode. If all else fails,google Wolfram Alpha and enter plot x + y(x^2 +y^2 -1) = 0.

(c) Determine all points where the graph has a horizontal or verticaltangent line. Each of these points occurs where the graph intersectsanother curve (simpler fortunately). Include these other curves onyour diagram also.

4.3 Graphing and Calculus

As already discussed in Chapter 1, the existence of graphing technology doesnot lessen the need for a good qualitative understanding of the properties ofgraphs and a thorough knowledge of the graphs of the standard families offunctions. And as you know, calculus can be very useful in exploring thegraph of a function.The first derivative of a function f is the rate of change of f . Thus f

is increasing when f 0 is positive, and decreasing when f 0 is negative. A realnumber a is a critical point of f if either f 0 (a) = 0 or f 0 (a) does not exist(for instance a = 0 for f (x) = |x|). A function which is differentiable exceptpossibly at scattered points can change from increasing to decreasing or viceversa only at a critical point. This is why we look for local maxima andminima by finding the critical points of a function. But notice that f does notnecessarily change direction at a critical point; x3 is increasing on the entirereal line despite the fact that its derivative vanishes at x = 0.The second derivative, f 00, of f is the derivative of the first derivative. Thus

it is the rate of change of f 0 and is related to the behavior of f 0 in exactly thesame way that the behavior of f 0 is related to that of f : f 0 is increasing whenf 00 is positive (we say the graph of f is concave up) and f 0 is decreasing whenf 00 is negative (we say the graph of f is concave down).More carefully, we call a function concave up on an interval I if for every

pair of points a < b in I, the line segment joining (a, f (a)) to (b, f (b)) liesabove the graph of f at any point between a and b. Sometimes a functionwith this property is said to be strictly concave up, but I will not use thisterminology here. The issue is whether a line segment is considered to beconcave up or not. In this class we will not consider a curve containing a linesegment to be concave up. Recall that if f 00 is positive.on an interval I, then f 0

(the slope of the graph) is strictly increasing on that interval and so the graphof f is concave up. Similarly, we say the graph of f is concave down if thegraph of −f is concave up .It follows from the previous paragraph that the graph of a line is neither

concave up nor concave down. Sometimes the definition is more lenient andlines are considered both concave up and concave down, but not in this course.

GRAPHING AND CALCULUS 177

Page 178: Notes on Calculus

Sometimes the convention is different, so whenever the term occurs in a newcontext you should try to discover which version of the definition is being used.

A point where the concavity of the graph of f changes from concave up toconcave down or the reverse is a point of inflection of f. A point of inflectionmust be a point a where either f 00 (a) = 0 or f 00 (a) does not exist. (Why?)But note that one can have f 00(a) = 0 without the concavity changing (forinstance the graph of x4 is concave up everywhere, but 12x2 = 0 at x = 0).Since f 0 changes from increasing to decreasing or from decreasing to increasingat an inflection point, a point of inflection is a point where the first derivativef 0 has a local maximum (f 0 changing from increasing to decreasing, that is,f 00 changing from positive to negative) or a local minimum (f 0 changing fromdecreasing to increasing, that is, f 00 changing from negative to positive).

All of the material in the previous paragraphs is pretty obvious when youlook at a diagram. This is how you should remember it, pictorially; don’t tryto remember it as a list of rules written out in words without a picture.

f 0 > 0 f 0 < 0 Critical points

f 00 > 0 f 00 < 0 Points of inflection

EXERCISES

1. . For each of the following graphs, sketch the graphs of f 0 and f 00.

178 APPLICATIONS OF DIFFERENTIATION

Page 179: Notes on Calculus

(a) (b) (c)

(d)

2. The graph in (a) is the derivative of f.

(a) On a copy of the graph of the derivative f 0, indicate the location ofall inflection points of the function f itself. Make a rough sketch off consistent with this information. What arbitrary choice do youhave to make?

Graph of f 0

(b) On a copy of this graph of the second derivative f 00, indicate thelocation of all inflection points of the function f itself.

GRAPHING AND CALCULUS 179

Page 180: Notes on Calculus

Graph of f 00

3. Let f (x) = 3√2x− 2. Discuss the graph of this function at x = 1. Is f

defined there? Continuous? Differentiable? Is there a local min or maxat x = 1? A point of inflection? Explain.

4. Let f (x) = x3 (x− 3) . WITHOUT using your graphing calculator, an-swer the following questions.

(a) Where is f = 0? At which of these points does f change sign?

(b) What happens to the graph of f as x→∞? As x→−∞?(c) Sketch a rough graph of f near x = 0, keeping in mind that x−3 ≈−3 near x = 0.

(d) Use the information from parts (a), (b), and (c) to sketch a roughgraph of f that shows all of the information you have collected.

5. Suppose that the differentiable function f has three zeros on [a, b] . Showthat there is at least one c in [a, b] such that f 00 (c) = 0. What does thissay about the shape of the graph?

6. .A function f has f (0) = 0 and derivative f 0 (x) = sin (x2)+x

ln (x+ 1)−

1. Graph f 0 and its derivative, f 00, for 0 ≤ x ≤ 3. (You can enter f 00

efficiently in the TI-89 as d (y1 (x) , x)if you have entered f 0 as y1.)

(a) Use these graphs to state where f is increasing and where decreas-ing, and where f is concave up and where concave down.

(b) Using this information, sketch a graph of f for 0 ≤ x ≤ 3.

7. Assume that the polynomial p has exactly two local maxima and onelocal minimum.

(a) Sketch a possible graph of p.

(b) What is the largest number of zeros p could have?

180 APPLICATIONS OF DIFFERENTIATION

Page 181: Notes on Calculus

(c) What is the least number of zeros p could have?

(d) What is the least number of inflection points p could have?

(e) Is the degree of p even or odd? How can you tell?

(f) What is the smallest degree p could have?

(g) Find a possible formula for p.

8. Sketch the graph of a differentiable function f for 0 ≤ x ≤ 10 if f hasall the following properties.

(a) |f 00 (x)| ≤ 0.5 for all x.(b) f 0 takes on the values −2 and 2 somewhere in this interval.(c) f 00 is not constant.

9. For each part, sketch a possible graph of f using the given informationabout f 0 and f 00. Assume that f is defined, continuous throughout theinterval, and twice differentiable except as specified.

(a)

f' = 0.5f' = 0.5 f' > 0

f'' = 0 f'' > 0f'' = 0

(b)

f' = 0f' > 0 f' > 0

f'' = 0 f'' = 0f'' < 0 f'' < 0f'' > 0

(c)

undefinedf' f' = 0f' > 0 f' > 0f' < 0

f'' > 0f'' > 0undefinedf''

(d)

f' = 0f' > 0 f' > 0 f' > 0 f' < 0

f' = 0

f'' = 0 f'' = 0f'' > 0f'' < 0 f'' < 0

10. The graph shows the derivative f 0 of a function f with f (0) = 0.

-1 1 2 3 4 5

-10

10

20

x

y

GRAPHING AND CALCULUS 181

Page 182: Notes on Calculus

(a) Estimate f (−1) .(b) Where does f have its maximum value for −1 ≤ x ≤ 4?(c) Where does f have its minimum value for −1 ≤ x ≤ 4?(d) Locate all inflection points on the graph of f in this interval.(e) Sketch a possible graph of f for −1 ≤ x ≤ 4.

4.4 Optimization

Often we want to know where a function takes on its maximum or minimumvalue. The simplest way to identify such a point is often just to graph it.(Provided of course we have a handy graphing calculator or computor graphingprogram.) From looking at such graphs we know that if the maximum (orminimum) of a function over an interval occurs somewhere other than at anendpoint of the interval, then it must be at a critical point of the function—apoint where either the function is not differentiable, or is differentiable and hasderivative zero. Thus calculus can also generally help us to identify maximaand minima, provided we keep all the possibilities in mind. But simply drawingthe graph can be a better option when it is available.

Many standard max-min problems fall into the “not really a calculus prob-lem” category. For instance, consider the hoariest optimization problem ofall.

Example 1. A farmer has 100 meters of fence and wants to enclose arectangular field of maximum area. What should the dimensions be?

If two adjacent sides of the field have lengths x and y, then the perimeter ofthe field is 2x+2y = 100 and the area is xy. Solving the first equation for y interms of x as y = 50−x and plugging into the area expression, A = x (50− x) .So we must maximize this function. Of course we could set the derivative equalto zero, but it is easier to note that the graph of x (50− x) is an upside downparabola with zeros at x = 0 and x = 50 and therefore its largest value (at itsvertex) for x = 25. So then also y = 25, that is, the largest field is square withsides 25 meters and area 625 square meters.

182 APPLICATIONS OF DIFFERENTIATION

Page 183: Notes on Calculus

10 20 30 40 50

-200

200

400

600

x

area

x (1− x)

The moral of this problem is that, as you already know, most of the thoughtin an optimization problem goes into determining the function which is to bemaximized or minimized. Then a graph is often all you need to finish theproblem. There can be, from a mathematician’s point of view, some small lossin just drawing a graph in that sometimes we end up with a numerical valueinstead of a symbolic one. But in a practical situation the numerical value isgenerally what you want anyway. In this course I will try to tell you wheneverI really want an exact (i.e. symbolic) answer. Otherwise a sufficiently accuratedecimal is fine. To illustrate the difference, here is a slightly more complexexample.

Example 2. Find the dimensions of the cylindrical tin can with volume1000 cm3 which is “most efficient” in the sense of requiring the least materialto make.Here we assume, perhaps optimistically, that there is no waste in manufac-

turing, so the problem is to minimize the surface area of a cylinder (includingends) of volume 1000 cm3. If the cylinder has radius r and height h, then itssurface area (including ends) is

S = 2πrh+ 2πr2

where the first term is the area of the side of the can and the second termis the area of the two ends. We also know V = πr2h = 1000. We can usethis equation to eliminate one of the variables in the expression for S. If wesubstitute h = 1000/πr2 into the expression for S we get

S =2000

r+ 2πr2.

Now the simplest thing to do is to look at the graph. We see something likethe graph below

OPTIMIZATION 183

Page 184: Notes on Calculus

0 5 10 15 200

1000

2000

3000

4000

x

y

2000/r + 2πr2

and using the Minimum command on the TI-89 we see that the minimumvalue is about 553.581 cm2 when r ≈ 5.419 cm. Plugging into the equationfor h, h ≈ 10.839 cm.Of course we could also do this problem symbolically by setting dS

dr=

−2000/r2 + 4πr = 0 and solving to get r = (500/π)1/3 ≈ 5.419 and h =

1000

πr2=1000

π

³ π

500

´2/3= 2

µ500

π

¶1/3≈ 10. 839. But in this situation there is

no real need to do this.

In addition, drawing at least a rough graph is, as always, a good defenceagainst falling into error.Example 3. Find the dimensions of the right triangle with hypotenuse 5

and minimum perimeter.If we call one leg of the triangle x, then the other is

√25− x2.The perimeter

isP = 5 + x+

√25− x2.

5

x

(25 - )x2√

Right triangle with hypotenuse 5

If we just proceed symbolically,

dP

dx= 1 +

µ1

2

¶¡25− x2

¢−1/2(−2x) = 1− x√

25− x2.

184 APPLICATIONS OF DIFFERENTIATION

Page 185: Notes on Calculus

Then dPdx= 0 when

√25− x2 = x or x2 = 25/2 or x = 5/

√2. This is the only

critical point of P and corresponds to an isosceles right triangle (that is whatx =

√25− x2 says), which seems like a reasonable answer. Unfortunately,

however, the graph of 5 + x+√25− x2

0 1 2 3 4 50

2

4

6

8

10

12

x

y

5 + x+√25− x2

reveals that this is themaximum perimeter, and that the minimum value ofP for 0 ≤ x ≤ 5 occurs for x = 0 and x = 5. The minimum perimeter occursfor the degenerate triangle when one side has shrunk to zero and the othertwo sides are both of length 5. (Or, if you don’t like degenerate triangles, thenthere is no minimum.) So here again just drawing the graph in the first placewould have been a better strategy than differentiating, as long as it was easyto do.

This does not mean that finding an extremum by means of calculus is neverthe most efficient thing to do; it does mean that you should think about whatthe best method is likely to be before plunging into calculations. One commonsituation where symbolic methods are best occurs when you want to know howthe solution to a problem depends on one or more parameters.

Example 4

Let a > 0. What is the lowest point on the graph of f (x) = x2 + a/x forx > 0? How does the location of this point change as a varies?

f 0 (x) = 2x − a/x2 = 0 if x3 = a/2 or x = (a/2)1/3 . This point doesrepresent a minimum on the curve since, for instance, f 0 (x) < 0 for x <

(a/2)1/3 and f 0 (x) > 0 for x > (a/2)1/3 . The y-coordinate of the lowest pointis (a/2)2/3 + a/ (a/2)1/3 = 3 (a/2)2/3 . Thus the lowest point will move up andto the right as a increases. The diagram shows graphs of f for a = 2 anda = 8.

OPTIMIZATION 185

Page 186: Notes on Calculus

0 1 2 3

5

10

15

x

y

a

a

= 2

= 8

x2 + a/x for a = 2, 8

EXERCISESDon’t feel bound to use calculus if it is not really necessary. Think about

what the most efficient way to solve each problemmay be. However if you solvea problem graphically or numerically, report all parts of the answer correct toat least three decimal places.

1. A 216 m2 rectangular pea patch is to be enclosed by a fence and dividedinto two equal parts by another fence parallel to one of the sides. Whatdimensions for the outer rectangle will require the smallest total lengthof fence? How much fence will be needed?

2. In the preceding problem, suppose that the exterior fence must be fancyfence to keep out deer that costs $2 per foot, but the internal divisioncan be done with fence costing only $1 per foot. Now what dimensionswill make the cost of the fence as low as possible?

3. The sinc function, sin cx =sinx

xis important in Fourier analysis and in

applications such as information theory. Show that the local extrema ofthe sinc function occur where it intersects cosx. Illustrate with a diagramof both functions.

4. Determine all values of c such that the area A (c) between the graphof |1− |x|| + c and the x-axis is a minimum and the minimum value ofA (c) .

5. Determine the maximum value for1

1 + |x| +1

1 + |x− 2| and where itoccurs.(Think about the graph. Is this function differentiable?)

186 APPLICATIONS OF DIFFERENTIATION

curgus
Right Red Arrow
curgus
Right Red Arrow
Page 187: Notes on Calculus

6. Some problems on sums of absolute values. Illustrate each with an ap-propriate diagram.

(a) What value of x minimizes |x− 1|+ |x− 2|+ |x− 4|?What is thisminimum value? (Think about the graph and its slopes. Is thisfunction differentiable?)

(b) If a1 < a2 < a3, what value(s) of x minimizes |x− a1|+ |x− a2|+|x− a3|? What is this minimum value?

(c) If a1 < a2 < a3 < a4, what value(s) of x minimizes |x− a1| +|x− a2|+ |x− a3|+ |x− a4|? What is this minimum value?

7. Find the highest and lowest points on the graph of x2 + xy + y2 = 12.Sketch at least a rough graph of this curve.

8. The power that a windmill can extract from the air has been modeled asP (x) = 2kx2 (V − x) where V is the wind velocity (assumed constant),x is the average of the wind speeds in front of and behind the windmill,and k is a constant of proportionality.

(a) Using this formula, what value for x should the windmill designeraim for in order to maximize P?

(b) If the total power in the windstream is P = k2V 3, show that the

most the windmill designer can hope to achieve is to extract 1627of

the total power.

9. You have been asked to design a conical oil can with lid to hold 100cc. What dimensions will use the least material? Be sure to includethe bottom as well as the cone.What dimensions would use the mostmaterial? (For reference, the formulas for volume and surface area of acone of radius r and height h are V = 1

3πr2h, S = πr

√r2 + h2. We will

see where these formulas come from in Chapter 6.)

10. Determine the maximum angle between the line from the origin to thepoint P on the graph of f (x) = xe−x and the line tangent to this graphat P. Express the angle both in radians and in degrees. Illustrate witha diagram.

11. Find the points on the ellipse x2/9 + y2 = 1

(a) closest to the point (2, 0) .

(b) closest to the focus¡√8, 0¢.

12. Find the point on the parabola y = (x− 1)2 that is closest to the origin.

OPTIMIZATION 187

Page 188: Notes on Calculus

13. Find the point or points on the graph of f (x) = 2 − 2 sinx closest tothe origin and the minimum distance. Sketch the graph and locate theclosest point or points. (Suggestion: You cannot do this problem usingonly symbolic methods.)

14. Consider the parabola y = 1− x2.

(a) Find the point or points on the graph of y = 1− x2 closest to theorigin.

(b) Show that the line from the origin to each of these points is per-pendicular to the tangent line to the parabola at that point.

(c) Give an intuitive explanation for (b). (Suggestion: Draw a picture.If the line from the origin crosses the parabola at a point where itis not perpendicular to the tangent line, in which direction can youmove along the curve to get closer to the origin? How does thisanswer the question?)

15. The normal line through a point P on the graph of y = x2 in the firstquadrant also intersects the graph of y = x2 at a point Q in the secondquadrant. Sketch the graph of the function that assigns the y-coordinateof Q ro to the x-coordinate of each P . Determine P such that the y-coordinate of Q is as small as possible.

16. The line tangent to the graph of y = 4 − x2 at a point P in the firstquadrant determines a triangle bounded by the tangent line and thecoordinate axes. Determine the point or points for which the area ofthis triangle is a maximum.

17. A right triangle has legs on the x and y-axes. Its hypotenuse passesthrough the point (2, 1) . Which such triangle has the smallest area?Which such triangle has the largest area? What are these areas?

18. In the situation of the previous problem which triangle has the smallestperimeter and what is that perimeter?

19. A hexagon is incribed in the unit circle with vertices at (0,±1) and at(±x,±y) . For what point (x, y) is the area of the hexagon as large aspossible? Show that this is a regular hexagon.

20. Find the triangle of minimum area bounded by the positive coordinate

axes and a line tangent to the ellipsex2

a2+

y2

b2= 1 in the first quadrant.

188 APPLICATIONS OF DIFFERENTIATION

Page 189: Notes on Calculus

21. Let f (x) = xe−x. Determine the point P in the first quadrant on thegraph of f with the property that the angle between the line from theorigin to P and the tangent line to f at P is as large as possible. Whatis this angle? (Both degrees and radians.)

22. Show that |sinx− cosx| ≤√2 for all x.

23. Show thatex+y

xy≥ e2 for all positive numbers x and y.

24. Show that if a and b are positive, then a (1− b) and b (1− a) cannotboth exceed 1/4.

25. The graph below shows the fuel consumption, f (v) , in gallons per hourof a small airplane as a function of its airspeed v, measured in miles perhour.

(a) Let g (v) be the fuel consumption of the same airplane measuredin gallons per mile traveled at speed v instead of gallons per hour.Determine the relationship between f (v) and g (v) .

(b) At which point on the graph is f (v) minimized?

(c) At which point on the graph is g (v) minimized? Explain your rea-soning.

(d) If the pilot wants to travel at minimum total fuel cost, which shouldshe try to minimize, f (v) or g (v)?.

0 100 200 300 400 5000

20

40

60

v

f(v)

graph of f

26. Researchers have modeled the total energy expended by a salmon swim-ming upstream as E = cv3t where v is the velocity of the salmon instill water in miles per hour, t is the number of hours the salmon spends

OPTIMIZATION 189

Page 190: Notes on Calculus

swimming upstream, and c is a constant of proportionality. Suppose aparticular salmon needs to travel 200 miles upstream in a river whosecurrent is flowing at 4 mph. At what velocity should the salmon swimin order to minimize total energy expended?

(a) Repeat the previous problem if the current is flowing at smph toget an answer in terms of s. Does the optimal velocity depend onthe length of the river?

(b) Show that if instead E = cvαt, α > 0, then the optimal velocityis a multiple of s, but the constant depends on α. Verify that fora = 3 you get the same result as in (a). The model for these thesetwo problems was actually derived backwards. It was observed thatsalmon typically travel at the multiple of current velocity s derivedin part (a). It is assumed that they instinctively know the mostefficient velocity. Then the exponent α was chosen equal to 3 togive the correct constant.

27. You are standing on the edge of a straight canal (with no current) ofwidth 100 meters. You want to get to a point on the opposite edge ofthe canal 200 meters along the canal from your present position. Youcan paddle on the canal at 3 m/s and run along the edge at 4 m/s.Determine the route (paddle first, then run) that will get you to yourobjective in the minimum time.

28. In the situation of the previous problem, you can paddle on the canal ata speed v1 meters/sec and run along the edge at speed v2. Determine (interms of v1 and v2) the route that will get you to your objective in theminimum time. There is more than one case, so be careful. Some issuesto consider: (i) if v1 > v2 then you should just paddle directly to yourobjective. Is there any other relationship between v1 and v2 for whichyou should do this? (ii) is it ever most efficient to paddle directly acrossthe canal to the point directly opposite you and then run along the canalfor the entire 200 meters?

29. Light travels in such a way as to require the minimum possible time to getfrom one point to another. A ray of light from point A travels to point Bby reflecting off a plane mirror as in the left diagram below. Show thatthe two angles θ1 and θ2 between the two parts of the path of the rayof light and a line perpendicular to the mirror are equal. (Suggestion:Since the speed of light is constant, this is equivalent to minimizing thedistance traveled. )

190 APPLICATIONS OF DIFFERENTIATION

Page 191: Notes on Calculus

θ θ1 2

θ

θ

1

2

30. Suppose now that the points A and B are in different materials (say airand water) in which light travels with respective speeds v1 and v2 andthat there is a planar interface between the two materials. Show thatthe relationship between the angles θ1 and θ2 in the right diagram aboveis

sin θ1sin θ2

=v1v2.

This is known as Snell’s Law.

31. It takes Moe a certain length of time to travel to and from the mall.While there is no end to the things he would like to buy there, hisbuying rate decreases with time for a variety of reasons (he has to go tofarther corners to find stuff, he gets tired and has to refresh himself witha latte...). A graph of the value of the things he has bought as a functionof time at the mall looks like this. Here the interval on the horizontalaxis labeled TT represents his travel time to and from the mall and thetime to the right of the vertical axis is his time at the mall.

-4 -3 -2 -1 0 1 2 3 4 5

200

400

time

dollars

TT

Moe is a compulsive efficiency seeker. In this case he defines efficiency

as the ratiodollars spenttotal time spent

where the total time includes travel time

as well as time at the mall.

OPTIMIZATION 191

Page 192: Notes on Calculus

(a) Pick a time spent at the mall, say t = 4 hours. Draw a line on acopy of this graph whose slope is the efficiency of spending 4 hoursat the mall.

(b) Using the graph, estimate what amount of time spent at the mallwill maximize Moe’s efficiency.

(c) If Moe can find a way to get to and from the mall more quickly,would the time he spends at the mall for maximum efficiency in-crease or decrease? Explain.

4.5 Families of Functions

It happens fairly often that we want to select a particular member of a familyof functions that has some prescribed property or properties. Calculus is often,though not always, a useful tool in the selection process.We have already considered such a problem at the beginning of the course

when trying to find a specific formula for a function that increased or decreasedto some limiting value as x → ∞. In that case the general family was of theform

f (x) = c+ b · akx

where the constants a, b, c, k are the parameters defining the family, that is,they are constants for any particular member of the family but are differentfor different members. (We need a > 0 but the other parameters can beany real number.) In this case we recall that there is some redundancy builtin since different choices for a, b, and k will give the same function. Howdo the different constants affect the shape of the graph? c is the easiest tocharacterize—if a > 1, c = lim

x→−∞f (x) if k > 0 and lim

x→∞f (x) if k < 0.(If

0 < a < 1 these are reversed. What if a = 1?) The sign of b determineswhether the graph approaches the horizontal asymptote y = c from above(b > 0) or below (b < 0). The magnitude of b and a and k determine howrapidly the graph approaches the asymptote—to determine them we need toknow two points that the graph passes through.Another family would be

f (x) = axe−bx

where we will assume a > 0, b > 0. These functions have graphs with thisgeneral shape for x > 0.

192 APPLICATIONS OF DIFFERENTIATION

Page 193: Notes on Calculus

0 1 2 3 4 5 60.0

0.1

0.2

0.3

x

y

It is clear from the formula that limx→∞

f (x) = 0 for all members of the family

(because the exponential ebx will overwhelm the power function x) and thediagram suggests that the graph will have a single global maximum. Howcould we determine a and b so that this maximum occurs at x = 3 with avalue of, say, π?Differentiating with the product rule, f 0 (x) = ae−bx−abxe−bx = a (1− bx) e−bx.

From this we see that there is exactly one critical point at x = 1/b, and this isthe location of the maximum since f 0 changes from positive to negative there.Thus to make the maximum occur at x = 3 we must choose b = 1/3. Then wehave f (3) = ae−1 so to make f (3) = π we must choose a = πe. The functionwe want is πexe−x/3 = πxe1−x/3.

EXERCISES

1. Make a reasonably careful sketch of members of the family of parabolasp (x) = (2cx− x2) /c3 for several positive values of c. Find the equationof the curve containing the vertices of all these parabolas. Add this curveto your sketch.

2. Consider the quadratic polynomial p (x) = a (x+ b)2, where a and b areparameters.

(a) For fixed b, what is the effect of varying a? Consider both positiveand negative a. Illustrate with a sketch of several graphs on thesame axes.

(b) For fixed a, what is the effect of varying b? Consider both positiveand negative b. Illustrate with a sketch of several graphs on thesame axes.

3. Consider the polynomial p (x) = x4+ax2+b, where a and b are parameters.

(a) For what values of a and b does p have a single critical point? Whatis it? Is it a local maximum or a local minimum?

(b) For what values of a and b does p have three critical points? Whatare they? Which are local maxima and which are local minima?

FAMILIES OF FUNCTIONS 193

Page 194: Notes on Calculus

(c) Sketch representative graphs of p in the two cases. Describe theeffects of varying a with b fixed, and of varying b with a fixed.

4. Consider the cubic polynomial p (x) = x3 + ax where a is a parameter.

(a) Show that p is always increasing if a > 0. What about a = 0?

(b) Show that p has two critical points for a < 0, a local max and alocal minimum. Find the location of each and the respective valuesof p in terms of a.

(c) As a varies, the local maximum and the local minimum move alonga curve. Find a formula for the curve. (Suggestion: write down thecoordinates of a point on the curve and express the y-coordinate interms of the x-coordinate to eliminate a.)

(d) Sketch the curve from (c) and several members of the family x3−axto show how they fit together.

5. Consider the cubic polynomial f (x) = x3+3ax2+a3 where a is a positiveparameter.

(a) Sketch the graph of f for a =1

2, a = 1, and a = 2.

(b) For which points (x, y) in the plane is there a positive value of a forwhich f (x) = y? In other words, what set of points in the plane iscovered by the graphs of all the cubics for all a > 0?

(c) Is there any point in the plane which lies on two or more of thesecubics? Either give an example or show that no example can exist.

6. Consider the function

f (x) =

½ax2 + bx, x ≤ 2,3x+ 1, x > 2

.

(a) Find constants a and b so that the function is differentiable at x = 2.Sketch a graph!

(b) Suppose it is only required that f be continuous at x = 2. Underwhat conditions on a and b will this be true?

7. Let p be the cubic polynomial p (x) = x3 + ax2 + bx+ c.

(a) Under what condition on the parameters a, b, and c is p (x) increas-ing on the entire real line, that is, f (x) < f (y) whenever x < y?

(b) Under what condition on the parameters a, b, and c does p alwayshave exactly one point of inflection?

194 APPLICATIONS OF DIFFERENTIATION

curgus
Right Red Arrow
Page 195: Notes on Calculus

8. The number N of students who have heard a rumor t days after it startsto spread can be modeled by the function N = a

¡1− e−kt

¢where a and

k are parameters.

(a) Suppose that 10,000 students will eventually hear the rumor andthat 10% of these hear it on the first day. Find a and k.

(b) Describe the effect of varying a with fixed k, both in terms of itseffect on the graph and in terms of the situation being modeled.

(c) Describe the effect of varying k with fixed a, both in terms of itseffect on the graph and in terms of the situation being modeled.

9. Part of the track of a roller coaster starts at the maximum point ofthe parabola y = 100 − 1

4x2 and later runs along the x-axis for x ≥

40. (See the diagram below.) To make the ride smooth, a section oftrack following the graph of the function.y = a (x− 40)2 is inserted as atransition from the first parabola to the x-axis. Determine the value fora which makes the ride as smooth as possible, that is, continuous anddifferentiable at all points. At what point do the parabolas intersect?What is special about the member of the family a (x− 40)2 which makesthe transition smooth?

0 10 20 30 40 50 600

50

100

x

y

10. T and N are the tangent and normal lines to the ellipsex2

4+

y2

9= 1 at

the point P on the part of the ellipse in the first quadrant. T intersectsthe x-axis at xT and the y-axis at yT ; N intersects the axes at xN andyN . As P varies through the first quadrant, what are the ranges of thesefour numbers?

11. Generalize the outcome of the elllipse problem from the previous section(problem #11) by determining the point on x2/9+ y2 = 1 closest to thepoint (c, 0) for 0 ≤ c ≤ 3. In particular determine the set of c0s for whichthe closest point is (3, 0) .

12. Let f (x) = c√1 + x2 − x where c is a positive constant.

FAMILIES OF FUNCTIONS 195

Page 196: Notes on Calculus

(a) For which positive c does the graph of f have a local extremum atsome positive number x0? For such values of c, compute f (x0) .

(b) The graph of f can have several different shapes depending on c.Sketch a representative member of each such shape, and identifythe values of c associated with each shape. (Collectively the valuesof c should cover all positive numbers.)

(c) Let g be the function that assigns to each positive number c forwhich f (x) = c

√1 + x2 − x has a local extremum at c the value

of that extremum (the quantity f (x0) from (a)). What are thedomain and range of g. Is g a bounded function? Does it have amaximum and/or minimum value?

13. We will look at a limit of a sequence of differentiable functions that maybe a bit surprising.

(a) For each x > 0 determine the point P on the unit circle closest tothe point (x, x2) in terms of x.

(b) P has the form (cos θ, sin θ) for some θ depending on x. Determineθ = T2 (x) explicitly.

(c) Generalizing, for each n = 3, 4, ...determine the point Pn on theunit circle closest to (x, xn) and the corresponding angle θ = Tn (x)in terms of x.

(d) Show that for any fixed x > 0, the sequence {Tn (x)}∞n=2 is eitherincreasing or decreasing and for each x > 0 converges to a numberL (x) . (It is sufficient to argue from the graphs of the T 0ns.)

(e) Sketch the graph of L for x > 0 and a representative set of T 0ns.How is L essentially different from the T 0ns

(f) Interpret the result in terms of the motion of the sequence{(x, xn)}∞n=2for fixed x as n→∞.

14. In 1900 Max Planck announced that the intensity of radiation of a blackbody depends on the wavelength λ according to the formula r (λ) =

a

λ5 (eb/λ − 1). Experimental data shows that at a temperature of 3000◦

Kelvin the intensity has a maximum of 3.13 MW/m2μm at a wavelengthof 0.96 μm. Determine a and b for this temperature. (Suggestion: Youmay want to think about how to get the TI-89 to do as much of thesymbolic manipulation as possible. Otherwise it can get ugly.)

196 APPLICATIONS OF DIFFERENTIATION

Page 197: Notes on Calculus

4.6 Newton’s Method

How do you find the roots of an equation that cannot be solved symbolically?For instance, what is the smallest positive root of x = cosx? A very simplemethod for such problems is the bisection method. We can see by lookingat the graphs of x and cosx that the smallest root of x = cosx lies betweenx = 0 and x = π/2. As the graphs below show, the difference x − cosx isnegative at x = 0 and positive at x = π/2, so it seems that the curves mustcross somewhere between these values. (It is harder than you might think togive a careful proof that the curves actually intersect in a point. It dependson a careful definition of the real numbers. The potential difficulty is that theequation might have no solutions at all—that the two graphs might somehowhop over each other. For “calculator numbers,” the limited set of rationalnumbers that can be represented exactly in the TI-89, say, this is actuallytrue. There is no exact calculator number solution to x = cosx. That is whynumerical solvers do not look for equality but only for a difference smaller thansome specified tolerance.)

0 1 20.0

0.5

1.0

1.5

x

y

x and cosx

0.5 1.0 1.5 2.0

-1

0

1

2

x

y

x− cosx

To approximate the point of intersection by the bisection method we startwith a pair of x values such that x−cosx is negative at one and positive at theother (0 and π/2 in this case), and examine the sign of x−cosx at the midpointof the interval between them, here π/4. We find π

4− cos π

4= .078291 > 0. We

discard π/2 and now have a new interval, 0 ≤ x ≤ π/4, where x − cosxhas opposite signs at the two ends. We keep repeating the process, at eachstep confining the root to an interval half the length of that in the precedingstep until we have determined the root sufficiently accurately. In this case,repetition would lead to the following table.

x x− cosxπ4

.078π8

-.5313π16

-.2427π32

-.08615π64

-.00431π128

.037

NEWTON’S METHOD 197

Page 198: Notes on Calculus

After six steps we have found that the zero is between 15π64≈ .736 and 31π

128≈

.761. It will take several more steps to locate it to even two decimal places.The strengths of the bisection rule are that it is simple to write a program

that will do each step very quickly, and that it is inexorable—the two ends ofthe interval will inevitably close down on a zero. The weakness, as revealedby the little table above, is that convergence is relatively slow, so that manysteps are required for even moderate accuracy. For this reason serious rootfinders tend to use some variation of a somewhat more sophisticated schemecalled Newton’s Method.The idea of Newton’s Method is actually quite simple. We make a series

of guesses of the root of an expression f (x) . If we have an initial guess x0,then to find our next guess at the root, x1, we follow the tangent line to thegraph of f at the point (x0, f (x0)) until it intersects the x-axis. This pointwill be x1. Then we repeat, as in the diagram below. Note that if f is linear,then its graph is the same as the tangent line and x1 will be the root exactly.If f is nearly linear, then x1 ought to be very close to the root. And we knowthat differentiable functions are nearly linear for short distances, so if we startclose to the root, we should approach it very quickly.

0.5 1.0 1.5

-1.0

-0.5

0.0

0.5

1.0

x

y

x

xx

0

12

How do we set this up? Well, the tangent line to the graph of f at(x0, f (x0)) is y − f (x0) = f 0 (x0) (x− x0) . This line intersects the x-axisat (x1, 0) . Plugging these values in, 0 − f (x0) = f 0 (x0) (x1 − x0) or x1 =

x0 −f (x0)

f 0 (x0). In general, if we have xn, then the next estimate is

xn+1 = xn −f (xn)

f 0 (xn).

This is more complicated at each step than the bisection method, but it usu-ally converges much faster if you can start reasonably near a root. We canimplement this on the TI-89 for our example above, f (x) = x − cosx, withinitial guess x0 = π/4 by entering x− (x− cosx) / (1 + sinx) |x = π/4 on thehome screen and pressing (green diamond) + Enter to get x1 = .739536 and

198 APPLICATIONS OF DIFFERENTIATION

Page 199: Notes on Calculus

then replacing π/4 by .739536 to get x2 = 739085. Another iteration still yields.739085 and plugging this value into f gives f (.739085) ≈ −2.23× 10−7 so itis probably right to six decimal places. (Keeping more places would make x3even closer of course.)Convergence is thus very much faster than with the bisection method. On

the other hand, if the initial guess is too far from the root, then following thetangent line make take you so far away that you never find the root, or at bestfind a different root. Consider for instance, this

EXAMPLE. Find the root of f (x) = x− 10 cosx nearest to 3. If we plugin 3 as an initial guess, the method converges to −1.74633. If we try 3.5 as aninitial guess, the method converges to 7.06889. As the graph shows, neither isthe closest root to x = 3.

-10 -8 -6 -4 -2 2 4 6 8 10

-15

-10

-5

5

10

15

x

y

x− 10 cosx

The problem, as the graph also reveals, is that the slope of the curve isrelatively small near x = 3. Thus the slope of the first tangent line is alsosmall and so the tangent line intersects the x-axis far away. Then the methodhomes in on a root in the new region near x1.The relationship between theinitial guess and the root found can be very complicated. I may ask you toplay with thisThus in practice commercial root finders generally do something like use

the bisection method first to get close to a root and then switch to Newton’smethod to converge rapidly. If using Newton’s method “by hand,” looking ata graph should allow you to make a good enough initial guess to avoid thisproblem.

EXERCISES.1.Use Newton’s method to find 3

√50 correct to at least 4 decimal places.

(First of course you will have to construct a simple equation to which this isa root.)

2.(a) Use Newton’s method to find a root of f (x) = 3 sinx − 2x with aninitial guess of 0.904 and also with an initial guess of 0.906.

NEWTON’S METHOD 199

Page 200: Notes on Calculus

(b) Explain your results in (a), using a diagram.

3.Back in medieval times, before electronic calculators, I was taught the

following method to approximate√3 (or any square root) by hand.

Start with a guess, say x0 = 2. The next guess, x1, will be the average ofx0 and the quotient 3/x0. In this case we get

x1 =1

2

µ2 +

3

2

¶= 1.75.

Now repeat as many times as desired.(a) Explain why for any x > 0, exactly one of x and 3/x is greater than

√3

and one is less than√3. (Unless both are exactly

√3, that is.) Thus the two

numbers used for any estimate also provide an estimate of the possible error.(b) Write out explicitly the formula for xn+1 in terms of xn and show that

this is the same formula that Newton’s method gives in this case.

4. Apply Newton’s method to f (x) = x1/3 with an initial guess of 1. Whathappens? Discuss in terms of the explicit iteration formula and a diagram. Isthere any nonzero initial guess that would allow Newton’s method to find thezero of this function?5. Apply Newton’s method to f (x) = e−x with an initial guess of 0. Discuss

what happens in terms of the explicit iteration formula and a diagram.

200 APPLICATIONS OF DIFFERENTIATION

Page 201: Notes on Calculus

5. INTEGRATION

5.1 Introduction

Just as with differentiation, we begin this study of integration by discussingseveral problems.Example 1. Distance. I leave Seattle during the rush hour in my aged

van to drive back to Bellingham. Enough of my van still works to get me home,but the odometer (total mileage indicator) is long gone. The speedometer isstill ok, though, and comparing its readings with my watch, I observe that Iam speeding up at a uniform rate, so that my speed after t hours is 20t milesper hour.Question: How far have I gone after 2 hours?

Example 2. Mass. A nonhomogeneous bar has linear density (mass perunit length) 20x grams/cm a distance x centimeters from the left hand end ofthe bar.Question: What is the mass of the left hand two centimeters of the bar?

Example 3. Area. The line y = 20x, the x-axis, and the vertical linex = 2 bound a triangular region.Question: What is the area of this region?

Discussion of Example 1. If my speed had been constant, it would bevery easy to answer the question. We would just need to use the familiarformula

distance = speed × time.

Since my speed is not constant, I will have to do something more complicatedthan this to find my exact distance, but I can use this formula to estimate mydistance.A very simple-minded estimate is that my speed has just reached 40 miles

per hour, so I have certainly gone less than 40× 2 = 80 miles. This is unlikelyto be very close, since for most of the two hours I was traveling much moreslowly than 40 mph.I can compensate for that to some extent by dividing the two hours into

two one hour periods, estimating distance separately for each, and adding theestimates to get an estimate for the total distance. Since I was going 20 mphat the end of the first hour and 40 mph at the end of the second, this upperestimate becomes that I traveled less than

20× 1 + 40× 1 = 60 miles.

INTEGRATION 201

Page 202: Notes on Calculus

Similarly, I could get a lower estimate by using the fact that at the beginningof the first hour I was going 0 mph, at the beginning of the second hour I wasgoing 20 mph, and in each case this is the lowest speed during that hour. Thusduring the two hours I traveled more than

0× 1 + 20× 1 = 20 miles.

I now have my distance trapped between a lower estimate of 20 miles andan upper one of 60 miles. This is still not good enough to tell me whether Iam in Everett or Mount Vernon, but I can decrease the gap by dividing thetwo hours into more parts.Repeating with four half hour intervals, I get an upper estimate of

10× 12+ 20× 1

2+ 30× 1

2+ 40× 1

2= 50 miles

and a lower estimate of

0× 12+ 10× 1

2+ 20× 1

2+ 30× 1

2= 30 miles.

These are still pretty far apart, but the gap is half of what it was with theprevious estimate, so presumably if I consider, say eight 15 minute intervals,(or 120 one minute intervals), I can do better (or much better). I will returnto this in the next subsection, but first I’ll look at the other two problems.

Discussion of Example 2. What is the mass of the left two centimetersof the bar? If the bar had constant linear density, this would be easy to answerusing the simple formula

mass = density × length.

Here the density isn’t constant, but we can still use this formula to estimatethe mass. For instance, if we divide this part of the bar into two one centimeterpieces, then the left piece has density 0 at its left end and 20g/cm at its rightend and the right piece has density 20 g/cm at its left end and 40 g/cm at itsright endIt seems reasonable to estimate the mass of the left part as less than 20

g/cm×1 cm= 20 g and the right part as less than 40 g/cm × 1 cm = 40 gmso that the total mass of the left two centimeters is less than

20× 1 + 40× 1 = 60 grams.

Similarly, using the lowest value of the density for each part, the total massshould be more that

0× 1 + 20× 1 = 20 grams.If we divide the two centimeter piece into four equal pieces of length 1/2

cm and proceed in the same way, we get an upper estimate of

10× 12+ 20× 1

2+ 30× 1

2+ 40× 1

2= 50 grams

202 INTEGRATION

Page 203: Notes on Calculus

where each term is an upper bound for the mass of one of the half centimeterpieces, and a corresponding lower estimate of

0× 12+ 10× 1

2+ 20× 1

2+ 30× 1

2= 30 grams.

Presumably if we further subdivided the two centimeter piece into a largernumber of smaller parts and made upper and lower estimates of the total mass,the estimates would get closer together—the more subintervals, the closer wemight expect them to be.Notice that the sums look just like the sums in Example 1, although here

they represent a mass while there they represented a distance. Instead oftrying to improve the estimates for this example, I will move on to Example3.

Discussion of Example 3. What is the area of the region bounded bythe line y = 20x, the two coordinate axes, and the vertical line x = 2? Thediagram shows that it is a triangle, so it would be easy to figure out the area,but I am going to forget the formula for the area of a triangle for a momentand proceed in a different way, assuming that the only area formula I canremember is the formula for the area of a rectangle.How can I find the area of my region by filling it up with rectangles? Well,

I can’t; not exactly. But I can estimate the area by covering it with rectangleslike this:

0.0 0.5 1.0 1.5 2.00

10

20

30

40

x

y

Estimating area with 2 rectangles

21.510.50

40

30

20

10

0

Estimating area with 4 rectangles

From the left diagram I see that the area of the triangle is less than thesum of the areas of two rectangles, each with base 1. This sum is

20× 1 + 40× 1 = 60.

We get a better estimate by using four rectangles, each with base 1/2, as inthe right diagram. The sum of their areas is

10× 12+ 20× 1

2+ 30× 1

2+ 40× 1

2= 50,

so that the triangle’s area must be less than this.We have seen this expressionbefore!

INTRODUCTION 203

Page 204: Notes on Calculus

Similarly, if we compare the area of the triangle to the sum of areas of acollection of rectangles that fit entirely within the triangle, we will get a lowerestimate for the area. The corresponding two diagrams look like this:

0.0 0.5 1.0 1.5 2.00

10

20

30

40

x

y

Estimating area with 2 rectangles

21.510.50

40

30

20

10

0

Estimating area with 4 rectangles

In the two diagrams above, the left hand rectangle has height zero. Thusthere really are two and four rectangles respectively.The sum of the areas of the rectangles are respectively

0× 1 + 20× 1 = 20

and0× 1

2+ 10× 1

2+ 20× 1

2+ 30× 1

2= 30.

Comparing the two estimates with four rectangles,we see that the triangle’sarea must lie between 30 and 50. If we were willing to repeat this calculationwith a larger number of thinner rectangles, we would be able to make the lowerestimate and the upper estimate closer together.Instead of doing that, however, I’ll return to Example 1 and do a general

calculation with n time intervals, each of length2

nhours in order to try to get

a feel for how the estimates depend on the number of time intervals.

5.1.1 A General Formula for the Sums in Example 1

I’ll take advantage of the simplicity of my speed function to compute a general

upper estimate for n time intervals, each of length2

nhours. At the end of

the first interval, that is, 2nhours after I started, I was going 20 × 2

n= 40

n

mph, at the end of the second interval, 4nhours after I started, I was going

20 × 4n= 80

nmph and so forth. I can summarize the “and so forth” part in

the first three columns of the following table where the second column givesthe total time t that has elapsed after that many time intervals, and the thirdcolumn uses the formula v = 20t to compute the speed at the end of thattime interval. Multiplying each of these speeds by the length of one interval, 2

n

204 INTEGRATION

Page 205: Notes on Calculus

hours, gives the upper estimates in the fourth column of the distance traveledin each individual time interval. A similar calculation, using instead the speedat the beginning of each interval gives the lower estimates of distance traveledin each individual time interval in the fifth column.

interval total time speed at end upper est. lower est.1 2

n20× 2

n= 40

n40n× 2

n0× 2

n

2 4n

80n

80n× 2

n40n× 2

n

3 6n

120n

120n× 2

n80n× 2

n

4 8n

160n

160n× 2

n120n× 2

n...

......

......

k 2kn

40kn

40kn× 2

n40(k−1)

n× 2

n...

......

......

n 2nn= 2 40n

n= 40 40n

n× 2

n= 40× 2

n40(n−1)

n× 2

n

Finally, to compute the upper estimate and the lower estimate of the totaldistance traveled over two hours that come from breaking the two hour pe-riod into n equal subintervals of length 2

nhours, we must add the individual

estimates. Thus

Upper estimate =80

n2+160

n2+240

n2+ ...+

80k

n2+ ...+

80n

n2

=80

n2(1 + 2 + ...+ k + ...+ n) .

To get a formula in closed form for this estimate, we must know a formula

for the sum of the first n integers. It turns out that this sum isn (n+ 1)

2.

(Check this for the first few integers. Note for instance that 1 + 2 = 3 = 2×32.

Why is this quantity always a whole number despite the 2 in the denominator,and not sometimes a fraction?) Substituting, we find that our upper estimateis

Upper estimate =80

n2× n (n+ 1)

2= 40 +

40

nmiles.

A similar calculation for the lower estimate of distance (see the problems)leads to

Lower estimate = 40− 40nmiles.

We see that as the number n of time intervals we use for our estimate grows,both the upper estimate and the lower estimate approach 40 miles. Thissuggests pretty strongly that after 2 hours I will have gone exactly 40 miles.This is good news since it means that I have just reached the rest stop beforeArlington, and can stop for coffee.Comment on Examples 2 and 3. We can reuse this calculation to finish

off both Examples 2 and 3. For Example 2 we would find that if we subdividedthe bar into n pieces, each of length 2/n centimeters with density 40k/n g/cm

INTRODUCTION 205

Page 206: Notes on Calculus

at the right hand end of the k-th piece, then we get an upper estimate for thetotal mass of the left two centimeters of the bar of

80

n2+160

n2+240

n2+ ...+

80k

n2+ ...+

80n

n2

=80

n2(1 + 2 + ...+ k + ...+ n)

=80

n2× n (n+ 1)

2= 40 +

40

ngrams.

Similarly, using the density of 40 (k − 1) /n at the left end of the k-th pieceleads to a lower estimate of 40− 40

ngrams for the mass. Since both of these

estimates close in on 40 grams as we divide our two centimeter hunk into moreand more pieces, it seems very reasonable that the exact mass of the left twocentimeters of the bar is 40 grams.For Example 3 we find that enclosing our triangle within n rectangles each

of width 2/n and with the height of the k-th rectangle being 40k/n, leads toan upper bound for the area of the rectangle of

80

n2+160

n2+240

n2+ ...+

80k

n2+ ...+

80n

n2

=80

n2(1 + 2 + ...+ k + ...+ n)

=80

n2× n (n+ 1)

2= 40 +

40

n.

Similarly, enclosing n rectangles, each of width 2/n within our triangle leads

to a lower bound for the area of 40− 40n. Since both of these estimates close

in on an area of 40 as we divide our two unit width into more and morepieces, it seems very reasonable that the exact area of the triangle is 40. Inthis case, of course, we can check our work by remembering the formula forthe area of a triangle and calculating that the area is indeed 1

2(base)×(height)

= 12· 2 · 40 = 40.

Why work so hard?

Why work so hard to find the area of a triangle, when we already know aneasy way to do it? If the sum method were good for nothing but the areas ofregions we could already compute using high school geometry, this would bea reasonable objection. But we will see that the sum method will allow us tofind the area of any region defined by functions given by reasonable formulas,something that is far beyond the reach of geometry. The advantage of trying itfirst on an example where we already know the answer is that seeing that themethod does definitely give the right answer then can give us the confidenceto use it in situations where we cannot easily check the result.

206 INTEGRATION

Page 207: Notes on Calculus

EXERCISES

1. A car on a test track comes to a stop 5 seconds after the brakes are firstapplied. While the brakes are on, a technician records the following velocities:

Time since brakes applied (sec) 0 1 2 3 4 5Velocity (ft/sec) 88 60 40 24 10 0

(a) Give upper and lower estimates for the distance that the car traveled fromthe time the brakes were applied until the car stopped.(b) On a sketch of velocity against time (just draw a smooth curve throughthe data points), show the upper and lower estimates as the sum of areas ofrectangles.2. Tom decides to run a marathon. His friend Myrl rides behind him and

records his pace every 15 minutes. Tom starts out strongly, but after an hourand a half he is so exhausted that he has to stop. Here is the data Myrlcollected:

Time spent running (min) 0 15 30 45 60 75 90Velocity (mph) 10 9 8 8 7 5 0

(a) Assuming that Tom’s speed is always decreasing or constant, give upperand lower estimates for the distance Tom ran in the first half hour.(b) Give upper and lower estimates for the distance Tom ran in total.(c) On a sketch of velocity against time, show the upper and lower estimatesas the sum of areas of rectangles.3. You jump out of an airplane. Before your parachute opens, you fall

faster and faster, but your acceleration decreases because of air resistance.The table gives your acceleration in meters/sec2 t seconds after jumping.

Time since jumped (sec) 0 1 2 3 4 5Acceleration (m/sec2) 9.81 8.03 6.53 5.38 4.41 3.61

(a) Give upper and lower estimates of your velocity at t = 5.(b) On a sketch of acceleration against time, show the upper and lower esti-mates as the sum of areas of rectangles.(c) Get a new estimate by taking the average of your upper and lower estimates.What does the concavity of the graph tell you about whether this estimate isan overestimate or underestimate?4. Your energetic but not very mathematical neighbors want to dig a

garden pond for their back yard, but they are a little concerned about howmuch water it will take to fill the pond once they dig it. They want it to bean attractive (= irregular) shape, so no simple formula that they know willbe helpful. They tell you the following about their plans: the pond will beone meter deep at its deepest point. A chart of the area in square meters of ahorizontal section through the pond at various depths looks like this:

Depth (meters) 0 .2 .4 .6 .8 1Area of section (meters2) 4 3.5 3 2.5 2 1

INTRODUCTION 207

Page 208: Notes on Calculus

Estimate how many liters of water it will take to fill the pond to ground level.(Remember that one square meter of water is 1000 liters.)

5. Write out the sum for the lower estimate of distance traveled in Example

1, and use the formula 1 + 2 + 3 + ... +m =m (m+ 1)

2correctly to simplify

it. You will have to be a bit careful in in applying the formula, since m and nare not the same.

6. Derive the formula 1+ 2+3+ ...+m =m (m+ 1)

2. One way to do this

is geometrically: think of lining up m squares, each 1× 1, in a horizontal row.Right above them place m− 1 squares, then m− 2 squares, and so forth goingup until at last there is only one square. If you arrange the rows so that all theleft hand edges form a vertical line, then the collection of squares almost forman isosceles right triangle. Find the area as the sum of the area of a genuinetriangle as corrected with the areas of some small triangles.

5.1.2 Review: Sigma Notation for Sums

In case you’ve forgotten, there is a very convenient compact notation for sumswhose terms can be described by a rule. Here’s an example:

4Xk=1

k2 = 12 + 22 + 32 + 42 = 30.

The idea is that you plug successive integer values of the index starting withthe lower limit (the value under the sigma sign, 1 in this example) and finishingwith the upper limit (the value above the sigma sign, 4 in this example) intothe expresssion to the right of the sigma, and then sum all the terms obtainedin that way.You can compute such sums conveniently on the TI-89 by using the

Pkey

(2nd + 4). For the sum above you would enterP(kˆ2, k, 1, 4) . Here the first

entry is the expression to be summed, the second is the name of the index, andthe third and fourth are the starting and ending values of the index. Pressing

Enter produces the sum in the form4X

k=1

(k2) on the left and the sum of 30 on

the right.Part of the power of the sigma notation is that the amount of space it

uses is independent of the number of terms; the example above takes the sameamount of space as if there were 4000 terms:

4000Xk=1

k2 = 12 + 22 + 32 + ...+ 39992 + 40002(= 21, 341, 334, 000).

208 INTEGRATION

Page 209: Notes on Calculus

(This sum took 21 seconds on my elderly TI-89.) Another part of the poweris that you can easily indicate a sum where each term is rather complicatedas long as the terms change in a regular way:

4Xk=1

(−1)k

2kk!=

(−1)1

211!+(−1)2

222!+(−1)3

233!+(−1)4

244!

= −12+1

8− 1

48+

1

384= −151

384.

(Remember that the exclamation point is notation for the factorial function:k! is the product of all positive integers starting with 1 and going up throughk. Thus 3! = 1 · 2 · 3 = 6, 4! = 1 · 2 · 3 · 4 = 24 and so forth. You can useyour TI-89 to see that factorials grow surprisingly fast—even much faster thanexponentials. There is a hidden factorial key, press Green Diamond-÷.Irrelevant Remark: The TI-89 will do rational arithmetic: if you enter

that sum of fractions from the displayed equation above, the calculator willreport the answer as a fraction. (That is, it will do this unless you accidentallyhave the calculator in Approximate mode. Probably Auto is the best modefor most purposes.)

The sigma notation can also be used in less explicit situations. If we justwant to express the rule: Add the values of the function f at the multiples ofπ from 2π up to 7π we would write

7Xk=2

f (kπ)

to meanf (2π) + f (3π) + f (4π) + f (5π) + f (6π) + f (7π) .

Just as before you substitute each of the indicated values of the index k intothe general expression and sum all the terms thus obtained.

Finally, note that as is always the case with functional notation, it doesn’tmatter what symbol you use for the index:

4Xk=1

k2 =4X

j=1

j2 =4X

n=1

n2 =4X

¤=1¤2 = 12 + 22 + 32 + 42.

5.2 The Definite Integral

For each of the three problems above, an estimate of the solution involvesforming a sum of products. Progressing to better and better estimates corre-sponds to considering sums with more and more terms, and it seems natural

THE DEFINITE INTEGRAL 209

Page 210: Notes on Calculus

that if these sums approach some limiting value, then that value should rep-resent the exact answer to our original problem. This is a situation ripe forthe mathematician’s usual activity of abstracting and generalizing. We willdescribe a single procedure, depending only on a function f defined on someinterval of real numbers, a ≤ x ≤ b, and regard each of the examples above(and others to be described below) as examples of this procedure.

Let f be a function defined and bounded on an interval a ≤ x ≤ b of realnumbers. (That f is bounded means that all the values f (x) , for a ≤ x ≤ blie between a fixed upper bound M and a fixed lower bound m. For instance,the function f (x) = 1/x is bounded on 1 ≤ x ≤ 3 with upper bound M = 1and lower bound m = 1/3 (see the left diagram) but 1/x is not bounded onthe interval −1 ≤ x ≤ 1 since it is not possible to find either an upper boundor a lower bound for the values of f in this interval (right diagram).

1 2 3

0.5

1.0

1.5

2.0

x

y

upper bound

lower bound

1/x is bounded on 1 ≤ x ≤ 3

-1.0 -0.5 0.5 1.0

-10

-5

5

10

x

yno upper bound

no lower bound

1/x is not bounded for −1 ≤ x ≤ 1

A partition of [a, b] into n subintervals is a division of [a, b] into n partsof equal length (b− a) /n. We denote the division points of the subintervals

by x0 = a, x1, x2, ..., xn = b. Here x1 = a +b− a

n, x2 = a + 2 × b− a

n, xn =

a+ n× b− a

n= b. In the example of the previous section where [a, b] = [0, 2],

the divisions for four subintervals would be a = x0 = 0, x1 = 0.5, x2 = 1, x3 =1.5, x4 = b = 2 as in this diagram

0 1 2x0 1 2 3 4x x x x

The left hand sum for f with n subintervals, denoted L (f, n) is the sum

L (f, n) =nX

j=1

f (xj−1)∆x

210 INTEGRATION

Page 211: Notes on Calculus

21.510.50

40

30

20

10

0

Figure 5-1 Left hand sum with n = 4

where ∆x =b− a

nis the length of each subinterval. Pictorially, this sum

repesents the sum of the areas of n rectangles, each of whose top left cornertouches the graph of f. For the function f (x) = 20x of the previous section,with n = 4, ∆x = 1

2, the sum L (f, 4) would be the sum

L (20x, 4) =4X

j=1

20xj−1∆x

= 0× 12+ 10× 1

2+ 20× 1

2+ 30× 1

2= 30

of the previous section.

The right hand sum for f with n subintervals, denoted R (f, n) is thesum

R (f, n) =nX

j=1

f (xj)∆x

where ∆x =b− a

nis the length of each subinterval. Pictorially, this sum

repesents the sum of the areas of n rectangles, each of whose top right cornertouches the graph of f. For the function f (x) = 20x of the previous section,with n = 4, ∆x = 1

2, the sum R (f, 4) would be the sum

R (20x, 4) =4X

j=1

20xj∆x

= 10× 12+ 20× 1

2+ 30× 1

2+ 40× 1

2= 50

of the previous section with diagram

THE DEFINITE INTEGRAL 211

Page 212: Notes on Calculus

21.510.50

40

30

20

10

0

Right hand sum with n = 4

REMARK: Note that since f (x) = 20x is an increasing function, the lefthand sum or any number of intervals will be an underestimate of the areaunder the graph and the right hand sum will be an overestimate of the area.This will be true for any increasing function. For a decreasing function on theother hand, any left hand sum will be an overestimate and any right hand sumwill be an underestimate. In that case the pictures would look like this for afunction defined on 2 ≤ x ≤ 4:

43210

4

3

2

1

0

Left hand sum with n = 4

43210

4

3

2

1

0

Right hand sum with n = 4

5.2.1 Definition of the Definite Integral

As we have seen, for an increasing function a left hand sum is also a lowersum, that is, a sum

nXk=1

f (zk)∆x

with the property that the value of f in each term is less than or equal toall values of f in the same subinterval, as in the first diagram above. (I used

212 INTEGRATION

Page 213: Notes on Calculus

zk in the sum to indicate some point in the k-th subinterval xk−1 ≤ x ≤ xkwithout specifying which one.) Pictorially, the top of each rectangle lies belowthe graph of f, so that it is clear that the sum (which represents the sum ofthe area of the rectangles) is less than the area under the graph of f.Similarly, for an increasing function a right hand sum is also an upper

sum, that is, a sumPn

k=1 f (zk)∆x with the property that the value of f ineach term is greater than or equal to all values of f in the same subinterval,as in the second diagram above.For decreasing functions this situation is reversed; a left hand sum is an

upper sum and a right hand sum is a lower sum. For functions which changedirection, upper and lower sums may be neither left nor right hand sums. Hereis a simple example on the interval [0, 4].

43210

4

3

2

1

0

Upper sum with n = 4

43210

4

3

2

1

0

Lower sum with n = 4

DEFINITION: Let f be a bounded function defined on an interval a ≤x ≤ b.We say that f is Riemann integrable on [a, b] if there is a single realnumber I which both upper sums and lower sums approach as the number nof subintervals approaches infinity. We denote this number I byZ b

a

f (x) dx

and call it the definite integral (or just integral for short) of f over [a, b] .

There are two questions that arise immediately from this definition:1. Which bounded functions are Riemann integrable?2. For those functions which are Riemann integrable, what is the best way

to actually calculate the numberR baf (x) dx?

5.2.2 Which Functions Are Riemann Integrable?

This is a hard question to answer completely (maybe you’ll see a full answerin a senior or graduate level class someday) but an easy question to answerfor most practical purposes. All bounded functions that arise “in real life”

THE DEFINITE INTEGRAL 213

Page 214: Notes on Calculus

are Riemann integrable, though there are examples of bounded functions thatare not Riemann integrable. In particular, continuity is not necessary forintegrability. The simplest example of this is a step function—a function thatis constant on a finite number of intervals, so that the area under its graph issimply a finite union of rectangles.The most famous example of a bounded non-integrable function is this

function defined on [0, 1] :

f (x) =

½1, if x is rational,0, if x is irrational.

The “graph” of this function looks something like this:

0.0 0.2 0.4 0.6 0.8 1.00

1

2

x

y

where the dotted line is meant to indicate that there are lots of values of 1,but gaps between them.The problem here is that since the graph keeps jumping back and forth

between the lines y = 0 and y = 1, any time you divide [0, 1] up into nsubintervals, there will be values of both 0 and 1 in each subinterval. Thismeans that any upper sum will have to take the value 1 in each subintervaland so the total area will be 1, no matter what n is. (Both left hand sumsand right hand sums will always be 1, in fact. Do you see why?) On the otherhand, any lower sum will have to take the value 0 in each subinterval and sothe total area of any lower sum will always be 0. Since all upper sums are 1and all lower sums are 0, there is no single number that both kinds of sumsare closing in on. Thus the function is not Riemann integrable.This example dates from 1829 and accompanied the first serious proof that

some class of functions is Riemann integrable. Up until very shortly beforethis date most mathematicians would not have regarded this example as afunction–the dominant view in the eighteenth century was that only objectsdefined by simple formulas could be considered functions. Mathematicianswere forced to abandon this view when it was realized that surprisingly simpleoperations on “formula functions” would produce functions not given by simpleformulas.It turns out, however, as stated above that all bounded functions that

come up in calculus courses are integrable, so it seems better to move on to

214 INTEGRATION

Page 215: Notes on Calculus

the second question than to spend a lot of time on this one. (For a simpleproof that “calculus functions” are Riemann integrable see some sections ofMath 226. A more difficult proof that all continuous functions are Riemannintegrable is normally included in Math 421 or 422.)

5.3 Calculating Integrals With Sums

Let’s go back to thinking about increasing functions again. As stated above, foran increasing function, a left hand sum is always a lower sum and a right handsum is always an upper sum. Here is a picture for the function f (x) = sinx

on 0 ≤ x ≤ π

2and n = 6.

1.61.41.210.80.60.40.20

1

0.8

0.6

0.4

0.2

0

Left and right sums for sinx

We can write these two sums symbolically as

L (sinx, 6) =6X

k=1

sin

µ(k − 1)π12

¶· π12; R (sinx, 6) =

6Xk=1

sin

µkπ

12

¶· π12

.

Unlike the example at the beginning of the chapter, however, we cannot easilyevaluate these sums by hand. And although I could easily write an expressionin sigma notation for the left and right sums with n subintervals, there is nosimple formula depending on n for the sums. (That is, we cannot actuallyevaluate these sums in general as I did above for the very special case off (x) = 20x.)Does that mean that these sums are not useful objects? By no means. They

cannot be evaluated by hand, but they are easy for a computer or program-mable calculator to evaluate. Here is a table of left and right sums for sinx

CALCULATING INTEGRALS WITH SUMS 215

Page 216: Notes on Calculus

on [0, π/2] for various values of n.

n R (sinx, n) L (sinx, n) R− L

10 1.07648 0.91940 0.15708100 1.00783 0.99212 0.015711000 1.000785 0.999214 0.001571

(5.1)

As the number of intervals increases, the right hand sums decrease and theleft hand sums increase. Since

R π/20

sinx dx –the area under the graph ofsinx between x = 0 and x = π/2 –is trapped between these two sums, wesee from the last row that (rounding to 3 decimal places) it lies between .999and 1.001. In a later section we will see how to get much closer with fewersubintervals by using other sums that approximate the area better.

The last column of the table is suggestive. Although there is no simpleformula for either R (sinx, n) or L (sinx, n) in terms of n, it appears that thedependence of their difference, R − L, on n may be very simple. The factthat multiplying n by 10 has the effect of dividing R− L by 10 suggests that

possibly R−L is proportional to 1n. Let’s try to confirm that from the diagram

above.If we look at the diagram, R − L can be represented by the sum of the

areas of the “difference” rectangles; the rectangles with base π/12 and withheight the difference between the value of sinx at the right hand end of theinterval and the value of sinx at the left hand end of the interval. These arethe smaller rectangles along the graph of sinx in the diagram on the left below.

1.61.41.210.80.60.40.20

1

0.8

0.6

0.4

0.2

0

Difference between right and left sums R− L

The most important thing to notice about the diagram on the left is thatthe “difference” rectangles do not intersect vertically—the top of each is evenwith the bottom of the next one to the right. This is just a consequence ofthe fact that on this interval, sinx is increasing. The result of this is thatif we think of these rectangles as being like toy blocks, we can push themhorizontally (without moving them up or down at all) to stack them as in thediagram on the right. Since all six rectangles have the same base of π/12,they fit together neatly into a rectangle of base π/12 and height 1. (The total

216 INTEGRATION

Page 217: Notes on Calculus

height of this rectangle is just the difference between the value at the top ofthe highest difference rectangle (1 in this case) and the value at the bottomof the lowest difference rectangle (0 in this case), that is, it is the differencesinπ/2− sin 0 = 1− 0 = 1.Thus we see from the picture that we can compute the difference between

the right and left hand sums even without computing the sums themselves.We get

R (sinx, 6)− L (sinx, 6) = 1× π

12=

π

12.

We could draw a very similar picture for any number n of subrectangles.

With [0, π/2] divided up into n equal parts, each would have a width ofπ/2

n=

π

2n.We could again slide the n difference rectangles horizontally to stack on top

of one another into a rectangle of width π/2n and total height sinπ/2−sin 0 =1. Since this rectangle would have area π/2n, we have shown that for anynumber n of subrectangles,

R (sinx, n)− L (sinx, n) =π

2n. (5.2)

You can check that this agrees with the numbers in the table above; for instancefor n = 1000,

π

2000≈ .0015708.

We can draw several conclusions from this simple argument.

1. Any increasing (or decreasing) function on an interval [a, b] is Riemannintegrable on that interval. This follows from equation (5.2) since that estimateshows that we can make the right hand sum and left hand sum as close togetheras we like by dividing [a, b] into sufficiently many subintervals. Thus only onenumber can possibly lie between all the right hand sums and all the left handsums. A little experimentation should convince you that this is true even ifthe function has one or more jumps on [a, b]—for instance the postage functionis integrable. (This is anyway clear from thinking about area.)

2. More concretely, if we just want to estimateR baf (x) dx to within some

specified tolerance, then equation (5.2) tells us how many subintervals we needto do the job. For instance, to estimate

R π/20

sinx dx to within .001, we wouldneed that R− L ≤ .002, that is, that

π

2n≤ .002 or n ≥ 250π ≈ 785.4.

Thus 786 intervals would be just enough. (A number of intervals needs to be awhole number!) The reason for .002 is that our estimate will be the midpoint,L+R

2, of the interval from L to R and that number is not more than half the

length, .001 in this case, from any number in this interval. Thus it cannot bemore than .001 from

R π/20

sinx dx. In this particular case, with n = 786 we

CALCULATING INTEGRALS WITH SUMS 217

Page 218: Notes on Calculus

find L ≈ .9990, R ≈ 1.0010 so the average, 1.000, is within .001 of the value ofthe interval.

The general principle of which this calculation is an example is that if fis increasing on [a, b] then

R (f, n)− L (f, n) = (width of one interval)× (f (b)− f (a)) (5.3)

=b− a

n(f (b)− f (a)) .

If f is decreasing, the same is true, except that it is L−R rather than R−Lthat is given by this expression.

What about functions that are not monotonic (increasing or decreasing)throughout an interval, but whose graph changes direction one or more times?If the number of direction changes is finite, you can imagine dividing the in-terval [a, b] into a finite number of intervals in each of which f is monotonicand then dealing with them one at a time. Thus any such function is Riemannintegrable. Since essentially all “calculus functions” are of this form on anybounded interval a ≤ x ≤ b, all “normal calculus functions” can be integratedon such an interval.However if the number of direction changes is infinite (this

is possible, even for continuous functions) a proof of Riemann integrabilitymust be quite different and depend on more sophisticated properties of func-tions and sets of points.

EXERCISES

1. For each of the following integrals, make a table of left- and right-handsums with 10, 50, and 250 subintervals. Use the results to estimate thevalue of the integral and the possible error in your estimate. (Whichsums give overestimates and which give underestimates?)

(a).Z 1

0

et2dt (b).

Z 2

1

xxdx (c).Z 1

0

1

t4 + 1dt

2. For the braking test problem (#1 in section 5.1), at how many equallyspaced times would you have to measure the velocity in order to be ableto estimate the distance traveled by the car while braking to within onefoot?

3. For the parachute problem, (#3 in section 5.1), at how many equallyspaced times would you have to measure your acceleration in order tobe able to determine your velocity at t = 5 seconds to within 0.1 m/sec?

4. For each of the following integrals, find the left and right hand sums for100 subintervals, and also for enough subintervals for the left and righthand sums to be within 0.05 of one another.

218 INTEGRATION

curgus
Right Red Arrow
curgus
Right Red Arrow
Page 219: Notes on Calculus

(a).Z 0

−1arctanx dx (b).

Z √3

−1arctanx dx (c).

Z 3π/4

π/4

sin 2x dx

For each of the following problems, use left and right sums to estimate theintegral to within .01. Explain how many subintervals you need to be certainthat your estimate is correct to that level of accuracy.

5.Z 4

1

1√1 + t2

dt 6.Z 1.5

1

sin t dt 7.Z π/4

0

cos θ

8.Z 5

1

(lnx)2 dx 9.Z 3

1

2z dz 10.Z π/4

0

tanu du

5.4 Numerical Integration

5.4.1 Midpoint and Trapezoidal Rules

We have approximated integrals by using left and right hand sums. These havethe advantage of simplicity, and the further advantage that if the integrandis monotonic (either increasing or decreasing) then the pair of estimates fromleft and right sums bracket the value of the integral so that we can easily seehow close our estimates are to the limit.The disadvantage of left and right hand sums is that they are not very

efficient. It is obvious from a diagram that using the largest or smallest valuefor a function in the interval as a height estimate (when the average value isby definition the value that would give exactly the right answer) is not themost effective course. Two natural methods of improving the sum estimatesby getting closer to the average value of the function are

• average the left and right sums—this amounts to estimating the areaunder a piece of a curve by the area of a trapezoid. Not surprisingly,this method of estimating integrals is called the trapezoidal rule.

• use the value of the function at the midpoint of the interval, instead ofthe value at one end or the other. This is the midpoint rule.

Both of these rules tend to give better estimates than left or right handsums. And in some cases it is possible to tell from the graph of the integrandwhether they are overestimates or underestimates. As should be clear fromthe diagram

NUMERICAL INTEGRATION 219

curgus
Right Red Arrow
curgus
Right Red Arrow
Page 220: Notes on Calculus

0.5 1.0

0.5

1.0

1.5

2.0

x

y

Trapezoidal > Midpoint

0.5 1.0

0.5

1.0

1.5

2.0

x

y

Midpoint > Trapezoidal

this no longer depends on whether f is increasing or decreasing (or on the signof f), but on the concavity of f. If f is concave up, then the trapezoidal ruleoverestimates

R baf (t) dt and the midpoint rule underestimates

R baf (t) dt.

If f is concave down, this is reversed: the trapezoidal rule underestimates andthe midpoint rule overestimates.

5.4.2 Simpson’s Rule

The fact that the trapezoidal rule and the midpoint rule tend to err in oppositedirections suggests a further possible improvement. A suitable average of theestimates from these rules should cause their individual errors to more or lesscancel out and give an estimate better than either one individually.But what is “a suitable average?” This depends on the relative sizes of

the errors to be expected from the two rules. We can judge that by lookingat a simple function where we can calculate exactly. Both the trapezoidalrule and the midpoint rule will give the exact value of the integral of a linearfunction, so we’ll look at the next most complicated polynomial, a quadraticpolynomial. I will let you do this.

EXERCISE. Calculate the trapezoidal and midpoint estimates for f (x) =x2 on the intervals [0, 1] and [−1, 2] with n = 1 (do this by hand with a picture)and n = 20 (do this with your calculator) and make a table showing the errorsobtained from the two rules.

The result for quadratic functions is that the errors from the rules are inopposite directions and the error for the trapezoidal rule estimate T has exactlytwice the magnitude of the error from the midpoint rule estimate M . (Thisseems reasonable from the diagrams above.) This means that for a quadraticfunction, (2M + T ) /3 would give the exact value of the integral.

220 INTEGRATION

Page 221: Notes on Calculus

EXERCISE. Verify the last statement for the integral of x2 over [0, 1] and[−1, 2] using the data from the previous exercise.

Let’s check this result for an integrand that is not a quadratic polynomial.We’ll look at errors for

R π/20

t sin t dt = −t cos t+ sin t¯̄̄π/20 = 1.

n T T error M M error10 1.00206 .00206 .99897 −.0010320 1.000515 .000515 .999743 −.00025740 1.000128 .000128 .9999357 −.000064380 1.0000321 .000321 .9999839 −.0000161

Here, too, the errors have opposite sign and the trapezoidal rule error is verynearly twice the magnitude of the midpoint rule error.The preceding paragraphs are intended to suggest that (2M + T ) /3 might

give a better estimate for a given number of subintervals than either T or Malone. (For

R π0t sin t dt with n = 80, this quantity would be 1.00000000098,

which is a lot closer to 1 than either M or T by itself.) This amounts tosaying that over a short interval (one of the subintervals when there are manyof them) any function is similar to a quadratic function. This estimate forthe integral is called Simpson’s rule, that is, the Simpson’s rule estimate foran integral using n subintervals is S = (2M + T ) /3 where M and T are themidpoint rule and trapezoidal rule estimates using n subintervals.You may have seen another method for describing Simpson’s rule. The

usual alternative amounts to approximating the integral of f on each subinter-val by the integral of a quadratic function We may explore several alternativedescriptions of Simpson’s rule in bonus problems.

5.4.3 Errors of the Rules

It seems very plausible that the trapezoidal and midpoint rules should givebetter estimates than left or right hand sums, and that Simpson’s rule shouldgive better estimates than the trapezoidal and midpoint rules. Can we makethis a little more precise? It turns out that a full theoretical investigationof the relative accuracy would take us longer than would really be justifiedoutside of a numerical analysis course, but we can understand what is goingon pretty well just by computing some examples. That is what we will do here.The conclusion will be that the principal way in which the trapezoidal rule andthe midpoint rule improve on left and right hand sums, and also the principalway in which Simpson’s rule improves on the trapezoidal and midpoint rulesis in the rate at which the estimate improves as the number n of subintervalsincreases. As is so common in calculus, the interest is not so much in singlenumbers, but in the trend when a sequence of numbers is calculated.

NUMERICAL INTEGRATION 221

Page 222: Notes on Calculus

To continue the previous example, we estimateR π/20

t sin t dt using each ruleand a sequence of numbers n of subintervals. In this case we know that theexact answer is 1, so we can calculate errors. (This is just to study the errors.Of course the point of numerical integration is precisely that we can still doit when we cannot find an antiderivative and use the Fundamental Theoremof Calculus.) Since we have already calculated errors for the trapezoidal andmidpoint rules, we will add here errors for the left and right hand rules andfor Simpson’s rule.

n L L error R R error S S error10 .8787 -.1213 1.1254 .1254 .999999365 6.65×10−720 .9388 -.0612 1.0622 .0622 .9999999603 3.97×10−840 .9693 -.0307 1.0310 .0310 .99999999752 2.48×10−980 .9846 -.0154 1.0155 .0155 .999999999846 1.54×10−10

If we compare trends in errors as n increases, we see that, roughly

• if n doubles, the left and right hand sum errors are cut in half

• if n doubles, the trapezoidal and midpoint rule errors are di-vided by 4,

• if n doubles, the Simpson’s rule error is divided by 16.

This suggests the following rules of thumb for the way in which the errorassociated with any one of these rules depends on the number of subintervalsn.

• The error in the left and right hand rules behaves like a multipleof 1/n.

• The error in the trapezoidal and midpoint rules behaves like amultiple of 1/n2.

• The error in Simpson’s rule behaves like a multiple of 1/n4.

In terms of decimal places of accuracy, this means that if you already havean estimate with a certain number of significant figures (say 2), then taking 10times as many intervals will give one extra significant figure for the left andright hand rules (error is about one tenth), two extra significant figures for thetrapezoidal and midpoint rules (error is about one hundredth) and four extrasignificant figures for Simpson’s rule (error is about one ten thousandth). Or,to put it the other way around, to get a specified number of significant figures,we expect to need far fewer subintervals with the trapezoidal and midpoint

222 INTEGRATION

Page 223: Notes on Calculus

rules than with left or right sums, and far fewer subintervals with Simpson’srule than with the trapezoidal or midpoint rules.

EXERCISES.

1. EstimateZ 1

0

esin(t2/4)dt using the trapezoidal rule, and the midpoint rule

and Simpson’s rule for n = 10 and n = 100. Determine the concavity ofesin(t

2/4) for 0 ≤ t ≤ 1 and explain what this tells you about how yourestimates compare to the value of the integral. Using the estimates forn = 100, how many decimal places of the value are you absolutely certainof? (And what is that value?) How many places are you pretty sure of?

2. Use Simpson’s rule to estimate values for erf (x) = 2√π

R x0e−t

2dt for x =

.1, .2, .5, 1. Compare with the results from exercise 1(a) of section 5.3.

3. I estimated the integralR 41f (u) du using left hand sums, right hand

sums, the trapezoidal rule, and the midpoint rule, with the same num-ber of subintervals for each method. I wrote down the results as 0.531,0.542, 0.543, 0,554 but forgot to note which was which. If the graph off is like this, match methods with results.

0 1 2 3 4 51.0

1.5

2.0

2.5

3.0

x

y

4. (a) Explain why the area of the trapezoid in the diagram on the leftbelow is hy1+y2

2.

h

yy

12

h

f

x

NUMERICAL INTEGRATION 223

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 224: Notes on Calculus

(b) On a copy of the graph on the right, sketch areas representing eachof these quantities:

T = h · f (0) ; D = h · f (h) ; I = h · f (h/2)

R = h · f (0) + f (h)

2; L =

h

2

µf

µh

4

¶+ f

µ3h

4

¶¶E =

h

2

f (0) + f¡h2

¢2

+h

2

f¡h2

¢+ f (h)

2

(c) If B is the area under the graph between x = 0 and x = h, arrangethe quantities B,D,E, I, L,R, T in increasing order.(d) Which is a better approximation to B, T or D?(e) Which is a better approximation to B, I or R?

5. (a) Suppose a computer takes 2 seconds to approximate a certain def-inite integral accurate to 4 decimal places using left hand sums. Howlong will it take the same computer to estimate the same integral to 8decimal places using left hand sums? 12 decimal places? 20 decimalplaces? (Give answers in years if more than one year.)(b) Repeat if the initial estimate and the subsequent estimates use thetrapezoidal rule instead.(c) Repeat if the initial estimate and the subsequent estimates use Simp-son’s rule instead.

6. Let f be twice differentiable on [a, b] with f 0 > 0 and f 00 > 0 through-out [a, b] . For any c between a and b, let T (c) be the trapezidal ruleapproximation to

R baf (x) dx formed by using the two subintervals [a, c]

and [c, b] .

(a) Show that¯̄̄R b

af (x) dx− T (c)

¯̄̄is minimized when c =

f (b)− f (a)

b− a.

(b) Does this remain true if we drop the requirement that f 0 > 0 andrequire only f 00 > 0 throughout [a, b]?

(c) This fact can be justified with essentially no computation by draw-ing a diagram that makes it obvious. Draw the diagram and explain.

5.4.4 More on Errors of the Rules—A Partial Proof

In this subsection I will give a partial justification of the statement that theerrors associated with the Trapezoidal and Midpoint Rules decrease like thesquare of the number of subintervals. The proof of the general case is com-plicated enough to be left to a course on numerical integration, but the spe-cial case where the integrand f has constant concavity can be dealt with by

224 INTEGRATION

curgus
Right Red Arrow
Page 225: Notes on Calculus

comparing the areas of various regions, using a method that is essentially anelaboration of the argument in section 5.3 that an increasing or decreasingfunction is Riemann integrable. Here is the statement that I will prove.Theorem. If f is differentiable on [a, b] and has constant concavity, then

the trapezoidal rule estimate T (n) and the midpoint rule estimate M (n) forR baf (x) dx with [a, b] divided into n equal subintervals satisfy¯̄̄̄Z b

a

f (x) dx−M (n)

¯̄̄̄<

¯̄̄̄Z b

a

f (x) dx− T (n)

¯̄̄̄<|f 0 (b)− f 0 (a)| (b− a)2

2

1

n2.

Proof. I will divide the argument into several parts. For the first parts wewill compare areas above what is essentially a single subinterval [c, d] of theinterval [a, b] . For the final part of the proof we will combine the results fromindividual subintervals in an efficient way.First we compare the areas of four regions on the interval [c, d] as shown

in this diagram, where I will do the case where the graph of f (the thick curvein the diagram below) is concave down. The four regions M,T, F, and U allhave base at height f (c) and extend horizontally from c to d.

M

T

F

U

c d

M

U

Here F is the area of the region bounded above by the thick curve, M is the

area of the rectangle whose height is fµc+ d

2

¶, T is the area of the lower

triangle with base at height f (c) and top right hand vertex at height f (d) ,and U is the area of the upper triangle with base at height f (c) and whosehypotenuse has slope f 0 (c) . It is clear, using the fact that f is concave downso that f 0 is a decreasing function, that T < F < U, but the relative size ofM is not quite so obvious.The area of M is equal to the area of any trapezoid with top passing

throughµc+ d

2, f

µc+ d

2

¶¶since rotating the top of a trapezoid through

the point midway between its sides does not change its area. Choosing the top

with slope f 0µc+ d

2

¶(this is the dashed line, also labeled M) makes it clear

NUMERICAL INTEGRATION 225

Page 226: Notes on Calculus

that F < M because the curve is concave down and is tangent to the top ofthe trapezoid with dashed top. Also M < U since the area of U is equal to

that of the rectangle whose height is the value atc+ d

2of the line forming the

top of trapezoid U (indicated by the horizontal dashed line, also labeled U)

and the fact that f is concave down makes this value greater than fµc+ d

2

¶.

Thus we haveT < F < M < U.

Moreover, we can show that M − F < F − T, that is, the area M ofthe midpoint rectangle is closer to the area F under the graph of the curvethan is the area of the triangle T. To see this it will be helpful to draw somemore diagrams with yet another version of M. In the diagram on the leftbelow, the two shaded triangles enclose the region below the tangent line to

the graph of f atc+ d

2and above the graph of f, that is, the sum of their

areas, denoted EM , is greater than M − F. Now if we rotate the line throughµc+ d

2, f

µc+ d

2

¶¶to be parallel to the line from (c, f (c)) to (d, f (d)) (the

bottom of the unshaded triangle) as in the diagram on the right, then the sumof the areas of the shaded triangles will not change. In the diagrams—what hasbeen added on the left has been subtracted from the irght. Notice that the topnow does not lie entirely above the graph of f, but this will not matter. Notenext that the shaded areas form exactly half of the area P of the parallelogramwith vertical sides at x = c and x = d, that is, the sum of the areas of the twoshaded triangles equals the area of the unshaded triangle underneath them.

(For instance, divide the parallelogram in two with a vertical line at x =c+ d

2and note that then each half consists of two congruent triangles, one shadedand one unshaded.) Thus we have

M − F < EM = P/2 < F − T,

where the last inequality follows from the fact that the unshaded triangle liesentirely inside the region below the graph of f and above the line from (c, f (c))to (d, f (d)) .

c d

EM

c d

EM

226 INTEGRATION

Page 227: Notes on Calculus

Now we complete the argument. Each of the quantities in the statementof the theorem can be regarded as the sum of n terms corresponding to then separate subintervals. If I write Tj for the part of the Trapezoidal Rule

approximation coming from the j-th subinterval [xj−1, xj] , then T (n) =nX

j=1

Tj

and similarly M (n) =nX

j=1

Mj andR baf (x) dx =

nXj=1

R xjxj−1

f (x) dx =nX

j=1

Fj.

The consistent concavity and the argument above implies that the relative sizesof the four quantities considered above are consistent from one subinterval tothe next, so that (stilll assuming f is concave down), Tj < Fj < Mj andMj − Fj < Fj − Tj for each j. Thus, adding,

0 < M (n)−Z b

a

f (x) dx =nX

j=1

ÃMj −

Z xj

xj−1

f (x) dx

!

<nX

j=1

ÃZ xj

xj−1

f (x) dx− Tj

!=

Z b

a

f (x) dx− T (n) .

To complete the proof we must now find an upper estimate U =nX

j=1

Uj forR baf (x) dx so that

Z b

a

f (x) dx− T (n) =

ÃnX

j=1

Z xj

xj−1

f (x) dx− Tj

!<

nXj=1

(Uj − Tj)

= U − T (n) <|f 0 (b)− f 0 (a)| (b− a)2

2

1

n2.

U will be the sum of areas Uj of trapeziods constructed on each of the nsubintervals. Thus it is again enough to consider a single subinterval [tj−1, tj]

of lengthb− a

n. The part Fj−Tj of

R baf (x) dx−T (n) contained between tj−1

and tj is equal to the area of the region bounded above by the graph of f inthis subinterval and bounded below by the line segment through (tj−1, f (tj−1))and (tj, f (tj)) . This is the region indicated by horizontal line segments in thediagram below.The derivative function f 0 is decreasing on [a, b] so the graph of f in this

interval lies entirely below the line through (tj−1, f (tj−1)) with slope f 0 (tj−1)(the highest line in the diagram below) and entirely above the line through

(tj−1, f (tj−1)) and (tj, f (tj)) . This line has slopef (tj)− f (tj−1)

tj − tj−1where

f 0 (tj) <f (tj)− f (tj−1)

tj − tj−1< f 0 (tj−1) .

NUMERICAL INTEGRATION 227

Page 228: Notes on Calculus

The area of the part ofR baf (x) dx − T (n) contained between tj−1 and tj is

less than the area of the triangle indicated by vertical shading below. whosetop and bottom are the lines through (tj−1, f (tj−1)) with slopes f 0 (tj−1) andf (tj)− f (tj−1)

tj − tj−1mentioned above and whose right hand side is along the ver-

tical line through (tj, f (tj)). (This is also U − T in the first diagram in thissection.)

tj-1 t j

Fj − Tj < Uj − Tj

We will estimate the total area of the n triangles of this form by sliding themacross and down as in the diagram below

a bR baf (x) dx− T (n) less than area of triangle

so that all have a common left hand vertex. Now the top of the j-th triangle hasslope f 0 (tj−1) and the bottom of the previous triangle has slope equal to theslope of the curve at some point in the previous interval. Using concavity onelast time, f 0 is a decreasing function, so the various triangles do not overlap.Thus the sum of their areas is less than the area of a single triangle containing

all of them—the thick triangle in the diagram. This triangle has widthb− a

n

and height (measured vertically) (f 0 (a)− s)b− a

nwhere s is the slope of the

228 INTEGRATION

Page 229: Notes on Calculus

bottom of the right hand triangle. This slope is greater than f 0 (b) , so finallythe area of the solid triangle containing all the error triangles is less than

1

2(f 0 (a)− f 0 (b))

µb− a

n

¶2.

We know this is greater thanR baf (x) dx− T (n) so the proof is complete..

EXERCISES.

The purpose of these exercises is to adapt the proof in this section to thecase where the graph of f is concave up.

1. Draw a diagram analogous to the first diagram in the section showingthe graph of f and the trapezoidal and midpoint approximations with asingle subinterval, and give the inequalities that now hold between thequantities F, T, and M.

2. Adding to the diagram from the previous problem, what now takes theplace of the triangle labeled U above and how is it related to F, T, andM?

3. The theorem stated at the beginning of the section is restated near theend without abslolute value signs in the case where the graph of f isconcave down. How should the theorem be stated without absolute valuesigns when the graph of f is concave up?

4. Modify the argument that shows that the midpoint approximation iscloser to the value of the integral than the trapezoidal approximation toapply when the graph of f is concave up.

5. Complete the proof of the theorem when the graph of f is concave up.

5.5 Definite Integrals and Areas

So far all the diagrams have shown functions with positive values. There weknow that

R baf (x) dx represents the area between the graph of f and the x-

axis between x = a and x = b. Furthermore we know that we can write downsums that give either overestimates or underestimates for the integral and thusfor the area. What if f is negative, or changes sign in the interval?

Example 1. Consider f (x) = 20x− 60 on the interval 0 ≤ x ≤ 2. This isthe same function considered above except for the −60, that is, the graph has

DEFINITE INTEGRALS AND AREAS 229

Page 230: Notes on Calculus

been translated down by 60 units. Now what do the left hand sum and righthand sum represent? They are still given by the same expressions

L (f, n) =nX

j=1

f (xj−1)∆x; R (f, n) =nX

j=1

f (xj)∆x,

but it is not clear that they have the same interpretation. Here is a diagramshowing the left and right sums for n = 4.

21.510.500

-10

-20

-30

-40

-50

-60

L (20x− 60, 4) and U (20x− 60, 4)

Here we have that

R (20x− 60, 4) = −50 · 12+−40 · 1

2+−30 · 1

2+−20 · 1

2= −70

and

L (20x− 60, 4) = −60 · 12+−50 · 1

2+−40 · 1

2+−30 · 1

2= −90.

It is still true that R (20x− 60, 4) > L (20x− 60, 4) but that now just meansthat the right sum is less negative than the left sum. It is not any longer truethat R (20x− 60, 4) is the sum of the areas of four rectangles. (That wouldbe a positive number.) Instead R (20x− 60, 4) is the negative of the sum ofthe areas. Similarly, L (20x− 60, 4) is the negative of the sum of the areas ofthe four larger rectangles.It is also still true that if we look at larger values of n, that is, sums

corresponding to a larger number of skinnier rectangles, then the total spaceoccupied by the rectangles gets closer to just exactly covering the region be-tween the graph of f and the x-axis. A general argument like the one aboveusing the formula for the sum of the first n positive integers would lead to theformulas

R (20x− 60, n) = −80 + 40n; L (20x− 60, n) = −80− 40

n

230 INTEGRATION

Page 231: Notes on Calculus

so that as n→∞, both sets of sums approach −80. ThusZ 2

0

(20x− 60) dx = −80.

This is just the negative of the area of the trapezoid bounded by the graphof the function, the x-axis and the two vertical lines x = 0 and x = 2.

Some things we can conclude from this discussion:

• If f is negative for a ≤ x ≤ b, thenR baf (x) dx < 0 and is the negative

of the area bounded by the graph of f, the x-axis and the vertical linesx = a and x = b.

• If f is negative and increasing, then the right hand sums are still over-estimates for the integral (still upper sums), although overestimate nowmeans less negative, and the left hand sums are still underestimates forthe integral (still lower sums) although underestimate now means morenegative. (If f is negative and decreasing, then the right hand sums areunderestimates and the left hand sums are overestimates, just as whenf is positive.)

• If f is negative and increasing, then the difference between R (f, n) andL (f, n) can still be represented as the sum of the areas of the “differencerectangles” (the real areas—positive quantities) so that the formula fromabove

R (f, n)− L (f, n) = (width of one interval)× (f (b)− f (a))

still holds without any change at all. For instance for f (x) = 20x− 60and n = 4,

R (20x− 60, n)− L (20x− 60, n) =µ2

n

¶× 40 = 80

n

In particular, f (b)− f (a) = −20− (−60) = 40 is positive even thoughboth f (b) and f (a) are negative because we are subtracting the morenegative number from the less negative one.

• If f is negative and decreasing, then L (f, n) > R (f, n) (though both arenegative) and it is L−R that is given by the expression above.

What if f changes sign on the interval [a, b]? We just have to put the twocases together to conclude that in generalR baf (x) dx represents the“signed area” between the graph of f and the x-axis

DEFINITE INTEGRALS AND AREAS 231

Page 232: Notes on Calculus

that is,R baf (x) dx is equal to the difference between the area under the graph

of f above the x-axis and the area over the graph of f below the x-axis.For instance,

R 20(20x− 10) dx = 20 since the area of the large triangle above

the axis is 12× 3

2× 30 = 90

4, the area of the small triangle below the axis is

12× 1

2× 10 = 10

4and 90

4− 10

4= 20.

0.5 1.0 1.5 2.0

-10

0

10

20

30

x

y

area 22.5

area 2.5

20x− 10

5.5.1 What is area?

By looking at simple shapes where we can calculate the area in some wayindependent of calculus, we have decided that

R baf (x) dx represents the signed

area of the region bounded by the graph of f, the x-axis and the vertical linesx = a and x = b. But what if the region is one where we cannot find the areain some geometric way (say the area under the graph of ex

2between x = 0

and x = 1). What do we mean by the area of such a region? How would werecognize a number as the area? It turns out that the answer is similar to theanswer to the question of how to define the line tangent to a curve at a point.We turn around and use the calculus concept to define the geometric one, thatis, we define the area of the region to be the one given by the integral. So ourinterpretation of the integral as area has now become the definition of whatwe mean by area. Our earlier checking should reassure us that this definitionis consistent with geometric definitions whenever both can be applied.

5.6 Using Areas to Evaluate Integrals

In the example just above we evaluated an integral by finding the areas of twotriangles. In general whenever we can use our knowledge of area to evaluate

232 INTEGRATION

Page 233: Notes on Calculus

an integral, it will be the fastest and most certain method of evaluation. Hereare some simple examples.

Example 1. Evaluate the integral of the constant function f (x) = 17 overthe interval −2 ≤ x ≤ 3.

-5 -4 -3 -2 -1 0 1 2 3 4 5

5

10

15

20

x

y

R 3−2 17 dx = 85.

Here the area is just the area of a rectangle with base 5 and height 17, soR 3−2 17 dx = 5× 17 = 85.Example 2. Evaluate the integral of the pictured function f over the

interval 1 ≤ x ≤ 5.

1 2 3 4 5

-1.0

-0.5

0.0

0.5

1.0

x

y

Here the area of the triangle above the x-axis between x = 1 and x = 2is the same as the area of the triangle below the x-axis between x = 2 andx = 3, so the corresponding contributions to

R 51f (x) dx just cancel out. This

leaves the rectangle of height 1 and width 2 between x = 3 and x = 5. Sinceit is below the axis, we have Z 5

1

f (x) dx = −2.

USING AREAS TO EVALUATE INTEGRALS 233

Page 234: Notes on Calculus

Example 3. EvaluateR 20

√4− x2dx. This is not so simple to evaluate

unless you use a table of antiderivatives (or a trigonometric substitution), butif you draw a picture, you should recognize the curve y =

√4− x2 as part of

the circle x2 + y2 = 4.

0.0 0.5 1.0 1.5 2.00.0

0.5

1.0

1.5

2.0

x

y

y =√4− x2

The area is a fourth of the area of the circle of radius 2 soZ 2

0

√4− x2dx =

1

4π × 22 = π.

EXERCISES.

Compute the integrals by considering areas.

1.R 61f (x) dx

1 2 3 4 5 6

1

2

3

x

y

234 INTEGRATION

curgus
Right Red Arrow
Page 235: Notes on Calculus

2.R 3−1 f (x) dx

-1 1 2 3

-0.5

0.5

1.0

x

y

3. Find the interval [a, b] on which the integralR ba(2 + x− x2) dx is a max-

imum by looking at a diagram rather than evaluating this integral withthe Fundamental Theorem of Calculus.

4. Use the diagram below to show thatZ x

0

√1− t2dt =

1

2x√1− x2 +

1

2arcsinx

for 0 ≤ x ≤ 1 by calculating areas rather than the integral. It will help toremember (or, better, to work out) the formula for the area of a wedgeof a circle with angle opening θ.

-1.0 -0.5 0.5 1.0

-1.0

-0.5

0.5

1.0

x

y

x

5. Adjust the diagram in the previous problem to obtain a similar expres-

sion forZ x

0

√a2 − t2dt for arbitrary a > 0.

USING AREAS TO EVALUATE INTEGRALS 235

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 236: Notes on Calculus

5.7 Average Value of a Function

If f is a Riemann integrable function on an interval a ≤ x ≤ b, the averagevalue of f on [a, b] is the number

A =1

b− a

Z b

a

f (x) dx.

Why is this a reasonable use of the word ‘average’? Consider this diagram:

0 1 2 30

5

10

x

y

A

We know thatR 31f (x) dx represents the area under the graph of f between

x = 1 and x = 3. b − a = 3 − 1 = 2 is the length of the base of this region.The quotient, A = 1

2

R 31f (x) dx satisfies the equation 2 × A =

R 31f (x) dx.

Here the right hand side represents an area and we can also think of the lefthand side as an area–the area of a rectangle with base 2 and height A (thehorizontal dotted line in the diagram).Thus the average value of a function gives the height to use to form a

rectangle with the same base and the same area as the original region. To putit another way, if we altered f on the interval [1, 3] (or in general on [a, b]) tohave the constant value A, then the area under the graph would remain thesame.In the diagram above, f (x) = x2+2.We can calculate that

R 31x2+2 dx =

1223so that here A = 1

2× 122

3= 61

3.

EXERCISES.

1. (a) Compute the average value of f (t) = et on [0, 10] .(b) At what value of t is the value of f equal to its average value over the

interval?2. (a) I drive the 90 miles to Seattle in 1.5 hours at a variable speed v (t) .

Dividing distance by time, I say my average speed is 60 mph. Is this the sameuse of the word average as in this section? Explain why or why not.(b) Must there have been at least one instant during my trip when my

(instantaneous) speed was exactly 60 mph? Justify your answer graphically interms of the idea of average value of a function.

236 INTEGRATION

curgus
Right Red Arrow
curgus
Right Red Arrow
Page 237: Notes on Calculus

3. Without computing any integral, explain why the average value off (x) = sinx on [0, π] is between 1

2and 1. (Suggestion: estimate

R π0sinx dx

using an isosceles triangle.) Then compute the average value exactly.

4. The value, V, of a painting sold for $350 in 1965 increases at 15% peryear from then on.(a) Write an expression for V in terms of the number t of years elapsed

since 1965.(b) Find the average value of the painting over the period 1965-2000.

5. A bar of metal is cooling from 1000◦C to room temperature, 20◦C. Thetemperature T of the bar t minutes after it starts cooling is given by

T = 20 + 980e−0.1t.

(a) Find the temperature of the bar after 1 hour.(b) Find the average temperature of the bar during the first hour.(c) Compare your answer to (b) with the average of the two numbers T (0)

and T (60) . Explain, using the concavity of the graph of T, which averageshould be greater.

5.8 The Fundamental Theorem of Calculus

When we began to talk about differentiation, we started from the simple equa-tion for objects moving at a constant speed

speed =distancetime

which we used to define the average speed over an interval of time. When webegan to talk about integration, we used the same formula in the form

distance = speed × time

to estimate the distance traveled over a short time interval (where we couldpretend the speed was constant without too much error) and then added theindividual estimates to get an overall estimate for the distance traveled.In discussing other applications, we generalized this simple equation to the

following, valid for any quantity that is changing at a constant rate.

total change = rate of change × time. (5.4)

The Fundamental Theorem of Calculus is the extension of this simple equationto the situation where some quantity is changing at a variable rate.

THE FUNDAMENTAL THEOREM OF CALCULUS 237

curgus
Right Red Arrow
Page 238: Notes on Calculus

In Example 1 at the beginning of the chapter, the speed of the car, therate of change of its position, was f (t) = 20t. We estimated the distance Dtraveled during a two hour period by a sum,

nXk=1

20tk∆t

and showed that these sums approached a limiting value, 40 miles, as thenumber n of subintervals got larger. Thus we arrived at

40 =

Z 2

0

20t dt,

or in words (regarding the distance traveled as the total change of position ofthe car)

total change =Z 2

0

(rate of change) dt.

Similarly, in Example 2 we estimated the mass of a two centimeter lengthof bar with variable density f (x) = 20x at a point x centimeters from the leftend of the bar by using the appropriate form of equation (5.4), namely

total mass = density × length.

This is a special case of (5.4) because the total mass is really the total changeof mass from the left end of the bar to the point two centimeters to the rightof it (think of the mass function as increasing as we move the right end ofthe measured hunk of bar to the right) and linear density is really the rateof change of mass with respect to distance along the bar. Thus the result ofExample 2 could also be phrased as

total change =Z 2

0

(rate of change) dx.

We will discuss in the next section how Example 3 (computing an area)can also be fitted into this framework, but for the moment I will just pass tothe word version of the Fundamental Theorem of Calculus:

Let F be a bounded function defined on an interval a ≤ x ≤ b. Then

the total change of F between a and b =R ba(the rate of change of F ) dx.

Since we know that the rate of change of a function is given by its derivative,and the change of F between x = a and x = b is just F (b)−F (a) , this takesthe symbolic form

F (b)− F (a) =

Z b

a

F 0 (x) dx.

238 INTEGRATION

Page 239: Notes on Calculus

You are probably used to using this formula from right to left—you aregiven a definite integral, that is, you are given the function F 0 and you wantto evaluate it by finding the function F, that is, you want to find an antideriv-ative for F 0. You have no doubt practiced various techniques for discoveringF when you know F 0—these are generally rules of differentiation run in reverse,though that is often disguised. This is a useful skill, and we will review someaspects of it later in this chapter, but the truth is that if in the future you arerequired by your profession to evaluate integrals, you are more likely to usesome descendent of a TI-89 (some computer, anyway) than to do a calculationby hand, just as you now probably usually use a calculator to multiply twonumbers even though you could do it by hand if you really had to.The real value of the Fundamental Theorem of Calculus, then, is more

likely to come from the interpretation of the word version of it, and the helpthat it gives in modeling with integrals. We will return to this in a laterchapter when we study how to model with integrals, that is, how to answerthe question: “What integral can I use to represent this quantity?”

EXERCISES.

1. The diagram shows the graph of f.

(a) If F 0 = f and F (0) = 0, find F (b) for b = 1, 2, 3, 4, 5.

(b) Write a piecewise formula for F.

(c) Repeat (a) and (b) if F 0 = f and F (0) = 3.

1 2 3 4 5 6

-1

1

2

3

x

y

2. The graph of y = f (x) passes through the origin and the point (1, 1) .What is

R 10f 0 (x) dx?

3. Suppose F (0) = −1 and F 0 (θ) = cos¡θ2¢for θ ≥ 0. Approximate F (θ)

for θ = 0.5 and θ = 1.

4. With F as in #3, use a graph of F 0 to decide where F is increasing andwhere decreasing and also where F is concave up and concave down .Draw a very rough graph of F for 0 ≤ θ ≤ 2. (Don’t worry about thevertical scale too much, but do use the points from #3.)

THE FUNDAMENTAL THEOREM OF CALCULUS 239

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 240: Notes on Calculus

5. With F as in #3, use the graph from #4 to locate the maximum andminimum values of F for 0 ≤ θ ≤ 2.5. Use sums to estimate the maxi-mum and minimum values.

6. Find a function f so that f (1) = −1, f (4) = 6 and f 0 (x) > 3 for all xor show that no such function can exist.

7. Using the graph below of the derivative, f 0, of f,

(a) decide which is greater, f (1) or f (2) ,

(b) decide whether the following quantities can be represented in thegraph as lengths, slopes or areas (and which ones) and use this to

arrange them in increasing order:f (4)− f (2)

2, f (4)−f (3) , f (3)−

f (2) .

1 2 3 4

-10

-8

-6

-4

-2

0

xy

f '

8. Dave is pedaling along the Guide Meridian, a perfectly straight road,with velocity v, given as a function of time by the graph below. Supposethat he starts 5 miles from Wiser Lake (which the Guide crosses), thatpositive velocities take him away from the lake and negative velocitiestake him towards the lake. When is Dave farthest from Wiser Lake, andabout how far is he then?

0.2 0.4 0.6 0.8 1.0

-10

0

10

20

30

t

v

Dave’s velocity

240 INTEGRATION

curgus
Right Red Arrow
curgus
Right Red Arrow
Page 241: Notes on Calculus

9. Mark the following quantities on a copy of this graph of f .

a b

f

(a) A length representing f (b)− f (a) .

(b) A slope representingf (b)− f (a)

b− a.

(c) An area representing F (b)− F (a) , where F 0 = f.

(d) A length roughly approximatingF (b)− F (a)

b− a, where F 0 = f.

10. Determine f and a so that

6 +

Z x

a

f (t)

t2dt = 2

√x.

11. Let f 0 = df/dt be periodic with period P.

(a) Show that ifR P0f 0 (t) dt = 0, then f is periodic with period P.

(b) Show that ifR P0f 0 (t) dt = c 6= 0, then f (x) − c

Px is periodic with

period P, but f is not periodic with period P.

REMARK: Thus f 0 periodic means f is the sum of a periodic function anda linear function.

5.9 The Area Function as Antiderivative

We know that for any integrable function f on an interval [a, b],R baf (t) dt

represents the (signed) area between the graph of f and the x-axis and betweenthe vertical lines x = a and x = b. This area depends on f, on a and on b.We

THE AREA FUNCTION AS ANTIDERIVATIVE 241

curgus
Right Red Arrow
curgus
Right Red Arrow
curgus
Right Red Arrow
Page 242: Notes on Calculus

can therefore define a function by regarding one of these three quantities asvariable while holding the other two fixed. All three of these functions are ofsome interest, but for the moment it is b, the upper limit of the integral, thatwe allow to vary. In this way we associate with any integrable function f andany fixed lower limit a an area function Fa (x) defined by

Fa (x) =

Z x

a

f (t) dt.

Some simple properties of Fa are evident from our experience with area.For instance, if f (t) > 0 within [a, b], then Fa is an increasing function, while iff (t) < 0, then Fa is a decreasing function. Also, since the area of a region willshrink to zero as its width shrinks to zero if the height is kept the same, twovalues F (x) and F (z) with x and z close together will be very close together,that is, area functions are always continuous.

0 1 2 3 40

2

4

6

x

y

F1 (2)

0 1 2 3 40

2

4

6

x

y

F1 (3)

If f is given by some simple formula, then we can generally use our knowl-edge of finding antiderivatives to find an explicit formula for Fa. For instance,if f (x) = 3 + 2x, and a = 1, then

F1 (x) =

Z x

1

(3 + 2t) dt =¡3t+ t2

¢¯̄x1= 3x+ x2 − (3 + 1) = x2 + 3x− 4.

Note that if we change a from 1 to 0 while leaving f the same, we do changethe area function. Pictorially, the area should change by the area under thegraph of f between x = 0 and x = 1.

-1 0 1 2 3 4

5

10

x

y

F0 (3)

-1 0 1 2 3 4

5

10

x

y

F1 (3)

-1 0 1 2 3 4

5

10

x

y

F0 (3)− F1 (3)

242 INTEGRATION

Page 243: Notes on Calculus

Symbolically, we find

F0 (x) =

Z x

0

(3 + 2t) dt =¡3t+ t2

¢|x0 = x2 + 3x.

The difference between F0 and F1 is exactlyR 10(3 + 2t) dt = (3t+ t2) |10 = 4.

An area function Fa makes sense even for x < a as long as f is defined onthe appropriate interval. For instance if f is the constant function f (x) = 3for all x, then the area function F0 (x) =

R x03 dx is defined for all x, negative

as well as positive, though we need to think about the two cases separately.For x ≥ 0, remembering that the area defined by this integral is just a

rectangle, we have F0 (x) = 3 (x− 0) = 3x. For x < 0, we need to rememberthe interpretation

R x03 dx = −

R 0x3 dx, that is, F0 (x) is the negative of an

area:

F0 (x) = −Z 0

x

3 dx = −3 (0− x) = 3x

but it turns out to be given by the same expression 3x. See also this diagram:

-4 -2 0 2 4

1

2

3

x

y

F0 (−4) is the negative of this area

EXERCISES

Find the area function Fa (x) over the interval that f is defined for theindicated value or values of a. Use whatever method seems appropriate. Sketchthe graph of Fa.1. f (x) = 1− x, 0 ≤ x ≤ 2, a = 0.2.f (x) = 1− x, 0 ≤ x ≤ 2, a = 1.

3.f (x) =½1 if −1 ≤ x ≤ 0,1− x if 0 < x ≤ 1. , a = −1 and a = 0.

4.f (x) =½1 if −1 ≤ x ≤ 0,−1 if 0 < x ≤ 1. , a = −1 and a = 0.

5. Make a rough sketch of the graphs of the area functions F0 and F1 for0 ≤ x ≤ 3 where f is given by this graph.

THE AREA FUNCTION AS ANTIDERIVATIVE 243

Page 244: Notes on Calculus

1 2 3

-1

0

1

2

3

x

y

Two properties of area functions are evident from these calculations. First,that any two area functions for the same integrand f but differentlower limits a will differ by a constant, and that constant will be preciselythe area under the graph of f between the two values of a. This means, inparticular, that two such area functions will have the same derivative.(2x+ 3 in the example above.)

Second, this derivative will be the integrand f, that is, each area functionis an antiderivative for the integrand function. This can be justifiedroughly as follows. The derivative of the area function Fa is the rate at whichFa is changing as the upper limit x increases. If we increase x by a smallamount h, the effect on the area is to add a small “nearly rectangle” whosebase has width h and whose height is approximately f (x) . (The difficulty isthat if f is not constant, then the top of the “nearly rectangle” is not flat, butif h is small and f is continuous, the height will not vary much from f (x) .)

0.0 0.5 1.0 1.5 2.00

1

2

3

x

y

F (x+ h)− F (x) (An area)

244 INTEGRATION

Page 245: Notes on Calculus

____________F(x + h) - F(x)h

h

F (x+ h)− F (x)

h(A height)

We see that Fa changes by about f (x)h when x changes by h. Thus theaverage rate of change of Fa near x is approximately f (x) . (Or, more pic-torially, the rate of change of the area of a rectangle as the base changes isthe height of the rectangle. This is just (area) = (base) (height) rearranged to(area) / (base) = (height) .)We can make a more formal argument out of this by writing a difference

quotient for Fa and using properties of integrals, though it is best to keep thepicture firmly in mind. Since Fa (x) =

R xaf (t) dt, we will have Fa (x+ h) =R x+h

af (t) dt and so

Fa (x+ h)− Fa (x) =

Z x+h

a

f (t) dt−Z x

a

f (t) dt =

Z x+h

x

f (t) dt.

ThusFa (x+ h)− Fa (x)

h=1

h

Z x+h

x

f (t) dt.

The right hand side of the equation is precisely the average value of f overthe interval [x, x+ h] . If f is a continuous function at x, then its averagevalue over [x, x+ h] will approach f (x) as h→ 0. Thus in this case,

Fa (x+ h)− Fa (x)

h=1

h

Z x+h

x

f (t) dt→ f (x) as x→ 0

that is,F 0a (x) = f (x) .

Notice that the value for the limit (even its existence) depends on theaverage value of f approaching f (x) as h shrinks down to 0. If f has a jumpat x then this may not happen, and the area function will not be differentiableat x.

Example. Let f (x) =½

1 if x ≤ 1,−1 if x > 1.

Consider the derivative of F1 (x)

at x = 1.

THE AREA FUNCTION AS ANTIDERIVATIVE 245

Page 246: Notes on Calculus

0.0 0.5 1.0 1.5 2.0

-1.0

-0.5

0.0

xy

F1 (x)

From the graph it is clear that for 0 ≤ x ≤ 1, F1 (x) = x − 1. However forx > 1, F1 (x) = 1 − x. Thus the graph of F1 has a corner at x = 1 and F1 isnot differentiable there, though it is differentiable at all other points. In termsof average values, 1

h

R 1+h1

f (t) dt = −1 if h > 0 and = 1 if h < 0 (since bothh and the integral are negative). Thus there is no single limit for the averagevalues as h→ 0 although there are separate limits from the right (-1) and theleft (1).

We conclude from all this that every integrable function f has a familyof antiderivatives, its family of area functions. Two members of thisfamily differ from one another by a constant, and the value of that constantis the area under the graph of f between the two lower limits. Of courseyou know that for many simple functions, you can identify a formula for itsantiderivative. (This is just a formula for an area function, of course.) Thepoint is that even when you can’t find a formula for the area function, it isstill an antiderivative, and often you can use general properties of integrals tostudy the antiderivative even without a formula for it.

Example. Let f (x) =sinx

x. There is no elementary formula for an anti-

derivative of this function. (Try to find one if you don’t believe it.)

0 2 4 6 8 10 12 140.0

0.5

1.0

1.5

x

y

F0 (x)

From the diagram you can see that the area function antiderivative F0 (x) =

246 INTEGRATION

Page 247: Notes on Calculus

R x0

sin t

tdt alternately increases and decreases on intervals of length π (depend-

ing on the sign ofsin t

t, which is the sign of sin t) by an amount that decreases

as t increases (because the t in the denominator makes the magnitude ofsin t

tdecrease as t increases). In a class that considers limits more carefully, youmay study whether the values of this area function approach a limit as x→∞.(See also Exercise 4.)EXAMPLE. The area function erf x = 2√

π

R x0e−t

2dt is very important in

probability, where it is called the error function. Since there is no elementaryformula for this function, the integral itself is used to calculate values of erf x.(In the old days, there were extensive tables of values. Today this function isbuilt in to some calculators and mathematical computer packages.)

5.9.1 Composition Involving the Area Function

A favorite “tough” calculus problem is to ask for the derivative of a functionof x like Z x2

0

esin tdt

or even Z sin(x2)

−1e− tan(t

2)dt.

This is going to be rather messy, but it is not really difficult. The point isjust to recognize that each of the functions above is a composition of an areafunction with something else, so that we just need to write the composition outand plug it into the chain rule. In the first example above the area function is

F1 (x) =

Z x

0

esin tdt

with derivative esin t and the given integral is G1 (x) = F1 (x2) . Thus

d

dx

Z x2

0

esin tdt = 2xF 01

¡x2¢= 2xesin(x

2).

In the second example, the area function is

F2 (x) =

Z x

−1e− tan(t

2)dt

with derivative e− tan(x2) and the given integral is G2 (x) = F2 (sin (x

2)) . We

knowd

dxsin (x2) = 2x cos (x2) so

G02 (x) = 2x cos

¡x2¢F

02

¡sin¡x2¢¢

THE AREA FUNCTION AS ANTIDERIVATIVE 247

Page 248: Notes on Calculus

or

d

dx

Z sin(x2)

−1e− tan(t

2)dt = 2x cos¡x2¢F 02

¡sin¡x2¢¢

= 2x cos¡x2¢e− tan(sin

2(x2)).

EXERCISES.

1. Use Simpson’s rule to estimate values for erf x for x = .1, .2, .5, 1. Sketcha rough graph of erf x for x ≥ 0.

2. Do you think the graph of erf x has a horizontal asymptote? Why orwhy not?

3. The problem is to computeR 40f 00 (x) dx where f is the piecewise linear

function with this graph:

2 4

-2

0

2

x

y

George says: Use the Fundamental Theorem of Calculus:Z 4

0

f 00 (x) dx = f 0 (4)− f 0 (0) = 1− (−3) = 4.

Martha says: Since f 00 (x) = 0 except at one point in [0, 4] ,R 40f 00 (x) dx = 0.

Is either George or Martha right? If so, which one? Discuss the correct-ness of both opinions.

4. The Fresnel function

F (x) =

Z x

0

sin¡πt2/2

¢dt

first appeared in the work of the physicist Augustin Fresnel on the dif-fraction of light rays. Recently it has been applied to highway design.Sketch a rough graph of F for −3 ≤ x ≤ 3. You will find it helpful togenerate and use the graph of f (t) = sin (πt2/2) and to compute thezeros of sin (πt2/2) for −3 ≤ t ≤ 3. (There are 9 of them.) What doesthe fact that f is an even function tell you about F?

248 INTEGRATION

Page 249: Notes on Calculus

5. Consider the function f (x) =R x0ecos tdt.

(a) Graph ecos t for −2π ≤ t ≤ 2π. What would the graph look like forgreater or smaller values of t?

(b) Before actually graphing f (x) , use your graph from (a) to answerthese questions: where is the graph of

R x0ecos tdt increasing for 0 ≤

x ≤ 2π? for −2π ≤ x ≤ 0? for other values of x? Where are thepoints of inflection of the graph? (This should be clear from thegraph in (a). Don’t do any calculations.)

(c) Now use Simpson’s Rule to make a table of values of f at 0,π

4,π

2,3π

4, π,

and sketch the graph of f on [0, π] . Use the appropriate symmetryto sketch the graph for [−π, 0] .

6. Show that limx→∞

R x0sin t/t dt exists. (Hint: How are the numbers an =

F0 (nπ) =R nπ0sin t/t dt arranged on the number line? What does this

tell you about the location of other values of F0 and of limx→∞

R x0sin t/t dt?)

7. Define

F (x) =

Z √|x|0

sin¡t2¢dt.

(a) What is the domain of this function? Is it continuous at 0? Differ-entiable at 0? In each case, if it is, give the value with justification.

(b) What symmetry, if any, does the graph exhibit?

(c) Where is F increasing, and where decreasing? Sketch the graph.

8. Is the function F of the previous problem bounded? Does limx→∞

F (x)

exist?

5.10 Finding Antiderivatives Pictorially with SlopeFields

You have some experience with finding antiderivatives symbolically, and withusing various so-called methods of integration to do that. We will review someof this material in the next section, but first we will look at something morevisual which has the advantage that it always works. You recall that symbolicantidifferentiation does not. Consider, for instance, the problem of finding anantiderivative for e−x

2or for sin (x2/2) two functions that have occurred in

exercises in this chapter. You can easily convince yourself that they do not

FINDING ANTIDERIVATIVES PICTORIALLY WITH SLOPE FIELDS 249

Page 250: Notes on Calculus

appear to have simple symbolic antiderivatives. On the other hand we knowthat each of these functions does have a family of antiderivatives—its family ofarea functions. We have explored approximating them numerically. Here weconsider another approach.By definition, any antiderivative for f (x) = e−x

2has slope f (0) = 1 at

x = 0.Similarly it has slope f (.1) = e−.01 ≈ .990 at x = .1, f (.2) = e−.04 ≈.961 at x = .2 and so forth. So we can make a picture of the tangent lines tothe antiderivative functions (or small pieces of them) just by evaluating f ata range of points. From this picture of pieces of tangent lines, called a slopefield, the general nature of the antiderivative functions themselves is generallyclear. Here is the result for e−x

2.

0.0 0.5 1.0 1.50.0

0.2

0.4

0.6

0.8

1.0

x

y

Slope field for e−x2

It is easy to see from this diagram that antiderivatives for e−x2are increas-

ing and concave down. It is not possible to tell whether they have a horizontalasymptote or not, but this can be determined in a different way as we will seein the section on improper integrals.Notice that all slopes corresponding to a single value of x are the same.

This corresponds to the fact that the family of antiderivative functions all differfrom one another by a constant value, that is, graphically we get the graphs ofall antiderivatives by translating vertically the graph of a single antiderivative.

5.10.1 Drawing Slope Fields with the TI-89

To draw a slope field, you must first change the graphing mode to DifferentialEquations. Press the Mode button; the first entry must read DifferentialEquations. Change, if necessary. Remember to press Enter enough times(twice) to get the calculator to remember this change in graphing mode.Go to the y = screen. Press F1 (tool pictures), then 9 (format) and look

at the bottom item Fields. This should be set to SLPFLD.

250 INTEGRATION

Page 251: Notes on Calculus

In the y = screen, enter the function for which you want an antiderivativeas y1 (or y2, y3, or whatever you want.) Notice that you must use t ratherthan x as the independent variable.Determine the graphing window by setting xmin, xmax, ymin, ymax as

usual. You can set t0 to be the same as xmin and tmax to be the same asxmax. The number of columns of slopes drawn is determined by fldres at thebottom of the menu. Then hit Green Diamond and Graph to draw the slopefield.You can also superimpose one or more of the actual antiderivatives (deter-

mined numerically) on the slope field. To do this, enter one or more initial val-ues under yi1 (if you have entered the function as y1) before you press Graph.To enter multiple initial values, enclose them in curly brackets. {0, 1, 2} willcause the calculator to graph the antiderivatives whose value at t0 are respec-tively 0, 1, and 2. Note that t0 does not actually have to be xmin. If it issomewhere between xmin and xmax, the calculator will first draw the anti-derivative graphs forward from t0 and will then come back and extend themback to a border of the screen.

EXERCISES.

1. Draw a slope field for antiderivatives ofsin t

tfor 0 ≤ t ≤ 10. Superimpose

the graph of the antiderivative which is equal to 0 at t = 0.

2. Draw a slope field for antiderivatives of sin (πt2/2) for −3 ≤ t ≤ 3.Superimpose the graph of the Fresnel function F (x) =

R x0sin (πt2/2) dt

5.11 Finding Antiderivatives Symbolically

For the most part, the simplest way to find an antiderivative of a specificfunction defined symbolically is to punch some buttons on your TI-89. Whileit is not perfect (what is?), you will find that it can solve just about everyantiderivative problem in just about every standard calculus text.So why is it worth thinking about finding antiderivatives by hand? I can

think of two reasons offhand. One has already been mentioned—the TI-89 (orany other symbolic integrator)—will fall down occasionally, and you might wantto see if you can do what it can’t. Of course the TI-89 will be able to givea reasonably accurate numerical approximation to the integral anyway andin practice this is likely to be good enough, so this is perhaps not the mostpersuasive of reasons.A better reason is that the principal methods used in finding antiderivatives—

substitution and integration by parts—are also useful for establishing generalfacts that are beyond the reach of symbolic manipulators. For example, con-sider the statement

FINDING ANTIDERIVATIVES SYMBOLICALLY 251

Page 252: Notes on Calculus

If a function f has a bounded antiderivative F, thenZ x

0

f (t2) dt approaches

a finite limit as x→∞.

Note thatR x0f (t) dt = F (x)−F (0) need not approach a limit as x→∞,

say if f (t) = cos t. I will show how we can use integration by parts to verifythis statement in the section on improper integrals.For reference purposes, a short table of antiderivatives follows this section.

The specific functions in the table are superfluous except for purposes of sug-gestion, since the TI-89 knows all of them, but it does not know the reductionrules in a general form.

5.11.1 Substitution

This is the most used and most useful general method of finding antideriva-tives, and you have undoubtedly met it before. Methods for finding antideriv-atives are generally just rules of differentiation used backwards in some way.Substitution is the backwards chain rule. Thus, since the chain rule tells youthat the derivative of the composition of two functions is a product of deriv-atives, using substitution amounts to recognizing an integrand as a suitableproduct in order to recognize the antiderivative as a composition of functions.Formally, rewriting the chain rule:

d

dxf (g (x)) = f 0 (g (x)) g0 (x)

as an antiderivative would giveZf 0 (g (x)) g0 (x) dx = f (g (x)) + C, (5.5)

where the “+C” reminds us as usual that an antiderivative is determined onlyup to an additive constant. Thus the practical problem is to recognize whichpart of the integrand to take as g0 (x) and which part as f 0 (g (x)) . Sometimesthis is pretty obvious (after a little practice). ForZ

2x cos¡x2¢dx

we see at a glance that g0 (x) = 2x, g (x) = x2, f (u) = sinu so that f 0 (g (x)) =cos (x2) and then an antiderivative is f (g (x)) = sin (x2) .In general, the first step in the process of substitution is to make a guess at

the inside function g.We look for a function whose derivative is either presentor can be introduced somehow, and which seems to be present in an “insideposition.” This is x2 in the simple example above. Once we have done that,our choice for f 0 is determined.

252 INTEGRATION

Page 253: Notes on Calculus

There is some suggestive notation that can facilitate this process. This isthe “substitution” part. When we have guessed our g, we write u = g (x) , du =g0 (x) dx and substitute these into the original integral to get a new one inwhich the variable of integration is u and the integrand is the derivative of theoutside function in the composition. We could expand equation (5.5) asZ

f 0 (g (x)) g0 (x) dx =

Zf 0 (u) du = f (u) + C = f (g (x)) + C

In our example, u = x2, du = 2xdx soZ2x cos

¡x2¢dx =

Zcosu du = sinu+ C.

Now we just substitute g (x) back for u (x2 for u in the example) to get f (g (x))(= sin (x2) in the example).Whether the method works depends on whether we manage to isolate the

derivative of a recognizable outside function.

5.11.2 Substitution with Definite Integrals

Equation (5.5) is a statement about antiderivatives, that is, about functions.But there is a similar statement about definite integrals, that is, about num-bers. If still u = g (x) and a ≤ x ≤ b, It looks like this:Z b

a

f 0 (g (x)) g0 (x) dx =

Z g(b)

g(a)

f 0 (u) du (5.6)

The justification is immediate from the Fundamental Theorem of Calculus(used twice) and the fact that f (g (x)) is an antiderivative for f 0 (g (x)) g0 (x)since Z b

a

f 0 (g (x)) g0 (x) dx = f (g (x))|ba = f (g (b))− f (g (a))

= f (u)|g(b)g(a) =

Z g(b)

g(a)

f 0 (u) du.

5.11.3 An Area Interpretation of Substitution

Since we normally think of definite integrals as representing areas, it seemsnatural to suppose that there would be an area interpretation of equation(5.6). There is indeed such an interpretation, though we will see that it is

FINDING ANTIDERIVATIVES SYMBOLICALLY 253

Page 254: Notes on Calculus

most natural to think of transforming the area in the direction opposite tothat indicated by the substitution.

To start with, consider the simplest possible case. (Remember that thegeneral method of calculus is to make something complicated out of manycases of something very simple.) Suppose that f 0 is constant, say f 0 (u) = 4.Then

R 604 du is the area 6 ·4 = 24 of a region of height 4 extending from u = 0

to u = 6 along the horizontal axis. (See the left picture below.) Suppose wenow compress this region into 0 ≤ u ≤ 3 (center picture). Clearly the areabecomes smaller—in fact it is half as large as before. How can we compensatefor the decreased width to keep the area the same? The only thing we can dois to change the height. If we double the height (right picture), this will justcompensate for halving the width and the area will still be 24.

0 1 2 3 4 5 60

1

2

3

4

x

y

Rectangle

0 1 2 3 4 5 60

1

2

3

4

x

y

Squashed rectangle

0 1 2 3 4 5 60

2

4

6

8

x

y

Height compensation

We could indicate this with integral notation as

Z 6

0

4 du =

Z 126

0

4 · 2 dx orZ 2·3

0

4 du =

Z 3

0

4 · 2 dx.

where the more symmetric form on the right can be interpreted as sayingthat the effect of doubling the width is the same as the effect of doubling theheight and we have switched the variable of integration from u to x since wecan regard this as the substitution u = 2x. Note that the substitution goesbackwards—from right to left in the pictures above.

If we start over, but this time stretch the width to three times the originalwidth, then the correct compensation is to divide the height by 3 (next seriesof pictures)

254 INTEGRATION

Page 255: Notes on Calculus

0 5 100

1

2

3

4

x

y

Rectangle.

0 5 10 150

1

2

3

4

x

y

Stretched rectangle.

0 5 10 150

1

2

3

4

x

y

Height compensation.

In integral notation, Z 1318

0

4 du =

Z 18

0

4 · 13dx.

Here the substitution is u = 13x, and again this corresponds to going from

right to left in the pictures.If we try the same thing with a non-constant function, the effect is similar.

Suppose we compress one arch of the graph of sinx, 0 ≤ x ≤ π into half itswidth as in the first two pictures below.

0 1 2 3

0.5

1.0

x

y

sinu

0 1 20.0

0.5

1.0

x

y

sinx, u = 2x

0 1 20.0

0.5

1.0

1.5

2.0

x

y

2sinxIt seems plausible that the area of the squashed sine arch is half that of

the original arch. One way to justify this, at least roughly, is to think of thearea of the original arch as being made up of many thin rectangles like theone illustrated. Each of these corresponds to a thinner rectangle of the sameheight in the squashed arch. More precisely, a rectangle whose base extendsfrom u to u +∆u is squashed into a rectangle whose base extends from x tox + ∆x where x = u/2 and x + ∆x = (u+∆u) /2. Subtracting, the widthsare related by ∆x = ∆u/2, that is, the squashed rectangle is half as wide asthe original rectangle. Just as before, to maintain the area of each rectanglewe must double the height as in the right hand picture. Since each rectangleon the left has the same area as the corresponding rectangle on the right, the

FINDING ANTIDERIVATIVES SYMBOLICALLY 255

Page 256: Notes on Calculus

two areas are the same. In integral notation this saysZ π

0

sinudu =

Z 2·π/2

0

sinudu =

Z π/2

0

2 sin 2x dx.

We can think of the formal calculation u = 2x, du = 2dx as telling us that thewidth of a skinny u-rectangle is twice as much as the width of the correspondingx-rectangle.Now suppose that we change the horizontal scale in a less uniform way. For

the substitution of the previous subsection, u = x2, one arch of sinx, 0 ≤ u ≤ πbecomes one arch of sinx2, 0 ≤ x ≤ √π as in the two diagrams on the leftbelow.

0 1 2 3

0.2

0.4

0.6

0.8

1.0

x

y

sinu

0 1 20.0

0.2

0.4

0.6

0.8

1.0

x

y

sinx2

0 1 20.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

2.4

2.6

x

y

2x sinx2

It is apparent from the two diagrams on the left that the graph is squashedby different factors at different places—in particular, the right half of the archis squashed much more than the left half. The far left of the arch is actuallystretched, even to the point of changing concavity. Thus we will have to usea variable height compensation as we move from left to right. How do we dothis? As in the simple examples above, the correct height adjustment factoris the reciprocal of the width adjustment factor.If the base of the rectangle in the left picture extends from u to u + ∆u

and the base of the rectangle in the middle picture from x to x + ∆x, thenu = x2 and u+∆u = (x+∆x)2 . Now

∆u = (u+∆u)− u = (x+∆x)2 − x2 = 2x∆x+∆x2 ≈ 2x∆x or∆u

∆x≈ 2x,

256 INTEGRATION

Page 257: Notes on Calculus

that is, the u-rectangle is about 2x times as wide as the x-rectangle. Thusthe compensation is to multiply the height of the x-rectangle by 2x. (Noticethat this is less than 1 for 0 ≤ x < 1/2, which corresponds to 0 ≤ u < 1, andgreater than 1 elsewhere. Thus the graph is stretched between 0 and 1 andsquashed to the right of 1.) This produces the picture on the right—the graphof 2x sinx2—and the equality of areas designated by integralsZ π

0

sinudu =

Z √π

0

2x sin¡x2¢dx.

Again, the formal calculation u = x2, du = 2x dx can be interpreted assaying that the thin u-rectangle is 2x times as wide as the corresponding thinx-rectangle.

The general argument is similar. If we have an intgralZ B

A

f (u) du and

a substitution u = g (x) where A = g (a) and B = g (b) , then we must lookat what happens to the width of a rectangle whose base extends from u tou+∆u. We have

∆u = (u+∆u)− u = g (x+∆x)− g (x) ≈ g0(x)∆x.

Thus we must adjust the height of the x-rectangle by a factor g0(x) .We thinkof this as the content of the formal calculation u = g (x) , du = g0(x) dx. Thisproduces the familiar substitution formulaZ g(b)

g(a)

f (u) du =

Z B

A

f (u) du =

Z b

a

f (g (x)) g0(x) dx.

5.11.4 Integration by Parts

Integration by parts may be regarded as a “backwards product rule.” Werewrite d

dx(f (x) g (x)) = f (x) g0(x) + f 0(x) g (x) orZ

(f (x) g0(x) + f 0(x) g (x)) dx = f (x) g (x) + C

as Zf (x) g0(x) dx = f (x) g (x)−

Zf 0(x) g (x) dx. (5.7)

Thus the strategy of integration by parts is to trade your problem for a differentproblem—one that is easier if you have made good choices. The practical taskis to partition the integrand into two parts again, but this time one factor issomething that will be simplified by differentiation (most often x or a powerof x) and the other factor has an antiderivative that is not too frightening.

FINDING ANTIDERIVATIVES SYMBOLICALLY 257

Page 258: Notes on Calculus

For instance, to integrateRx cosx dx we choose f (x) = x, g0 (x) = cosx

(so that g (x) = sinx) to getZx cosxdx = x sinx−

Z1 · sinxdx = x sinx+ cosx+ C.

There is also suggestive notation here. We usually abbreviate f (x) as u andg (x) as v and write equation (5.7) in the shorthand formZ

udv = uv −Z

vdu. (5.8)

Notice that this is not a change of variable.The quantities u and v are notnew variables like the u in substitution. They are just shorthand names forthe functions f and g, and the point of equation (5.8) is just to remind youwhat to differentiate and what to antidifferentiate as you go from one side ofthe equation to the other.

EXERCISES.

Use substitution with definite integrals or integration by parts to evaluatethese integrals.

1.Z 1

0

ex

1 + exdx 2.

Z 1

0

1

1 + 2x+ x2dx 3.

Z 1

0

x

1 + 5x2dx

4.Z 1

0

te5t dt 5.Z 1/7

0

arctan 7t dt 6.Z e

1

lnx

x2dx 7.

Z π2

0

sin¡√

t¢dt

Use substitution to decide how to fill in the blanks in the right hand inte-grals so that the two sides can be transformed by substitution into the sameintegral. It is not necessary to evaluate the integrals.

8.Z e

1

lnx

xdx =

Z _

_x dx

9.Z 0

− ln 2

ex

1 + e2xdx = __

Z _

_

cosx

1 + sin2 xdx

10.Z 3

0

√1 + x dx = __

Z _

_

p1 +√x√

xdx

11. George is trying again. He says, I don’t think substitution works fortrig functions. We know

R−2 sin (2x) dx = cos (2x) and

R−4 sinx cosxdx =

2 cos2 (2x) since we can check by differentiation. Also−2 sin (2x) = −2 (2 sinx cosx) =−4 sinx cosx by the addition rule for sine. Thus cos 2x = 2 cos2 2x. But thesefunctions are not equal, since they have different values at x = 0. Where didGeorge go wrong? Correct his mistake.

12. George has given up, but Martha is worried about integration by parts

because of this example. We knowRtanxdx =

R sinxcosx

dx. If we integrate

the right side by parts with u =1

cosxso du =

sinx

cos2 x= tanx secx and

258 INTEGRATION

Page 259: Notes on Calculus

dv = sinxdx, v = − cosx, thenZtanxdx = uv −

Zvdu = −1 +

Ztanx secx cosxdx

= −1 +Ztanxdx.

Is Martha right to be worried? Explain.

13. The integration command on my TI-89 does not know what to do withZ(1 + lnx)

q1 + (x lnx)2dx.

Show that you can beat a machine, at least sometimes, by making a substitu-tion to convert this into a form from which either the TI-89 or you (using thetable following these exercises) can find an antiderivative. Convert the finalantiderivative back into the original variable x.

14. Illustrate the subsection on the area interpretation of substitution using#1 above by showing that skinny rectangles of width dx near x = 1 and alsonear x = .01 have the same area as rectangles of the corresponding width dunear the corresponding values of u.

15. Repeat #14 using #3 above.

5.11.5 A Short Table of Antiderivatives

I. Basic Functions

1.Rxadx = 1

a+1xa+1 + C, a any real number except −1

2.Rx−1dx = lnx+ C

3.Raxdx = 1

ln aax + C, a > 0

4.Rlnxdx = x lnx− x+ C, x > 0

5.Rsinxdx = − cosx+ C

6.Rcosxdx = sinx+ C

7.Rtanxdx = − ln |cosx|+ C

II. Products of ex, sinx, cosx

8.Reax sin (bx) dx = 1

a2+b2eax [a sin (bx)− b cos (bx)] + C

9.Reax cos (bx) dx = 1

a2+b2eax [a cos (bx) + b sin (bx)] + C

FINDING ANTIDERIVATIVES SYMBOLICALLY 259

Page 260: Notes on Calculus

10.Rsin (ax) sin (bx) dx = 1

b2−a2 [a cos (ax) sin (bx)− b sin (ax) cos (bx)] + C,a 6= b

11.Rcos (ax) cos (bx) dx = 1

b2−a2 [b cos (ax) sin (bx)− a sin (ax) cos (bx)] +C,a 6= b

12.Rsin (ax) cos (bx) dx = 1

b2−a2 [b sin (ax) sin (bx) + a cos (ax) cos (bx)] + C,a 6= b

III. Product of a Polynomial p (x) with lnx, ex, cosx, sinx

13.Rxα lnxdx = 1

α+1xα+1 lnx − 1

(α+1)2xα+1 + C,α any real number except

−1

14.Rp (x) eaxdx = 1

ap (x) eax− 1

a2p0 (x) eax+ 1

a3p00 (x) eax− ... (signs alternate:

+−+− ...)

15.Rp (x) sin (ax) dx = − 1

ap (x) cos (ax)+ 1

a2p0 (x) sin (ax)+ 1

a3p00 (x) cos (ax)−

... (signs alternate in pairs after first: −++−−++...)

16.Rp (x) cos (ax) dx = 1

ap (x) sin (ax)+ 1

a2p0 (x) cos (ax)− 1

a3p00 (x) sin (ax)−

... (signs alternate in pairs: ++−−++ ...)

IV. Integer Powers of sinx and cosx

17.Rsinn xdx = − 1

nsinn−1 x cosx+ n−1

n

Rsinn−2 xdx, n > 0

18.Rcosn xdx = 1

ncosn−1 x sinx+ n−1

n

Rcosn−2 xdx, n > 0

19.R 1

sinm xdx = −1

m−1cosx

sinm−1 x+ m−2

m−1R 1

sinm−2 xdx,m 6= 1,m > 0

20.Z

1

cosm xdx = 1

m−1sinx

cosm−1 x+ m−2

m−1R 1

cosm−2 xdx,m 6= 1,m > 0

21.Z

1

sinxdx = 1

2ln

¯̄̄̄(cos x)− 1(cosx) + 1

¯̄̄̄+ C

22.Z

1

cosxdx = 1

2ln

¯̄̄̄(sinx) + 1

(sinx)− 1

¯̄̄̄+ C

23.Rsinm x cosn xdx: If m is odd, let u = cosx; if n is odd, let u = sinx.

If both m and n are even and nonnegative, convert to power of sinxor power of cosx and use reduction formula above. If one of m andn is negative, convert to sum of powers of whichever function is in thedenominator. If both are negative, find a better table.

V. Quadratic in the Denominator

24.Z

1

x2 + a2dx = 1

aarctan x

a+ C, a 6= 0

260 INTEGRATION

Page 261: Notes on Calculus

25.Z

bx+ c

x2 + a2dx = b

2ln |x2 + a2|+ c

aarctan x

a, a 6= 0

26.Z

1

(x− a) (x− b)dx = 1

a−b ln

¯̄̄̄x− a

x− b

¯̄̄̄+ C, a 6= b

27.Z

cx+ d

(x− a) (x− b)dx = 1

a−b [(ac+ d) ln |x− a|− (bc+ d) ln |x− b|] + C,

a 6= b

VI. Integrands with Radicals

28.Z

1√a2 − x2

dx = arcsin xa+ C

29.Z

1√x2 ± a2

dx = ln¯̄x+√x2 ± a2

¯̄+ C

30.Z √

a2 ± x2dx = 12

µx√a2 ± x2 + a2

R 1√a2 ± x2

dx

¶+ C

31.Z √

x2 − a2dx = 12

¡x√x2 − a2 − a2 ln

¯̄x+√x2 − a2

¯̄¢+ C

5.12 Improper Integrals

5.12.1 The Basic Idea

In a remake of The Wizard of Oz, Dorothy is driving across Kansas on I-70in her Uncle Henry’s beat up Ford Edsel with her dog Toto riding shotgun.Suppose that at time t = 0 her car runs out of fuel and starts to coast. At timet its speed is 70e−t mph. (Thus after one hour its speed is 70e−1 ≈ 26 mph.It rolls very well and the road is maybe a bit downhill.) What will happeneventually? On the one hand her car’s speed is clearly decreasing to 0; on theother hand it will never exactly stop, though after some time it will be movingvery slowly indeed—after a day its speed is about 2× 10−9 mph. Does the factthat it will never quite stop mean that if we wait long enough it will go a verylong distance? Will Dorothy ever be able to say, “Toto, I’ve a feeling we’renot in Kansas anymore”?Fortunately it is easy to do a calculation. We know that from t = 0 to

t = T hours the total distance traveled by the car isZ T

0

70e−tdt = −70e−t¯̄T0= 70

¡1− e−T

¢miles.

IMPROPER INTEGRALS 261

Page 262: Notes on Calculus

Thus no matter how long we wait, the car will never have gone quite as faras 70 miles. On the other hand, as time goes on, the distance traveled doesapproach 70 miles, that is,

limT→∞

Z T

0

70e−tdt = 70.

We use the shorthandR∞070e−tdt = 70 to denote the existence and value of

this limit. More formally, we have this definition.

Definition 7 Let a be a fixed real number, and let f be a real-valued functionthat is Riemann integrable on [a, b] for each real number b > a. We define theimproper integral

R∞a

f (x) dx byZ ∞

a

f (x) dx = limb→∞

Z b

a

f (x) dx

if this limit exists. In this case we say the improper integral converges. If thelimit does not exist, the improper integral diverges. Similarly,Z b

−∞f (x) dx = lim

a→−∞

Z b

a

f (x) dx

if this limit exists. FinallyZ ∞

−∞f (x) dx = lim

a→−∞

Z 0

a

f (x) dx+ limb→∞

Z b

0

f (x) dx

provided both individual limits exist. We use the converge-diverge terminologyin these cases too.

Remark: Note thatR∞−∞ f (x) dx exists only if the two limits each exist

individually. For instanceR∞−∞ x dx diverges even though

R a−a x dx = 0 for all

positive a. In other words, we don’t allow cancellation between what’s goingon for positive x’s and negative x’s.

Examples. ConsiderR∞1

1

xpdx. We know that if p 6= 1,

Z b

1

1

xpdx =

x1−p

1− p

¯̄̄̄b1

=1

p− 1¡1− b1−p

¢.

If p > 1, b1−p → 0 as b→∞, so the integral converges to 1

p− 1 . If p < 1, thenb1−p →∞ as b→∞. Finally,Z b

1

1

xdx = ln b→∞ as b→∞.

262 INTEGRATION

Page 263: Notes on Calculus

Thus Z ∞

a

1

xpdx

⎧⎨⎩ converges to1

p− 1 if p > 1,

diverges if p ≤ 1.

Remark. We are accustomed to thinking of the integral of a non-negativefunction as representing the area under the graph of the function. Thus, in

particular, for any b > 0,

Z b

1

1

x2dx is the area under the graph of

1

x2from

x = 1 to x = b. As b increases, the region grows to the right, and so of courseits area increases. The calculation above shows that these areas approach 1as b→∞. Thus it is natural to regard 1 as the area of the unbounded region

between the x-axis and the graph of1

x2for all x ≥ 1.

0 1 2 3 4 5 6 7 8 9 100.0

0.2

0.4

0.6

0.8

1.0

x

y

goes on forever

Notice that the region bounded by the x-axis and the graph of the function1

xplooks roughly the same for all positive values of p. But the improper integralR∞

a1xp

dx converges for p > 1 and diverges for p ≤ 1.Thus an unbounded regionof this shape may either have a finite area or fail to have a finite area. Theonly way to tell which is the case is to do a calculation like the one we havejust done.

Example. To emphasize that limits of this sort do not always behave inthe way our intuition would expect, consider the following situation, sometimescalled Gabriel’s Horn. Recall that if a curve y = f (x) is rotated around thex-axis, then the resulting solid of revolution has circular cross-sections in thedirection perpendicular to the x-axis with radius f (x) and area π (f (x))2 . For

f (x) =1

x, then, the volume of the solid of revolution generated between x = 1

and x = b by revolving this curve around the x-axis isZ b

1

π1

x2dx = π

¡1− 1

b

¢.

It thus seems natural to regard πZ b

1

1

x2dx = π as the volume of the unbounded

“horn” extending from x = 1 indefinitely to the right.We will see shortly that the surface area of this horn is also given by an

improper integral, but one that diverges. Thus Gabriel’s Horn has a finitevolume, but infinite surface area.

IMPROPER INTEGRALS 263

Page 264: Notes on Calculus

EXERCISES. Decide whether the improper integrals converge or diverge.Evaluate the integral if it converges.

1.R∞0

e−x

1 + e−xdx 2.

R 0−∞

ex

1 + exdx 3.

R∞1

1√xdx

4.R∞−∞ sinπxdx 5.

R∞1

1

(2x− 1)2/3dx 6.

R∞−∞

1

1 + x2dx

7.R 2−∞

1

(2x− 5)2dx

8. A villain has tied you up in the back of a car which she has set movingalong Chuckanut Drive toward a sharp curve above a steep drop to the rocksat the edge of Bellingham Bay. To increase your anguish she tells you that thecar will slow as it moves, but never stop—its speed t seconds from the time itstarts carrying you toward the cliff edge will be 200e−t/2 ft/sec. You estimatethat you have started 500 feet from death. Will you go over the edge? If so,how long do you have for your life to pass before your eyes?

9. Does it make any difference if Mad Marion, overly excited by the thoughtof your suffering, pushes a little harder than she intended and the car’s speedis 200e−t/3 ft/sec instead? If so, what difference?

5.12.2 Comparison

So far we have considered improper integrals where we can find an antideriv-ative and compute the limit directly (if it exists). Often, however, we cannotfind an antiderivative. In this situation we proceed in a way similar to whatwe have done earlier in the chapter — we try to replace a complicated situationby a simpler one that will give us the information we need.

Example. InvestigateZ

1

x4 + sinx+ 1dx. Here we cannot find an anti-

derivative, so we cannot evaluateZ b

1

1

x4 + sinx+ 1dx explicitly. But we can

compare it toZ b

1

1

x4dx = −1

3x−3|b1 =

1

3

µ1− 1

b3

¶. Since

1

x4 + sinx+ 1≤ 1

x4,Z b

1

1

x4 + 1dx ≤

Z b

1

1

x4dx = 1

3

µ1− 1

b3

¶. (This is just the principle that the

area under the higher graph is greater than the area under the lower graph.)

264 INTEGRATION

Page 265: Notes on Calculus

0.0 0.5 1.0 1.5 2.0 2.5 3.00.0

0.2

0.4

0.6

0.8

1.0

x

y

___1

________ + sin + 1

1x x4

x4

We know that as b → ∞,

Z b

1

1

x4dx = 1

3

¡1− 1

b3

¢→ 1

3. Since the area

functionZ b

1

1

x4 + sinx+ 1dx is increasing as b increases, but is always less thanZ b

1

1

x4dx and so always less than 1

3, it is reasonable that

Z b

1

1

x4 + sinx+ 1dx

must also approach a limit as b→∞ and that this limit must be less than 13.

Thus we can tell by comparison thatZ ∞

1

1

x4 + sinx+ 1dx converges, though

we do not know the exact value of the limit. We know only that it is less than13.

The comparison method of the previous example can be formalized as thefollowing theorem.

Theorem 8 Let f and g be positive Riemann integrable functions defined forall x ≥ a, for some a, such that f (x) ≥ g (x) for all x ≥ a.

IfZ ∞

a

f (x) dx converges, thenZ ∞

a

g (x) dx converges.

IfZ ∞

a

g (x) dx diverges, thenZ ∞

a

f (x) dx diverges.

Remark. Intuitively, the theorem just says that if a larger area is finite,then a smaller one is finite, and conversely, that if a smaller area is infinite,then a larger one is infinite. A careful proof depends on a deep property ofthe real numbers, the Least Upper Bound Principle. See Math 226 for anintroduction to the Least Upper Bound Principle and its use.Effective use of this theorem requires having a reasonable library of im-

proper integrals for which convergence or divergence is already known. The

integralsZ ∞

a

1

xpdx are the most important members of such a library. Then

the idea is, given an integral whose convergence or divergence is to be deter-mined and for which an antiderivative cannot be found, to search the library

IMPROPER INTEGRALS 265

Page 266: Notes on Calculus

for a suitable comparison. One example of this was given before the theorem.Here is another.Example. (Gabriel’s Horn again) The surface area of the region ob-

tained by revolving y = f (x) around the x-axis from x = 1 to x = b is

Z b

1

f (x)q1 + (f 0 (x))2dx. (Consult a calculus book for this formula.) In

particular, for f (x) =1

x, the surface area from x = 1 to x = b is

Z b

1

1

x

s1 +

µ− 1x2

¶2dx = 2π

Z b

1

1

x

r1 +

1

x4dx.

We certainly can’t find an antiderivative here, but we can notice thatq1 + 1

x4>

1 so thatZ b

1

1

x

q1 + 1

x4dx >

Z b

1

1

xdx.We already know that

Z ∞

1

1

xdx diverges,

so the larger improper integral also diverges. Thus Gabriel’s Horn has infinitesurface area, despite having finite volume. One might try to create the follow-ing paradox from this: since the surface area is infinite, we could never paintit with a finite amount of paint. But since the volume is finite, we could justdump a finite amount of paint into the horn and fill it up, thus painting it.What is wrong with this?

Finally, one more example to show how we can use comparison to drawconclusions about a situation where we don’t have enough information to doan exact computation.Example. Show that if f is any bounded function defined for x ≥ 1, thenR∞

1

f (t)

t2dt converges. We know that for some fixed constant M, |f (t)| ≤ M

for all t ≥ 1, or equivalently that −M ≤ f (t) ≤ M. This means that for anyb,

−MZ b

1

1

t2dt = −

Z b

1

M

t2dt ≤

Z b

1

f (t)

t2dt ≤

Z b

1

M

t2dt =M

Z b

1

1

t2dt.

SinceR b1

1

t2dt → 1 as b → ∞, this means that all the integrals

R b1

f (t)

t2dt for

all real numbers b lie in the band [−M,M ] of width 2M. More generally, forany lower limit a,

−MZ b

a

1

t2dt ≤

Z b

a

f (t)

t2dt ≤M

Z b

a

1

t2dt.

SinceR ba

1

t2dt→ 1

aas b→∞, this means that all integrals

R ba

f (t)

t2dt for b > a

lie in the band∙−M

a,M

a

¸of width 2

M

a. Equivalently, if b > a, then¯̄̄̄Z b

1

f (t)

t2dt−

Z a

1

f (t)

t2dt

¯̄̄̄=

¯̄̄̄Z b

a

f (t)

t2dt

¯̄̄̄≤ 2M

a.

266 INTEGRATION

Page 267: Notes on Calculus

If a is very large, then 2M

ais very small. Thus the integrals

R b1

f (t)

t2dt are not

changing much for large b. This is not quite enough to prove thatR b1

f (t)

t2dt

approaches a limit as b → ∞, but in fact we could prove that from thisinequality if we worked a little harder.

Example Continued. Now we return to the statement from the sectionon symbolic antidifferentiation: if f has a bounded antiderivative F, thenR∞1

f (t2) dt converges.

We write f (t2) =1

2t2tf (t2) and integrate by parts, with dv = 2tf (t2) and

u =1

2tto get

Z b

1

f¡t2¢dt =

1

2tF¡t2¢¯̄̄̄b1

+1

2

Z b

1

F (t2)

t2dt.

Since F is bounded,1

2tF (t2)

¯̄̄̄b1

→ −12F (1) as b→∞. By the previous exam-

ple,R b1

F (t2)

t2dt converges to some limit L as b→∞. It follows that

R∞1

f (t2) dt

converges, and in fact thatR∞1

f (t2) dt = L− 12F (1) =

R∞1

F (t2)

t2dt− 1

2F (1) .

EXERCISES.Use an appropriate comparison to decide whether each of these improper

integrals converges or diverges. Do not try to use your calculator to computean antiderivative, even if your calculator knows one.

1.Z ∞

0

1

x+ exdx 2.

Z ∞

0

1

x+ e−xdx 3.

Z ∞

1

√x

x+ 5dx

4.Z ∞

1

1

x3 + 1dx 5.

Z ∞

2

1

x3 − 1dx 6.Z ∞

2

e−x√xdx

7.Z ∞

2

1√x3/2 − 3

dx 8.Z ∞

2

x√x4 − 1

dx 9.Z ∞

2

1

ex − 2xdx

10.Z ∞

1

1

xa (x+ 1)dx, a > 0 11.

Z ∞

0

esinxdx 12.Z ∞

2

1

2x − exdx

13.Z ∞

1

1

x4 + sinx+ 5dx 14.

Z ∞

0

3−x2dx 15.

Z ∞

2

x

lnxdx

16.Z ∞

1

cos2 x

1 + x2dx 17.

Z ∞

1

2 + e−x

xdx 18.

Z ∞

1

2− e−x

xdx

19.Z ∞

1

x

x+ e2xdx 20.

Z ∞

1

x

e2x − xdx 21.

Z ∞

1

x√1 + x6

dx

22.Z ∞

1

1

x+ x2dx 23.

Z ∞

1

x2 + 5

x3 + 2x+ 2dx 24.

Z ∞

1

x

x2 + 2x+ 2dx

25.Z ∞

1

5x+ 2

x3 + 8x2 + 4dx

IMPROPER INTEGRALS 267

Page 268: Notes on Calculus

26. Sketch the region and find the area (if finite) or explain why the areais not finite.(a) x ≤ 1, 0 ≤ y ≤ ex (includes all negative values of x)(b) x ≥ 0, 0 ≤ y ≤ x

x2 + 927. What is the difficulty with the paint "solution" to the Gabriel’s Horn

paradox mentioned in the example just above?

28. Find a value of a so that the value of the improper integral is less than10−4.

(a)Z ∞

a

e−xdx (b)Z ∞

a

e−x2dx (c)

Z ∞

a

1

x4 + 1dx

5.12.3 Another Kind of Improper Integral

So far we have considered integrals that are improper because the interval of

integration is unbounded. Now we consider briefly integrals likeR 10

1

xdx that

are unbounded because the integrand is unbounded. We can again think ofthis integral as the area (if it exists) of an unbounded region in the plane, butthis time the region is unbounded vertically rather than horizontally. The ideais essentially the same as before. The improper integral is the limit of ordinaryintegrals if the limit exists. Here is the definition.

Definition 9 Let b be a fixed real number, and let f be a real-valued functionthat is Riemann integrable on [c, b] for each real number c > a. We define theimproper integral

R baf (x) dx byZ b

a

f (x) dx = limc→a

Z b

c

f (x) dx

if this limit exists. In this case we say the improper integral converges. If thelimit does not exist, the improper integral diverges. Similarly, if f is Riemannintegrable on [a, c] for each c < b,Z b

a

f (x) dx = limc→b

Z c

a

f (x) dx

if this limit exists and we say again that the integral converges.

Examples. ConsiderR 10

1

xpdx. We know that if p 6= 1,

Z 1

c

1

xpdx =

x1−p

1− p

¯̄̄̄1c

=1

1− p

¡1− c1−p

¢.

268 INTEGRATION

Page 269: Notes on Calculus

If p < 1, then c1−p → 0 as c→ 0 so the integral converges to1

1− p. If p > 1,

then c1−p →∞ as c→ 0 so the integral diverges. FinallyZ 1

c

1

xdx = lnx|1c = − ln c→∞ as c→ 0.

Thus the integral diverges in this case also. Summarizing,

Z 1

0

1

xpdx

⎧⎨⎩ converges to1

1− pif p < 1,

diverges if p ≥ 1.

Notice hat the inequalities are exactly opposite to those forR∞1

1

xpdx except

that p = 1 is a divergent case both times.Remark. We are accustomed to thinking of the integral of a non-negative

function as representing the area under the graph of the function. Thus, in

particular, for any c > 0,

Z 1

c

1√xdx is the area under the graph of

1√xfrom

x = c to x = 1. As c decreases, the region grows to the left and up, andso of course its area increases. The calculation above shows that these areasapproach 2 as c→ 0. Thus it is natural to regard 2 as the area of the unbounded

region between the x-axis and the graph of1√xfor 0 < x ≤ 1.

0.0 0.2 0.4 0.6 0.8 1.00

2

4

6

8

10

x

y

Z 1

0

1/√x dx

Notice that the region bounded by the x-axis and the graph of the function1

xplooks roughly the same for all positive values of p. But the improper integralZ 1

0

1xp

dx converges for p < 1 and diverges for p ≥ 1. (This is almost just theopposite of the outcome for integrals that are improper because the upperlimit is ∞. Only for p = 1 is the outcome the same.) Thus an unboundedregion of this shape may either have a finite area or fail to have a finite area.

IMPROPER INTEGRALS 269

Page 270: Notes on Calculus

The only way to tell which is the case is to do a calculation like the one wehave just done.

As with the first kind of improper integral, we evaluate the integralZ b

c

f (x) dx

directly with an antiderivative when we can, and then evaluate the limit ex-plicitly. When we cannot find an antiderivative, then we have to proceed bycomparison, just as before. The same result holds, for the same reasons.

Theorem 10 Let f and g be positive Riemann integrable functions definedfor a < x ≤ b, such that f (x) ≥ g (x) throughout this interval.

IfZ b

a

f (x) dx converges, thenZ b

a

g (x) dx converges.

IfZ b

a

g (x) dx diverges, thenZ b

a

f (x) dx diverges.

Remark. Again, the theorem just says that if a larger area is finite, thena smaller one is finite, and conversely, that if a smaller area is infinite, then alarger one is infinite.Warning. Often the gist of determining whether an improper integral

converge or diverges is deciding which part of the integrand really matters,that is, which part controls the growth rate of the integrand as you approachthe critical endpoint. Be aware that the issue depends on whether the endpointis finite or infinite. For the function x2 + x−2, it is the first term that growsfastest as x→∞ but the second term that grows as x→ 0. And for x2 + x itis the first term that dominates as x→∞ but the second term that dominatesas x→ 0. For instance, for the integralZ 1

0

x

x3 + x+ 2dx

you have to ignore the x2 in the denominator—the fraction does not behave

likex

x3=1

x2as x→ 0, but like

x

2so that it is not even an improper integral.

You also need to be careful in comparing functions involving fractions. Inearly always find it helps when comparing functions to eliminate fractions bycross-multiplying

EXERCISES. State clearly why each integral is improper. Determine,without using your calculator except for examining the graph of the integrand,whether each converges or diverges and justify your answer. For those thatconverge, determine the value of the integral if possible.

1.R 30

1

x2dx 2.

R 20x−2/3dx 3.

R 20(2− x)−2/3 dx

4.R 30(x− 1)−2/3 5.

R 20

1

x2 + x−2dx 6.

R 42

1√x2 − 4

dx

7.Z 1

0

1√1− x3

dx 8.R 10

1√x+ sinx

dx 9.R e0

x lnx

x+ 1dx

270 INTEGRATION

Page 271: Notes on Calculus

10.Z 33

0

(x− 1)−1/5 dx 11.Z π

0

sin θ√θdθ 12

Z 1

0

e−x√xdx

13.Z 1

0

lnx√xdx 14.

Z ∞

0

1√x (x+ 1)

dx 15.Z 1

0

1

x+ x2dx

16.Z 3

−3

x√81− x4

dx 17.Z 2

1

1√x− 1dx (Suggestion: multiply and divide by√

x+ 1.)

18. Determine k so thatZ ∞

0

sin (x2)

x2dx = k

Z ∞

0

cos¡x2¢dx.

First you should explain how you know that the left side is finite, that is, thatthe improper integral converges.

19. Sketch the region and find the area (if finite) or explain why the areais not finite.(a) 0 ≤ x ≤ π

2, 0 ≤ y ≤ sec2 x

(b) −2 ≤ x ≤ 0, 0 ≤ y ≤ 1√x+ 2

20. Determine a value of a so that the indicated improper integral has avalue less than 10−4.(a)

R a0x−2/3dx (b).

R 2a(2− x)−2/3 dx (c).

R a0lnx dx

IMPROPER INTEGRALS 271

Page 272: Notes on Calculus

272 INTEGRATION

Page 273: Notes on Calculus

6. MODELING WITH INTEGRALS

Deciding how to represent a quantity by an integral depends on two big ideas.The first is the Fundamental Theorem of Calculus as expressed in Chapter 5:

Total Change =Z(rate of change)

or sometimes

Total Amount =Z(amount per unit “volume”).

Here unit “volume” can mean unit length or unit area or unit volume depend-ing on circumstances. We will see examples of all three possibilities in theexamples and problems.

The second guideline to setting up integrals is the definition of the definiteintegral as a limit of sums. The idea is that if a quantity can be approximatedby a sum, and if that sum is the Riemann sum associated with an integral,then the quantity is given exactly by the integral. Finding the approximatingsum is usually a matter of using a simple formula for the case when the rateof change (or the density or something of the kind) is constant by breakingthe situation up into a number of small pieces in a way appropriate to thesituation, pretending the rate of change is constant on each piece, using thesimple formula on each piece, and then adding all the individual estimates.

We will study modeling in several different contexts—representing geomet-ric quantities like area or volume, representing physical quanitities like mass,force, or kinetic energy, computing total change from rate information, and anintroduction to probability.

6.1 Finding Areas and Volumes

Example 1. Find the volume of a pyramid 3 meters high with a square base4 meters on a side.

Solution. Think of the pyramid as being placed on its side so the vertexis at the origin and its axis of symmetry is the x-axis.

MODELING WITH INTEGRALS 273

Page 274: Notes on Calculus

1 2 3 4

-2

-1

0

1

2

x

y

... ...

y = 2x/3

y = - 2x/3

Δx

In profile the base of the pyramid extends from the point (3,-2) to thepoint (3,2), and the sides are given by the lines y = ±2

3x. Each cross-section

perpendicular to the x-axis is a square. Think of approximating the volumeby stacking thin square blocks of varying size with width ∆x. A block with“top” at xi and “bottom” at xi +∆x has a square base 4

3xi on a side. Such

a block has volume¡43xi¢2∆x. If we approximate the volume as the sum of n

such blocks, each of width ∆x = 3n, we get

n−1Xi=0

µ4

3xi

¶2∆x =

n−1Xi=0

16

9x2i∆x.

The volume is thenZ 3

0

16

9x2dx =

16

27x3¯̄30 = 16 cubic meters.

Remark. More generally, if we have any solid extending from x = ato x = b whose cross-section perpendicular to the x-axis has area A (x) , anargument similar to that of Example 1 shows that its volume is

V =

Z b

a

A (x) dx.

In particular, if each cross-section is a disk of radius r (x) , ThenA (x) = πr2(x)and the volume is given by the usual formula for a solid of revolution:

V = π

Z b

a

r2 (x) dx.

Remark. The same principle, one dimension down, applies to the problemof finding the area bounded by some collection of curves. Here the correspond-ing formula is that the area, A, is the integral of the cross-sectional length insome convenient direction. (The convenient direction for cross-sections is notalways vertical!

274 MODELING WITH INTEGRALS

Page 275: Notes on Calculus

Example 2. To express the area bounded by y = x2, y = 1 − 2x2 andthe line y = 1 (see diagram below) as a single integral, we use horizontalcross-sections and integrate with respect to y.

0.5 1.0 1.5

-0.5

0.0

0.5

1.0

x

y

horizontal section

Thus to find the horizontal distance between the curves we must solve each

equation for x (getting x =√y and x =

r1− y

2)and take the difference—at

a given height y, the horizontal distance between the curves is√y −

r1− y

2.

The curves intersect when 3x2 = 1 or x = 1/√3, y = 1/3, so we must integrate

the horizontal sections between y = 1/3 and y = 1 to get an area of

A =

Z 1

1/3

Ã√y −

r1− y

2

!dy =

2

3− 29

√3.

If we insisted on finding this area with vertical cross-sections, we would needto consider separately the region to the left of x = 1/

√3 where the lower curve

is y = 1− 2x2 and the region to the right of x = 1/√3 where the lower curve

is y = x2.

Remark. Sometimes instead of a flat cross-section it is more convenientto to think of a solid as approximately a collection of concentric hollow cylin-ders.In the next example the height of the solid is constant on a collection ofconcentric circles Thus we can approximate the volume as the sum of volumesof concentric thin cylinders. Also, instead of writing an exact volume for eachthin cylinder (the difference in volume of two solid cylinders) we can proceedmore crudely by “unrolling” each cylinder into a thin rectangular block whoselarger dimensions are the circumference of the inner circle and the height ofthe cylinder.Example 3. After the eruption of Mt. St. Helens, the approximate

depth of volcanic ash that settled x feet from the volcano was approximately1016

(x+10,000)4feet. Set up a definite integal for the total volume of volcanic ash

that fell within a distance b of the volcano.Solution. If the depth of ash were constant, the total volume of ash would

simply bevolume = depth × area.

FINDING AREAS AND VOLUMES 275

Page 276: Notes on Calculus

Since depth is not constant, we must break the total area up into smallerpieces on which it is approximately constant. The area within a distance bof the volcano is a circular region centered at the volcano with radius b. Weare assuming that the depth of ash depends only on distance from the volcanoand not on direction, so that depth is constant on each circle centered at thevolcano. (Of course this assumption is not quite accurate, but it makes themodel a lot simpler.) Thus the depth would be approximately constant on athin ring centered at the volcano.The area of a ring whose inner boundary is xi feet from the volcano and

whose outer boundary is xi+∆x feet from the volcano is approximately 2πxi∆xsquare feet. (Cut the ring and “straighten” it into a rectangle of length 2πxiand width ∆x.)

cut hereΔx

xi

xΔ2πxi

Straightened Ring

Thus the volume of ash covering this ring is approximately

2πxi1016

(xi + 10, 000)4∆x cubic feet.

If we divide the interval 0 ≤ x ≤ b into n equal subintervals of size ∆x = 2πbn

corresponding to n concentric thin rings, and add the estimates of the volumeof ash on each ring we get

n−1Xi=0

2πxi1016

(xi + 10, 000)4∆x cubic feet.

This is a Riemann sum for the following integral which then represents thetotal volume of ash within a distance b of Mt. St. Helens:Z b

0

2πx1016

(x+ 10, 000)4dx = 2 · 1016π

Z b

0

x

(x+ 10, 000)4dx.

Question: How would you evaluate this integral?

What we have just done is essentially an integration by the so-called“method of shells.”

276 MODELING WITH INTEGRALS

Page 277: Notes on Calculus

EXERCISES.

For each problem, write a Riemann sum that approximates thedesired quantity, and illustrate with a diagram that shows the con-tribution of a typical term of the sum. Then complete the problem.

1. Determine the area of each region. Illustrate with a careful diagramshowing a typical cross-section.

(a) The region bounded by y = x3, y = 1/x, and y = 3.

(b) The region in the third and fourth quadrants bounded by y = x3,the line tangent to this curve at (1, 1) and the x-axis. Use bothhorizontal and vertical cross-sections.

(c) The region in the first quadrant bounded by the same curves as theprevious part. Use cross-sections in the more convenient direction.

(d) The region bounded by y = x and x = y2 − 2.

2. Fix a > 0.

(a) Determine the area of the region bounded by y = x2 and y = a.

(b) Determine the area of the largest rectangle whose top is along y = aand which lies above y = x2.

(c) How does the ratio of the area in (b) to the area in (a) behave asa→∞? As a→ 0?

3. Determine the volume of the solid generated by revolving the regionbounded by the x and y axes and the curve y = 9− x2

(a) around the x-axis

(b) around the y-axis

(c) around the line y = −2.

4. Let R be the region in the plane bounded by the graphs of y =1

x, y =

1

x+ 1and x = 1.

(a) Determine the area of R.

(b) Determine the volume of the region obtained by revolving R aroundthe x-axis.

(c) Determine the volume of the region obtained by revolving R aroundthe y-axis.

FINDING AREAS AND VOLUMES 277

Page 278: Notes on Calculus

5. Determine the volume of each solid.

(a) The solid lies between planes perpendicular to the x-axis at x = 0and x = 4 with cross-sections perpendicular to the x-axis beingsquares whose diagonals run from y = −√x to y = √x. (A sort ofmountain tent on its side.)

1 2 3 4

-2

-1

0

1

2

x

yedge of a cross-section

Top view of solid

4x-1 1

2

0

z

0

1

0y

21

-2-1 3-2 2

(b) The solid lies between planes perpendicular to the x-axis at x = −2and x = 2 with cross-sections perpendicular to the x-axis beingdisks whose diameters run from y = x2/4 to y = 2− x2/4. (Shapedsomething like a football.)

-2 -1 0 1 2

1

2

x

y edge of a cross-section

Top view of solid

(c) The base of the solid is the disk x2 + y2 ≤ 1. Cross-sections byplanes perpendicular to the x-axis are isosceles right triangles withone leg in the disk (from one side of the circle to the other).

-0.5

1.0-1 0.5

0

x y0.0

0

z 1

1

2

-1.0

6.1 #5c

-0.5

-1.0

1.0

0.5

0.01-0.7

0.72 3 x4

6.1#5d

278 MODELING WITH INTEGRALS

Page 279: Notes on Calculus

(d) The solid lies between planes perpendicular to the x-axis at x = π/4to x = 5π/4 with cross-sections being disks whose diameters runfrom y = 2 cosx to y = 2 sinx.

(e) The solid lies between planes perpendicular to the x-axis at x = 0to x = 6 with cross-sections being squares whose bases run fromthe x-axis to the curve x

12 + y

12 =√6.

46

6

2

4

00 2

0

2

46

6.1#5e

6. Determine the volume of a pyramid whose base is an equilateral triangleof side s and whose height is h.

7. Determine the volume of a solid whose base is a circle of radius 2 ifvertical cross-sections parallel to the x-axis are equilateral triangles.

8. Pete, Piet, and Peter want to divide a 14 inch pizza into three parts ofequal area with two parallel straight cuts of a knife. Where should theycut the pizza?

9. What fraction of the volume of a sphere is contained between parallelplanes that trisect the diameter to which they are perpendicular?

10. What is the largest possible fraction of the volume of a sphere that canbe occupied by a right circular cone inside the sphere? (Both vertex andbase are on the surface of the sphere)

FINDING AREAS AND VOLUMES 279

curgus
Red RIght Arrow
Page 280: Notes on Calculus

-1.0-0.5

1

-1.0-0.5z

y1.0

00.00.5

x0.0

1.0

-1

0.5

11. The depth of rain at a distance r from the center of a storm is g (r) feet.Set up a definite integral for the total volume of rain (in cubic feet) thatfalls between 1000 feet and 2000 feet from the center of the storm.

12. A hole is bored all the way through a sphere by a drill whose diameteris equal to the radius of the sphere, and whose point passes through thecenter of the sphere. Determine the volume that remains.

13. A wedding ring is a solid of revolution whose inside surface is a cylinderof radius R and height H. The thickness at its top and bottom edges isT and the outer edge of a section is the arc of a circle centered on theaxis of symmetry as in the diagram below. Show that the volume of the

ring isπH

6(H2 + 12RT + 6T 2) .

R

Taxis of symmetry

H

Cross section of ring

14. A gas station’s fuel tank is a cylinder of diameter d feet and lengthfeet which is buried on its side so that the circular ends of the tank arevertical.

(a) You lower a stick through a hole at the highest point of the tankand pull it back up to find that the depth of fuel in the tank is hfeet. What is the volume of the fuel in the tank?

280 MODELING WITH INTEGRALS

curgus
Red RIght Arrow
curgus
Red RIght Arrow
Page 281: Notes on Calculus

(b) Make a rough graph of the area of a vertical cross-section throughthe fuel as a function of the depth h of the cross-section. (Thinkabout how the rate of change of the area function changes as hincreases.)

(c) If the tank had the shape of a rectangular box of length , heightd, and width w feet, and if you find h feet of the stick wet, thenyou can conclude that the tank contains h/d of its total possiblefuel capacity. For the cylindrical tank, for which values of h doesthe tank contain h/d of its total capacity? For which values of h ish/d an overestimate of the fraction of the tank’s volume that thefuel occupies? For which values of h is h/d an underestimate?

15. A vertical circular disk of radius R is partially submerged in water androtated about its center. To what depth should it be lowered to maximizethe wetted area above water?

16. Find the area of the region inside a square of side 2 consisting of allpoints closer to the center of the square than to the edge. (Be verycareful when you identify the points equidistant from the center and theedge.)

6.2 Finding Physical Quantities

Here is a fairly simple first example. (It is very similar to Example 2 ofsection 5.1. The other two examples of that section are also illustrations ofthe techniques we will be studying here.)

Example 4. Suppose a bar 1 meter long has linear density 50+ 3x gramsper centimeter a distance x centimeters from the left hand end. Find the massof the bar.Solution. If the bar had constant linear density, say 50 grams per cen-

timeter, then we could simply find the mass by multiplication: mass = (50g/cm)(100 cm) = 5000 g = 5 kg.We can approximate the mass of a bar with variable density by chopping

the bar into small pieces, each of length ∆x, computing the approximate massof each piece by assuming the density is constant on it, and then adding themasses of all the pieces to get the total mass.

FINDING PHYSICAL QUANTITIES 281

curgus
Red RIght Arrow
curgus
Red RIght Arrow
Page 282: Notes on Calculus

Δx

x Δx +0 100

x

One piece from the bar

If the piece extends from xi to xi + ∆x then the density throughout thepiece is approximately 50 + 3xi g/cm (the exact density at the left end of thepiece) and the mass is approximately

(50 + 3xi g/cm)(∆x cm) = (50 + 3xi)∆x grams.

This makes the total mass of the bar approximately

n−1Xi=0

(50 + 3xi)∆x grams.

Now this is a Riemann sum (a left hand sum, in fact) for the integralR 1000(50 + 3x) dx.

Thus

Total Mass =Z 100

0

(50 + 3x) dx =

µ50x+

3x2

2

¶ ¯̄1000

= 5, 000 + 15, 000 = 20, 000g = 20 kg.

Remark. What if we used a different estimate for the density on the pieceextending from xi to xi+∆x? For instance, if we used the density at the rightend of the piece, 50 + 3 (xi +∆x) , we would estimate the total mass as

n−1Xi=0

[50 + 3 (xi +∆x)]∆x.

The difference between this estimate and the previous one, using the fact thatwhen there are n subintervals the length of each one is ∆x = 100

n, is

n−1Xi=0

3 (∆x)2 = n · 3 (∆x)2 = 3n ·µ100

n

¶2=30, 000

n.

As n→∞, this quantity decreases to zero. The point is that using the differentestimate does not affect the outcome. In particular, in any such process, youcan just ignore any terms involving (∆x)2 (or higher powers of ∆x) because

282 MODELING WITH INTEGRALS

Page 283: Notes on Calculus

the total effect of such terms decreases as the number of intervals increaseseven though the number of terms increases.Example 5. Find the gravitational force exerted by a homogeneous rod

of length 2 meters and mass 4 kg on a point mass of 3 kg located 1 meter fromone end of the rod and in line with it.Solution. The gravitational force between point masses of size M and m

located a distance r apart isGMm

r2. In these units G = 6.673×10−11 newton-

m2/kg2.We approximate the total force exerted by the rod by dividing it intoshort sections, each of which is approximately a point mass.

point massΔx

xi

1 + xi

Let x denote distance from the end of the rod closest to the point mass. Themass of the part of the rod from xi to xi+∆x is (since the rod is homogeneous)just proportional to its length. Since the total length is 2 meters, the mass ofa piece of length ∆x is ∆x

2· 4 = 2∆x kg. The gravitational attraction exerted

by this piece on the point mass is thenG · 3 · 2∆x

(1 + xi)2 . Summing over n intervals

of length ∆x givesn−1Xi=0

6G

(1 + xi)2∆x.

The total force is then given by the integralZ 2

0

6G

(1 + x)2dx = −6G 1

1 + x

¯̄20 = 6G(1−

1

3) ≈ 2.7× 10−10 newtons.

Remark: Often a variable quantity varies over a region in a way whichinvolves change in only one direction. As a guide to setting up the sum, look forsets (usually a line or curve) on which the variable quantity is constant. Thedirection perpendicular to these sets is the direction to integrate with respectto. Get a sum by “fattening” the curves of constancy a bit to get rectanglesor rings (or whatever) that are small in the direction that the density (orwhatever) is changing. Here is an example of this technique.Example 6. The kinetic energy of an object of mass m kilograms moving

at a velocity of v meters per second is defined to be 12mv2 joules. Find the

kinetic energy of a thin rectangular metal plate, 6 by 10 centimeters withdensity 40 kilograms per square meter, if it is spinning around one of its longeredges 3 times per second.

FINDING PHYSICAL QUANTITIES 283

Page 284: Notes on Calculus

Solution. The difficulty is that although all parts of the plate have thesame angular speed, the linear speed of a point on the plate depends on itsdistance from the edge about which the plate is rotating. In fact, a point xmeters from that edge is rotating through a circle of circumference 2πx meters3 times a second, and so has linear speed 6πx m/sec.

direction of rotation 6

End view of the plate

Δx

axis of rotation

Side view of the plate

Thus we estimate the kinetic energy of the plate by dividing it into thinstrips of length 10 cm = 0.1 meter and width ∆x imeters parallel to theedge about which the plate is spinning. The mass of such a strip is m =(density) (area) = 40 · 0.1∆x kilograms, so the kinetic energy of such a striplying between xi and xi+∆x meters from the axis of rotation is approximately

1

2mv2 =

1

2(4∆x) (6πxi)

2 = 72π2x2i∆x joules.

If we divide the interval 0 ≤ x ≤ 0.06 (the plate is 6 centimeters wide in thedirection perpendicular to the axis of rotation) into n equal subintervals ofwidth ∆x = .06

nand add the estimates of kinetic energy from each strip, we

get this estimate for the total kinetic energy:

n−1Xi=0

72π2x2i∆x joules.

The total kinetic energy is then given by the integralZ .06

0

72π2x2dx = 24π2Z .06

0

3x2dx = 24π2 · (.06)3

= 0.0051840π2 joules ≈ 0.0512 joules.

Remark. If we made the previous problem more elaborate by replacingthe homogeneous plate by one whose density is 2000x kg/m2 a distance xmeters from the axis of rotation (i.e. the plate gets denser as you move awayfrom the axis of rotation), the only change in the solution above would be thatthe mass of a thin strip a distance x from the axis of rotation would now beapproximately

(density) (area) = 2000x · 0.1∆x = 200x∆x kg.

284 MODELING WITH INTEGRALS

Page 285: Notes on Calculus

This makes the estimate for the kinetic energy of a single piece a distance ximeters from the axis 1

2(200xi∆x) (6πxi)

2 = 3600π2x3i∆x. The correspondingRiemann sum suggests the integralZ .06

0

3600π2x3dx = 900π2 · (.06)4 = 0.0116640π2 joules ≈ 0.1151 joules.

Summary. In each of the examples, we are trying to find a total quantityin a situation where if all of the individual elements of the problem remainedconstant, we could just multiply. Since one or more elements is changing asa function of distance in one direction, we approximate the total quantityby breaking the problem into small pieces (measured in that direction—notnecessarily small in other directions), on each of which we can pretend, moreor less, that everything is constant, and then adding the contribution fromeach piece. We recognise this sum as a Riemann sum for a certain integal, andtake this integral to be the solution to the orginal problem.EXERCISESFor each problem, write a Riemann sum approximating the de-

sired quantity, draw a diagram showing the contribution of a typicalterm in the sum, and then write down the integral suggested by thesum.

1. Find the mass of a circular region of radius 6 meters if the density (massper unit area here) in kilograms per square meter at a point in the regionis equal to the square of the distance from the center of the region.

2. Write down an integral representing the mass of the earth if we assumethat the earth is a sphere of radius 3960 miles, and that the density ofthe earth is a function ρ (r) that depends only on the distance r fromthe center of the earth.

3. The rectangular plate of Example 6 is rotated about the line down themiddle of the sheet parallel to the long edges at 3 revolutions per second.Set up an integral for the total kinetic energy of the plate now. Do youexpect it to be more or less than in Example 6? Evaluate the integral tofind out whether you’re right.

4. An audio CD has a radius of 12, cm a hole at its center of radius 0.75 cmand mass 17 grams, which we will assume is uniformly distributed (so itsmass per unit area is constant). It rotates at “constant linear velocity”of 1.3 m/sec. This means that the angular velocity (in revolutions persecond) varies between about 3.5 and 9 depending on whether the headis reading from near the center of the disk (spins fast) or from near theedge (spins more slowly). Taking an intermediate value of 5 revolutionsper second, set up an integral for the CD’s kinetic energy in joules.(Unitsare kg(meter/sec)2)

FINDING PHYSICAL QUANTITIES 285

Page 286: Notes on Calculus

5. (Poiseuille’s law of blood flow) The fluid flowing through a circular pipedoes not all move at the same speed. The speed of any part of the fluiddepends on its distance from the center of the pipe — the part in themiddle moves fastest; the part nearest the walls moves slowest. Assumethat the speed of the fluid at a distance r from the center is g (r) cm/sec.Estimate the flow of fluid (in cubic centimeters per second) through athin ring of inner radius r and outer radius r +∆r. Use this to set upan integral for the flow through a pipe of radius b centimeters.

Poisueille (about 1850), studying the flow of blood through arteries usedthe function g (r) = k (b2 − r2) where k is a constant. Show that in thiscase the flow of blood through an artery is proportional to the fourthpower of the radius of the artery.

6. Suppose that the rod in Example 5 has variable density, say the density adistance x meters from the end nearest the point mass is 2+x kg/meter.Now compute the gravitational attraction of the rod on the point mass.

7. Determine the gravitational force exerted by a homogeneous semicircularwire of total massM and radius r on a very small mass m located at thecenter of the semicircle (i.e. at the center of the circle of which this ishalf). Suggestion: Consider the force exerted by suitable pairs of shortpieces of wire. You will need to think in vector terms, but you will beable to do a calculation of the force in a single direction.

8. A homogeneous circular piece of wire of radius r and total mass Mis oriented so that it is perpendicular to the line from a point massm through the center of the circle. Determine the gravitational forceexerted by the circular wire on the point mass if the distance from thepoint mass to the center of the circle is h.

9. Determine the gravitational force exerted by a homogeneous bar of length2h and total mass M on a point mass m if the bar is perpendicular tothe line from the point mass that passes through the middle of the bar(as in the diagram) and the perpendicular distance from the mass to thecenter of the bar is r.

point mass

r

mass M

h

h

286 MODELING WITH INTEGRALS

Page 287: Notes on Calculus

6.3 Finding the Total Change

As mentioned at the beginning of the chapter, representing a quantity byan integral generally starts with a simple formula. The formula does notapply directly because one (or more) of the quantities assumed constant in theformula is not constant in the situation under consideration. So we estimate bydividing the situation up into small pieces on each of which the non-constantquantity (or quantities) is approximately constant. The total estimate is thesum of all of the estimates of pieces. Finally, recognize this sum as a Riemannsum for some integral. This is the integral we want.

Example 7. A painting company bases its charges to paint a house exterioron both the area painted and height above the ground—say for a small sectionof wall of area A square feet that is h feet above ground level they charge h2Adollars. How much would they charge to paint an exterior wall that is 8 feethigh and 30 feet long? How much if it is 12 feet high?

A horizontal strip of wall 30 feet long and between y and y+∆y feet abovethe ground would have area 30∆y feet and would cost about

height2 × area = y2 (30∆y) dollars.

The total cost of painting the wall would then be a sum of the form

nXk=1

y2k (30∆y) = 30nX

k=1

y2k∆y dollars.

This is a Riemann sum for

30

Z 8

0

y2dy = 3083

3= 5120 dollars.

If the height is 12 feet, the cost is 30R 120

y2dy = 17, 280 dollars. This company’spainters do not seem to like heights!

Example 8. How much would you have to deposit now in order to have$1000 a year from now if you can get 4% annual interest compounded contin-uously? How much to get $1000 five years from now? If you deposit P dollarsnow at 4% interest compounded continuously, then after t years you will havePe.04t dollars. Thus to have $1000 a year from now you would need to depositP dollars where Pe.04 = 1000 or P = 1000e−.04 ≈ 960.79 dollars. Similarly tohave $1000 five years from now, you would need to deposit 1000e−.2 ≈ 818.73dollars Economists express this by saying that the present value of $1000 ayear from now is $960.79 and the present value of $1000 five years from now

FINDING THE TOTAL CHANGE 287

Page 288: Notes on Calculus

is $818.73, that is, the present value of a sum is the amount you would haveto invest now in order to have the specified amount at the specified time.Large lottery wins are often paid over time, say, a $1,000,000 prize might

be paid in ten annual installments of $100,000 starting a year from now. Thepresent value of this prize is the amount you might be offered as a lump sumsettlement. Clearly it is just the sum of the present values of the ten individualpayments. It is therefore

100, 00010Xk=1

e−.04k ≈ $807, 830.

Now we are ready for the calculus question. If someone has available acontinuous stream of income amounting to f (t) dollars per year at time tyears from now, what is the present value of this income stream for the periodfrom now until T years from now? We can approximate the answer with a sumby dividing the period of T years into a large number n of short time periodsof length ∆t = T/n years, that is, introducing a partition of [0, T ] of equallyspaced points t0 = 0, t1 = ∆t, t2 = 2∆t, ..., tn = T. The income for the periodtk ≤ t < tk+1 is approximately f (tk)∆t and the present value of this income,assuming continuously compounded interest with constant interest rate r, isf (tk) e

−rtk∆t. Thus the present value of the income stream is approximately

nXk−1

f (tk) e−rtk∆t.

This sum becomes a better estimate as n increases. The present value of theincome stream is therefore the integral for which this is a Riemann sum, thatis, the present value is

V =

Z T

0

f (t) e−rtdt.

EXERCISESFor each problem, write a Riemann sum approximating the de-

sired quantity, draw a diagram showing the contribution of a typicalterm in the sum, and then write down the integral suggested by thesum.

1. An animal population is increasing at a rate of 200+50t individuals peryear, where t is measured in years. By how much does the populationincrease from the beginning of the fourth year to the beginning of thetenth year?

2. A simple model of subsistence agriculture is to imagine a circular rangeof production of radius a centered at a base of production (a village,say). Let G (x) denote the density of food (in calories per square meter)

288 MODELING WITH INTEGRALS

Page 289: Notes on Calculus

produced a distance x meters from the center. Write a definite integralthat expresses the total number of calories produced within the rangeof production. Suggestion. On what kind of region of small area isthe density of food production approximately constant? What does thissuggest about a good way to subdivide the circular range?

3. At time t hours, 0 ≤ t ≤ 24, a firm uses electricity at the rate of e (t)kilowatts.

(a) Suppose that the cost per kilowatt hour is c dollars. Estimate thecost of electricity consumed between times t and t + ∆t. Use thisto set up a definite integral for the total cost of electricity for the24 hour period.

(b) Now suppose that the cost per kilowatt hour is variable—there arecheaper rates for use in off-peak hours—so that it is now a functionc (t) .The rate schedule indicates that the cost per kilowatt hour attime t is c (t) dollars. Estimate the cost of electricity consumedbetween times t and t + ∆t. Use this to set up a definite integralfor the total cost of electricity for the 24 hour period.

4. Suppose thatR 73f (x) dx is the number of gallons of oil that has flowed

out of the south end of the Alaska pipeline between hours 3 and 7 of aparticular day, say, October 13, 2004.

(a) For the corresponding approximating sum,nP

k=1

f (xk)∆x, what does

∆x represent? What does f (xk)∆x represent? Be specific.

(b) What does f (x) represent? What are its units? If f (4) = 17, 000,what is the physical interpretation?

5. Suppose thatR 73f (x) dx is the cost in dollars of building the Alaska

pipeline between milemarkers 3 and 7.

(a) For the corresponding approximating sum,nP

k=1

f (xk)∆x, what does

∆x represent? What does f (xk)∆x represent? Be specific.

(b) What does f (x) represent? What are its units? If f (4) = 17, 000,what is the physical interpretation?

6. Suppose thatR 73f (x) dx is the number of floating point operations done

by a computer between 11:03 and 11:07 this morning.

(a) For the corresponding approximating sum,nP

k=1

f (xk)∆x, what does

∆x represent? What does f (xk)∆x represent? Be specific.

FINDING THE TOTAL CHANGE 289

Page 290: Notes on Calculus

(b) What does f (x) represent? What are its units? What does f (4) =1.7× 108 mean?

7. A machine earns 4000− 10t2 dollars per year when it is t years old.

(a) What is its useful life of the machine, assuming the owner neverdiscards machines that are still profitable?

(b) How much does the machine earn during its lifetime? What per-centage of the total does it earn during the first half of its workinglife?

8. The revenue of an internet business is compiled daily, but we may idealizethe situation by assuming that we can compute the rate at which revenueis received continuously according to the equation

r (t) =106

e−2t

where t is time in years and r (t) is the rate in dollars per year. (Thismeans that the revenue received between times t and t +∆t is approx-imately r (t)∆t.) Set up and evaluate an integral for the company’stotal revenue for one year starting from now (t = 0). Compute the totalrevenue for the next three years.

9. It is usually the case that when a new product is put into productionthe time required to build the x-th unit decreases as x increases becauseof improvements in the production process. Suppose that the time inhours to produce the x-th unit is T (x) = 80 − .05 (x− 1) . Predict thetotal time to build 1000 units. (Suggestion: The total time is a sum,which may be thought of as the sum of areas of rectangles. Approximatethis sum by an integral). Is the value of the integral greater than or lessthan the sum? Can you estimate approximately how much by looking ata diagram?

10. If a small plot of land of area A is a distance s from the irrigation pump,then the cost of irrigating the plot is As3. What is the cost of irrigatinga circular field of radius R with the pump at its center?

11. The value of a small plot of land of area A square feet and distance sfeet from the railroad tracks is 2As dollars.

(a) What is the value of a large rectangular plot of land extending 200feet along the tracks whose nearest side is 10 feet from the tracksand whose farthest side is 100 feet from the tracks?

290 MODELING WITH INTEGRALS

Page 291: Notes on Calculus

(b) The plot is subdivided into two triangular plots with a straight linefrom one corner to the diagonally opposite corner. What is thevalue of each triangular plot?

12. Let f (x) be the age density of females in the United States. This meansthat if F (x) is the number of females of age x or less, then f (x) = F 0 (x) .Thus the number of females between the ages of x and x + ∆x is ap-proximately f (x)∆x. (It is exactly F (x+∆x)−F (x) =

R x+∆x

xf (s) ds.

This amounts to estimating change along the curve by moving along thetangent line.) Assume that the rate of childbirth for woman whose age isx at the beginning of a calendar year is m (x) . (Thus, for instance, 1000women of age x would produce 1000m (x) children during the year.) Setup definite integrals for(a) the number of women whose ages at the beginning of the calendaryear were between a and b, and(b) the total number of children born to these women during the calendaryear.

13. We will develop a formula for the length of a parametrized curve, x =f (t) , y = g (t) , a ≤ t ≤ b.

(a) Write a sum for the total length of the broken line segment obtainedin the following way: divide the interval [a, b] into n parts of equallength∆t. Set t0 = a, t1 = a+∆t, t2 = a+2∆t, ..., tn = a+n∆t = b.The broken line segment approximating the curve consists of the nline segments joining the points (f (ti−1) , g (ti−1)) to (f (ti) , g (ti))for i = 1, 2, 3, ..., n.

(b) Recognise the sum from (a) as a Riemann sum. The actual length ofthe curve is the integral suggested by this Riemann sum. It will helpto remember local linearity: f (ti) − f (ti−1) ≈ f 0 (ti) (ti − ti−1) =f 0 (ti)∆t.

(c) Check your answer by using your formula to compute the length ofa circle of radius r.

14. Find the length of the loop in the track of a particle moving accordingto the equations x = 3t2, y = t3 − 3t.

15. Find the length of a hypocycloid of four cusps, the track of a particlemoving according to the equations x = a cos3 t, y = a sin3 t.

16. Let x = f (t) , y = g (t) , a ≤ t ≤ b, be parametric equations such that xis an increasing function of t and y ≥ 0 for a ≤ t ≤ b.

(a) Write a sum (in terms of the functions f and g) that approximatesthe area of the region bounded on the left and right by f (a) and

FINDING THE TOTAL CHANGE 291

Page 292: Notes on Calculus

f (b) , bounded above by the curve, and bounded below by the x-axis. For the width of the approximating rectangles you will needto approximate ∆x as in #13(b).

(b) From (a), find an integral that expresses the area bounded as inpart (a).

17. Use your formula from the previous problem to find the area under onearch of a cycloid. (The parametric equations are in problem 14 of section3.7.)

18. Use your parametric area formula to find the area of the ellipsex2

a2+y2

b2=

1. Parametrize the ellipse to move along the top half of the ellipse fromleft to right. What difference would it make if the point (f (t) , g (t))moves from right to left?

19. Consider the nonuniform parametrization of the unit circlex = cos

³(t+√π)2´, y = sin

³(t+√π)2´.

(a) Describe the motion—what is the position at time 0, in what di-rection is the motion, what is the speed at time t? How would aparticle moving according to these equations behave?.

(b) Find the distance L traveled along along the circle from t = 0 tot = T.

(c) Solve your answer to the previous part for T in terms of L and plugback into the original parametrization. Explain why you now havethe “usual” parametrization except for the position at time 0.

20. Consider the spiral described by the parametric equations x = t2 cos t, y =t2 sin t.

(a) Show that the straight line distance from the origin to the positionat time t is t2.

(b) Find the speed s (t) along the spiral at time t, and show thats (t)

t2approaches 1 as t→∞.

(c) Find the length (T ) of the spiral from t = 0 to t = T and show

that(T )13T 3→ 1 as T →∞.

(d) (A bit messy) Solve the equation giving in terms of T for T andsubstitute back into the original parametric equations so that now

arc length is the parameter. Show that nowµdx

d

¶2+

µdy

d

¶2= 1,

so that your new parametric equations do indeed give motion alongthe spiral at unit speed.

292 MODELING WITH INTEGRALS

Page 293: Notes on Calculus

6.4 Distribution Functions and Probability

The current population of the United States is distributed among people whoseages vary from 0+ (born, but less than a year old) to somewhere over 100 (butprobably less than 120). We could make a crude bar chart of the distributionof ages by dividing the interval [0, 120] into a number of intervals of equal sizeand counting the number of people in each age interval. Equivalently, we couldavoid large numbers by making such a chart of the fraction of the populationthat falls within each interval. One such chart is at the left below, where Ihave not bothered with ages above 100 because the scale is too crude to get abar with positive height.We could also add the fractions to get the fraction of the population whose

age is less than or equal to 20 or 40 or... years. For instance, since .29of the population is aged between 0 and 20 and .31 is between 20 and 40,.29+.31=.6 of the population is at most 40 years old. The corresponding chartof cumulative age fractions is on the right below.

0 20 40 60 80 1000.0

0.1

0.2

0.3

x

y

Age fractions

0 20 40 60 80 1000.0

0.2

0.4

0.6

0.8

1.0

x

y

Cumulative age fractions

We can imagine constructing more refined charts using much finer timedivisions, perhaps even day by day if we had the data and the patience. Inthat case the tops of the bars in the cumulative age chart would be very closeto an increasing continuous function with value 0 at age 0 and value 1 at theage of the oldest American. For many statistical purposes it is convenient tomodel the situation by using such a continuous function. This is the reasonfor the following definition.A cumulative distribution function (or just distribution function

for short or even cdf) is a non-negative, non-decreasing function F defined onthe real line such that F (x)→ 1 as x→∞ and F (x)→ 0 as x→−∞. Thus,in general, y = 1 and y = 0 are horizontal asymptotes of F as x → ∞ andx → −∞ respectively. It is allowed that F (x+) = 1 for some finite x+, butthen also F (x) = 1 for all x ≥ x+. Similarly, it may happen that F (x−) = 0for some finite x−, but if so, then F (x) = 0 for all x ≤ x−. Here are someexamples of distribution functions.

DISTRIBUTION FUNCTIONS AND PROBABILITY 293

Page 294: Notes on Calculus

EXAMPLE 1: F (x) =1

π

³π2+ arctanx

´has the graph

-14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14

0.5

1.0

x

y

EXAMPLE 2. F (x) =

⎧⎨⎩ 1, x ≥ 1,x 0 ≤ x ≤ 1,0, x ≤ 0.

has the graph

-1 0 1 2

0.5

1.0

x

y

EXAMPLE 3: F (x) =1√2π

R x−∞ e−t

2/2dt has the graph

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.5

1.0

x

y

Note that while generally similar to Example 1, this distribution functionapproaches its limiting values much faster. (Note the similarity to the erffunction! We will see this again below.)

Distribution functions do not need to be continuous as all these examplesare, but we will stick to distribution functions that are not only continuouseverywhere but also differentiable, at least at all but a few points. The idea is,as the introductory paragraph suggested, that for some quantity which takesvarious real values, F (x) measures the fraction of the quantity whose valueslie below x.The slope of the distribution function in Example 2 is constantly equal to

1 where it is non-zero. Thus we are adding to the cumulative amount at a

294 MODELING WITH INTEGRALS

Page 295: Notes on Calculus

constant rate. Outside the interval [0, 1] the distribution function does notchange, reflecting the fact that no values of the quantity being accumulatedoccur outside [0, 1] . For the distribution functions in Examples 1 and 3, how-ever, the rate of accumulation is greatest at x = 0 and decreases as we moveaway from 0 in either direction, but is never quite zero. Thus values of thequantity far from 0 are rare, but do occur.To be more precise, the rate of change of F is of course its derivative F 0.

We can see that F 0 (x) measures the “density” at x of the quantity for which Fis the cumulative distribution function as follows. The Fundamental Theoremof Calculus tells us that

F (b)− F (a) =

Z b

a

F 0 (t) dt.

If a and b are close together, say b = a+∆t, thenR a+∆t

aF 0 (t) dt ≈ F 0 (a)∆t

(left hand sum with one term) so that, in the context of the population ofthe US, if F (x) is the fraction of the population with age at most x, thenF (b) − F (a) is the fraction with ages between a and b (at most b, but morethan a) and if b = a+∆t, then this is approximately equal to (age density atage a)×(length of time interval).It is convenient to start from the other end with some more terminology.

A probability density function (pdf) is a non-negative function f definedfor all real numbers such that

R∞−∞ f (t) dt = 1. The cumulative distribution

function F defined by f is

F (x) =

Z x

−∞f (t) dt.

Notice that this function F has all the properties enumerated above for acumulative distribution function. Also F 0 (x) = f (x) by the FTC again, so fis the density function for F in the sense just discussed. The pdf’s from theexamples above are

EXAMPLE 1 F 0 (x) =1

π (1 + x2)with graph

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.1

0.2

0.3

x

y

EXAMPLE 2. F 0 (x) = 1 for 0 < x < 1 and 0 outside [0, 1] , so the graphis a step function with discontinuities at x = 0 and 1.

DISTRIBUTION FUNCTIONS AND PROBABILITY 295

Page 296: Notes on Calculus

-2 -1 0 1 2

0.5

1.0

x

y

EXAMPLE 3. F 0 (x) =1√2π

e−x2/2 with graph

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.2

0.4

x

y

EXERCISES.

1. Suppose that F is the cdf for heights in meters of trees in a certainforest.(a) What is the meaning, in terms of trees, of the statement F (7) = 0.6?(b) Which is greater, F (6) or F (7)? Explain in terms of trees.

2. An experiment is done to determine the effect of two new fertilizers, Aand B on the growth of peas. The cumulative distribution functions of theheights in feet of the mature peas without treatment and treated with eachof fertilizers A and B are graphed below. The unfertilized plants are the solidgraph.

0.0 0.5 1.0 1.5 2.00.0

0.2

0.4

0.6

0.8

1.0

x

y

A: dots, B: dashed

(a) About what height are most of the unfertilized plants? Explain.

296 MODELING WITH INTEGRALS

Page 297: Notes on Calculus

(b) Explain in words the effect of fertilizer A and fertilizer B on the typicalpea plant.

3. A large number of people take a standardized test. The graph belowis the pdf for their scores. Does the graph imply that most people receive ascore near 50? Explain why or why not.

0 20 40 600.00

0.05

0.10

0.15

0.20

x

y

4. Verify that for any c > 0, the function f (t) =

½ce−ct, t ≥ 0,0, t < 0,

is a

pdf. Find a formula for the corresponding cumulative distribution functionand graph it. (Include t < 0 in your graph also.)

5. After monitoring many calls, the telephone company has decided thatthe duration of calls in minutes is well modelled by the pdf p (t) = 0.4e−.4t.(a) What percentage of calls last between 1 and 2 minutes?(b) What percentage of calls last at most one minute?(c) What percentage of calls last 10 minutes or more?

6. For a population of individuals with a particular disease, the deathdensity function f (t) = cte−kt is the pdf for the lifetime of someone contractingthe disease, that is, the fraction of individuals who die between t and t +∆tyears after contracting the disease is approximately f (t)∆t. Here c and k areconstants that depend on the disease.(a) Find c in terms of k.(b) If 40% of those contracting the disease die within 5 years, find c and k.(c) Find the cumulative distribution function for the disease from (b).

7. Explain the connection between the functions f and F of problem 4 ofthe previous section and the concepts of cdf and pdf.

6.4.1 Averages and Spread

The study of probability density functions and their associated cumulativedensity functions is important because they are central to the use of statistics

DISTRIBUTION FUNCTIONS AND PROBABILITY 297

Page 298: Notes on Calculus

to organize and study empirical data. As part of the organization, there areways to measure the “center” and the degree of spread of a distribution. Theeasiest “center” to define is the median. For a finite collection of numbers, themedian is just the one in the middle. For a pdf f and associated cdf F, wedefine the median to be the value of x such that F (x) = .5. Since half of thedistribution has values below x and the other half has values above x, this isclearly the same usage.The definition of the mean of a distribution is a little more sophisticated.

For a finite collection of N numbers, {x1, x2, ..., xN}the mean ism =1

N

nXj−1

xj.

Note that mN =nX

j−1xj, that is, if we changed all of the numbers in the given

collection to m, the sum of the numbers would remain the same. Considerthe problem of defining a mean age for the US population. Of course thepopulation is finite, so all we have to do is add up all the ages and divideby the current population of about 300,000,000. But we could model it asfollows. We introduced cumulative distribution functions by imagining the cdfF of the US population—F (x) is the fraction of the population whose age isless than x. We know that then f (x) = F 0 (x) is the pdf for the population.If we divide the range of possible ages, say [0, 120] , into a large number N ofvery short times ∆t = 120/N, then the fraction of people with ages between tand t+∆t is approximately f (t)∆t. The number of people with ages in thisrange is then about Nf (t)∆t. Since each person in this group has age verynear t, the sum of the ages of these people is approximately Ntf (t)∆t. Thusthe sum of all ages of all Americans is about

NNXj=1

tjf (tj)∆t

and the average age is obtained by dividing by N to getNXj=1

tjf (tj)∆t. But

this is a Riemann sum for the integralR∞−∞ tf (t) dt =

R 1200

tf (t) dt since wemay assume f (t) = 0 outside the interval [0, 120] . The integral is thus aboutas good as measure as we are likely to get for the average age of Americans.Thus motivated, we define themean of a pdf function f to be the number

μ =

Z ∞

−∞tf (t) dt.

Of course this is an improper integral, but in practice that is not a problem,since the integral converges. In fact, as with the example of average ageabove, the function f is often equal to zero outside some bounded interval, sothe integral is not really improper after all.

298 MODELING WITH INTEGRALS

Page 299: Notes on Calculus

EXAMPLES. For the step function pdf of Example 2 above, this integraltakes the form Z ∞

−∞tf (t) dt =

Z 1

0

t · 1dt = 1

2.

For the pdf f (t) =1√2π

e−t2/2 of Example 3 we find

μ =1√2π

Z ∞

−∞te−t

2/2dt

=1√2π

Z 0

−∞te−t

2/2dt+1√2π

Z ∞

0

te−t2/2dt = 0

where we need to separate the parts in order to see that each converges sepa-

rately. (They do; in factR b0te−t

2/2dt = −e−t2/2¯̄̄b0→ 1 as b→∞ and it is clear

from symmetry that thenR 0−∞ te−t

2/2dt = −1.)The other quantity we need to discuss is a measure of spread. Presumably

perfect data, say repetitions of an experiment, would all be identical. Inpractice, there will always be a spread around the mean, and it is useful tomeasure the average deviation from the mean in some way. The quanitityusually chosen for this purpose is the standard deviation, defined for a pdff by

σ =

sZ ∞

−∞(t− μ)2 f (t) dt.

(We won’t need this formula.)

Example 1: The standard deviation of f (t) =1√2π

e−t2/2 with mean 0 is

σ =

s1√2π

Z ∞

−∞t2e−t2/2dt = 1

sinceR∞0

t2e−t2/2dt =

pπ/2.

Example 2. The standard deviation of the step function pdf with mean1/2 is

σ =

sZ 1

0

µt− 1

2

¶2dt =

vuut 1

3

µt− 1

2

¶3 ¯̄̄̄¯1

0

=

√3

6≈ .289.

6.4.2 Probability

Now let’s talk about probability a little. You have probably seen the finiteversion before. If an experiment has several possible outcomes, say tossing a

DISTRIBUTION FUNCTIONS AND PROBABILITY 299

Page 300: Notes on Calculus

coin has two outcomes if we overlook the possibility that the coin will standon edge or roll into a drain in the floor, then the probability of each outcomeis, roughly speaking the fraction of the time we would expect it to occur ifwe repeat the experiment many times. For instance, if the coin is fair, the

probability of heads (and also the probability of tails) is1

2. The probability

that something will happen is 1, so the sum of the probabilities of all possibleindividual outcomes should sum to 1. For the experiment of choosing a cardat random from a standard pack of 52 cards, the probability of choosing any

single card, say the ace of clubs, is1

52.We can get the probabilities of collective

events just by adding—the probability of choosing a black card is26

52=1

2since

there are 26 black cards.

What if we talk about a situation where there are infinitely many outcomes,say choosing one real number at random from the interval [0, 1] . Now the

probability of choosing any particular number, say1

π, is 0, since the probability

has to be the same for each real number (or the choice wouldn’t be random) andif it were positive, say p, then the probability of choosing one of a collection of

N numbers, where N >1

pwould be Np > 1, which is impossible. So we don’t

talk about the probability of a single outcome, but of a range of outcomes. The

probability if we choose a number at random from [0, 1] that it is at least2

3is

1

3, since the interval

∙2

3, 1

¸is one third of the total interval. For any interval

I = [a, b] within [0, 1] , the probability that a number chosen at random from[0, 1] will lie within I is just the length of I, that is, b − a. We can describethis using the machinery of pdf’s developed above by saying that the pdf for

this experiment is the pdf of Example 2 above, f (t) =½1, 0 ≤ t ≤ 1,0, otherwise

and

the probability that the chosen number lies within I isR baf (t) dt = b− a.

In general, if F (x) is the cumulative distribution function of an experimentwith numerical outcomes (now interpreted as the probability that the outcomeis less than or equal to x), then f = F 0 is the pdf for the experiment and theprobability that the outcome is between a and b is

R baf (t) dt = F (b)−F (a) .

One big reason that all of this is useful is that many real life experimentscan be modelled well by using one of a small number of pdf’s that have beenextensively studied. Imagine that I got each person in the class to flip a coin100 times and record how many heads and how many tails there were. Whileeach of you would expect to get 50 of each, most of you would get some otheroutcome. But a plot of all the outcomes would probably reveal a distributionclustered about 50 that looks something like this smooth bell-shaped curve.

300 MODELING WITH INTEGRALS

Page 301: Notes on Calculus

0 20 40 60 80 1000.00

0.01

0.02

0.03

0.04

x

y

This is the famous normal distribution with mean 50 and standard deviation

10 (I just inserted a standard deviation at random because I’m too lazy tolook up the actual standard deviation for this experiment.) It is given by the

formula f (t) =1

σ√2π

e−(t−μ)2/2σ2 with μ = 50 and σ = 10. The probability

that your number of heads is between 40 and 60 is1

10√2π

R 6040

e−(t−50)2/200dt ≈

0.68, that is, this should happen about 68% of the time. Experimental data iscommonly organized by assuming that it is normally distributed with a meanand standard deviation chosen to fit the data.

EXERCISES.1. The probability that a transistor will fail between a and b months after

it goes into service is cR bae−ctdt for some positive constant c.

(a) What is the cumulative distribution function F for this situation?(b) If the probability of failure is 10% for the first 6 months, what is c?(c) For this c, what is the probability of failure in the second 6 months?(d) After how many months will the cumulative probability of failure have

reached .5?

2. Let t be the number of years that a person lives after receiving treatmentfor a certain type of cancer. The pdf giving the distribution of such survivaltimes is f (t) = Ce−Ct for some C.(a) What is the practical meaning of the cumulative distribution function

F (t) =R t0f (u) du?

(b) The survival function, S (t) is the probability that a randomly selectedpatient survives for at least t years. Find S (t) in terms of f (t) or F (t) .(c) If a randomly selected patient has a 70% chance of surviving for at least

two years, find C, f, and F.

3. The distribution of IQ scores is often modeled by a normal distributionwith a mean of 100 and a standard deviation of 15.(a) Find the pdf for IQ scores.(b) Estimate the fraction of the population with IQ between 115 and 120.

4. The speed of cars on I-5 are randomly distributed (when there hasn’tjust been a crash) with mean 68 mph and standard deviation 7 mph.

DISTRIBUTION FUNCTIONS AND PROBABILITY 301

Page 302: Notes on Calculus

(a) What is the probability that a randomly selected car is going between70 and 75 mph?(b) What is the probability that a randomly selected car is going at least

80 mph?(c) What fraction of cars is going less than 60 mph?

5. Show that for a normal distribution of mean 0, the fraction of thepopulation within one standard deviation of 0 does not depend on the standarddeviation. What is this fraction? (Suggestion: Write down the integral usingσ and then make the change of variable u = t/σ.)

302 MODELING WITH INTEGRALS

Page 303: Notes on Calculus

7. DIFFERENTIAL EQUATIONS

7.1 Introduction

A differential equation is an equation involving one or more derivativesof a function y (t), usually the function itself, and often one or more knownfunctions of the independent variable t. The implication (as with a quadraticequation) is that the solution or solutions of the equation (a function in thiscase) is not known at the beginning. Some examples are

y0 =1

2y, (7.1)

y00 + 4 sin y = 0, (7.2)

y0 =1

2y (1− y) , (7.3)

y00 + 4y = 3 sin t, (7.4)

y0 = .1 (200− y) , (7.5)

y0 = (1− y) (t+ y) , (7.6)

y0 = 3 sin t. (7.7)

Differential equations are important because they are “the language of thenatural world,” that is, a very wide variety of natural phenomena can be mod-eled in a natural way using differential equations. This comes about becausethe relationship between a quantity and its rate of change (or its second deriv-ative) often arises in a very direct way from the “natural laws” governing thebehavior of the quantity. For example, the first equation above is the simplestequation for growth, and is essentially a statement of the assumption of aconstant birth rate; the second equation describes the motion of a pendulumand follows immediately from Newton’s second law of motion F = ma; thethird models logistic or limited growth, the fourth describes a mass attachedto a spring and also being acted on by a periodic external force, and the fifthequation just restates Newton’s Law of Cooling that the rate of change of thetemperature of an object (such as a yam) is proportional to the temperaturedifference between the object and its surroundings. The derivation of some ofthese equations (and others) will be discussed in subsequent sections.The fundamental problem of differential equations is to solve equations

such as the ones above. This means finding an explicit form for each of thefunctions that satisfy the equation (and knowing that all such functions havebeen found) whenever this is possible. Sometimes it is impossible to find an

DIFFERENTIAL EQUATIONS 303

Page 304: Notes on Calculus

explicit formula; in this situation the objective is to find out as much infor-mation about the properties of the solutions as possible, including qualitativeproperties that are not accessible via numerical approximations.Notice in particular the last equation in the list, (7.7), where the right

hand side depends only on t. In this situation a solution to the differentialequation is simply an antiderivative for the right hand side, in this case for3 sin t. Thus all solutions to equation (7.7) are of the form y = −3 cos t + C.The method we will use in a subsequent section to solve differential equationssymbolically amounts to arranging the equation so that all we have to do isfind antiderivatives. However we already know that some functions of t, likesin t/t, do not have antiderivatives that can be written explicitly in terms offamiliar functions, so it is clear from the beginning that we will not be able tosolve all differential equations symbolically.To begin discussing differential equations where the right hand side depends

on y, equation (7.1) is a familiar one. We know that y = e12t is a solution

because then y0 = 12e12t = 1

2y. More generally, for any real constant k, y = ke

12t

is a solution since y0 = k³12e12t´= 1

2

³ke

12t´= 1

2y.Whether there are any other

functions whose derivatives are half as large as themselves is not immediatelyclear. We will return to this question later.We have just seen that equation (7.1) has a whole family of solutions—all

functions of the form ke12t where k can have any real value, including zero.

This family of solutions fits together remarkably well, just like the family ofall antiderivatives of a single function. For each point in the plane, exactlyone member of the family passes through it, and no two different members ofthe family ever intersect. Thus a graph of all members of the family wouldlook something like a well combed head of hair. Here is a diagram of a few ofthe solutions of y0 = 1

2y.

-2 -1 1 2

-4

-2

2

4

t

y

Solutions of y0 = 12y

Note that we can select a specific solution to the equation by specifyingits value at t = 0. The highest solution in the diagram above, for instance, is

304 DIFFERENTIAL EQUATIONS

Page 305: Notes on Calculus

the solution to y0 = y/2 with the initial condition y (0) = 1.5. A differentialequation with an initial condition attached is called an initial value problemor IVP for short. Very often it is an IVP that we want to solve, that is, we areinterested in finding a single solution of a differential equation with a specifiedinitial condition rather than in finding all solutions of the differential equationalone.

7.1.1 Classifying Differential Equations

The order of a differential equation is the number of the highest derivativethat appears in the equation. The first, third, fifth and sixth equations in thelist above are first order equations; the second and fourth are second order.Second order equations are common in the description of motion (where thesecond derivative is acceleration) and many other common physical applica-tions. First order equations are also very important, however, and it is onthem that we will concentrate here.

More specifically, we will consider only explicit first order equations, equa-tions that can be written in the form y0 = (some function of y and t). (Anexample of an implicit first order equation would be y2+y02 = 1, though thisimplicit equation could be solved for y0 to give two explicit equations.) Thus inwhat follows we will write “first order equation” to mean “explicit first orderequation.”

Differential equations such as those above are also called ordinary differ-ential equations to distinguish them from partial differential equations.Some quantities depend on more than one independent variable—for instancethe outside temperature depends not only on time, but also on position, thatis, three additional spatial variables. It would thus make sense to discuss therate of change of temperature with respect to any one of these four variables(e.g. the rate of change of temperature at a given place over time, the rate ofchange of temperature as a function of altitude above a fixed position on theearth at a fixed time, etc.); these are the partial derivatives of the temper-ature and an equation linking them would be a partial differential equation.We will consider only ordinary differential equations—equations where there isa single independent variable.

EXERCISES.

1. Pick out which functions are solutions for which equations. (A functionmay solve more than one equation, or none. An equation may have more than

INTRODUCTION 305

Page 306: Notes on Calculus

one solution from the list.)

(a) dydt= −2y (I) y = e2t (V) y = sin t

(b) dydt= 2y (II) y = −2e2t (VI) y = cos t

(c) d2ydt2= −4y (III) y = e−2t (VII) y = 2 sin t

(d) d2ydt2= 4y (IV) y = 2e−2t

2. For each part, first verify that the given function y does satisfy thegiven differential equation. Then determine C so that y satisfies the IVP

consisting of the differential equation together with the initial condition. Ifthe equation and solution are not defined for all real numbers, state on whatdomain containing the initial point you have a solution of the IVP.(a) y0 + y = 0; y = Ce−t; y (0) = 3.(b) y0 = πy; y = Ceπt; y (0) = −4.(c) y0 = y + 1; y (t) = Cet − 1; y (0) = 0.(d) y0 = t− y; y (t) = Ce−t + t− 1; y (0) = 10.(e) y0 + 3t2y = 0; y (t) = Ce−t

3; y (0) = −2.

(f) y0 = y2; y (t) =1

C − t; y (0) = 2. Can you think of a solution to the

IVP y0 = y2, y (0) = 0?(g) y0 = 3y

t+t2; y (t) = t3 (C + ln t) ; y (1) = 6. (Why not a value for y (0)?)

(h) y0 = −y tan t+cos t; y (t) = (t+ C) cos t; y (π) = −7. (Why not a valuefor y

¡π2

¢?)

7.2 Modeling with Differential Equations

For the situations that we will consider there is a fairly standard procedure forderiving an appropriate first order differential equation. We are looking for anexpression for dy

dt, the instantaneous rate of change of y with respect to t. The

route to dydtgoes through ∆y

∆t, the average rate of change of y with respect to

t. We know that dydtis the limit of ∆y

∆tas the time interval ∆t shrinks down to

zero.Generally we start by estimating ∆y, the amount that y changes in a small

time interval ∆t. (Often it is easiest to begin by estimating ∆y for a unit timeinterval, and then deciding afterwards how the length ∆t of the time intervalfits in). It is then generally easy to convert this to an equation for ∆y

∆t; just

divide through by ∆t. Finally, it is also generally easy to pass to the limit,that is, as usual we will not spend much time worrying about whether whatlooks reasonable is really correct.Let’s illustrate this process first by considering the simplest model for pop-

ulation growth. The simplest model assumes a constant birth rate, say b births

306 DIFFERENTIAL EQUATIONS

Page 307: Notes on Calculus

per unit of population per year. There is also a constant death rate, d deathsper unit of population per year. Thus in one year the change in population isabout

∆y ≈ by − dy = (b− d) y. (7.8)

What about a shorter period of time? Well, if we assume that births and deathsoccur with equal frequency all year (this is reasonable for some populations atleast for a first approximation, but not for those with definite breeding seasonsor which tend to die in a certain season) then the change in population in anytime interval should also be proportional to the length of the time interval, thatis, in half a year the change in population should be half of that in equation(7.8) , in a tenth of a year the change should be a tenth of the change in a yearand in ∆t years the change should be about ∆t times the annual change, thatis,

∆y ≈ (b− d) y∆t

or∆y

∆t≈ (b− d) y.

Now it seems natural to assume that the differential equation describing nat-ural growth is

dy

dt= ky

where k = b−d represents the difference between the birth rate and the deathrate. Note that we could rewrite this as 1

ydydt= k, another way to say that the

net birth rate is constant since the left hand side of this equation representsthe rate of change of y per unit of population (or the percentage rate of change,expressed as a decimal).The next most sophisticated model of population would be one that ac-

knowledges that as population increases, population pressure will begin toaffect birth and death rates. Thus 1

ydydtis not really constant, but is some

decreasing function of y. (This still ignores outside influences which may befunctions of t, but is at any rate an improvement on the most naive model.)The simplest decreasing function of y is a linear one, so we might try

1

y

dy

dt= k − cy = k (1− ay) where a =

c

k

or, multiplying through by y,

dy

dt= ky (1− ay) .

This is the logistic model of population growth first proposed in 1837 by theBelgian mathematician P. F. Verhulst. A few years later Verhulst applied hismodel to the US population using the census data from 1790 through 1840to determine the constants. His prediction matched the actual US population

MODELING WITH DIFFERENTIAL EQUATIONS 307

Page 308: Notes on Calculus

remarkably well for the next century, as shown by this diagram, where thecurve is Verhulst’s prediction and the diamonds are the actual census results

1800 1820 1840 1860 1880 1900 1920 1940 19600

50

100

150

x

y

Logistic curve prediction of US population

Much later the logistic model was shown to model experimental data in biol-ogy fairly well (fruit fly populations by R Pearl in 1930; paramecium and flourbeetle populations by G. F. Gause in 1935). We will see that solutions of thelogistic equation are similar to exponential functions for small populations,but behave in a very different way as the population grows.

EXERCISES.

1. The population of a city is 1.5 million in 1980. The population increasesat a continuous rate of 4% a year, and in addition 50,000 people per year moveto the city from elsewhere, arriving at a constant rate. Write a differentialequation for the city’s population. What initial value problem would have tobe solved to find the population t years after 1980? (Suggestion: What is theapproximate change in population in 1 year, in ∆t years?)

2. A hospital patient is being fed glucose intravenously at a constantrate of b grams per minute. Meanwhile glucose is being absorbed from thebloodstream at a rate proportional to the amount G (t) of glucose in the bloodat time t. Write a differential equation for G.

3. A student carrying measles returns after Thanksgiving vacation to acollege campus of 10,000 students. Suppose that the rate that measles willspread through the student body is proportional to the product of the numberM (t) of students who are infected at time t and the number of students whoare not infected at time t. Write an initial value problem describing thissituation.

4. A motorboat travelling at 25 mph shuts off its motor and starts to coast.While coasting, the rate of change of its velocity is proportional to the square

308 DIFFERENTIAL EQUATIONS

Page 309: Notes on Calculus

of its velocity. Write an initial value problem describing this situation. Whatis the sign of your constant of proportionality?

5. A mothball whose radius was originally 1/4 inch is found to have aradius of 1/8 inch after one month. Assuming that the mothball evaporates ata rate proportional to its surface area, find its radius as a function of time. Willthe mothball ever be completely gone? If so, when? Note that evaporationis a change of volume, so you will have to relate dV

dtto dr

dt(Start with the

relationship between V and r.)

7.3 Slope Fields, EquilibriumSolutions, AutonomousEquations

7.3.1 Slope Fields

For most of the equations we consider, we will be able to come up with an ex-plicit family of formulas for the solutions. Sometimes, however, these formulasare not so easy to understand qualitatively. Fortunately there is a very imme-diate way to get a general idea about the solutions of any first order differentialequation, whether it is possible to get specific formulas for the solutions or not.The graph of a solution of a first order differential equation y0 = (some

function of y and t) is a curve in the (t, y) plane. While the exact nature ofsuch curves may not be immediately apparent from the differential equationitself, the tangent lines to the curve are immediately apparent, since this isexactly the information provided by the equation. For instance, the equationy0 = 1

2y says that whenever y = 1 (say), then y0 = 1

2, that is, the tangent line

to the solution to this equation that passes through the point (0, 1) (or thepoint (1, 1) or (π, 1) or ...) has slope 1

2.

We can therefore draw a diagram of these tangent lines immediately fromthe equations, without knowing what the solutions are. Of course we cannotinclude all tangent lines without filling up the plane into a sea of black ink,so we show a representative selection of short pieces of tangent lines. This iscalled a slope field or sometimes a direction field.The value of slope fields is that generally it is possible to guess at least the

rough nature of the graphs of solutions from the slope field; it does not showthe solutions directly, but it does tend to reveal the “ghosts” of the solutions.Here, for instance, is the slope field for the equation y0 = 1

2y.

SLOPE FIELDS, EQUILIBRIUM SOLUTIONS, AUTONOMOUS EQUATIONS 309

Page 310: Notes on Calculus

-5 -4 -3 -2 -1 1 2 3 4 5

-4

-2

2

4

t

y

Slope field for y0 = y/2

Notice that drawing curves in this diagram for which the arrowed segmentsbecome tangent lines leads inevitably to a diagram like that for the solutionsof y0 = 1

2y in the Introduction. You should use the instructions in the next

section to try to reproduce this diagram, in slightly cruder form, on your TI-89.As just mentioned, the importance of slope fields lies principally in the fact

that we can draw a slope field even when we don’t already know the solutions.Here is a slope field for the logistic equation y0 = 3y (1− y) .

-2 -1 1 2 3

-0.50

-0.25

0.25

0.50

0.75

1.00

1.25

1.50

t

y

Slope field for y0 = 3y (1− y)

7.3.2 Equilibrium Solutions

A key feature of this slope field is the row of horizontal arrows along the line y =1. They arise from the fact that when y = 1, the differential equation makesy0 = 0. This means that the constant function y = 1 is a solution of thisdifferential equation. Such a constant solution is called an equilibriumsolution of the equation. If we think of this equation as modeling a population,perhaps in units of millions of individuals, then the significance of y = 1 being

310 DIFFERENTIAL EQUATIONS

Page 311: Notes on Calculus

a solution is that it means that a population of one million individuals will justmaintain the same size indefinitely without either growing or shrinking. Theexistence of such a stable population size is a standard feature of the logisticmodel of population growth.It is harder to see in the diagram, but there is a second equilibrium solution:

y = 0, since then also y0 = 0. Looking back at the slope field for y0 = y/2 wesee that y = 0 is also an equilibrium solution there; it is the only equilibriumsolution.Typically, solutions near to an equilibrium solution either approach it (a

stable equilibrium) or move away from it (an unstable equilibrium). For thelogistic equation it is easy to see from the direction of the arrows that y = 1is a stable equilibrium and y = 0 is an unstable equilibrium. y = 0 is also anunstable equilibrium for y0 = y/2.It is usually pretty easy to identify equilibrium solutions algebraically from

the differential equation. They are just the values for y that make y0 = 0, thatis, they are the zeros of the function on the right hand side of the differentialequation: 0 and 1, for instance, are the zeros of 3y (1− y) . In practice it cantake a bit of insight to find equilibrium solutions directly from a drawn slopefield, since you will see a line of horizontal arrows only if the equilibrium heighthappens to be one selected for drawing arrows. Here, for instance, is anotherversion of the slope field for y0 = 3y (1− y) where I have not carefully chosenthe number of rows to display the equilibrium solutions.

-3 -2 -1 1 2 3

-0.50-0.25

0.250.500.751.001.251.50

t

y

Slope field for y0 = 3y (1− y)

The presence of equilibrium solutions is, however, hinted at by the lines ofarrows that are nearly horizontal but with slopes of opposite sign above andbelow y = 1 and y = 0. That these are indeed equilibrium solutions can thenbe confirmed from the differential equation itself.

7.3.3 Autonomous Equations

A first order differential equation whose right hand side depends only on y,that is, an equation of the form y0 = f (y) is said to be autonomous. Many

SLOPE FIELDS, EQUILIBRIUM SOLUTIONS, AUTONOMOUS EQUATIONS 311

Page 312: Notes on Calculus

differential equations arising in applications have this property–both the pop-ulation models of the previous section, for instance. One can think of it asmeaning that the conditions affecting changes in y are affected only by thecurrent state of y and do not change with time.For an autonomous equation, the slope field has the property that all points

at a given height above the t-axis (a given value of y) have an arrow of thesame slope passing through it. Consider, for instance the arrows lying on anyhorizontal line through the slope field for y0 = 3y (1− y) just above. Oneplausible consequence of the nature of the slope field is that any horizontaltranslate of a solution of the differential equation is also a solution. We willsee in the next section that the properties of the function f tell us quite a lotabout the behavior of the solutions of y0 = f (y) .

EXERCISES

1. The slope field for the equation y0 = t+ y is on the left below.

(a) Copy it and draw in (by eye, to be compatible with the slope field)the solutions of the equation that pass through (0, 0) , (−3, 1) , and(−1, 0) .

(b) Guess a formula for the solution through (−1, 0) and verify by plug-ging it into the equation that it is correct.

-4 -2 2 4

-4

-2

2

4

t

y

-4 -2 2 4

-4

-2

2

4

t

y

2. The slope field for the equation y0 = sin y sin t is on the right above.

(a) Copy it and draw in (by eye, to be compatible with the slopefield) the solutions of the equation that pass through (−2,−2) ,and (0, π) .

(b) Guess a formula for the solution through (0, π) and verify by plug-ging it into the equation that it is correct.

312 DIFFERENTIAL EQUATIONS

Page 313: Notes on Calculus

3. Draw a slope field for y0 = −y for −3 ≤ t ≤ 3,−2 ≤ y ≤ 2. Includeslopes for y = 0,±1

2,±1,±3

2. Then add the graphs of the solutions of

this equation with y (0) = 1 and also y (0) = −1.

4. Draw a slope field for y0 = −t for −3 ≤ t ≤ 3,−2 ≤ y ≤ 2. Include slopesfor t = 0,±1,±2. (Why have I specified values of t rather than values ofy as in the previous problem?) Then add the graphs of the solutions ofthis equation with y (0) = 1 and also y (0) = −1.

5. Classify each of the equations in exercise 2 of section 7.1 as autonomousor non-autonomous.

6. Two chemicals, A and B, combine to form a product C. The rate offormation of C at any time is proportional to the product of the amountof A that remains and the amount of B that remains. Suppose thatinitially there are 40 grams of A and 50 grams of B and that one gramof A combines with one gram of B to form 2 grams of C.

(a) Write a differential equation for the number of grams c (t) of chem-ical C at time t. (Suggestion: You can express the amount of A andB present at time t in terms of the amount of C present.) Assumethat no C is present at time t = 0.

(b) Make a slope field for your equation. What are the equilibriumvalues?What do you expect the amount of C to approach as t→∞?What parts of the slope field make sense in the context of thisproblem?

7. Suppose everything in the previous problem continues to hold exceptthat 2 grams of A combine with 1 gram of B to form 3 grams of C. Nowfind the differential equation and slope field and predict how much Cwill be formed in the long term.

7.4 Using the TI-89 Plus to Graph Solutions andSlope Fields

7.4.1 Basic Setup

Press the Mode button; the first entry must read Differential Equations.Change, if necessary. Remember to press Enter enough times (twice) to getthe calculator to remember this change in graphing mode.

USING THE TI-89 PLUS TO GRAPH SOLUTIONS AND SLOPE FIELDS 313

Page 314: Notes on Calculus

Go to the y = screen. Press F1 (tool pictures), then 9 (format) and lookat the bottom item Fields. This should be set to either SLPFLD (if you wanta slope field as part of your graph) or FLDOFF (if you don’t). If it is set toDIRFLD you will get an error message (Dimension).F6 on the y = screen controls the way in which the solution is graphed.

The default is #4 (thick). If you would like a “normal” graph, change to #1(line) or #6 (path) which is the same except that a moving circle is displayedat the head of the graph as it is produced.

7.4.2 Working With an Equation

In the y = screen, enter the equation you want as y1 (or y2, y3, or whateveryou want.) Note that you must use y1 (or whatever) as the variable name.Thus you would enter the equation y0 = t+ y as y10 = t+ y1.You must also enter an initial condition in the position following the equa-

tion. The initial condition for y1 is entered as yi1. This initial condition willbe taken as the value of the solution at the point t0 which you set on theWindow screen.Set the graphing window on the Window screen as usual. Note that the

horizontal axis will be the t-axis and the vertical axis will be the y-axis.

7.4.3 Initial Conditions

There are several ways to specify initial conditions. You can specify themahead of time on the y = screen. If you want to specify more than oneinitial condition for the same equation (and the same t0), enclose them inbrackets: {0, 1, 2} will cause the calculator to graph the 3 solutions withy (t0) = 0, y (t0) = 1, and y (t0) = 2.Once you have graphed at least one solution of an equation you can enter

more initial conditions directly from the graph screen. Press F8 (IC) and eitherenter a new IC as a value of t and a value of y from the keypad, or move thecursor to the point (t, y) you want to have as the new IC.

EXERCISES

1. (a)-(h) For the corresponding part of problem 2 in section 7.1, generatea slope field for the equation and graph the solution to the IVP.

2. For each of the following equations, verify that the given solution isindeed a solution, generate the slope field for the given window, add the graphof the given solution to it, and then use the slope field itself to make a roughsketch of the solutions with the given IC.

314 DIFFERENTIAL EQUATIONS

Page 315: Notes on Calculus

(a) y0 = t− y, one solution is y = −1+ t; use −3 ≤ t ≤ 3,−3 ≤ y ≤ 3; addy (0) = −1, y (0) = 0, y (0) = 1.(b) y0 = y (2− y) , one solution is y = 2; use 0 ≤ t ≤ 3, 0 ≤ y ≤ 3; add

y (0) = 1, y (0) = 3.(c) y0 = y (1− y) (2− y), use the region 0 ≤ t ≤ 3, 0 ≤ y ≤ 3. There

are three equilibrium solutions. The calculator picture may be a little hardto interpret though “zooming in” may help; use the equation itself to decidewhere the sign of y0 is positive and where it is negative. Plot the solutionswith y (0) = .5, y (0) = 1.5, y (0) = 2.5.

7.5 Existence & Uniqueness of Solutions andWhat It Tells Us

7.5.1 Existence and Uniqueness

In the initial discussion of equation (7.1) I noted that the various solutionsto this equation “fit together like a well combed head of hair.” The pictureswe have drawn since then for other equations seem to show the same kind ofproperty for the solutions of other equations. It is time to discuss this a littlemore carefully.It is a “fact of experience” that if we start some physical system going,

then something will happen. Moreover, if we start the system more thanonce under identical conditions (say, throw a ball up in the air twice with thesame initial direction and velocity) then it will behave in the same way eachtime.These two “facts” are reflected in the following two desirable qualities for

an initial value problem ( = differential equation + initial condition).

• (Existence) An initial value problem y0 = f (y, t) , y (t0) = y0 has asolution for each choice of initial condition. (“Something will happen.”)

• (Uniqueness) An initial value problem y0 = f (y, t) , y (t0) = y0 hasonly one solution for each choice of initial condition.(“Under the sameconditions, the same thing happens every time.”)

Notice that this is a slightly expanded view of an “initial condition” inthat the specified time is not necessarily t = 0. It is in fact useful from timeto time to specify its solution at some time t0 other than 0.In a more systematic introduction to differential equations, we would spend

some time discussing which initial value problems actually possess these de-sirable qualities. While we will not do this here, I cannot resist giving you

EXISTENCE & UNIQUENESS OF SOLUTIONS AND WHAT IT TELLS US 315

Page 316: Notes on Calculus

an example where uniqueness (the chancier property) fails. Consider the ini-tial value problem y0 = y1/3, y (0) = 0. You can verify by differentiation thaty (t) =

¡23t¢3/2

is a solution to the equation (since y0 =¡23

¢3/2 32t1/2 =

¡23t¢1/2

=

y1/3), and clearly y (0) = 0. But also y = 0 (the constant solution) is a solutionto this initial value problem. So we have at least two solutions with the sameinitial condition y (0) = 0. In fact it turns out that this IVP has infinitelymany solutions. The diagram shows one more of these.

-2 -1 0 1 2 3 4

2

4

t

y

Solutions of y0 = y1/3, y (0) = 0

What causes the trouble, roughly speaking, is the fact that the graph of the

right hand side of the equation, y1/3, has a vertical tangent line on the x-axis.As long as the right hand side has a derivative as a function of y and also asa function of t the two properties of existence and uniqueness do hold, so wewill feel free to use them from now on.

7.5.2 What It Tells Us

The practical significance of uniqueness of solutions of first order differentialequations is this: no two different solution curves to the same differen-tial equation can intersect. For if, say, y1 and y2 are both solutions to thesame differential equation y0 = f (t, y), their graphs intersect at t = t0 (withcommon y value Y ) only if they solve the same IVP, y0 = f (t, y) , y (t0) = Y.But then uniqueness says they are not different solutions after all, but thesame solution. To say it in a different way, if two solutions to a first orderdifferential equation have the same value at one point, then theyhave the same value at all points.Now look back at the direction field for logistic growth. We know that there

are two equilibrium solutions represented by the horizontal lines y = 0 andy = 1. Uniqueness says that no other solution can cross either of these lines.Thus any solution with 0 < y (0) < 1 is trapped forever between those lines;following the arrows of the direction field we see that there is a whole familyof solutions, each of which is an increasing S-shaped curve moving away fromy = 0 and approaching y = 1. These curves can never intersect one another,

316 DIFFERENTIAL EQUATIONS

Page 317: Notes on Calculus

so must fit together as in the following diagram. Notice that we have beenable to draw this diagram of solutions of the logistic equation eventhough we have no formula for the solutions.

-3 -2 -1 1 2 3

-0.50

-0.25

0.25

0.50

0.75

1.00

1.25

1.50

t

y

Some Solutions of y0 = 3y (1− y)

The different curves are actually horizontal translates of each other. These S-

shaped curves are characteristic of the logistic equation and represent growthbehavior often observed in nature.Finally, here is the direction field for the more difficult equation (7.6) from

the Introduction.

-1 1 2

-1

1

2

t

y

y0 = (1− y) (t+ y)

Equations of this type have been used to model the proportion of peoplein a group who have performed some action (like drivers turning on theirheadlights at dusk) where performance depends partly on external factors(getting dark) and partly on imitating the actions of others. For x ≥ 0 andy ≥ 0 the equilibrium solution y = 1 is clearly stable. In fact it appears thatany solution y (t) with y (0) ≥ 0 will approach y = 1 fairly quickly. We willnot be able to find a formula for the non-equilibrium solutions, so a diagramlike this is the best available general picture of the situation.

EXISTENCE & UNIQUENESS OF SOLUTIONS AND WHAT IT TELLS US 317

Page 318: Notes on Calculus

7.5.3 Phase Lines for Autonomous Equations

For an autonomous equation y0 = f (y) the equilibrium solutions occur atthe zeros of the function f and only there. For instance, for the equationy0 = 3y (1− y) above, the zeros of the function f (y) = 3y (1− y) are y = 0and y = 1. These are the two equilibrium solutions. Moreover, the quadraticfunction f can change sign only as y passes these zeros. In fact, f is negativefor y < 0 and y > 1 and positive for 0 < y < 1. Thus, as we see in the graph ofseveral solutions above, any time a solution has values between between 0 and1, it must be increasing. Furthermore, it must stay increasing forever, sinceExistence and Uniqueness decrees that it can never actually reach the valuey = 1. Similarly, any time a function finds itself with values greater than 1,not only must it be decreasing, but it must stay decreasing forever, since it cannever leave the region where y > 1. In fact we can summarize the behavior ofall solutions of y0 = 3y (1− y) with the line on the left, called a phase line. Forcomparison, the other items are a slope field and a set of graphs of solutions.

Phase Line

-2 -1 1 2

-0.50

-0.25

0.25

0.50

0.75

1.00

1.25

1.50

t

y

Slope Field

-2 -1 1 2

-0.50

-0.25

0.25

0.50

0.75

1.00

1.25

1.50

t

y

Solution Graphs

We can draw a phase line for any autonomous differential equation. Thepoints on the line corresponding to equilibrium solutions of the equation, thatis, the values y0 such that f (y0) = 0 are called equilibrium points. Betweenany two adjacent equilibrium points either all solutions of the equation areincreasing or all solutions of the equation are decreasing. In the example justabove, for instance, all solutions are increasing in the interval 0 < y < 1between the equilibrium points, and all solutions are decreasing when y > 1or y < 0. We can indicate that on the phase line with an arrowhead pointingin the appropriate direction.

An equilibrium point with the property that all nearby solutions approachit is called a sink; an equilibrium point with the property that all nearbysolutions move away from it is called a source. For y0 = 3y (1− y) we seethat y = 1 is a sink and y = 0 is a source. It is possible for an equilibriumpoint to be neither a sink nor a source; such points are called nodes.

318 DIFFERENTIAL EQUATIONS

Page 319: Notes on Calculus

7.5.4 Have I Found All Solutions?

In section 6.1 we saw that the family of functions y = ke12t solved the differ-

ential equation y0 = 12y, but left open the question of whether there are any

other solutions to y0 = 12y. Using the uniqueness of solutions we can answer

this question. Since for every point in the plane there is a function in the fam-ily ke

12t that passes through the point (to find it for a given point (t0, y0) plug

in these values and solve for k), there cannot be any other solutions besidesthe ones we have already found. For if, f (t) say solves y0 = 1

2y, and f (0) = C,

then f and Ce12t have the same value at t = 0. By uniqueness, they must have

the same value for all t, that is, we must have f (t) = Ce12t for all t.

This same argument works in general in the following sense. If we havefound a family of solutions of a first order differential equation such that oneof the solutions passes through each point of the plane, then this must be acomplete list of solutions, and there is no point in hunting for any more. Onthe other hand, if the solutions we have fail to go through one or more points inthe plane, then we must be missing some solutions—the ones that pass throughthose points.

EXERCISES.

1. Make phase lines for the equations y0 = y (2− y) and y0 = y (1− y) (2− y)of exercise 2 of section 7.4. Identify the equilibrium points as sinks, sourcesor nodes. Explain how you can tell that the first equation has no positivesolutions that get large as t→∞, but the second equation does.

2. Make a phase line for each of the autonomous equations in the list forexercise 2 of section 7.1. Identify the equilibrium points as sinks, sources, ornodes. Which of these equations has a solution that “approaches infinity” or“approaches negative infinity” as t→∞?3. For each of these functions f defined by the graph, construct a phase

line. Explain what will happen as t → ∞ to the solution of the equationy0 = f (y) such that y (0) = 1.(a)

-3 -2 -1 1 2 3

-2

2

4

6

t

y(b)

-2 -1 1 2

-4

-2

2

4

t

y

4. Make a phase portrait for y0 = sin (t+ y) . Why is a phase line notappropriate? Check that each of the functions y = 3π

2+ 2kπ − t is a solution

EXISTENCE & UNIQUENESS OF SOLUTIONS AND WHAT IT TELLS US 319

Page 320: Notes on Calculus

to this equation. How do other solutions behave as t→∞ and t→−∞?

7.6 Separable Equations; Symbolic Solutions

We have seen that we can learn a lot about the general nature of the solutionsof any first order differential equation graphically by looking at the slope field.Nevertheless it is natural to want to find an explicit formula for solutionswhenever this is possible. One common situation where this is often possibleis when the equation is separable, which means that it can be written in theform

y0 = f (y) g (t)

for some functions of y and t alone. This is certainly possible whenever theequation is autonomous, that is the right side does not involve t explicitly, asin the first, third, and fifth equations in the Introduction, for then the equationjust has the form y0 = f (y) and g is just g (t) = 1. On the other hand, thelast equation in the Introduction, (7.6) , is not separable.As an example of how to proceed with a separable equation, let’s reconsider

y0 = 12y. Think of it as dy

dt= 1

2y, and divide through by y to get 1

ydydt= 1

2. Now

take the antiderivative (with respect to t) on both sides. We get (rememberingthe chain rule),

ln |y (t)| = 1

2t+ C.

Exponentiating, |y (t)| = eCe12t. Note that eC can be regarded as an arbitrary

positive constant. Since the right hand side of this equation is never zero,any single solution y (t) must be always positive or always negative. We cancover both cases by removing the absolute value signs and replacing eC by anarbitrary (not necessarily positive) constant k to get y = ke

12t as we already

knew.In practice we organize this computation like this: “multiply the equation

1ydydt= 1

2by dt” and take antiderivativesZ

1

ydy =

Z1

2dt

or

ln |y| = 1

2t+ C

ory = ke

12t

as before. It appears that we are taking antiderivatives with respect to y onthe left and with respect to t on the right. What has actually happened is

320 DIFFERENTIAL EQUATIONS

Page 321: Notes on Calculus

that there has been a hidden substitution on the left of u = y, du = y0dt, butthe new variable is called y, not u.

Example. Solve the “haggis equation” y0 = .1 (200− y) with the initialcondition y (0) = 20. (The haggis, initially at room temperature, is put in a200◦C oven.)Dividing through by 200− y and separating,Z

dy

200− y=

Z.1dt

or− ln |200− y| = .1t+ C.

Inserting the initial condition to determine C, − ln |200− 20| = 0 + C orC = − ln 180. Thus

ln (200− y) = −.1t+ ln 180or

200− y = e−.1t+ln 180 = 180e−.1t

ory = 200− 180e−.1t = 20 + 180

¡1− e−.1t

¢.

This produces the haggis temperature graph that we have already seen inqualitative form:

0 10 20 30 40 50 600

50

100

150

200

t

y

Temperature of a haggis in an oven

EXERCISES.

Solve each of the following initial value problems. Note that the indepen-dent variable is x rather than t in some of them.

1.dP

dt=

P

5, P (0) = 50.

SEPARABLE EQUATIONS; SYMBOLIC SOLUTIONS 321

Page 322: Notes on Calculus

2.dP

dt= P + 4, P (0) = 100.

3.dP

dt= P − a, P (0) = b. (There are three cases. What are they?)

4.dy

dt= 2y − 4, y (2) = 5.

5.dm

dt= .1m+ 50, m (0) = 500.

6.dy

dx= 3√xy, y = 1 when x = 1.

7.dy

dx= yex, y (0) = 2e.

8.dy

dx= xy3, y (0) = 1. What is the domain of this solution? (Include

negative x as well as positive x.)

9. Consider the separable equationdy

dt=p1− y2, defined for all y with

−1 ≤ y ≤ 1, with the initial condition y (0) = 0.

(a) If we separate asdyp1− y2

= dt and integrate, we get arcsin y =

t+ c. Using the initial condition, c = 0 so that arcsin y = t or y =sin t as the solution we want. Or is it? In the original differentialequation

p1− y2 ≥ 0 for all y with |y| ≤ 1, which is certainly not

the case for the sine function.What is wrong here? Is sin t actuallya solution over some t-interval? If so, which one?

(b) There shouild be a solution of this equation through any point(t0, y0) in the plane with |y| < 1. Sketch a representative collec-tion of solutions on a single diagram as in the previous section.Find the solution with y (π/6) = 0 explicitly and include it in yourdiagram.

10. Newton’s Law of Cooling states that if the haggis, now at 200◦C in theoven, is removed from the oven into room temperature (20◦C) and al-lowed to cool, its rate of cooling is proportional to the difference betweenthe haggis’ temperature and room temperature. Find an expression forthe haggis’ temperature if it has cooled to 100◦C in half an hour. Whenwill its temperature be 25◦?

11. Solve the IVP for the city population that you set up in #1 in section7.2.

322 DIFFERENTIAL EQUATIONS

Page 323: Notes on Calculus

12. Solve the IVP for the number of students infected with measles that youset up in #3 in section 2.

13. Solve the general logistic equation y0 = ky (M − y) . The antiderivativeyou will need is in the table in Chapter 5 (or just use your TI-89).

14. Solve the general logistic equation again by making the change of variablev = y−1 so that v0 = −y0y−2 or y0 = −v0y2. If you multiply out the righthand side of the logistic equation, make the substitution for y0, and dividethrough by y2 you should end up with a simpler separable equation forv. Solving this and then setting y = v−1 should give the same answer as#12.

15. Torricelli’s Law states that if fluid is draining out of a container (of anyreasonable shape) through a hole in the bottom, then the rate of flow offluid through the hole (in, say cc3/sec) is proportional to the square rootof the depth of the fluid in the container above the hole. Suppose thata cylindrical container of radius 1 foot and height 2 feet is initially fullof water. A hole is punched in the bottom, and it is then observed thatafter 5 minutes the cylinder is only half full. Find an expression for thedepth of water in the hole at time t. Is the tank ever completely empty?

If so, when? (You will have to relatedV

dtto

dy

dtwhere y (t) is the depth

of the water at time t. Use the chain rule.)

16. A hemispherical tank has top radius 1 meter and is initially full of water.Ten minutes after the plug at the bottom is removed, the depth of waterremaining in the tank is 0.5 meter. When will the tank be empty?(Suggestions: Read the suggestion for the previous problem. Note that

it is easy to finddV

dyfrom the relationship V =

R y0A (z) dz where A (z)

is the area of a cross-section of the hemisphere at height z above thebottom. Don’t try to solve the equation you get by solving the DE fory.)

17. (The time-keeper’s problem). The Greeks kept time using a water clock,the clepsydra. Help them out, a bit belatedly, by finding a curve y =f (x) which when rotated about the y-axis produces a shape with theproperty that if it is filled with water which drains through a hole in thebottom, the water level will fall at a constant rate. (Actually, of course,you will get a family of curves with one parameter.)

18. (The coffee drinker’s problem) The temperature of a cup of McDonald’scoffee, when served, may be taken to be 90◦C. It then cools according toNewton’s Law of Cooling toward room temperature (20◦C). If you likecream in your coffee but aren’t going to drink it right away, is it better to

SEPARABLE EQUATIONS; SYMBOLIC SOLUTIONS 323

Page 324: Notes on Calculus

put the cream in immediately or to wait and put the cream in just beforeyou drink it? That is, which way will the coffee-cream mixture be hotterwhen you start to drink it? Assume that the cream’s temperature is 10◦Cand that the effect of adding the cream and stirring is to instantaneouslylower the temperature of the mixture according to the amount of creamadded. (That is, if the volume of cream added is equal to the volumeof coffee (ugh!) the temperature of the mixture would be the average ofthe individual temperatures; if the volume of cream is 10% of the totalvolume, then the temperature of the mixture would be 10% of the wayfrom the temperature of the coffee to the temperature of the cream.)Suggestions and further questions: Try this first with a specific mixture,say half coffee, half cream and a specific waiting time, say 10 minutes.Does the answer (mix immediately or wait until drinking) depend on thewaiting time? On how much cream is added? On the cooling constantin Newton’s Law? Would it make any difference if the cream is at roomtemperature?

19. (The Alaskan’s problem) One morning it starts to snow, and it continuesto snow all day at a constant rate. At noon a snow plow starts to clear theroads. The plow covered two miles of road by 2 p.m. and one additionalmile by 4 p.m. Assuming that the plow removes snow at a constant rate(volume per hour), when did it start to snow?

7.7 More Modeling

7.7.1 Motion

Consider the motion of a falling body (not your calculus teacher). In thesimplest model, the only force acting is gravity (which we may assume to beconstant close to the earth’s surface). According to Newton’s second law ofmotion, the velocity v of the body satisfies

ma = mdv

dt= (the sum of all forces acting on the body)

= (force of gravity) = −mg.

Here we have taken upwards as the positive direction so that gravity acts inthe negative direction. Dividing though by m, v satisfies the DE dv

dt= −g.

Thus v = −gt + C. If the body falls from rest (say from a passing airplane),v (0) = 0, then v = −gt, that is, the body’s velocity just increases in a linearfashion.

324 DIFFERENTIAL EQUATIONS

Page 325: Notes on Calculus

Now we see what difference it makes to add air resistance. This forcedepends on the velocity of the body. For bodies that are not moving toofast it is often taken to be proportional to velocity. Since it acts to slow thebody, its direction is opposite to that of the velocity. Thus the constant ofproportionality is negative. Now

ma = mdv

dt= (the sum of all forces acting on the body)

= (force of gravity) + (air resistance) = −mg − kv (7.9)

or, dividing through by m again, we get the separable equation

dv

dt= − k

mv − g.

Suppose in particular that the drag coefficient km= .25 sec−1. The most

convenient way to solve the equation is to write dvdt= −.25 (v + 4g) , separate

to Zdv

v + 4g= −

Z.25dt

orln |v + 4g| = −.25t+ C.

If the body falls from rest (from a passing airplane for instance) then sub-stituting v (0) = 0 gives ln 4g = C. Since v + 4g > 0, at least for awhile,exponentiating gives

v + 4g = e−.25teln 4g = 4ge−.25t

or v = 4g (e−.25t − 1) . Note that as t→∞, v = 4ge−.25t− 4g →−4g ≈ −39.2m/sec.Thus allowing for the effect of air resistance has the effect that the velocity

of a falling body approaches a terminal velocity. It does not just increaseforever. This is what is actually observed. Comparing the formulas for v withair resistance and without air resistance graphically,

0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 00

2

4

6

8

1 0

x

y

Velocity of a falling body

MORE MODELING 325

Page 326: Notes on Calculus

we see that the velocity starts out just the same as without air resistance,but that the body’s acceleration (the slope of the velocity curve) then dropsoff toward zero.

7.7.2 Mixing

Suppose we install a filtering plant to clean the water in a lake. The lakecontains 1 million cubic meters of water, including 20,000 cubic meters ofpollutants. The plant can remove the pollutants from 1000 cubic meters ofwater per hour. How will P (t), the volume of pollutants in the lake at time tbehave?We derive a differential equation for P by considering how P changes in a

short period of time. In one hour, the plant cleans 1000 cubic meters of water.How much pollutant does it remove? This depends on the way in which thepollutant is distributed in the water. The easiest assumption is that the lakeis perfectly mixed, that is, the concentration of pollutant is the same in anysample of water. In that case, the amount of pollutant in any sample is justproportional to the total amount:

pollutant in sampletotal pollutant

=water in sampletotal water

. (7.10)

Here this meansPsample

P=

1000

1, 000, 000= .001 or

∆P ≈ −.001P

where the minus sign comes from the fact that the amount of pollutant in thelake is decreasing, that is, the change in P is negative. The approximatelyequal sign ≈ is needed because the total amount of pollution may changeslightly during the hour, that is, P is not really constant throughout theinterval as we are taking it to be to make a simple “equation.”.Similarly, the amount of pollutant removed in any time ∆t satisfies the

same equation (7.10), where now the amount of water processed in ∆t hoursis 1000∆t m3 so

∆P ≈ −1000∆t

106P = −.001P∆t or

∆P

∆t≈ −.001P.

Thus the differential equation for P isdP

dt= −.001P. We know this equa-

tion has solutions of the form P = Ce−.001t, and from P (0) = 20, 000 we getthat

P (t) = 20, 000e−.001t.

Thus the amount of pollutant will eventually approach zero, though it willnever actually be zero. Note that we have assumed that the water in the

326 DIFFERENTIAL EQUATIONS

Page 327: Notes on Calculus

lake is always perfectly mixed (this is the meaning of equation (7.10) ,whilein reality there might be some tendency for the water near the plant to becleaner than water far from the plant.Now suppose that a new plant is built by the lake and discharges 10 m3 per

hour of pollutant into the lake. How will P (t) behave now? We will continueto assume for simplicity that the lake is perfectly mixed at all times.If we repeat our study of ∆P, we still have pollutant being removed as

before but now we also have pollutant being added at a constant rate. In onehour, ∆P = −.001P + 10, and in ∆t hours,

∆P ≈ −.001P∆t+ 10∆t

or nowdP

dt= −.001P + 10 = −.001 (P − 10, 000) .

There is now a non-zero equilibrium solution, P (t) = 10, 000. This is thelevel of pollutant where the amount added to the lake by the plant is justbalanced by the amount removed by the filtering plant.

We can solve the equation by separating variables to getdP

P − 10, 000 =−.001dt or, taking antiderivatives

ln |P − 10, 000| = −.001t+ C.

Plugging in P (0) = 20, 000, we get C = ln 10, 000. Exponentiating,

|P − 10, 000| = 10, 000e−.001t

and, using the fact that P (t) > 10, 000,

P (t) = 10, 000 + 10, 000e−.001t = 10, 000¡1 + e−.001t

¢.

Note that now P (t)→ 10, 000 as t→∞, that is, the lake will never be clean;instead the amount of pollutant will approach the equilibrium level of 10,000m3 as t→∞.

EXERCISES.

1. Solve the general equation (7.9). How does the terminal velocity dependon the drag coefficient k/m? Draw a (qualitative) direction field for t ≥ 0 andv ≤ 0 and superimpose some solutions (estimated by eye).2. Under some circumstances (falling with a parachute, for instance) it

is better to assume that the retarding force of air resistance is proportionalto v2 rather than to v. Modifying equation (7.9) to reflect this leads to dv

dt=

−g+ kmv2, where the reason that there must now be a plus sign for the second

term is that for a falling body, air resistance is a force in the positive direction

MORE MODELING 327

Page 328: Notes on Calculus

(upwards) and so the second termmust be positive. (This was true for equation(7.9) because v < 0 made − k

mv > 0.) Solve the new equation by separating

variables when v (0) = 0. Is there still a terminal velocity? How does it dependon k/m now?3. A tank contains 100 gallons of a solution of dissolved salt and water

which is kept perfectly mixed at all times by stirring. Suppose the mixtureis allowed to flow out at 3 gallons per minute and is replaced by pure waterflowing in at 3 gallons per minute. If there is 15 pounds of dissolved salt in themixture initially, find the amount S (t) of salt after t minutes. What happensto S (t) as t→∞?4. Suppose that in the situation of the previous problem, the outflow

remains the same, but pure water flows in at 4 gallons per minute. Now whatis S (t)? (Suggestion: What is the total amount of mixture in the tank attime t? This is no longer constant. However an equation similar to (7.10) willcontinue to hold.)5. When ice forms on Lake Padden, the surface freezes first. As heat travels

up through the ice already formed and is lost into the air, the ice gets thicker.It is reasonable that the thicker the ice is, the more slowly the thickness willincrease. Suppose that the rate of change of thickness of the ice is inverselyproportional to the thickness. Find an expression for the thickness of the iceas a function of time.6. Dead leaves accumulate on the ground in a deciduous forest at a constant

rate of 3 grams per square centimeter per year. (This assumption is not veryrealistic. Why not? But it makes the problem easier.) Simultaneously, theleaves on the ground decompose (and vanish into the soil) at a constant rateof 75% decomposition per year. Write a differential equation for the mass ofleaves per square centimeter. Use a slope field to show that the mass of leavesapproaches an equilibrium level. What is it? Solve the equation and showthat your formula leads to the same conclusion.7. A cell contains a chemical at a concentration c (t) . The same chemical

is outside the cell at a constant concentration k. (Constant because the muchlarger volume is not affected by small movements across the wall of the cell.)Fick’s Law states that the rate of movement across the cell wall is proportionalto the difference between c (t) and k in the direction from higher concentrationto lower concentration. Set up and solve a differential equation for c (t) withthe initial concentration c (0) = c0. Sketch the solution if c0 = 0.8. When a course ends, some students (not you, of course) start to forget

what they have learned. One model assumes that the rate of forgetting isproportional to the difference between the fraction remembered and a positiveconstant a. Set up and solve a differential equation for y = r (t) , the fractionof the material learned in the course which is remembered t weeks after thecourse ends. Note that r (0) = 1. Assume that 0 < a < 1. (This is reallyan IVP.) What is the interpretation of a in terms of this model—what does itrepresent?

328 DIFFERENTIAL EQUATIONS

Page 329: Notes on Calculus

9. A drug is administered intravenously at a contant rate r mg/hr and isexcreted at a rate proportional to the quantity in the body, with constant ofproportionality α.(a) Set up and solve a differential equation for Q (t) , the quantity in mil-

ligrams of drug in the body at time t. Show that Q approaches a limiting valueQ∞.(b) What effect does doubling r have on Q∞?What effect does doubling r

have on the time to reach 12Q∞ if initially there is no drug in the body?

(c) What effect does doubling α have on Q∞? On the time to reach 12Q∞?

7.8 Systems of Differential Equations

Very often the evolution of a quantity over time depends on its interactionwith one or more other quantities. When the interactions are described bydifferential equations, the result is a system of differential equations. In thissection there is not room for a systematic exploration of this enormous topic,so we will look at some examples.

EXAMPLE 1. Romeo and Juliet are lovers, but their love has its upsand downs, as we will see. Their love for each other is selfless—it is affectedonly by the other’s feelings and not by their own. But while Juliet responds toaffection—the more Romeo loves her, the more she responds—Romeo is fickle.He is turned off by Juliet’s love, but attracted by her dislike. We can describeall this by the following system of differential equations:

j0 = r (7.11)

r0 = −j.

Here j (t) is Juliet’s love for Romeo and r (t) is Romeo’s love for Juliet. Bothare signed quantities—a positive value representing a level of affection and anegative value a level of dislike. (Zero is indifference.)A solution of this system would consist of a pair of functions, j (t) and

r (t) , which fit together to satisfy both equations simultaneously. In this casewe can guess some solutions fairly easily. (Can you?) For assistance note thatif we differentiate both sides of the top equation and then substitute fromthe bottom one we get j00 = r0 = −j, that is, j must be a function whosesecond derivative is its negative. Both sin t and cos t have this property. Ifj (t) = sin t, then r = j0 = cos t while if j (t) = cos t, then r = j0 = − sin t.Thus we have two solution pairs, which I will write as vectors:

∙sin tcos t

¸and∙

cos t− sin t

¸.Moreover, we can notice that any multiple of a solution pair is also

a solution pair (or solution for short) and that sums of solutions are solutions.

SYSTEMS OF DIFFERENTIAL EQUATIONS 329

Page 330: Notes on Calculus

Thus in fact any vector of the form

c1

∙sin tcos t

¸+ c2

∙cos t− sin t

¸=

∙c1 sin t+ c2 cos tc1 cos t− c2 sin t

¸provides a solution to the system. (These are all the solutions, in fact, but thatis not so evident at the moment.) Notice that each of these solutions is periodicwith period 2π, that is, their feelings cycle endlessly through a succession ofstages, each of length π/2. For instance for the solution j = sin t, r = cos t wesee from the graphs

1 2 3 4 5 6

-1

0

1

t

y

Solid: Juliet; Dashed: Romeo

that initially Romeo loves Juliet but she is indifferent. For 0 < t < π/2 eachloves the other, but this does not suit Romeo, who dislikes Juliet from t = π/2to t = 3π/2. Romeo’s dislike discourages Juliet, so her affection diminishesand after t = π has become dislike. This revives Romeo’s ardor, so by t = 2πthey are back where they started, and about to embark on another cycle.

EXAMPLE 2. Consider a situation where there is both a predator speciesand a prey species—say foxes and rabbits, or food fish and predatory fish. If welet R (t) denote the prey population at time t and F (t) denote the predatorpopulation at time t, then their interaction might be described by a system ofthe form

R0 = aR− bRF

F 0 = −cF + dRF

where a, b, c, d are all positive parameters. These are the Lotka-Volterra equa-tions, used by the Italian mathematician Vito Volterra in the 1920’s to try toexplain why the proportion of food fish to predatory fish in the Mediterraneanchanged during World War I.Here the equations have the following interpretations. The first equation

says that in the absence of foxes the rabbits will increase exponentially withnatural growth rate a, but that the foxes produce a decrease of populationproportional to the product RF of the two populations. One might regard thisproduct as a measure of the frequency with which a fox and a rabbit meet,since very roughly one might expect the number of meetings to be proportional

330 DIFFERENTIAL EQUATIONS

Page 331: Notes on Calculus

to each of the individual populations. The second equation says that in theabsence of rabbits the foxes will decrease exponentially, but that the presenceof rabbits helps out the foxes just as it is bad for the rabbits. A specificexample would be

R0 = 6R− 3RF (7.12)

F 0 = −10F + 2RF.

A solution of this system would consist of a pair of functions, R (t) andF (t) , which fit together to satisfy both equations simultaneously. In this caseit is impossible to “solve” the system in the sense of arriving at formulas forR (t) and F (t) , except for a few special solutions—the trivial solution R =F = 0 and the exponential solutions R = k1e

at, F = 0 and R = 0, F = k2e−ct

corresponding to the presence of only one species There are also a secondconstant solution: R = c/d, F = a/b. This is often the case for systemsinvolving one or more nonlinear terms like the RF term here. Neverthelessthere is an Existence and Uniqueness Theorem at work here that assures usthat for any pair of initial conditions R (0) = R0 and F (0) = F0 there is aunique solution pair. We will just have to find another way to study them.

7.8.1 Direction Fields

We can learn a great deal about how solutions to the Lotka-Volterra systembehave from looking at the direction field—the equivalent here of the slopefields we used for single equations. A graph of a solution of the system abovewould be a set of ordered triples (t, R (t) , F (t)) , that is a specification of botha value for R and a value for F corresponding to each time t. Such a graphwould look like a curve in three dimensional space. Visualizing and drawingsuch graphs is complicated, and a drawing of several solutions at once wouldlook a lot like a tangle of spaghetti, so we generally do something simpler.If we give up having a separate component for t, we could just draw a

solution of the system as a parametrized curve (R (t) , F (t)) . The disadvantageof this is that we cannot read directly from the parametrized curve the time atwhich the solution passes it. But the gain from dealing with two dimensionsinstead of three is much more important than the loss.Before discussing the direction field for 7.12, we’ll look at the simpler ex-

ample of 7.11. Thus we must find the parametrized curves (j (t) , r (t)) in thejr-plane.Recall that the velocity along the parametrized curve is given by the vector

[j0 (t) , r0 (t)] . And since the equations above do not explicitly include t on theright hand side, the values for j0 and r0 depend only on the values of j and rand not on what t is. (Such a system is autonomous.) For instance, if j = 1and r = 1 (each is feeling affectionate toward the other) then from the system

SYSTEMS OF DIFFERENTIAL EQUATIONS 331

Page 332: Notes on Calculus

we see j0 = 1, r0 = −1, that is, Juliet will be encouraged, but Romeo will cooloff. If we place velocity arrows at a selection of points in the plane we see

-4 -2 2 4

-4

-2

2

4

j

r

Just as with the slope fields, this direction field indicates the ghosts of solutioncurves. Clearly solution curves revolve clockwise around the origin. We canactually be more precise than that by noting that the quantity j2 + r2 isconstant along any solution curve since

d

dt

¡j2 + r2

¢= 2jj0 + 2rr0 = 2jr + 2r (−j) = 0.

But j2 + r2 = c2 is just the equation of a circle centered at the origin in thejr-plane. Thus solution curves move along these circles.It is important to realize that many different solutions move along the same

curve; they just start from different positions. Referring back to the formulasfor solutions of this system, the solution j = sin t, r = cos t is at (0, 1) at timezero and moves clockwise around the circle j2 + r2 = 1, returning to (0, 1) att = 2π. The solution j = cos t, r = − sin t also moves clockwise around thesame circle, but starting from the point (1, 0) at t = 0. It stays a quarter circleahead of the other solution at all times.The meaning of the Existence and Uniqueness Theorem for two dimensional

autonomous systems is that the plane is filled with a collection of curves whichnever intersect, each of which is the track of a family of solutions. For eachpoint on a given track there is one solution which is at that point at t = 0.For the Romeo and Juliet system the curves are the circles centered at theorigin. For each point (j0, r0) with j20 + r20 = c2 the corresponding solutioncan be found from writing j0 = c sin θ, r0 = c cos θ where θ = arctan (j0/r0) .Then j (t) = c sin (t+ θ) , r (t) = c cos (t+ θ) . We can make this look morelike the form of solutions found above by using the addition formulas for sineand cosine:∙

j (t)r (t)

¸=

∙c sin (t+ θ)c cos (t+ θ)

¸=

∙c sin t cos θ + c cos t sin θc cos t cos θ − c sin t sin θ

¸= c cos θ

∙sin tcos t

¸+ c sin θ

∙cos t− sin t

¸.

332 DIFFERENTIAL EQUATIONS

Page 333: Notes on Calculus

7.8.2 Direction Fields Using the TI-89

To draw a direction field, be sure you are in Differential Equations mode. Thenenter the system in the y= screen. You must use y1 and y2 as variable namesso that the Romeo-Juliet system would be entered as

y10 = y2

y20 = −y1.

You must also enter an initial value for both y1 and y2.Before drawing the direction field you must go to F1 (tool pictures), then

9 (format) and look at the bottom item Fields. This should now be set toDIRFLD. If set to SLPFLD you will get an error message (Dimension). Youshould also check F7 Axes to see that it is set to Custom, X Axis y1, Y Axisy2. Edit the Window as for single equations. When you press Graph you willget first the direction field and then the solution with the initial conditionsyou specified. You can then add additional solutions using F8 as for singledifferential eqations. One deficiency is that the direction field arrows comewithout heads—they are just segments. You will have to examine the systemto decide where the arrowheads are.One final comment about drawing and interpreting direction fields. It is

often helpful to draw in (lightly) curves where just one of x0 or y0 is equalto zero. These are called the nullclines of the system. The nullclines dividethe plane into regions within each of which all velocity arrows are pointing ingenerally the same direction. For instance, for the direction field drawn abovefor the Romeo-Juliet system, the nullclines are the coordinate axes. Thesedivide the plane into four sectors—the four quadrants in this case. In the firstquadrant, all arrows point up and to the left. In the second quadrant all arrowspoint down and to the left, and so forth.EXERCISES.

1. For these two predator-prey systems I have used the neutral names xand y for the two species. In each case identify which is predator, and whichprey. Is the growth of the prey limited by any factor other than the numberof predators? Do the predators have a source of food other than the prey?

(a)x0

y0==−ax+ bxycy − dxy

(b)x0

y0==

ax− ax2/N − bxycy + dxy

2. Consider the following two predator-prey systems.

(a)x0

y0==

5x(1− x/10)− 10xyy (−3 + x/10)

(b)x0

y0==

.5x− xy/4010y (1− y/10) + 20xy

In one of these systems the prey are very large animals and the predatorsare very small animals, such as elephants and mosquitos. Thus it takes manypredators to eat one prey, but each prey is of great benefit to the predatorpopulation. The other system has large predators and small prey. Determinewhich is which and justify your answer.

SYSTEMS OF DIFFERENTIAL EQUATIONS 333

Page 334: Notes on Calculus

3.(a) Find all equilibrium points for each of the systems in #2.(b) For each system describe the evolution of the prey population if there

are no predators.(c) For each system describe the evolution of the predator population if

there are no prey. Do the predators have an alternative source of food ineither case?

4. Make a direction field for the system of #2(a). Include the nullclines inyour diagram. Note that they are lines, but one of them has a nonzero slope.Use it to predict what will happen if both predators and prey are present att = 0. Does your conclusion depend on how many of each are present?

5. Make a direction field for the system of #1(b). Include the nullclines inyour diagram. Use it to predict what will happen if both predators and preyare present at t = 0. Does your conclusion depend on how many of each arepresent?

7.8.3 Second Order Equations

Second order differential equations arise in many physical situations as partic-ular cases of Newton’s second law F = ma. For instance, suppose that a massof m kilograms sliding on a frictionless flat surface is attached to a spring asin the diagram below.

0

mass

Let x = 0 mark the horizontal position at which the mass would beat rest with the spring at its natural length. If the mass is moved to the left(compressing the spring), the spring will exert a force on it directed to the right.

334 DIFFERENTIAL EQUATIONS

Page 335: Notes on Calculus

On the other hand, if the mass is moved to the right (stretching the spring),the spring will exert a force directed to the left. It is customary to assumethat the magnitude of this force is proportional to the distance between themass’ position and its rest position, that is to the mass’ displacement x. Theconstant of proportionality is the spring constant. Since the force exertedby the spring is the only force acting on the mass, we have this differentialequation for the displacement x (t) :

ma = md2x

dt2= F = −kx or x00 = − k

mx.

The negative sign is there to remind us that the force exerted by the springalways acts in the direction opposite to the current displacement of the mass.(Thus the spring constant is positive.)We can make this into a system by introducing the velocity v of the mass

as a second variable. The velocity is related to the displacement x by v = x0

and also (through the second order differential equation) by v0 = x00 = − kmx.

Thus we have the system

x0 = v

v0 = − k

mx.

In particular, if m = 1 and k = 4 we have

x0 = v

v0 = −4x.

This system is not so different from Romeo and Juliet. In fact, it is just aseasy to guess that one solution is x = sin 2t, v = 2 cos 2t and that another isx = cos 2t, v = −2 sin 2t. Just as before, sums of solutions and multiples ofsolutions are also solutions, so that a general solution of the system would be

c1

∙sin 2t2 cos t

¸+ c2

∙cos 2t−2 sin 2t

¸=

∙c1 sin 2t+ c2 cos 2t2c1 cos 2t− 2c2 sin 2t

¸.

A direction field for this system looks like

-3 -2 -1 1 2 3

-3

-2

-1

1

2

3

x

v

x0 = v, v0 = −4x

SYSTEMS OF DIFFERENTIAL EQUATIONS 335

Page 336: Notes on Calculus

This is similar to the field for Romeo and Juliet, but the motion is around

ellipses instead of circles. We can see this by looking for a function of x and vthat is conserved on trajectories (that is, is constant there) now leads to

d

dt

¡4x2 + v2

¢= 8xx0 + 2vv0 = 8xv + 2v (−4x) = 0

so that solutions now move on ellipses with equations x2 + v2/4 = c2. In thiscase the constant quantity can be taken to be proportional to the total energyof the mass (first term is potential energy, second term is kinetic energy) sothe fact we have just used is the principle of conservation of energy for thefrictionless system.

EXERCISES.

1. Write down the second order equation and the associated system for amass-spring system with m = 2, k = 6. Try to guess solutions. (Still involvesin at, cos at for a suitable a.) Draw a direction field and draw a solution curveor two on it.

2. The spring of the example above is not quite frictionless after all. Sup-pose that there is a frictional force opposing the motion of the mass whosemagnitude is proportional to the velocity of the mass. Call the constant ofproportionality b.(a)What do you now expect to happen if the mass is pulled some distance

from its rest position and released?(b) Show that the second order equation governing the spring is nowmx00+

bx0 + kx = 0. (It may be helpful to look back at the discussion of a fallingbody with air resistance.)(c) Convert this into a system for the two variables x and v = x0. Draw

a direction field for this system in the particular case m = 1, b = 1, k = 4.What do you think the solution curves now look like? You answer should beconsistent with your answer to part (a).

3. It can be shown that if a a mass m is attached to the end of a masslessrod of length free to pivot around its other end to make a pendulum, and if weignore any frictional forces, then the pendulum moves according to the secondorder equation θ00 + g sin θ = 0, where g is the usual gravitational constantand θ (t) is the angle that the rod makes with the vertical at time t. In physicsclasses it is customary to deal with this nonlinear equation by saying that forsmall values of θ, sinθ is nearly the same as θ and to replace the equation bythe spring-like equation θ00+ gθ = 0. But this is clearly not exactly correct, andleads to some incorrect predictions (such as the prediction that the period ofthe pendulum’s oscillation is independent of its maximum deflection from thevertical—close but not quite true) so it is a little dangerous to conclude fromthis approximation that the pendulum’s motion is really periodic just becausethe motion from the approximating equation is periodic.

336 DIFFERENTIAL EQUATIONS

Page 337: Notes on Calculus

(a) Convert the nonlinear pendulum equation into a system for the variablesx = θ, y = θ0. Find all equilibrium solutions of the system. (There are infinitelymany.) Interpret the equilibrium solutions in terms of the position of thependulum. (They break into two groups.)

(b) Make a direction field for the system for the special case = g. Try todraw in some representative solutions of the system. (There are at least threekinds. Only one of these corresponds to periodic motion.) Interpret each kindof solution in terms of a motion of the pendulum. Notice that the directionfield is much different near one class of equilibrium points than it is near theother class. (We say that one kind of equilibrium point is stable and the otheris unstable.)

7.8.4 An Epidemic Model

In a previous section there was a problem about an outbreak of measles. Themodel predicted that everyone would eventually get the disease. Here we willlook at a somewhat more sophisticated model that makes somewhat more real-istic predictions. We will see that one prediction is the existence of a thresholdnumber of cases—with fewer than the threshold number, the disease will justdie out, but with more than the threshold number, the infected populationwill increase to an extent determined by how far over the threshold the systemstarts.The model is designed for a fixed population and a disease which people

get only once—afterwards they are recovered and immune, or possibly dead.We split the population into three groups: the infected, (I (t) is the infectedpopulation at time t), the susceptible (those who do not have the disease butmay get it—S (t) is the susceptible population at time t), and the removed(those who have had the disease and are no longer infectious for whateverreason—R (t) is the removed population at time t.) They are related by thefollowing system

S0 = −aSII 0 = aSI − bI

R0 = bI.

Here the first equation means that the rate of new infectives (equivalent to therate of decrease of the susceptible population) is proportional to the numberof meetings of susceptibles and infectives. The constant a may be thought ofas related to how infectious the disease is. The second equation means thatthe number of infectives grows at the rate susceptibles get the disease, whilethe rate at which infectives become removed (non-infectious) is proportionalto the number of infectives. (This is more convenient than totally realistic—it

SYSTEMS OF DIFFERENTIAL EQUATIONS 337

Page 338: Notes on Calculus

amounts to assuming that passing from infective to removed occurs randomly,like radioactive decay.)Note that also S (t) + I (t) + R (t) = P (a constant), the total popula-

tion. Thus there are really only two variables here. If we can determine whathappens to S and I, then what happens to R is determined.The first two equations alone involve only S and I so we can deal with

them as a system in their own right.

S0 = −aSII 0 = aSI − bI = I (aS − b)

We see that for an equilibrium solution we must have I = 0 (otherwise S wouldhave to be both 0 and b/a), that is, the only steady state is when nobody isinfected. If I = 0 then we do have an equilibrium solution regardless of thevalue of S. Thus the entire positive S-axis in the first quadrant of the SI-planeconsists of equilibrium points. Moreover S = b/a is a nullcline where I 0 = 0.This gives us a direction field like this where I have set b/a = 2 in order tomake my software produce a picture:

1 2 3 40

1

2

3

S

I

S = b/a

We see first that all solutions move from right to left (that is, the number S ofsusceptibles must decrease). What varies according to the initial condition iswhat happens to I (t) . There are two possibilities depending on whether theinitial position (S (0) , I (0)) is to the left or the right of the nullcline S = b/a.If S (0) < b/a, then I is always decreasing. Thus the disease will run its coursewithout involving too many people.On the other hand, if S (0) > b/a, then I (t) starts by increasing. How far

it increases depends on S (0)−b/a. The farther to the right of b/a the solutioncurve starts, the higher it will go. This is an epidemic. Thus the value b/a canbe regarded as a threshold value—an epidemic occurs only if there are moresusceptibles than this.One interesting feature is that if we try to imagine a solution curve following

the direction field, it appears to hit the S-axis somewhere between S = 0 andS = b/a. That means that the disease has died out (no more infectives) withoutall susceptibles being used up—that is, not everybody gets the disease, unlike

338 DIFFERENTIAL EQUATIONS

Page 339: Notes on Calculus

the simple measles model of a previous section. Moreover, for a given initiallevel of infectives, that is, fixed I0, if we are beyond the threshold value thenthe larger the initial pool of susceptibles (the farther to the right the solutioncurve starts), the farther to the left it will hit the S-axis, that is, the more theinitial pool of susceptibles exceeds the threshold value, the fewer people willescape getting the disease. Adding more susceptibles increases the numberwho will get the disease by more than the the number added. The exerciseswill explore this phenomenon.Suppose we can immunize the population against the disease. This would

mean reducing the pool of susceptibles, but not changing the equations or thethreshold value. The discussion above suggests that to avoid an epidemic itis not necessary to immunize everyone, only to immunize enough so that theremaining susceptible population falls below the threshold.

EXERCISES.

1. Suppose we have an immunization that is only partially effective. Thiscould be interpreted as decreasing the infectiousness of the disease, that is,decreasing the parameter a while leaving the initial susceptible pool S0 and theinitial infected pool I0 at the same level as before. What difference would thatmake to the direction field? What difference would it make to the number ofpeople who contract the disease? Can an epidemic be prevented by a partially

effective immunization?

2. We can find a family of formulas for the solution curves which give I asa function of S. By the chain rule,

dI

dS=

dI/dt

dS/dt=

aSI − bI

−aSI = −1 + b

aS.

(a) Integrate with respect to S on both sides to get a formula for I as afunction of S. Do this as definite integrals from S0 to S1 to find the corre-sponding change I1 − I0.(b) Substitute I1 = 0 into your equation to get an equation for S∞, the

number of susceptibles left when the disease has run its course. It will dependon a, b, I0, and S0 and unfortunately cannot be solved for S∞ in terms of theseother quantities.

3. Set b/a = 2, I0 = 1 and solve the equation from 2(b) numerically forvarious values of S0, say S0 = 1, 2, 3, 4. Make a table showing S∞ and S0−S∞.What is the meaning of S0 − S∞? How does your table show at what levelof susceptibles the number of additional people getting the disease when thesusceptible pool is increased is greater than the number of susceptibles added?

SYSTEMS OF DIFFERENTIAL EQUATIONS 339

Page 340: Notes on Calculus

Index

antiderivative, 239area function, 242asymptote, 44average velocity, 90

Bisection method, 197

ceiling function, 14concave up or down, 177constant differences, 18continuity, 119cosh(x), 86critical point, 177cumulative distribution function, 293Curve, 80

average speed along, 148average velocity along, 147closed, 81initial, terminal times, 81instantaneous speed along, 150instantaneous velocity along, 149track, 81

definite integral, 213density

linear, 92derivative from the right (left), 115derivative of a function, 93difference quotient, 93Differential equation, 303

autonomous, 311, 331equilibrium solution, 310sink, source, node, 318stable, unstable, 311

initial condition, 305initial value problem, 305logistic equation, 307order, 305separable, 320slope field, 309

system, 329direction field, 331Lotka-Volterra, 330Romeo and Juliet, 329

distribution function, 293domain of function, 8

exponential function, 22constant ratios, 22

factorial, 209floor function, 14function, 8

area, 242average value, 236bounded, 210composition, 45continuous, 119decreasing, 58derivative, 93dominates, 68even, odd, 33function as rule, 8horizontal line test, 57increasing, 58inverse, 55monotonic, 58one-to-one, 55periodic, 33Riemann integrable, 213vertical line test, 57

Gabriel’s horn, 263graph of function, 8, 10

hyperbolic sine and cosine, 86

Improper integral, 262, 268instantaneous velocity, 91inverse function, 55

340

Page 341: Notes on Calculus

left hand sum, 210local linearity, 105logarithm, 60

common or natural, 61logarithmic derivative, 133lower sum, 212

mapping, 10mapping diagram, 10mean of a pdf, 298median, 298midpoint rule, 219monotonic function, 58

natural logarithm, 61Newton’s method, 197normal distribution, 301

one-sided derivative, 115

parameter, 21, 192Parametric equations, 79partition of an interval, 210periodic function, 33

amplitude, 34point of inflection, 178power function, 27present value, 287probability, 300probability density function, 295

right hand sum, 211

secant line, 94sigma notation, 208sign function, 14Simpson’s rule, 220sinh(x), 86Slope field, 250, 309slope of line, 16Snell’s Law, 191standard deviation, 299step function, 214substitution, 252sum

left hand, 210lower, 212

right hand, 211upper, 213

tangent line, 94Terminal velocity, 325trapezoidal rule, 219

upper sum, 213

Vector, 146length, 146

INDEX 341