1Chapters 1-4

8/6/2019 1Chapters 1-4

1/36

MATH283 Module notes

Field Theory and Partial Differential Equations

These notes contain all of the material which will be presented on the data projector

in lectures: that is, motivation, theory, and a few examples. Thus you should not need

to take detailed handwritten notes while the data projector is being used.

Most examples will be presented on the board, and are not contained in these notes.

Each time there is some board material, theres a little dagger in the margin of these

notes like this. You should ensure that these daggers are properly cross-referenced 24with the relevant parts of your written notes: perhaps the simplest way to do this is

to number sections of your written notes according to the number by each dagger in

these notes.Most section headings in the notes are followed in brackets by a reference to the

relevant section(s) of the module textbook, where you will usually find more information

and examples.

1

8/6/2019 1Chapters 1-4

2/36

8/6/2019 1Chapters 1-4

3/36

Figure 1.1: A typical contour map (image from Wikipedia)

field is constant, and the field increases or decreases most rapidly if you move

perpendicular to these contour surfaces.

The density of electric charge = (x,y ,z) is a scalar field (the quantity of charge

per unit volume).

The electrostatic potential = (x,y ,z) is a scalar field (it gives the energy of acharged test particle at each point in space).

Vector fields

Wind velocity is a vector which depends on position in space (as long as we stayin the atmosphere): that is, a vector field. It has both a magnitude (or size) and

a direction at each point of space. That is, it is a vector function v = v(x,y ,z)

of the coordinates (x,y ,z). Since a vector is described by three components, we

can describe this vector field as2

v = v(x,y ,z) = (v1(x,y ,z), v2(x,y ,z), v3(x,y ,z)).

Continuing the two-dimensional example of height above sea level, you can imag-ine drawing an arrow (a vector) at each point in Merseyside, which points di-

rectly uphill the direction in which the land is steepest (or has the greatest

gradient) and whose length is given by the steepness (gradient) of the land in

2Dont be tempted to say that v1(x,y,z), v2(x,y,z), and v3(x,y,z) are scalar fields though. Theyrenot, even though they are numbers which depend on p osition in space. The point is that they changeif you change your coordinate system, and physical quantities dont do that.

3

8/6/2019 1Chapters 1-4

4/36

that direction. This gives us a physical vector at each point, that is, a vector field.

This vector field v(x, y) is called the gradient of the scalar field h(x, y). It points

in the direction in which h(x, y) increases most rapidly, and is perpendicular to

the contour lines h(x, y) = constant.

The Electric current density J = J(x,y ,z) = (J1(x,y ,z), J2(x,y ,z), J3(x,y ,z)) isa vector field (it describes the motion of charge).

The Electric field E = E(x,y ,z) is a vector field (it describes the force exertedon charged test particles).

The Magnetic field B = B(x,y ,z) is a vector field (it describes the force exertedon moving charged test particles).

1.2 Partial differential equations: Maxwells equations

The laws of physics are generally described by differential equations. When the funda-

mental physical quantities involved are scalar or vector fields, which depend on three

spatial variables x, y, and z, as well as time t, these differential equations will be partial

differential equations.

An excellent example is provided by Maxwells equations, which describe how the

Electric and Magnetic fields E and B are related to their sources (charge density and

current density J) and to each other. Here they are:

E1x

+E2y

+E3z

=

0,

E3y

E2z

= B1t

,

E1z

E3x

= B2t

,

E2x

E1y

= B3t

,

B1x +

B2y +

B3z = 0,

B3y

B2z

= 0J1 +1

c2E1

t,

B1z

B3x

= 0J2 +1

c2E2

t,

B2x

B1y

= 0J3 +1

c2E3

t.

4

8/6/2019 1Chapters 1-4

5/36

Here

E = E(x,y ,z ,t) = (E1(x,y ,z ,t), E2(x,y ,z ,t), E3(x,y ,z ,t)) is the electric field;

B = B(x,y ,z ,t) = (B1(x,y ,z ,t), B2(x,y ,z ,t), B3(x,y ,z ,t)) is the magnetic field; = (x,y ,z ,t) is the electric charge density;

J = J(x,y ,z ,t) = (J1(x,y ,z ,t), J2(x,y ,z ,t), J3(x,y ,z ,t)) is the electric currentdensity;

0 is the electric constant (a universal constant);

c is the speed of light (also a universal constant); and

0 =1

0c2

, the magnetic constant (or vacuum permeability).

You can see that these equations are complicated, and indeed they are quite im-

possible to solve except in the simplest cases. However, one thing we can do is to

write them in a simpler and more revealing form: in notation that we will introduce in

Chapter 4, the eight equations above can be rewritten

E = 0

, (1.1)

E = Bt

, (1.2)

B = 0, (1.3)

B = 0J + 1c2

E

t. (1.4)

Of course, writing the equations in a simpler form doesnt make them simpler equa-

tions: it just means weve packed more into our notation. However, written like this

the equations are certainly more memorable, and, once you understand the notation,

they provide a more immediate insight into the physics too. Roughly speaking, the

four equations say:

(1.1) The flux of an electric field through a closed surface is equal to the total chargeenclosed by the surface divided by 0;

(1.2) Electric fields are generated by magnetic fields which change in time.

(1.3) The fluxof a magnetic field through a closed surface is zero (there are no magnetic

monopoles); and

(1.4) Magnetic fields are generated by electric fields which change in time, and by

electric currents.

5

8/6/2019 1Chapters 1-4

6/36

The statements about flux, and their equivalence to (1.1) and (1.3), are a con-

sequence of the famous divergence theorem of Gauss, which we will study in Chap-

ter 5. Similarly, Stokess theorem will provide macroscopic versions of equations (1.2)

and (1.4).We will comment on Maxwells equations at various points throughout the module,

and in particular on some special cases, in which the equations simplify:

The static case, in which Et = 0 and Bt = 0;

The free space case, in which = 0 and J = 0 (there are no charges in a vacuum).

However, we wont make any serious attempt to produce solutions of Maxwells

equations. In Chapter 6, we shall study methods of solution of some simpler equa-

tions: the wave equationand Laplaces equation(which are in fact related to Maxwells

equations in special cases); and the heat equation, which describes how tempera-

ture T(x,y ,z ,t) evolves in a body.

1.3 Coordinate systems

When we describe a scalar field f as f(x,y ,z), or when we resolve a vector (field) v into

its components v = (v1, v2, v3), we are relying on a choice ofcoordinate system: a choice

of a preferred point in space called the origin, a choice of three preferred directions for

our coordinate axes, and a choice of scale/units for measuring displacement in each of

the three coordinate directions.It is a basic principle that the laws of physics do not depend on the choice of

coordinate system. It would be a very strange universe if people observed different laws

of nature depending on how they set up their coordinates.

The coordinate systems alluded to above are all Cartesian systems, in which we

specify a point by its displacement from the origin in three orthogonal directions.

We shall also briefly consider, in Section 5.3, two examples of curvilinear coordinate

systems, namely cylindrical polar coordinates and spherical polar coordinates. These

can be seen as three-dimensional versions of the familiar polar coordinates ( R, ) in

two dimensions. They are convenient for studying problems with cylindrical symmetry

(e.g. the magnetic field due to a current along a straight wire) and spherical symmetry

(e.g. the electric field due to a point charge) respectively.

6

8/6/2019 1Chapters 1-4

7/36

Chapter 2

Revision of vectors

This chapter is a brief summary of those aspects of the elementary theory of vectors

which will be used in the remainder of the module.

2.1 Vectors and vector arithmetic (9.1)

A vector is a quantity which has both magnitude (size) and direction, as opposed to a

scalar which is just a number. Common examples of vectors include:

Velocity: the velocity of an aeroplane has both a magnitude (the speed of theplane) and a direction (the direction of travel).

Force: if a force is applied to a body, it has both a magnitude (the strength of

the force) and a direction (the direction in which it is applied).

Electric field: an electric field has both a magnitude (the size of the force whichit exerts on a charged particle) and a direction (the direction of that force).

Displacement: given two points, the displacement from the first to the secondhas both a magnitude (the distance between the points) and a direction (the

direction from the first to the second).

In three-dimensional space, we regard a vector as a directed line segment, whose

length gives the magnitude of the vector, and which points in the direction of thevector. If we have a coordinate system (i.e. a choice of x, y, and z axes), then we can

describe the vector by its components: three numbers which describe its extent in the

x, y, and z directions. (Alternatively, these three numbers are the coordinates of the

endpoint of the vector when its initial point is placed at the origin.) Thus, for example,

we writev = (5, 3, 4)

(or

v = (5, 3, 4) or v = (5, 3, 4)when writing on paper or a blackboard).

7

8/6/2019 1Chapters 1-4

8/36

An alternative notation is to write i = (1, 0, 0), j = (0, 1, 0) and k = (0, 0, 1) for the

three coordinate vectors, and then write

v = 5i

3j + 4k.

The notation (5, 3, 4) is clearly easier both to read and to write, and we shall use italmost all of the time.

There is, however, one advantage to the notation 5i3j+4k, which is that it makesit clear that were talking about a vector. When we write (5, 3, 4), it is ambiguouswhether we are referring to the point with coordinates (5, 3, 4), or the vector withcomponents (5, 3, 4). These are, of course, completely different things: a point has(is) a position, but no size or direction, whereas a vector has size and direction but no

position.1 (The concepts start to get confused when we talk about the position vector

of a point P, which is the displacement vector from the origin to P. However sincethe position vector of P depends on our choice of origin, which is arbitrary, it has no

physical meaning which is not to say that it isnt convenient when doing calculations

in a particular coordinate system.)

Recipe 2.1 (Magnitude of a vector)

By Pythagorass theorem, the magnitude |v| of a vector v = (v1, v2, v3) is given by

|v| =

v21 + v22 + v

23.

Vectors can be added, subtracted, and multiplied by scalars:

Recipe 2.2 (Vector addition, subtraction, and multiplication by a scalar)

Let v = (v1, v2, v3) and w = (w1, w2, w3) be vectors, and c be a scalar (a number).

Then

v + w = (v1 + w1, v2 + w2, v3 + w3),

v w = (v1 w1, v2 w2, v3 w3), andcv = (cv1, cv2, cv3).

Geometrically,

v + w is the vector obtained by concatenating the two vectors head to tail (seeFigure 2.1). (Physically, it expresses the combination of the two vectors. For

example, if a car is subject to a force F1 due to the thrust of its engine, and F2

due to friction, then the combined force on the car is F1 + F2.)

1This is true for vectors in flat three-dimensional space, the way weve defined them. Whenyou start to talk about vectors in curved spaces (or space-times), it is important to keep track of theposition of a vector. For example, the surface of the earth is a curved two-dimensional space. If wehave a vector in Liverpool and another in Suzhou, it doesnt make much sense to ask whether they arepointing in the same direction, or to add them, because of the curvature of the earth.

8

8/6/2019 1Chapters 1-4

9/36

v w is the vector from the end point of w to the end point of v, when v andw are placed with their initial points coinciding (see Figure 2.1).

cv is a vector whose magnitude is

|c|

times the magnitude ofv, and which points

in the same direction if c > 0, and the opposite direction if c < 0. For example,12

v has half the magnitude of v and the same direction, while 2v has twice themagnitude and the opposite direction.

w

v

v w

w

vv + w

Figure 2.1: Vector addition and subtraction

Example 2.3 (Vector arithmetic)

Let v = (5, 3, 4) and w = (0, 1, 1). Calculate |v|; |w|; v + w; and v 2w. 1

2.2 The scalar product (dot product) (9.2)

The scalar product or dot product is a way of combining two vectors v and w to get a

scalar v w. Here are some of the most important facts about the scalar product:

1. It is defined by v w = |v| |w| cos , where is the angle between the vectors v andw. (See Figure 2.2.)

2. It can be calculated by

v w = v1w1 + v2w2 + v3w3,

where v = (v1, v2, v3) and w = (w1, w2, w3).

3. v w = 0 (for non-zero vectors v and w) if and only if v and w are orthogonal(cos = 0). It has greatest magnitude when v and w are parallel (cos = 1).

4. The scalar product satisfies all the expected rules of normal arithmetic, e.g.

v w = w v,(cv)

w = c(v

w), and

u (v + w) = u v + u w

9

8/6/2019 1Chapters 1-4

10/36

This means you can calculate using the scalar product without having to think too

carefully (contrast this with the vector product in Section 2.4 below).

v

w

Figure 2.2: The scalar product: v w = |v| |w| cos

Interpretation of the scalar productIf w is a unit vector (that is, if |w| = 1), then v w is the component of v in thedirection of w. See Figure 2.3: since |w| = 1, it follows that the indicated length|v| cos = |v| |w| cos = v w.

(A special case of this is that the components ofv = (v1, v2, v3) along the coordinate

axes are given by v i = v1, v j = v2, and v k = v3.)

v

w

|v| cos

Figure 2.3: When |w| = 1, v w is the component of v in the direction of w

Combining the two formulae v w = |v| |w| cos and v w = v1w1 + v2w2 + v3w3gives a useful method for calculating the angle between two vectors.

Recipe 2.4 (Angle between two vectors)

Let v = (v1, v2, v3) and w = (w1, w2, w3) be two non-zero vectors. Then the angle

between them can be calculated by

= cos1

v w|v| |w|

.

Here v w should be calculated using v w = v1w1 + v2w2 + v3w3, and |v| and |w|using Recipe 2.1 (page 8).

Example 2.5 (Angle between two vectors)

Let v = (5, 3, 4) and w = (0, 1, 1). Calculate the angle between v and w. 2

10

8/6/2019 1Chapters 1-4

11/36

Important comment

A common mistake is for students to make a sort of vector product out of the scalar

product, by writing v

w = (v1w1, v2w2, v3w3).

Not only is this wrong (and the name scalar product should serve as a reminder),

but it makes no physical sense. To see why this is so, imagine that we defined a new

star product of vectors by saying that

v w = (v1w1, v2w2, v3w3).

Now Janet and John both try to calculate the star product of the two physical

vectors shown in the left hand side of Figure 2.4 (which lie in the x y plane).

John uses coordinate axes as shown in the middle of the figure (and the z axis

coming up out of the paper). He notes that v = (1, 0, 0) and w = (0, 1, 0), so v

w =

(0, 0, 0), the zero vector.

Janet, who tends to look at things diagonally, uses the coordinate axes shown on the

right of the figure (and, again, the z axis coming up out of the paper). So she works out

that v = (1/

2, 1/2, 0) and w = (1/2, 1/2, 0), so that v w = (1/2, 1/2, 0).That is, Whether or not v w is the zero vector depends on what coordi-

nate system we use. But the laws of physics really shouldnt care whether we use

Johns system or Janets system. So the star product cant have any physical meaning.

v

w

v

w

v

w

x

y

xy

Figure 2.4: The star product of two vectors makes no physical sense

Notice that the scalar product does not depend on the coordinate system which we

use, since it is defined as v w = |v| |w| cos . The magnitudes of the vectors and theangle between them dont depend on our choice of coordinates.

2.3 The equation of a plane (9.2)

In this section, we find the equation of a plane through a given point P with a given

normal vector n = (a,b,c). (In other words, n is perpendicular or orthogonal to the

plane.)

11

8/6/2019 1Chapters 1-4

12/36

Let P have coordinates (x0, y0, z0), and let R be a general point on the plane with

coordinates (x,y ,z) (see Figure 2.5). Since weve have chosen an origin O (otherwise

how could we talk about the equation of the plane?), we can talk about the position

vector p of the point P with components p = (x0, y0, z0), and the position vector r ofthe point R with components r = (x,y ,z).

O

P

R

n

p

r

PR

Figure 2.5: A plane through P with normal vector n

The crucial point is that, for all points R in the plane, the displacement vectorP R = r p is perpendicular to n. That is, n P R = 0, or n (r p) = 0, or

n r = n p.

Now the left hand side n

r = (a,b,c)

(x,y ,z) = ax + by + cz, while the right hand

side n p is just a constant (since n = (a,b,c) and p = (x0, y0, z0) are given).

Summary

The equation of a plane is of the form ax + by + cz = d, where a, b, c, and d areconstants. Moreover, the vector (a,b,c) is normal to the plane.

Recipe 2.6 (Equation of a plane through a given point with given normal vector)

To find the equation of the plane through the point P with normal vector n = (a,b,c):

1. The equation of the plane is ax + by + cz = d for some constant d.

2. To find the constant d, substitute the coordinates of the point P in the left hand

side.

Example 2.7 (Equation of a plane)

Find the equation of the plane through the point (2, 0, 1) with normal vector (1, 2, 1).3

Recipe 2.8 on page 14 shows how to find the equation of a plane through three

given points.

12

8/6/2019 1Chapters 1-4

13/36

2.4 The vector product (cross product) (9.3)

The vector product or cross product is a way of combining two vectors v and w to get

a vector v w. Here are some of the most important facts about the vector product:1. It is defined by v w = |v| |w| sin n, where n is a unit vector normal to both v

and w.

Ifv and w point in the same or opposite directions, then there is no unique direction

normal to both of them. This doesnt matter, since in this case = 0 or = , so

sin = 0 and v w = 0.Otherwise there is a choice of two vectors normal to both v and w. The conventionis to pick n by the right-hand rule. Imagine turning a right-handed corkscrew

through the angle < from v to w. Then n points in the direction that the point

of the corkscrew moves (see Figure 2.6). Of course this means that wv = vw.

v

w

v w

w v

Figure 2.6: The cross product

2. It can be calculated by

v w = (v2w3 v3w2, v3w1 v1w3, v1w2 v2w1),

where v = (v1, v2, v3) and w = (w1, w2, w3). More memorably, we can write this as

v w =

i j kv1 v2 v3w1 w2 w3

.

3. v w = 0 (for non-zero vectors v and w) if and only if v and w are parallel(sin = 0). It has greatest magnitude when v and w are orthogonal (sin = 1).

4. The vector product satisfies some of the expected rules of normal arithmetic, e.g.

(cv) w = c(v w) and

u (v + w) = u v + u w.

13

8/6/2019 1Chapters 1-4

14/36

However, there are some pitfalls! As already noted, the cross product is not com-

mutative, but ratherw v = v w.

Also it is not associative: that is u (v w) = (u v) w in general, and so itis meaningless to omit the brackets and just write u v w. In fact, we have thefollowing identities:

u (v w) = v(u w) w(u v), and(u v) w = v(u w) u(v w).

(To remember these identities: the two terms on the right hand side outside the

brackets are the two terms in brackets on the left hand side. The term with the +

sign corresponds to the middle vector on the left hand side (v in each case), while

the term with the sign corresponds to the other bracketed vector on the left handside.)

Verifying these identities is just a (longish) calculation.

Interpretation of the vector product

The magnitude |v w| = |v| |w| sin of v w is the area of the parallelogram definedby v and w (see Figure 2.7).

Area |v w|

v

w

Figure 2.7: The vector product as the area of a parallelogram

Often, the very fact that v w is normal to both v and w is a useful tool, as inthe following recipe.

Recipe 2.8 (Equation of a plane through three given points)

To find the equation of the plane through the three points P, Q, and R:

1. Calculate the displacement vectors d1 =P Q and d2 =

P R.

2. Calculate the vector product n = d1 d2.

3. Ifn = 0 then the three points lie on a common line, and so dont define a plane.

14

8/6/2019 1Chapters 1-4

15/36

4. Otherwise, n is normal to the plane. Proceed as in Recipe 2.6 (page 12).

Example 2.9 (Equation of a plane through three given points)

Find the equation of the plane through the three points (1 , 0, 0), (0, 1, 0), and (0, 0, 1). 4

15

8/6/2019 1Chapters 1-4

16/36

Chapter 3

Partial Differentiation

3.1 Partial derivatives (A3.2)

Recall that for a standard function f(x) of one variable, the derivative dfdx tells us the

rate of change of f(x) with respect to x: how much f(x) will change if we make a

small change in x. Informally, if we make an infinitesimal change dx to x, then the

corresponding change df of f is given by

df =df

dxdx.

Practically, if we make a very smallchange x to x, then the corresponding change f

of f is given approximately by

f dfdx

x.

For example, if f(x) = x2, then dfdx = 2x. Let x0 = 1, so that

dfdx(x0) = 2. If we

change x0 by x = 0.001, then the change in f is

f = f(1.001) f(1) = 0.002001 2x.

Now consider a function f(x,y ,z) of three variables. The partial derivative fx is

obtained by differentiating f(x,y ,z) with respect to x, while holding y and z constant.

It tells us how much f will change if we make a small change in x while keeping y and z

constant. Similarly we can define the partial derivatives fy and fz .

Thus the three partial derivatives tell us how fast f will change if we make a small

change in (x,y ,z) parallel to one of the coordinate axes: that is, to (x + x,y,z),

(x, y + y,z), or (x,y ,z + z). In Chapter 4 we will see how to determine the rate of

change of f when we make a small change to (x,y ,z) in an arbitrary direction.

Example 3.1 (Partial differentiation)

Compute fx ,fy , and

fz for

f(x,y ,z) = xex+y+z + 2y.

5

16

8/6/2019 1Chapters 1-4

17/36

3.2 Second derivatives (A3.2)

The partial derivatives fx ,fy and

fz are themselves functions of x, y, and z, which

can be differentiated again with respect to any of the three variables. This gives riseto second derivatives or second order partial derivatives

2f

x2=

x

f

x

,

2f

xy=

x

f

y

,

2f

yx=

y

f

x

,

2fz2

= z

fz

,

and so on.In fact, the order in which we carry out the two differentiations doesnt matter

(provided that f is sufficiently regular1), so that (for example) 2f

xy =2fyx . This is

Schwarzs theorem:

For sufficiently regular functions f(x,y ,z), the order of differentiation in second(and higher) partial derivatives doesnt matter.

Example 3.2 (Second order partial derivatives)

Let

f(x,y ,z) = x2yz sin(x + 2y).

Calculate 2f

xy and2fyx , and verify that they are equal. 6

3.3 Chain rules (9.6)

Recall the chain rule for functions of one variable: suppose that we have a function

f = f(u), where u itself is a function u = u(x) of a variable x. Then

df

dx=

df

du

du

dx.

1Here and in several other places throughout the module, we will require functions, curves, sur-faces etc. to be sufficiently regular. This means that there are some mathematical conditions on thefunctions, curves, or surfaces which need to hold in order for what we are saying to be true. However,these conditions will always hold for all normal examples: it is generally rather hard to constructexamples for which they do not hold. In the current case, for example, Schwarzs theorem states thatthe order in which we calculate the second order partial derivatives doesnt matter provided that allthe second order partial derivatives exist and are continuous.

In future we wont always be precise about what these regularity conditions are: they can be foundin the module textbook if youre interested.

17

8/6/2019 1Chapters 1-4

18/36

This is what youd expect. Ifu changes 3 times as fast as x (i.e. dudx = 3) and f changes

twice as fast as u (i.e. dfdu = 2), then f changes 6 times as fast as x (i.e.

dfdx = 6).

There are similar rules for functions of more than one variable. Before looking at

the general case, lets consider a couple of examples.

1. A common and useful case in this module is where f(x,y ,z) only depends on

the distance of the point (x,y ,z) from the origin: that is, f = f(r), where r =x2 + y2 + z2. In this case, the partial derivatives fx ,

fy , and

fz can be calcu-

lated using:

f

x=

df

dr

r

x,

f

y =df

dr

r

y , and

f

z=

df

dr

r

z.

(We write dfdr rather than

fr since f depends only on r.)

Now

r

x=

x

x2 + y2 + z2

1/2

= 2x 12

x2 + y2 + z21/2

(usual chain rule)

=x

(x2 + y2 + z2)1/2=

x

r.

Similarly, ry =yr and

rz =

zr .

If f = f(r), then fx = f(r)xr ,

fy = f

(r)yr , andfz = f

(r)zr .

Example 3.3 (Derivatives of 1/r using chain rule)

Let f(r) = 1/r (for r = 0), where r = x2 + y2 + z2 is the distance from the origin.Calculate fx ,

fy , and

fz . 7

2. Let f = f(x,y ,z), and suppose that x = x(t), y = y(t), and z = z(t) are themselves

functions of time t. Then the derivative dfdt can be calculated using:

df

dt=

f

x

dx

dt+

f

y

dy

dt+

f

z

dz

dt.

(Think of this as a particle moving in space, which is at position (x(t), y(t), z(t)) at

time t. We want to calculate the rate of change of f as measured by an observer

18

8/6/2019 1Chapters 1-4

19/36

sitting on the particle. The formula says that, as time evolves, the change in f is

given by the change due to motion in the x direction, plus the change due to motion

in the y direction, plus the change due to motion in the z direction.)

Example 3.4 (Chain rule)

Let f(x,y ,z) = x2yz, and suppose that x = x(t) = cos t, y = y(t) = sin t, and

z = z(t) = t. Calculate dfdt . 8

3.3.1 Dependency diagrams

There are many other cases which we can consider. Rather than memorizing lots of

different chain rules applicable in different situations, the appropriate chain rule can

be derived from a dependency diagram.

The dependency diagram is drawn by putting the function f at the top, havingdownwards pointing lines from f to each variable which it depends on, then downwards

lines from each such variable to each variable which it depends on, and so on. Then to

calculate the partial derivative off with respect to some variable, we add contributions

from each separate downward path from f to that variable. Its easier to see from some

examples . . .

1. Let f = f(r), where r = r(x,y ,z) =

x2 + y2 + z2. The dependency diagram is

shown in Figure 3.1. To calculate fx , consider the paths in the diagram from ff

r

x y z

Figure 3.1: The dependency diagram for f = f(r), r = r(x,y ,z)

to x: there is only one, indicated with the dotted line. This goes from f to r to x,

and tells us that

fx = dfdr rx .

2. Let f = f(x,y ,z), where x = x(t), y = y(t), and z = z(t). The dependency

diagram is shown in Figure 3.2. To calculate dfdt , consider the paths in the diagram

from f to t. There are three, indicated with the dotted lines. The first, from f to x

to t, gives a contribution fxdxdt ; the second, from f to y to t, gives a contribution

fy

dydt ; and the third, from f to z to t, gives a contribution

fz

dzdt . Adding the three

contributions gives

df

dt =

f

x

dx

dt +

f

y

dy

dt +

f

z

dz

dt .

19

8/6/2019 1Chapters 1-4

20/36

f

x y z

t

Figure 3.2: The dependency diagram for f = f(x,y ,z), x = x(t), y = y(t), z = z(t)

Example 3.5 (Chain rule)

Let f = f(r, ), where r = r(x,y ,z) and = (x, y). Draw the dependency diagram

and give chain rules to computef

x ,f

y , andf

z .Suppose f(r, ) = cos r , and that r =

x2 + y2 + z2 and = tan1(y/x). Calcu-

late fx . 9

20

8/6/2019 1Chapters 1-4

21/36

Chapter 4

Vector Differential Calculus

4.1 The gradient of a scalar field (9.7)

4.1.1 Directional derivatives and the gradient

Let f = f(x,y ,z) be a scalar field, that is, a scalar function of position in space. Weve

seen that the partial derivatives fx ,fy , and

fz give the rate of change of f as we move

parallel to one of the coordinate axes. How can we calculate the rate of change of f as

we move in an arbitrary direction?

Let a = (a1, a2, a3) be a unit vector in the direction in which we want to calculate

the rate of change, and P be the point, with coordinates (x0, y0, z0), at which we want

to calculate the rate of change. The question we are asking, therefore, is: we start

at the point P and set out in the direction of the vector a how fast does f change

as we move? This rate of change is called the directional derivative Daf of f in thedirection a (at the point P). (So, for example, if a = i = (1, 0, 0), then Dif = fx .)

To calculate the directional derivative, note that if we start at P (i.e. at (x0, y0, z0)),

and move at unit speed in the direction a = (a1, a2, a3), then our coordinates at time t

are given by

(x(t), y(t), z(t)) = (x0, y0, z0) + t(a1, a2, a3) = (x0 + ta1, y0 + ta2, z0 + ta3).

Then

Daf =d

dtf(x(t), y(t), z(t))

=f

x

dx

dt+

f

y

dy

dt+

f

z

dz

dt(chain rule)

= a1f

x+ a2

f

y+ a3

f

z

= a

f

x,

f

y,

f

z

.

The vector fieldfx ,

fy ,

fz

is called the gradient of (the scalar field) f, and is

21

8/6/2019 1Chapters 1-4

22/36

written f, i.e.

f =

f

x,

f

y,

f

z

.

(Youll sometimes though not in this module see the gradient off written as grad f.)

Thus we have shown

The directional derivative of a scalar field f in the direction of a unit vector a isgiven by

Daf = a f.

Example 4.1 (Gradient and directional derivative)

Let f(x,y ,z) = x2 + y2 z2. Calculate f, and determine the directional derivativeof f in the direction of the vector (1, 1, 0). 10

Example 4.2 (Gradient of a r)Let a be a constant vector. Show that (a r) = a, where r = (x,y ,z). 11

In the notation f = (fx , fy , fz ), it is helpful to think of

=

x,

y,

z

multiplying f. Described like this, is not a vector in the sense we have defined them

(it doesnt have a magnitude or a direction), but in an important way which is beyondthe scope of this module it behaves like a vector1. It is called a vector differential

operator. In sections 4.2 and 4.3 we shall consider the divergence

v = v1x

+v2y

+v3z

and the curl

v =

i j kx

y

z

v1 v2 v3

of a vector field v = (v1, v2, v3).

4.1.2 Properties of the gradient

1. The gradient f of a scalar field f is defined by

f =

f

x,

f

y,

f

z

.

It is a vector field,

f =

f(x,y ,z).

1It transforms in the same way as a vector under coordinate changes, at least in flat space

22

8/6/2019 1Chapters 1-4

23/36

2. Let dr be an infinitesimal displacement vector. Then the infinitesimal change df

in f due to this infinitesimal displacement is given by

df = dr f.To be more precise (although our talk of infinitesimalchanges is informal), let P be

a point with position vector r = (x,y ,z). Then

df = f(r + dr) f(r) = dr f(x,y ,z).

This is just a restatment of what we already know about the directional derivative.

A unit vector in the direction of dr is given by dr|dr|

, so the rate of change off in the

direction of dr is

Ddrf = dr|dr| f.

Thus when we move an infinitesimal distance |dr| in the direction of dr, the in-finitesimal change in f is given by

df = |dr|Ddrf = dr f.

3. The interpretation off to keep in mind is:f points in the direction in which f is increasing most rapidly. Its magnitude isthe rate of change in that direction.

For let a be a unit vector describing a direction. The rate of change of f in the

direction of a is given by

Daf = a f = |a| |f| cos = |f| cos ,

where is the angle between a and f. This is greatest when cos = 1, i.e. = 0so a is in the direction off. In this case, Daf = |f|.The rate of change of f is most negative when cos = 1, i.e. when = , or a isin the opposite direction to f. This is the direction in which f is decreasing mostrapidly.

The rate of change is zero when cos = 0, i.e. when = /2, or a is normal to f.

The rate of change of f in the direction a is zero if and only if a is normal to f.This observation will be expanded on in Section 4.1.3 below.

Example 4.3 (Gradient and its properties)

Let f(x,y ,z) = x2 + y2. Calculate f, and check the properties outlined above. 12

For a two-dimensional example, let h(x, y) be the height of the ground above a point

in Liverpool with coordinates (x, y) in some coordinate system. A map would show

23

8/6/2019 1Chapters 1-4

24/36

the curves h(x, y) = constant as contour lines, and at any given point h is normalto the contour line through that point: it points in the direction in which h increases

most rapidly, i.e. directly uphill.

4. f = 0 (everywhere) if and only if f is constant.

For f = 0 says exactly that fx = 0, fy = 0, and fz = 0, i.e. f does not dependon any of the coordinates x, y, and z.

5. The gradient satisfies sum, product, and quotient rules just like the familiar deriva-

tive. That is, if f and g are two scalar fields then

(f g) = f g,(f g) = fg + gf, and

f

g

=

gf fgg2

.

You can see that these hold by using the ordinary sum, product, and quotient rules

on each component of the gradient. For example (to take the most complicated

case), the first component of

f

g

=

xf

g

,

yf

g

,

zf

g

is (by the usual quotient rule)

x

f

g

=

g fx fgxg2

,

which is the first component of

gf fgg2

,

and similarly for the other two components.

Example 4.4 (Gradient using rules of differentiation)

Calculate the gradient of the function

f(x,y ,z) =(x 1)(y 1)

x2y2z2 + 1.

13

24

8/6/2019 1Chapters 1-4

25/36

6. Suppose that f depends only on the distance r =

x2 + y2 + z2 from the origin:

that is, f = f(r). Then f(r) = f(r)r/r, where r = (x,y ,z) is the position vectorof the point (x,y ,z).

If f = f(r) then

f = f(r) rr

= f(r)r.

This is what we would expect: first, f is constant on the spheres r = constant, so

f should be normal to these spheres, i.e. point in the direction of r (directly awayfrom or towards the origin). Secondly, the rate of change of f as we move along a

straight line through the origin is f(r), so that |f| = f(r) as expected.As calculated in Section 3.3, fx = f

(r)xr ,fy = f

(r)yr , andfz = f

(r)zr . Thus

f =

f

x,

f

y,

f

z

=

f(r)

r(x,y ,z) = f(r)

r

r.

Example 4.5 (Gradient of f(r))

Compute 1r2

. 14

4.1.3 Tangent planes

Let f = f(x,y ,z). The equation f(x,y ,z) = C, where C is a constant, defines a surface

in three-dimensional space. For example, iff(x,y ,z) = x2

+ y2

+ z2

, then f(x,y ,z) = 1defines a sphere of radius 1 centred on the origin. Similarly, x2 + 2y2 + 3z2 = 4 defines

an ellipsoid, and x2 + 2y2 3z2 = 4 defines a hyperboloid.If P is a point on the surface (i.e. f(P) = C), then we can consider the tangent

plane to the surface at P. Since f is constant on the surface, the rate of change of f

at P as we move in any direction in the tangent plane is zero. From property 3 in

Section 4.1.2 this means that f is normal to the tangent plane. The equation of thetangent plane can therefore be calculated using Recipe 2.6.

Recipe 4.6 (Equation of a tangent plane)

To find the equation of the tangent plane to the surface f(x,y ,z) = C at a point P of

the surface:

1. Calculate f and evaluate it at the point P: f(P) = n.

2. Then the tangent plane passes through P and has normal vector n. Its equation

can therefore be calculated using Recipe 2.6 (page 12).

Example 4.7 (Tangent plane)

Calculate the equation of the tangent plane to the ellipsoid x2 + 2y2 + 3z2 = 4 at the

point (1, 0, 1). (The ellipsoid and tangent plane are shown in Figure 4.1.) 15

25

8/6/2019 1Chapters 1-4

26/36

Figure 4.1: The ellipsoid x2 + 2y2 + 3z2 = 4 and a tangent plane at (1, 0, 1)

4.2 The divergence of a vector field (9.8)

Let v = v(x,y ,z) = (v1(x,y ,z), v2(x,y ,z), v3(x,y ,z)) be a vector field. The divergence

v of v is the scalar field defined by

v = v1x

+v2y

+v3z

(you can look at it as the scalar product of the vector differential operator =x ,

y ,

z

and v). Youll sometimes though not in this module see the divergence

of v written as div v.

Example 4.8 (Divergence)

Calculate the divergence of the vector field v = (x2 + y2, 2xyz,x2 + z2). 16

Interpretation of the divergence

Roughly speaking, the divergence v of v at a point P measures (as the namesuggests) how much the vector field is divergingfrom the point P. Lets try to be a bit

more precise, though a real understanding of the divergence will have to wait until we

study the divergence theorem in Section 5.5.

Consider a small ball of radius a centred at P. The flux of v through the boundary

of the ball measures how much the vector field is flowing out of the ball: for instance,

26

8/6/2019 1Chapters 1-4

27/36

if we think of v as describing how a fluid is moving, then the flux tells us the rate at

which the fluid is leaving the ball (or entering it, if the flux is negative).

The divergence measures the flux of v out of a very small ball (thus, as above, this

is intuitively the amount that the vector field is diverging away from P): if a is verysmall, then

v(P) FluxV

,

where V = 43

a3 is the volume of the ball. (To be more precise, the limit of the right

hand side of the above formula as the radius a tends to zero is v.)For example, consider the two-dimensional vector field v = (x, y) depicted in Fig-

ure 4.2 (were working in two dimensions so that we can draw the pictures). The

divergence is

v = xx

+ yy

= 1 + 1 = 2,

reflecting the fact that the vector field is diverging everywhere. If we change the sign

of the vector field to v = (x, y) (so the directions of the arrows in Figure 4.2reverse), then the divergence becomes 2: there is a net inflow of the vector field intoany small ball.

On the other hand, if we take the field v = (y, x) depicted in Figure 4.3, we get

v = yx

+(x)

y= 0 + 0 = 0,

reflecting the fact that the vector field is not diverging (if we draw a small ball anywhere,

the net flux out of the ball is zero: as much of the field is flowing in as is flowing out).

Figure 4.2: The (irrotational) vector field v = (x, y)

A vector field with zero divergence is called incompressible. The motivation comes

again from the fluid analogy: if a fluid is incompressible, then the total amount of fluid

in any given ball has to remain constant, so the divergence must be zero everywhere.

27

8/6/2019 1Chapters 1-4

28/36

Figure 4.3: The (incompressible) vector field v = (y,

x)

(For example, Maxwells equation (1.3) tells us that the magnetic field is incompressible:

B = 0.)A vector field v is defined to be incompressible if

v = 0.

The term solenoidal is also used.

Rules of differentiation for the divergence

The divergence satisfies rules which look (mostly) similar to the normal rules of differ-

entiation. Suppose that u and v are vector fields, and f is a scalar field. Then

(u v) = u v, (fv) = v f + f v,

v

f

=

f v v ff2

, and

(u v) = v ( u) u ( v).

Well show the formula for (fv) as an example (the formula for (uv) wontbe used in this module and is non-examinable not to mention that it doesnt make

28

8/6/2019 1Chapters 1-4

29/36

sense until after weve studied the curl, v, in Section 4.3).

(fv) = x

(f v1) +

y(f v2) +

z(f v3)

=

f

xv1 + f

v1x

+

f

yv2 + f

v2y

+

f

zv3 + f

v3z

=

v1

f

x+ v2

f

y+ v3

f

z

+

f

v1x

+ fv2y

+ fv3z

= (v1, v2, v3)

f

x,

f

y,

f

z

+ f

v1x

+v2y

+v3z

= v f + f v.

Example 4.9 (Divergence using rules of differentiation)Calculate the divergence of the vector field u = exyz(y ,z ,x). 17

4.3 The curl of a vector field (9.9)

Let v = v(x,y ,z) = (v1(x,y ,z), v2(x,y ,z), v3(x,y ,z)) be a vector field. The curlvof v is the vector field defined by

v =

i j kx

y

z

v1 v2 v3

=

v3

y

v2

z

,v1

z

v3

x

,v2

x

v1

y

(you can look at it as the vector product of the vector differential operator =x ,

y ,

z

and v). Youll sometimes though not in this module see the curl of v

written as curl v.

Example 4.10 (Curl)

Calculate the curl of the vector field v = (x2 + y2, 2xyz,x2 + z2). 18

Interpretation of the curl

Roughly speaking, the curl v of v at a point P measures (as the name suggests)how much and in what direction the vector field is curling around the point P. Lets

try to be a bit more precise, though a real understanding of the curl will have to wait

until we study Stokess theorem in Section 5.6.

Consider a small disk of radius a with unit normal vector n centred at P. Thecirculation of v around the boundary of the disk measures how much the vector field

is circulating around the boundary (in the direction that you would turn a corkscrew

to move the point in the direction of n): see Figure 4.4.

29

8/6/2019 1Chapters 1-4

30/36

n

a

P

Figure 4.4: Circulation around a small disk with unit normal n

The component of the curl in the direction n measures the circulation of v around

a very small disk: if a is very small, then

n ( v(P)) CirculationA

,

where A = a2 is the area of the disk. (To be more precise, the limit of the right hand

side of the above formula as the radius a tends to zero is n ( v).)For example, consider the vector field v = (y, x, 0) depicted in Figure 4.3 (the

figure shows v in each plane z = constant). The curl is

v =

0

y (x)

z,

y

z 0

x,

(x)x

yy

= (0, 0, 2),

reflecting the fact that the vector field is rotating clockwise around the z-axis (if we turn

a corkscrew in the direction of the vector field, then the point moves in the negative z

direction). If we change the sign of the vector field to

v = (

y,x, 0) (so the directions

of the arrows in Figure 4.3 reverse), then the curl becomes (0, 0, 2): turning a corkscrew

in the direction of the vector field moves the point in the positive z direction.

On the other hand, if we take the vector field v = (x,y, 0) depicted in Figure 4.2.

The curl is

v =

0

y y

z,

x

z 0

x,

y

x x

y

= (0, 0, 0)

reflecting the fact that this vector field is not curling around.

A vector field with zero curl is called irrotational, since intuitively it has no rotation

in it. For example, Maxwells equation (1.2),

E =

B

t

, tells us that if the magnetic

field B is steady (i.e. does not change with time, so Bt = 0), then the electric field is

irrotational: E = 0.A vector field v is defined to be irrotational if

v = 0.

Rules of differentiation for the curl

The curl satisfies rules which look (mostly) similar to the normal rules of differentiation.

Suppose that u and v are vector fields, and f is a scalar field. Then

30

8/6/2019 1Chapters 1-4

31/36

(u v) = u v, (fv) = f v + f v, v

f

= f v (f) v

f2, and

(u v) = (v )u (u )v + u( v) v( u).

Consider as an example the formula for (fv) (the formula for (uv) wontbe used in this module and is non-examinable). Now

(fv) =

i j kx

y

z

f v1 f v2 f v3

,

so the first component of (fv) is given by

y(f v3)

z(f v2) = f

v3y

+f

yv3 fv2

z f

zv2

= f

v3y

v2z

+

f

yv3 f

zv2

,

which is the first component offv+fv: similarly for the other two components.

Example 4.11 (Curl using rules of differentiation)Calculate the curl of the vector field u = exyz(y ,z ,x). 19

4.4 Second order derivatives (Some information in 9.8

and 9.9)

Recall that the gradient f takes as input a scalar field and gives as output a vectorfield; the divergence v takes as input a vector field and gives as output a scalarfield; and the curl v takes as input a vector field and gives as output a vector field(see Figure 4.5).

Scalar fields Vector fields

gradient

divergence

curl

Figure 4.5: Input and output for gradient, divergence, and curl

There are therefore 5 possible second order derivatives: if we start with a scalar

field f, we have the divergence of the gradient (f) and the curl of the gradient

31

8/6/2019 1Chapters 1-4

32/36

(f); while if we start with a vector field v, we have the gradient of the divergence( v), the divergence of the curl ( v), and the curl of the curl ( v).In this section we shall look briefly at each of these five second order derivatives.

4.4.1 The curl of the gradient

Let f be any sufficiently regular2 scalar field. Then

(f) =

f

x,

f

y,

f

z

=

i j kx

y

z

fx

fy

fz

=

2f

yz

2f

zy,

2f

zx

2f

xz,

2f

xy

2f

yx

= 0

by Schwarzs theorem (Section 3.2). That is, the curl of any gradient is zero, or any

gradient is irrotational.

A very important fact is that the converse of this statement is true:

A vector field v is irrotational if and only if there is a scalar field with

= v.Such a scalar field is called a (scalar) potential for the vector field v.

We shall indicate in Section 5.6 why this is true: for the moment, please take it ontrust.

For example, Maxwells equation (1.2),

E = Bt

,

tells us that the electric field E is irrotational provided that the magnetic field B is

steady (i.e. Bt = 0). Thus E =

f for some scalar field f. In fact, the familiar

electrostatic potential (giving, for instance, the potential difference between two

points) is defined3 to be f: that is,

E = .

The potential for an irrotational vector field v is uniquely determined up to a

constant. For suppose that and are both potentials for v, so that = = v.Then ( ) = = v v = 0, so is a constant, i.e. and differ bya constant.

2All second order derivatives exist and are continuous3

The reason for the

sign is so that gives the potential energy of a unit test charge in a staticelectric field

32

8/6/2019 1Chapters 1-4

33/36

Example 4.12 (Scalar potential)

Let v be the vector field (2 + yz,xz,xy + 2z). Show that v is irrotational, and find a

scalar potential for v. 20

4.4.2 The divergence of the curl

Let v be any sufficiently regular4 vector field. Then

v =

v3y

v2z

,v1z

v3x

,v2x

v1y

,

and so

(

v) =

x

v3

y v2

z +

y

v1

z v3

x +

z

v2

x v1

y

=

2v3xy

2v2

xz

+

2v1yz

2v3

yx

+

2v2zx

2v1

zy

=

2v1yz

2v1

zy

+

2v2zx

2v2

xz

+

2v3xy

2v3

yx

= 0

by Schwarzs theorem (Section 3.2). That is, the divergence of any curl is zero, or any

curl is incompressible.

The converse of this statement is true (though we wont be using this in this mod-ule):

A vector field v is incompressible if and only if there is a vector field A with

A = v.Such a vector field A is called a (vector) potential for the vector field v.

For example, Maxwells equations tell us that the magnetic field B is incompressible.

Thus B = A for some vector field A called the magnetic vector potential.The vector potential for an incompressible vector field v is only determined up to

addition of an irrotational vector field. For suppose that A and A are both vector

potentials for v, so that A = A = v. Then (AA) = AA = 0,so A A is irrotational and hence A A = for some scalar field . For thisreason, additional conditions are normally imposed on the vector potential in order to

remove at least some of the freedom of choice: such conditions are called a choice ofgauge.

4All second order derivatives of each component exist and are continuous

33

8/6/2019 1Chapters 1-4

34/36

4.4.3 The divergence of the gradient (Laplacian)

Let f be a scalar field. Then

(f) =

x

, y

, z

fx

, fy

, fz

=2f

x2+

2f

y2+

2f

z2.

This quantity is called the Laplacianof f, written 2f.The Laplacianof a scalar field f is defined by

2f = (f) = 2f

x2+

2f

y2+

2f

z2.

Example 4.13 (Laplacian)

Let f(x,y ,z) = x2y + y2z + z2x. Calculate 2f. 21

For example, suppose that the magnetic field is steady, so that there is an electro-

static potential with

E = .Then Maxwells equation E = /0 becomes

2 = 0

.

If in addition the charge density is zero (for example, in free space), then this becomes

2 = 0.

The former equation 2 = f(x,y ,z) is called Poissons equation, and the specialcase 2 = 0 is called Laplaces equation. These are among the most important equa-tions of mathematical physics, having relevance (for example) in hydrodynamics and

gravitation, as well as electrostatics5

. We shall study solutions of Laplaces equationin Chapter 6.

Example 4.14 (Laplacian of 1/r)

Show that

2

1

r

= 0 (r = 0).

225Solutions of Laplaces equation are called harmonic functions. A basic fact is that the real and

imaginary parts of complex differentiable functions satisfy the (two-dimensional) Laplace equation.

34

8/6/2019 1Chapters 1-4

35/36

4.4.4 The curl of the curl

The curl of the curl also arises from the study of Maxwells equations. Recall that,

since the magnetic field is incompressible (

B = 0), it can be written as the curl of

a vector potential:

B = A.Then Maxwells equation (1.4) becomes

( A) = 0J + 1c2

E

t.

There is an identity for the curl of the curl:

( v) = ( v) 2v(where

2v means (

2v1,

2v2,

2v3), the vector Laplacian).

The proof of this identity is a long calculation, which is a good test of your under-

standing of the material of this chapter if you can do it.

In the absence of charge and current (for example, in free space), Maxwells equa-

tions become:

E = 0, (4.1)

E = Bt

, (4.2)

B = 0, and (4.3)

B = 1c2

Et

. (4.4)

Taking the curl of (4.4) gives

( B) = 1c2

( E)t

= 1c2

2B

t2(using (4.2)).

Now by the boxed identity above, ( B) = ( B) 2B = 2B, sincethe magnetic field is incompressible ( B = 0) by (4.3). Hence

2B = 1c2

2B

t2.

This is the wave equation (with propagation speed c): see Chapter 6. Similarly,

taking the curl of (4.2) and substituting (4.4) gives

2E = 1c2

2E

t2.

Thus in free space, both the electric and magnetic fields satisfy the wave equation,

with propagation speed c, the speed of light.

35

8/6/2019 1Chapters 1-4

36/36

4.4.5 The gradient of the divergence

We saw the (v), the gradient of the divergence of the vector field v in Section 4.4.4above. We will not have much further use for it in this module, although it does have

relevance in physics and engineering (for example, in the propagation of elastic waves).

1Chapters 1-4

Documents