Part I:ReviewofEssentialMaterialdclarke/PHYS3200/documents/review.pdf · These notes, covering much of Chapters 1 and 3 in Arfken, Weber, and Harris (ed. 7) are givenasa...

Part I: Review of Essential Material

These notes, covering much of Chapters 1 and 3 in Arfken, Weber, and Harris

(ed. 7) are given as a reading assignment accompanied by a problem set on the

first day of classes. This material, as well as that from Chapters 2 (matrices),

5 (vector spaces), and 6 (eigenvalue problems), should be familiar to students

from previous math and physics classes.

1.1 Series (AWH §1.1–1.4)

1.1.1 Convergence of series

A sequence of values, a1, a2, a3, . . . , forms a series when all terms are added:

a1 + a2 + a3 + . . .

We write a finite series as: SN =N∑

i=1

ai, an infinite series as: S∞ =∞∑

i=1

ai.

The nth partial sum of an infinite series is Sn =n∑

i=1

ai.

If limn→∞

Sn = S∞ exists, infinite series is convergent ; otherwise, divergent.

Note that finite series (finite number of finite terms) are always convergent.

Example 1.1. Divergent and convergent series.

∞∑

i=1

i = 1 + 2 + 3 + . . . → ∞ is divergent.

∞∑

i=1

1

2i=

1

2+

1

4+

1

8+ . . . → ∞ is convergent

1

Review of Essential Material 2

The partial sum of the geometric series is:

Sn =

n−1∑

i=0

ari = a+ ar + ar2 + · · · + arn−1 = a1− rn

1− r,

which can be confirmed easily by multiplying through by (1− r). Thus,

S∞ = limn→∞

Sn =

a

1− rfor r < 1 (convergent);

∞ for r > 1 (divergent).

In physics, series are usually infinite (with, however, some very important

exceptions), and we seek conditions for convergence.

Comparison test

- general; may be applied to any series

- sensitivity depends on availability of suitable comparator series.

Consider two series: A =∞∑

i=1

ai, and U =∞∑

i=1

ui. If, term by term, 0 ≤ ui ≤ ai

and if A is convergent, then U is convergent.

Consider two series: B =∞∑

i=1

bi, and V =∞∑

i=1

vi. If, term by term, 0 ≤ bi ≤ vi

and if B is divergent, then V is divergent.

The convergence tests that follow are essentially comparison tests, some bet-

ter disguised than others.

Cauchy (d’Alembert) ratio test

- general; may be applied to any series

- relatively insensitive; series often declared “indeterminate”.


Given the series S∞ =∞∑

i=1

ai, then;

limi→∞

ai+1

ai

< 1 S∞ converges;

> 1 S∞ diverges;

= 1 indeterminate.

Example 1.2. Test the geometric series for convergence.

S∞ =

∞∑

i=0

ari ⇒ limi→∞

ai+1

ai= lim

i→∞

ari

ari−1= r,

which, by the ratio test, converges for r < 1, diverges for r > 1.

Example 1.3. Test the harmonic series for convergence.

S∞ =

∞∑

i=1

1

i⇒ lim

i→∞

ai+1

ai= lim

i→∞

i

i+ 1= 1,

which, by the ratio test, is indeterminate.

Gauss’ test

- not general; may be applied only to series of specific form

- very sensitive; no indeterminacy.

Consider the series S∞ =∞∑

i=1

ai where ai > 0 ∀ i. If ai/ai+1 has the form:

aiai+1

= 1 +h

i+

B(i)

i2,

where B(i) is finite as i → ∞, then

h > 1 ⇒ series converges;

h ≤ 1 ⇒ series diverges.


Alternate form of Gauss’ test: If ai/ai+1 has the form:

aiai+1

=i2 + p1i+ p0i2 + q1i+ q0

, p1, p0, q1, q0 constants,

then,p1 > q1 + 1 ⇒ series converges;

p1 ≤ q1 + 1 ⇒ series diverges.

Example 1.4. Test the harmonic series for convergence.

S∞ =∞∑

i=1

1

i⇒

aiai+1

=i+ 1

i= 1 +

1

i+

0

i2,

which has the desired form for the first version of Gauss’ test. Since B(i) = 0

is finite and h = 1, the series diverges. Alternately,

aiai+1

=i+ 1

i=

i

i

i+ 1

i=

i2 + (1)i+ 0

i2 + (0)i+ 0,

and p1 = 1, q1 = 0. Since p1 ≤ q1 + 1, the series diverges.

In either case, the ratio ai/ai+1 had to be “beat into” the desired form which,

in some cases, may take a little imagination.

Cauchy (Maclaurin) integral test

- allows analysis of a wider selection of series

- no indeterminacy.

Consider the series S∞ =∞∑

i=n

ai where ai = f(i), i, n ∈ Z, and where f(x) is

a continuous, monotonically decreasing function for x > x0 ∈ R. Then,

∫ ∞

n

f(x)dx

{

is finite ⇒ series converges;

∞ ⇒ series diverges.


Example 1.5. Test the series, S∞ =∞∑

i=2

1

i ln i, for convergence.

ratio test: limi→∞

ai+1

ai= lim

i→∞

i ln i

(i+ 1) ln(i+ 1)= 1 ⇒ indeterminate.

Gauss’ test:aiai+1

=(i+ 1) ln(i+ 1)

i ln i, which can’t be beat into desired form.

Cauchy∫test:

∫ ∞

2

dx

x lnx=

∫ ∞

2

d(lnx)

ln x= ln(lnx)

∣∣∣

∞

2= ∞ ⇒ divergent.

1.1.2 Alternating series

An alternating series is one in which the sign changes with every term:

S∞ =∞∑

i=0

ui ≡∞∑

i=0

(−1)iai, ai > 0.

Alternating series converge more rapidly than positive definite series.

Leibnitz criterion for convergence: If ai is monotonic decreasing with i and

limi→∞

ai = 0, the alternating series is convergent.

Note that this criterion is insufficient to determine convergence of a positive-

definite series.

Consider the convergent series S =∞∑

i=0

ui.

- If∞∑

i=0

|ui| converges, S is said to be absolutely convergent.

- If∞∑

i=0

|ui| diverges, S is said to be conditionally convergent.


Example 1.6. Test for convergence the alternating series,

S =

∞∑

i=1

(−1)i+1

i= 1−

1

2+

1

3−

1

4+ . . . .

Since S is an alternating series,ai+1

ai< 1, and lim

i→∞ai = 0, S converges by the

Leibnitz criterion.

However, the absolute series (the harmonic series) diverges (Ex. 1.3), and

thus S is conditionally convergent.

To illustrate how strange conditionally convergent series can be, let’s find the

value to which S converges.

Arranging the terms of S as follows:

S = 1−(12 −

13

)−(14 −

15

)− . . . ,

it is evident that S < 1.

However, we can also arrange the terms as such:

S =(1 + 1

3+ 1

5

)

︸︷︷︸

S1

− 12

︸︷︷︸

S2

+(17+ 1

9+ 1

11+ 1

13+ 1

15

)

︸︷︷︸

S3

− 14

︸︷︷︸

S4

+(

117+ · · ·+ 1

29

)

︸︷︷︸

S5

− 16

︸︷︷︸

S6

+ . . . ,

where,

S1 = 1.5333; S3 = 1.5218; S5 = 1.5143; . . .

S2 = 1.0333; S4 = 1.2718; S6 = 1.3476; . . .

}

⇒ S = 1.5 > 1!


Thus, the value to which a conditionally convergent series converges can

depend on the order in which the terms are added!

On the other hand, for absolutely convergent series:

- convergent value is independent of order in which terms are added;

- product of two such series is also absolutely convergent whose value is

the product of the individual series values.

Improving convergence

Consider the alternating series representation of the natural log:

ln(1 + x) = −∞∑

i=1

(−x)i

i, (1.1.1)

where, by the Leibnitz criterion, the series converges for |x| < 1.

What does it mean for the series not to converge for x = 2, say, on the RHS

when the LHS, ln 3, is a perfectly legitimate number?

First, we’ve already seen that alternating series can have strange behaviour.

Examine the partial sums, Sn = −

n∑

i=1

(−x)i

i, for x = 2 and increasing n.

n Sn

1 2

2 03 2.667

4 −1.3335 5.067

6 −5.67 12.686

8 −19.3149 37.575

0

−10

Sn

3 = 1.0986ln

2 3 4 5 6 7 8 91n

10

20

30

40

−20


Sn(n) oscillates about ln 3 with ever increasing amplitude and, in this way,

the series does not converge.

The series in Eq. (1.1.1) converges for |x| < 1, and most rapidly for |x| → 0.

For |x| → 1, rate of convergence (e.g., number of terms required for a given

accuracy) can be improved by multiplying by a suitable polynomial in x.

Example 1.7. We can improve the convergence rate of ln(1+x) by multiplying

it by (1 + ax). To wit,

(1 + ax) ln(1 + x) = −(1 + ax)

∞∑

i=1

(−x)i

i= −

∞∑

i=1

(−x)i

i+ a

∞∑

i=1

(−x)i+1

i

= x−∞∑

i=2

(−x)i

i+ a

∞∑

i=2

(−x)i

i− 1

= x−∞∑

i=2

(−x)i(1

i−

a

i− 1

)

= x−

∞∑

i=2

(−x)ii(1− a)− 1

i(i− 1).

For a = 1, we get:

ln(1 + x) =x

1 + x+

1

1 + x

∞∑

i=2

(−x)i

i(i− 1),

and the series now converges as i−2 rather than i−1, as in Eq. (1.1.1).

1.1.3 Double sums, and the algebra of series

∞∑

i=1

ai is just a sum; the algebra of∑

signs is just the algebra of addition.

First, there is no requirement that the symbol i be used for the index:∞∑

i=1

ai =∞∑

j=1

aj =∞∑

α=1

aα =∞∑

♣=1

a♣.


The index is a dummy variable since it disappears once the sum is performed:3∑

b=1

ab = a1 + a2 + a3, and b no longer appears on the RHS.

Thus, during the algebra of sums, dummy indices may be changed as needed.

Second, where the index begins and ends can be altered so long as its use on

the series term is altered accordingly:∞∑

i=1

ai =∞∑

i=0

ai+1 =−∞∑

i=−1

a−i =∞∑

i=1

(a2i + a2i−1),

and so it goes. If in doubt, write out the first few terms of each sum to

confirm the equality.

Note that the last equality changes the order in which terms are added, and

the series must be absolutely convergent for this not to change the sum value.

Note also the last equality is effectively a “sum of sums”: a double sum. In

fact, a double, triple, whatever number of sums is, in the end, simply a sum.

Consider two collections of numbers represented in a sequence and a table:

sequence: a0, a1, a2, . . . table: b00, b01, b02, . . .b10, b11, b12, . . .

b20, b21, b22, . . ....

...... . . .

A one-to-one association can be made between the as and bs:

m

= 2m

= 0m

= 1

a0 = b00a1 = b10a2 = b01a3 = b20a4 = b11a5 = b02...

...

b00 b01 b02 . . . i = 0

b10 b11 b12 . . . i = 1

b20 b21 b22 . . . i = 2...

......

j = 0 j = 1 j = 2n = 0 n = 1 n = 2


which means the two lists have the same number of elements. By adding the

bs in the order they appear along the diagonals, we see by inspection that:

S =∞∑

i=0

ai =∞∑

m=0

(m∑

n=0

bm−n,n

)

=∞∑

i=0

∞∑

j=0

bij. (1.1.2)

Thus, S can be expressed as a single infinite sum, an infinite sum of finite

sums, or an infinite sum of infinite sums. In all cases, each sum has the same

number of terms. All that’s different is the order of addition which, for S

absolutely convergent, makes no difference.

To derive Eq. (1.1.2) without the aid of a diagram we perform an algebra of

series ; essentially a careful manipulation of indices.

Example 1.8. Show that:

S =∞∑

i=0

∞∑

j=0

bij =∞∑

m=0

m∑

n=0

bm−n,n. (1.1.3)

Taking inspiration from the indices on b in Eq. (1.1.3), we let:

0 ≤ i = m− n < ∞ ⇒ n ≤ m < ∞0 ≤ j = n < ∞ ⇒ 0 ≤ n < ∞

}

⇒0 ≤ n ≤ m0 ≤ m < ∞︸︷︷︸

required limits on n and m

What we did is replace dummy indices, i and j, with new dummy indices,

m−n and n, and then determine what the limits on m and n must be. Since

m is the upper limit on n, the sum on m must appear before the sum on n.

Bringing these ideas together, we get Eq. (1.1.3), as desired.

Example 1.9. Let f(x) =

∞∑

i=0

aixi and g(x) =

∞∑

i=0

bixi be two polynomials.

Find h(x) = f(x)g(x).


Start with: h(x) =∞∑

i=0

aixi

∞∑

j=0

bjxj,

where the second index has been changed to j to keep it distinct from the

first. Then, by expanding the sums,

h(x) =(a0 + a1x+ a2x

2 + . . .)(b0 + b1x+ b2x

2 + . . .)

= a0b0 + a0b1x+ a0b2x2 + . . .

+ a1b0x+ a1b1x2 + . . .

+ a2b0x2 + . . .

= a0b0 + (a0b1 + a1b0)x+ (a0b2 + a1b1 + a2b0)x2 + . . .

⇒ h(x) =∞∑

m=0

xmm∑

n=0

anbm−n (check it to see!) (1.1.4)

Expanding out sums is never the elegant way to do these problems. Instead,

try performing an algebra of series.

First, sum signs commute with anything not involving its index. Thus,

h(x) =∞∑

i=0

aixi

∞∑

j=0

bjxj =

∞∑

i=0

∞∑

j=0

aibjxi+j.

Now, redefine the indices so a single index is the power on x. Thus, let

m = i+ j; n = j ⇒ i = m− n; j = n

Now, 0 ≤ i < ∞ ⇒ 0 ≤ m− n < ∞ ⇒ n ≤ m < ∞.

Further, 0 ≤ j < ∞ ⇒ 0 ≤ n < ∞ which, with the above, means:

0 ≤ n ≤ m and 0 ≤ m < ∞.

Pulling this together, we get:

h(x) =

∞∑

m=0

m∑

n=0

am−nbnxm =

∞∑

m=0

xmm∑

n=0

am−nbn. (1.1.5)


With indices redefined, the power on x no longer depends on the index of the

second sum (the fact that n ≤ m doesn’t count!), and xm can slip through

the second sum to rest against the first sum whose index is m. This amounts

to factoring out all like powers of x, as was done to derive Eq. (1.1.4).

The reversal of the subscripts on a and b in Eq. (1.1.4) and (1.1.5) just

means the terms in parentheses before Eq. (1.1.4) are added in reverse order.

(Expand it out to see for yourself!)

1.1.4 Taylor and Maclaurin series

On integrating the nth derivative of f(x), f (n)(x), over [x0, x], we get:∫ x

x0

f (n)(x1)dx1 = f (n−1)(x)− f (n−1)(x0).

Integrating a second time, we get:∫ x

x0

∫ x2

x0

f (n)(x1)dx1dx2 =

∫ x

x0

(

f (n−1)(x2)− f (n−1)(x0))

dx2

= f (n−2)(x)− f (n−2)(x0)− (x− x0)f(n−1)(x0),

and a third time,∫ x

x0

∫ x3

x0

∫ x2

x0

f (n)(x1)dx1dx2dx3

=

∫ x

x0

(

f (n−2)(x3)− f (n−2)(x0)− (x3 − x0)f(n−1)(x0)

)

dx3

= f (n−3)(x)− f (n−3)(x0)− (x− x0)f(n−2)(x0)−

(x− x0)2

2f (n−1)(x0),

and so on, until the nth time:

Rn ≡

∫ x

x0

∫ xn

x0

· · ·

∫ x2

x0

f (n)(x)dx1dx2 · · · dxn

= f(x)− f(x0)− (x− x0)f′(x0)− · · · −

(x− x0)n−1

(n− 1)!f (n−1)(x0).


Solving for f(x), we get Taylor’s expansion:

f(x) = f(x0) + (x− x0)f′(x0) + · · ·+

(x− x0)n−1

(n− 1)!f (n−1)(x0) +Rn,

where, by the mean value theorem, the “remainder term”, Rn, is given by:

Rn =(x− x0)

n

n!f (n)(ξ),

for some x0 ≤ ξ ≤ x. With Rn retained, this expansion is exact.

For limn→∞

Rn = 0, we get the Taylor Series :

f(x) =

∞∑

i=0

(x− x0)i

i!f (i)(x0). (1.1.6)

For x0 = 0, we get the Maclaurin series :

f(x) =∞∑

i=0

xi

i!f (i)(0). (1.1.7)

Example 1.10. Express f(x) = ex as a Maclaurin series.

Since f (i)(x) = ex = 1 for x = 0, Eq. (2) becomes:

ex =

∞∑

i=0

xi

i!(1) = 1 + x+

x2

2+

x3

3!+ . . .

The ratio test can be used to show this series converges ∀ finite x.

1.1.5 Binomial expansion

The binomial expansion (sometimes called a power or series expansion) is

the Maclaurin series for the function (1 + x)y, y ∈ R:

(1 + x)y = 1 + yx +y(y − 1)

2x2 +

y(y − 1)(y − 2)

3!x3 + · · ·+Rn,


where the remainder term is given by:

Rn =xn

n!(1 + ξ)y−ny(y − 1) . . . (y − n+ 1); 0 ≤ ξ ≤ x.

For y ∈ Z > 0, at the nth term where n = y + 1, Rn = 0 and the binomial

expansion is exact with a finite number of terms. No surprise! Such an

expansion is just (1 + x)y multiplied out:

(1 + x)y =

y∑

i=0

y!

i!(y − i)!xi ≡

∞∑

i=0

(y

i

)

xi,

where

(y

i

)

is read “y choose i”, the number of ways i items can be chosen

from y ≥ i objects. In this context, it is called the binomial coefficient.

For n > y ∈ R (when the binomial expansion becomes an alternating series),

(1 + ξ)y−n is a maximum for ξ = 0 ⇒

Rn ≤xn

n!y(y − 1) . . . (y − n+ 1) → 0 as n → ∞ if 0 ≤ x < 1,

and the series converges by the Leibnitz criterion. If x ≥ 1, the series diverges

(in the oscillatory fashion illustrated on page 7).

So what if x > 1? Is there no way to find a convergent binomial expansion?

Sure there is! Choose x0 > (x− 1)/2, and then:

(1+x)y =(1+x0+(x−x0)

)y= (1+x0)

y

(

1 +x− x0

1 + x0

)y

= (1+x0)y(1+ξ)y,

where ξ ≡x− x0

1 + x0< 1 since x0 > (x − 1)/2. By pulling out the factor

(1+ x0)y, a binomial expansion is performed on (1+ ξ)y, which converges by

virtue of ξ < 1.

Example 1.11. Find a power series expansion for f(x) = (x0 + x)a, where

x0 > x.


For convergence, we need something of the form (1+ ξ)a, where ξ < 1. Thus,

f(x) = (x0 + x)a = xa0

(

1 +x

x0

)a

= xa0

(

1 + aξ +a(a− 1)

2ξ2 +

a(a− 1)(a− 2)

3!ξ3 + . . .

)

,

where ξ = x/x0 < 1.

Exercise: Repeat for x0 < x.

Example 1.12. The total relativistic energy of a particle of rest mass m0 is:

E = mc2 = m0γc2 =

m0c2

√

1− β2=

m0c2

√

1− v2/c2,

where v < c is the speed of m0 relative to an observer, and c is the speed of

light. Write down the first three terms of a power series in v2 for E.

The binomial expansion is performed on γ:

γ = (1− β2)−1/2

= 1 +(− 1

2

)(−β2) +

(− 1

2

)(− 3

2

)(−β2)2

2!+(− 1

2

)(− 3

2

)(− 5

2

)(−β2)3

3!+ . . .

= 1 +β2

2+

3β4

8+

15β6

48+ ...

⇒ E = m0c2

(

1 +v2

2c2+

3v4

8c4+ . . .

)

= m0c2 +

1

2m0v

2 +3m0

8c2v4 + . . .

Evidently, the relativistic energy of motion consists of the rest mass energy

(m0c2), the Newtonian kinetic energy, and higher-order correction terms.

1.2 Mathematical induction

Mathematical Induction is a technique designed to prove a general form for

an expression indexed, say, by an integer n. The basic procedures are to:


1. prove the assertion true for the simplest case (e.g., for n = 1);

2. assume the assertion to be true for n = m− 1 where m is arbitrary;

3. prove the assertion true for n = m.

In so doing, the assertion is proved.

Example 1.13. It was asserted without proof that the binomial expansion of

f(x) = (1 + x)y for y ∈ Z is:

f(x) = (1 + x)y =∞∑

n=0

xn

n!f (n)(0) =

∞∑

n=0

y!

n!(y − n)!xn.

Prove the last equality, which amounts to proving:

f (n)(x)︸︷︷︸

LHS

=y!

(y − n)!(1 + x)y−n

︸︷︷︸

RHS

⇒ f (n)(0) =y!

(y − n)!, ∀ n. (1.2.1)

Following the steps of mathematical induction,

1. Prove Eq. (1.2.1) true for n = 0.

LHS: f (0)(x) = f(x) = (1 + x)y

RHS:y!

(y − 0)!(1 + x)y−0 = (1 + x)y

⇒ LHS = RHS; true for n = 0.

2. Assume Eq. (1.2.1) true for n = m− 1. Thus,

f (m−1)(x) =y!

(y −m+ 1)!(1 + x)y−m+1. (1.2.2)

3. Prove Eq. (1.2.1) true for n = m. For this, differentiate Eq. (1.2.2) to get:

d

dxf (m−1)(x) = f (m)(x) =

y!

(y −m+ 1)!

d

dx(1 + x)y−m+1


=y!(y −m+ 1)

(y −m+ 1)!(1 + x)y−m =

y!

(y −m)!(1 + x)y−m,

which agrees with Eq. (1.2.1) for n = m. Thus, the assertion is proved.

How does mathematical induction actually work?

By proving the assertion (Eq. 1.2.1) true at a starting point (in this case,

n = 0), the assumption in step 2 is true for least at m = 1. Step 3 then

proves the assertion for n = 1.

Now go back to step 1 where, by virtue of the just completed step 3, “update”

the starting point to n = 1. Step 2 is therefore valid for m = 2 and step 3

proves the assertion for n = 2, etc., etc.

Mathematical Induction is a “proof by recursion”. It lends itself well to prob-

lems where a general formula has been proposed based on patterns detected

in the first few terms, and a formalism is needed to confirm the “assertion”.

1.3 Multivariate calculus (AWH §3.5–3.8)

1.3.1 Partial derivatives

The partial derivatives of the multivariate

function, f(x, y), are defined as:

∂f

∂x≡ ∂xf ≡ lim

h→0

f(x+ h, y)− f(x, y)

h;

∂f

∂y≡ ∂yf ≡ lim

h→0

f(x, y + h)− f(x, y)

h.

)y = constant

y

x slope of tangent = x

z

f

= f(x,y

∂xf = rate of change of f with respect to x holding y constant.

In univariate calculus, f(x) differentiable at x0 ⇔ f(x) continuous at x0.


This is not the case for multivariate calculus.

Example 1.14. Consider the bivariate function,

f(x, y) =

2xy

x2 + y2(x, y) 6= (0, 0);

0 (x, y) = (0, 0).

For ∂xf to exist at (0, 0), we must have f(x, y) continuous for y = 0

- since f(x, 0) = 0 ∀ x, f(x, 0) is continuous, and ∂xf exists at (0, 0).

- similarly, ∂yf exists at (0, 0).

Yet, along the line y = x,

f(x, x) =

2x2

2x2= 1 x 6= 0;

0 x = 0,

and clearly f(x, y) is not continuous at (0, 0) along y = x.

∂xf and ∂yf are derivatives of f(x, y) in specific directions. To determine the

continuity of f(x, y) at a point, derivatives must be checked in all directions.

Thus, even if ∂xf and ∂yf exist at (x0, y0), this does not necessarily means

f(x, y) is continuous there.

1.3.2 Mixed partials

For bivariate functions, there are four second order partial derivatives:

∂2f

∂x2≡ ∂xxf ;

∂

∂y

∂f

∂x=

∂2f

∂y∂x≡ ∂yxf ;

∂2f

∂y2≡ ∂yyf ;

∂

∂x

∂f

∂y=

∂2f

∂x∂y≡ ∂xyf.


For n-variate functions, there are n2 such constructs.

The quantities ∂yxf and ∂xyf are known as mixed partials.

Theorem 1.1. (Stated without proof) If f , ∂xf , ∂yf , ∂yxf , and ∂xyf are all

continuous in a region R, then ∂yxf = ∂xyf in R.

1.3.3 The gradient

The gradient of a trivariate function, φ(x, y, z), is given by:

∇φ ≡ (∂xφ, ∂yφ, ∂zφ),

where ∇ is called the nabla or del operator, and is formally defined as:

∇ ≡ (∂x, ∂y, ∂z).

Don’t confuse ∇ with a proper vector whose components can be evaluated.

Only when ∇ operates on a function, φ, can the components be evaluated.

∇ is an operator, not a function, not a number.

Consider the polar vector, ~r = (x, y, z) ⇒ r =√

x2 + y2 + z2; d~r = ıdx +

dy + kdz. Further, consider a differential of the function φ(x, y, z):

dφ =∂φ

∂xdx+

∂φ

∂ydy +

∂φ

∂zdz = ∇φ · d~r, (1.3.1)

using the chain rule. Thus, we occasionally write:

∇φ =dφ

d~r,

though this is somewhat abusive of notation since vectors aren’t supposed to

appear in denominators!


Example 1.15. Find ∇φ for φ = φ(r) [and not φ(~r)].

From the chain rule, we have:

∇φ(r) = ı∂φ

∂x+

∂φ

∂y+ k

∂φ

∂z= ı

dφ

dr

∂r

∂x+

dφ

dr

∂r

∂y+ k

dφ

dr

∂r

∂z

=dφ

dr

(

ıx

r+

y

r+ k

z

r

)

=dφ

drr. (1.3.2)

Interpretation of the gradient

φ

φφ

φ3

φ2

φ1

From Eq. (1.3.1), dφ is a maximum when d~r ‖

∇φ ⇒ as a vector, ∇φ points in the direction of

maximum (local) rate of change. Thus the direc-

tion of “steepest decent” is along the gradient.

Conversely, dφ is a minimum (0) when d~r ⊥ ∇φ.

Thus, along lines ⊥ to the gradient (“contours”), dφ = 0 and φ = constant.

- Since ~E = −∇V , electric field lines are ⊥ to equipotential surfaces

(surfaces of constant V ).

- For a conservative force, ~F = −∇φ, where φ is the potential. Thus, ~F is

⊥ to lines of constant φ. Bookshelves built parallel to the ground means

shelves lie along equipotentials⇒ component of gravitational force along

shelves is zero ⇒ books don’t slide off!

Ski hill analogy topologicalcontours"fall line"

Consider a ski hill where f(x, y) =

height over a point (x, y) at sea level.

Topological contours are the loci of


points of constant f(x, y)—points at the same height above sea level.

A skier resting on the hill aligns her skis tangent to the local “topological

contour” (⊥ ∇f) so that she will neither slide forward nor backward.

The “fall line” is the direction of steepest decent, ‖ ∇f and thus ⊥ to local

topological contour. Thus, a resting skier always points her skis⊥ to direction

in which she is gazing, namely down fall line; the fastest way down the hill!

Application: critical points for multivariate functions

For univariate calculus, a critical point of g(x) is when dg/dx = 0. Further,

d2g

dx2

> 0 ⇒ minimum;

< 0 ⇒ maximum;

= 0 ⇒ indeterminate (min., max., or inflection point).

For bivariate calculus, a critical point of f(x, y) is when:

∂xf = 0 and ∂yf = 0 ⇒ ∇f = 0.

The nature of the critical point is determined from the determinant:

D = −

∣∣∣∣∣

∂xxf ∂xyf

∂yxf ∂yyf

∣∣∣∣∣= −∂xxf ∂yyf +

(∂xyf

)2,

invoking Theorem 1.1 for sufficiently “well-behaved” functions. Then,

D

> 0 ⇒ “saddle point”;

< 0 ⇒ minimum if ∂xxf (or ∂yyf) > 0 at point;

⇒ maximum if ∂xxf (or ∂yyf) < 0 at point;

= 0 ⇒ indeterminate.

saddle pointminimum maximum


1.3.4 The divergence

The divergence of the vector function ~A(x, y, z) (in Cartesian coordinates) is:

∇ · ~A =(ı∂x + ∂y + k∂z

)·(ıAx + Ay + kAz

)= ∂xAx + ∂yAy + ∂zAz,

since ı, , and k form an orthonormal set.

Example 1.16. Find ∇ · ~A for ~A = ~r.

∇ · ~r = ∂xx+ ∂yy + ∂zz = 3.

Yes, it’s as simple as that!

Example 1.17. Find ∇ · ~A for ~A = ~rf(r).

∇ · ~A = ∂x(xf(r)

)+ ∂y

(yf(r)

)+ ∂z

(zf(r)

)

= f(r) + x∂xf(r) + f(r) + y∂yf(r) + f(r) + z∂zf(r).

Now, by the chain rule,∂f(r)

∂x=

df(r)

dr

∂r

∂x=

df(r)

dr

x

r⇒

∇ · ~A = 3f(r) +df(r)

dr

(x2

r+

y2

r+

z2

r

)

= 3f(r) + rdf(r)

dr.

Specifically, if f(r) = rn−1, then,

∇ · ~A = ∇ · (~rrn−1) = ∇ · rnr

= 3rn−1 + rdrn−1

dr= 3rn−1 + (n− 1)rn−1 = (n+ 2)rn−1.

For n = −2, this reduces to: ∇ ·

(r

r2

)

= 0, an extremely important result,

since Nature seems to like inverse square laws.


Interpretation of the divergence

In fluid dynamics, ∇ ·~v (~v the velocity field) is a measure of the compression

or expansion of the flow. In 1-D, ∇ · ~v → ∂xvx, and,

∂xvx

{

< 0 ⇒ fluid slowing down ⇒ compression (to conserve mass);

> 0 ⇒ fluid speeding up ⇒ expansion.

Large negative values of ∂xvx ⇒ fluid shocks.

If ∇ · ~v = 0 everywhere always, fluid is incompressible.

In general, the divergence of a vector field may be thought of as the flux of

a vector field leaving or entering a closed surface.

P

V

S

V

S

In the top case, all field lines of vector field ~V originate

from point P contained within closed surface S. All

field lines leave S, none enter⇒ ∇·~V > 0. Conversely,

if all field lines enter S and none leave, ∇ · ~V < 0.

In the bottom case, all field lines originate from out-

side S and as many field lines of ~V enter S as leave it

⇒ ∇ · ~V = 0.

Maxwell’s first relation for the electric field is: ∇ · ~E = ρ/ǫ0, where ρ is the

positive or negative charge density. This is an example of the top case.

Maxwell’s second relation for the magnetic field is: ∇ · ~B = 0 (no magnetic

charges, or monopoles). This is an example of the bottom case.

A vector field with zero divergence is said to be solenoidal.


1.3.5 The curl

The curl of ~A (in Cartesian coordinates) is: ∇× ~A ≡

∣∣∣∣∣∣

ı k∂x ∂y ∂zAx Ay Az

∣∣∣∣∣∣

⇒ ∇× ~A = ı (∂yAz − ∂zAy) + (∂zAx − ∂xAz) + k (∂xAy − ∂yAx).

The curl contains all the “cross derivatives”, while the divergence contains

all the “compressive derivatives”.

“Product rule” for curls:

[∇× (f ~A)

]

x= ∂y

(fAz

)− ∂z

(fAy

)= f

(∂yAz − ∂zAy

)+ Az∂yf −Ay∂zf

= f[∇× ~A

]

x+[(∇f)× ~A

]

x

⇒ ∇× (f ~A) = f∇× ~A+ (∇f)× ~A.

Example 1.18. Evaluate ∇× ~A for ~A = ~r.

From the “product rule”,

∇×(~rf(r)

)= f(r)∇× ~r +∇

(f(r)

)× ~r. (1.3.3)

Now, ∇× ~r =

∣∣∣∣∣∣

ı k

∂x ∂y ∂zx y z

∣∣∣∣∣∣

= 0 since all cross derivatives are zero.

This plus Eq. (1.3.2) ⇒

∇×(~rf(r)

)=

df(r)

drr × ~r = 0.

A vector field whose curl is zero is said to be irrotational.


Interpretation of the curl

x

yv

vx

y

In fluids, ∇ × ~v (vorticity) is a measure of flow cir-

culation. As fluid circulates, vx = vx(x, y) and vy =

vy(x, y) and, in general, ∇× ~v 6= 0.

In a shear layer where, as a function of y, ~v goes

from ∝ −ı to ∝ +ı, ∇× ~v = −∂yvxk 6= 0.

1.3.6 Vector identities

Let ~A, ~B, ~C, ~D be four vector functions of the coordinates, and let f and g

be two scalar functions of the coordinates. Then,

~A · ( ~B × ~C) = ~B · ( ~C × ~A) = ~C · ( ~A× ~B);

~A× ( ~B × ~C) = ( ~A · ~C) ~B − ( ~A · ~B) ~C;

( ~A× ~B) · ( ~C × ~D) = ( ~A · ~C)( ~B · ~D)− ( ~A · ~D)( ~B · ~C);

∇(fg) = f∇g + g∇f ;

∇(f/g) =g∇f − f∇g

g2;

∇( ~A · ~B) = ( ~B · ∇) ~A+ ( ~A · ∇) ~B + ~B × (∇× ~A) + ~A× (∇× ~B);

∇ · (f ~A) = f∇ · ~A+ ~A · ∇f ;

∇ · ( ~A× ~B) = ~B · (∇× ~A)− ~A · (∇× ~B);

∇× (f ~A) = f∇× ~A+∇f × ~A;

∇× ( ~A× ~B) = ( ~B · ∇) ~A− ( ~A · ∇) ~B − ~B(∇ · ~A) + ~A(∇ · ~B);

∇× (∇f) = 0;

∇ · (∇× ~A) = 0.


1.3.7 Theorems of vector calculus

Theorem 1.2. Gauss’ Theorem: Let V be a volume with surface Σ. Then,

∮

Σ

~A · d~σ =

∫

V

∇ · ~AdV (1.3.4)

∮

Σ

φ d~σ =

∫

V

∇φ dV (1.3.5)

∮

Σ

~A× d~σ = −

∫

V

∇× ~AdV (1.3.6)

V

Σ

σd

A

Proof: Consider an arbitrarily small cube centred at (x, y, z) of volume δV =

δx δy δz in the presence of a vector field ~A.

z

x y

δy xδ21δzx,y, z−(Az )

x,y(A )y − z21δ ,y 2

1δxx(A )x ,− zy,

x,y(A )y + z21δ ,y

21δxx(A )x ,+ zy,

21δzx,y, z+(Az )

zδ

The ı face at (x+ 12δx, y, z) is δ~σx = δy δz ı

(direction points outside cube). Thus,

~A · δ~σx

∣∣∣x+ δx

2

= Ax

(x+ 1

2δx, y, z)δy δz.

Similarly, the ı face at (x − 12δx, y, z) is

δ~σx = δy δz(−ı) (direction points outside cube) and,

~A · δ~σx

∣∣∣x− δx

2

= −Ax

(x− 1

2δx, y, z)δy δz.

Thus, their sum is:

~A · δ~σx

∣∣∣x+ δx

2

+ ~A · δ~σx

∣∣∣x− δx

2

=(

Ax

(x+ 1

2δx, y, z)− Ax

(x− 1

2δx, y, z))

δy δz

=δAx

δxδx δy δz =

δAx

δxδV.

By including the sums on the other four faces, the sum of ~A · δ~σ over the

entire (closed) surface of the cube is:

∑

cube

~A · δ~σ =

(δAx

δx+

δAy

δy+

δAz

δz

)

δV = ∇ · ~A δV, (1.3.7)


in the limit δx → dx, etc.

2 Σ1

V1V2

δσ2,R

1,Lδσ

− xδx

z

y

x

21δxx(A )x ,− z

Σ

,y

x

Now, introduce a second cube, Σ2 (red), at

(x−δx, y, z) that shares the (x− 12δx, y, z)

face with the first cube, Σ1 (blue).

Adding the contributions of the two cubes,

the RHS of Eq. (1.3.7) is the sum of the two values of ∇ · ~A δV .

Likewise, the LHS of Eq. (1.3.7) is the sum of all twelve faces, although the

contributions from the interior faces cancel. The right face of Σ2, δ~σ2,R ∝ +ı

while the left face of Σ1, δ~σ1,L ∝ −ı. Thus,

~A · δ~σ2,R = − ~A · δ~σ1,L,

and the LHS of Eq. (1.3.7) is just the sum from all exterior faces.

This is true no matter how many cubes we bring together; only the exterior

faces (the surface of the aggregate) contribute to the LHS of Eq. (1.3.7) while

all cubes contribute fully to the RHS.

V

A

ΣNow, redraw the original figure divided into “Rie-

mann cubes” that approximate the form of the vol-

ume V . Writing Eq. (1.3.7) for this case, we get:

∑

exterior faces

~A · δ~σ =∑

all cubes

∇ · ~A δV, (1.3.8)

In the limit as δx → dx, the sum on the LHS of Eq. (eq:sumgauss) becomes a

surface integral and the sum on the RHS becomes a volume integral, leaving

us with: ∮

Σ

~A · d~σ =

∫

V

∇ · ~AdV,

proving the first form of Gauss’ theorem (Eq. 1.3.4).


To get the second “flavour” of Gauss’ Theorem (Eq. 1.3.5), note that Eq.

(1.3.4) is valid for all vectors including ~A = φn, where φ is a scalar function

and n is an arbitrary, constant unit vector. In this case, Eq. (1.3.4) becomes:∮

Σ

φn · d~σ =

∫

V

∇ · (φn) dV

⇒ n·

∮

Σ

φ d~σ =

∫

V

(φ✟✟✟✟✯

0∇ · n+ n · ∇φ

)dV = n ·

∫

V

∇φ dV

⇒ n ·

(∮

Σ

φ d~σ −

∫

V

∇φ dV

)

= 0.

The only vector whose dot product with any unit vector is zero is the null

vector, and thus we get:

∮

Σ

φ d~σ =

∫

V

∇φ dV,

proving the second version of the theorem (Eq. 1.3.5).

Finally, I leave proving the third flavour, Eq. (1.3.6), as an exercise. [Hint:

substitute a vector of the form ~A = n× ~B into Eq. (1.3.4).]

Example 1.19. (Problem AWH3.8.2) Show that1

3

∮

Σ

~r · d~σ = V , where V is

a volume with surface Σ.

From Eq. (1.3.4),∮

Σ

~r · d~σ =

∫

V

∇ · ~r︸︷︷︸

3

dV = 3V ⇒1

3

∮

Σ

~r · d~σ = V.

Example 1.20. (Problem AWH3.8.4) For the electric field, ~E = −∇ϕ (ϕ =

electrostatic potential) and ∇ · ~E = ρ/ǫ0 (ρ = charge density). Show that:∫

ρϕ dV = ǫ0

∫

E2dV,


where integration is taken over all space, and where φ ∼ ra, a ≤ −1.

Consider the function ϕ~E. Then, from Eq. (1.3.4),∫

V

∇ · (ϕ~E) dV =

∮

Σ

ϕ~E · d~σ = −

∮

Σ

ϕ∇ϕ · d~σ,

where integration is over all space ⇒ surface integral at ∞.

Since ϕ ∼ r−1 (or faster), ∇ϕ ∼ r−2 and integrand ∼ r−3. Integration surface

∼ r2, and thus entire integral ∼ r−1. Therefore, as r → ∞, surface integral

vanishes, and we have:∫

V

∇ · (ϕ~E) dV =

∫

V

ϕ∇ · ~E︸︷︷︸

ρ/ǫ0

dV +

∫

V

~E · ∇ϕ︸︷︷︸

− ~E

dV = 0

⇒

∫

ϕρ dV = ǫ0

∫

E2 dV,

as desired.

Exercise: (Problem AWH3.8.3) If ~B = ∇ × ~A, show that

∮

Σ

~B · d~σ = 0.

(Like Ex. 1.19, this is a one-liner.)

Theorem 1.3. Stokes’ Theorem: Let S be an open surface with perimeter

C. Then,∮

C

~A · d~l =

∫

S

∇× ~A · d~σ (1.3.9)

∮

C

φ d~l = −

∫

S

∇φ× d~σ (1.3.10)

∮

C

~A× d~l = −

∫

S

(d~σ ×∇)× ~A (1.3.11)

ld

A

S

C

σd

Proof: Consider an arbitrarily small and flat square centred at (x, y, z) of

surface area δ~σ = δx δy k immersed in a vector field ~A.


j

i

y,21δxx(A )y ,− zy,2

1δx

x,y( )x + z21δ ,yA

x,y(A )x − z21δ ,y

3δl

lδ 2

xδ

yδ

k

,y )z,(x4

x(A )y ,+ z

δ

lδ 1

l

Traversing the square in the counter-

clockwise direction, its edges and their lo-

cations are given by:

δ~l1 = δx ı, (x, y − 12δy, z);

δ~l2 = δy , (x+ 12δx, y, z);

δ~l3 = −δx ı, (x, y + 12δy, z);

δ~l4 = −δy , (x− 12δx, y, z).

⇒4∑

n=1

~A · δ~ln = Ax(x, y −12δy, z)δx +Ay(x+ 1

2δx, y, z)δy

−Ax(x, y +12δy, z)δx−Ay(x− 1

2δx, y, z)δy

=

[Ay(x+ 1

2δx, y, z)− Ay(x− 12δx, y, z)

δx

−Ax(x, y +

12δy, z)−Ax(x, y −

12δy, z)

δy

]

δx δy

⇒4∑

n=1

~A · δ~ln =(∂xAy − ∂yAx︸︷︷︸

(∇× ~A)z

)dσ = ∇× ~A · d~σ, (1.3.12)

in the limit when δx → dx, etc.

j

i

y ,− zy,21δx

k

lδ 2,L

lδ 4,R

x

y

σL

σ

−

R

xδx

x(A )

Now, introduce a second square, σL (left) at

(x− δx, y, z), that shares the (x− 12δx, y, z)

edge with the first square, σR (right) at

(x, y, z).

Adding the contributions of the two squares, the RHS of Eq. (1.3.12) is the

sum of the two values of (∇× ~A) · δ~σ.

Likewise, the LHS of Eq. (1.3.12) is the sum of all eight edges, although the

contributions from the interior edges cancel. Edge 2 of σL (red), δ~l2,L ∝ +


while edge 4 of σR (green), δ~l4,R ∝ −. Thus,

~A · δ~l2,L = − ~A · δ~l4,R,

and the LHS of Eq. (1.3.12) is just the sum from all exterior edges.

This is true no matter how many squares we bring together, and whether the

squares are all actually coplanar squares, or non-coplanar “patches”; only the

exterior edges (the circumference of the aggregate) of the patches contribute

to the LHS of Eq. (1.3.12) while all patches contribute fully to the RHS.

A

S

C

σd

ld

Now, redraw the original figure divided into “Rie-

mann patches” whose aggregate approximates the

form of the surface S. Then, Eq. (1.3.12) implies,∑

exterior edges

~A · δ~l =∑

all patches

∇× ~A · d~σ. (1.3.13)

In the limit as δx → dx, the sum on the LHS of (1.3.13) becomes a contour

integral and the sum on the RHS becomes an open surface integral:∮

C

~A · d~l =

∫

S

∇× ~A · d~σ,

proving the first form of Stokes’ theorem (Eq. 1.3.9).

Exercise: Prove the second and third versions of Stokes’ Theorem, Eq. (1.3.10)

and (1.3.11) from the first, Eq. (1.3.9) (similar to proofs of Eq. 1.3.5, 1.3.6).

Example 1.21. (Problem AWH3.8.8) Evaluate

∮

C

~r × d~r, where C is a closed

loop in the x-y plane bounding a surface area S.

From the third flavour of Stokes’ theorem, Eq. (1.3.11),∮

C

~r × d~r = −

∫

S

(d~σ ×∇)× ~r. (1.3.14)


Now, S is in the x-y plane ⇒ d~σ = dσ k and ~r = ı x+ y. Thus,

d~σ ×∇ = dσ k × (ı ∂x + ∂y + k ∂z) = dσ(−ı ∂y + ∂x);

(d~σ ×∇)× ~r = dσ(−ı ∂y + ∂x)× (ı x+ y) = −2dσ k = −2d~σ.

Thus, Eq. (1.3.14) becomes:

∮

C

~r × d~r = 2

∫

S

dσ k = 2S k.

Note that. . .

- besides dV , a volume element can be denoted d3~r ( 6= d~r !), dτ , etc.

- besides C, perimeter of an open surface S can be denoted ∂S, δS, etc.

- besides Σ, surface of a volume V can be denoted ∂V , δV , etc.

What do Gauss’ and Stokes’ theorems actually do?

- In univariate calculus,

∫d

dxf(x)dx = f(x) ⇒

d

dxand

∫

“annihilate”.

- Gauss’ and Stokes’ theorems are the multi-variate analogues to this.

- In Gauss’ theorem, a volume (triple) integral of a ∇ (differential opera-

tor) → a surface (double) integral ⇒ ∇ “annihilates” one integral.

- In Stokes’ theorem, a surface (double) integral of a ∇ (differential oper-

ator) → a line (single) integral ⇒ ∇ “annihilates” one integral.

Example 1.22. The integral form of Maxwell’s equations are:∮

Σ

~D · d~σ =

∫

V

ρdV ; (Gauss’ law for ~D) (1.3.15)

∮

Σ

~B · d~σ = 0; (Gauss’ law for ~D) (1.3.16)


∮

C

~E · d~l = −d

dt

∫

S

~B · d~σ; (Faraday’s law) (1.3.17)

∮

C

~H · d~l =d

dt

∫

S

~D · d~σ +

∫

S

~J · d~σ, (Ampere-Maxwell law) (1.3.18)

where ~E and ~H are the electric and magnetic fields, ~D = ǫ ~E is the electric

displacement, ~B = µ ~H is the magnetic induction, ρ and ~J are the charge

and current densities, V is a volume with surface area Σ, and S is an open

surface with perimeter C. Derive the vector form of Maxwell’s equations.

From the first flavour of Gauss’ theorem (Eq. 1.3.4), Eq. (1.3.15) can be

written:∮

Σ

~D · d~σ =

∫

V

∇ · ~D dV =

∫

V

ρdV ⇒

∫

V

(∇ · ~D − ρ)dV = 0.

This is true for any volume, V and, in particular for an infinitesimal volume

δV over which the integrand is constant. Thus,

(∇ · ~D − ρ)

∫

δV

dV︸︷︷︸

δV 6= 0

= 0 ⇒ ∇ · ~D = ρ.

Similarly, Eq. (1.3.16) can be written: ∇ · ~B = 0.

Next, from the first flavour of Stokes’ theorem (Eq. 1.3.9), write Eq. (1.3.18)

as: ∮

C

~H · d~l =

∫

S

∇× ~H · d~σ =d

dt

∫

S

~D · d~σ +

∫

S

~J · d~σ.

Since t and ~r are independent variables, the order of the time derivative and

area integral on the right hand side may be switched. In so doing, the time

derivative becomes a partial derivative to reflect the fact that the spatial

dependence is held constant while the time derivative is taken. Thus, we get:∫

S

∇× ~H · d~σ =

∫

S

(∂ ~D

∂t+ ~J

)

· d~σ ⇒

∫

S

(

∇× ~H −∂ ~D

∂t− ~J

)

· d~σ = 0.


Again, this is true for any open surface, S including an infinitesimal surface,

δ~S, over which the integrand is constant. Thus,

(

∇× ~H −∂ ~D

∂t− ~J

)

·

∫

δS

d~σ︸︷︷︸

δ~S 6= 0

= 0.

The only vector whose dot product with a non-zero vector, δ~S, is zero re-

gardless of the angle between is the null vector, and we must have:

∇× ~H −∂ ~D

∂t− ~J = 0 ⇒ ∇× ~H =

∂ ~D

∂t+ ~J.

Similarly, Eq. (1.3.17) can be written: ∇× ~E = −∂ ~B

∂t.

The four boxed equations are Maxwell’s equations in differential form.

The last theorem of vector calculus, Green’s theorem, is a special case of

Gauss’ theorem, whose proof I leave as an exercise (e.g., let ~A = u∇v, then

apply Gauss + vector identities):

Theorem 1.4. Green’s Theorem: Let u(~r) and v(~r) be two scalar functions

of the coordinates. Then:∮

Σ

u∇v · d~σ =

∫

V

(u∇2v +∇u · ∇v) dV ;

∮

Σ

(u∇v − v∇u) · d~σ =

∫

V

(u∇2v − v∇2u) dV.

Part I:ReviewofEssentialMaterialdclarke/PHYS3200/documents/review.pdf · These notes, covering much of Chapters 1 and 3 in Arfken, Weber, and Harris (ed. 7) are givenasa...

Documents