Chapter I. Vectors and Tensors Yuxi Zheng Abstract. We assume the students know these materials: Basic vector algebra: addition, subtraction, and multiplication by a scalar, linear dependence, linear independence, basis, expansion of a vector with respect to other vectors, inner product. We will cover the topics: Advanced vector algebra: Projection of a vector onto an axis; vector product, product of three vectors; Brief Introduction As human beings learned to know the natural world around them, they invented words for description, and later introduced units of measurement to quantify their description. They tried various ways, including imagination, observation, and setting up laboratories, to find out the mechanism of motion. Math is to introduce mathe- matical symbols (numbers, variables (length, area, volume), coordinates, functions, vectors, tensors, rate of change, equations, inequalities, etc) to model the natural phenomena. With enough symbols accumulated, a branch of math, called pure math, is devoted to the study of these symbols. The study of these symbols (rules of opera- tions) with aims on applications to the natural world is called Applied Mathematics. A clear distinction between pure and applied mathematics is hard to draw. However, the application of mathematics is easily seen as the use of developed mathematical tools in sciences, engineering, and other fields. Our goal will be learning the basic tools of mathematics that had and will con- tinue to have applications. These tools will be introduced most often with some background of origin. Applications are often to follow. Our emphasis is on the math: principles and essential calculations 1.1. Vectors Review. Vectors are A = (1, 1, 1), B = (0, -1, 2).
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Chapter I. Vectors and Tensors
Yuxi Zheng
Abstract.
We assume the students know these materials:
Basic vector algebra: addition, subtraction, and multiplication by a scalar, linear
dependence, linear independence, basis, expansion of a vector with respect to other
vectors, inner product.
We will cover the topics:
Advanced vector algebra: Projection of a vector onto an axis; vector product, product
of three vectors;
Brief Introduction
As human beings learned to know the natural world around them, they invented
words for description, and later introduced units of measurement to quantify their
description. They tried various ways, including imagination, observation, and setting
up laboratories, to find out the mechanism of motion. Math is to introduce mathe-
We see thatφ(x1+αl1,x2+αl2,x3+αl3)−φ(x1,x2+αl2,x3+αl3)
α
= φ(x1+αl1,x2+αl2,x3+αl3)−φ(x1,x2+αl2,x3+αl3)αl1
l1(3)
converges to
∂φ(x1, x2 + αl2, x3 + αl3)
∂x1l1|α=0 =
∂φ(x1, x2, x3)
∂x1l1 (4)
as α → 0. Similarly, we can find the limits for the other two terms of difference in
(2). In summary, we find that
dφ
d l=
∂φ
∂x1l1 +
∂φ
∂x2l2 +
∂φ
∂x3l3
which is (1).
Now we can derive the two properties of the gradient from (1). We see that the term
∇φ · l achieves maximum when the angle between ∇φ and l is zero, i.e., l = ∇φ/|∇φ|,among all possible directions. So this is why the gradient gives the direction for most
rapid increase. Furthermore, the least change is zero change, achieved at directions
that are perpendicular to ∇φ, i.e., the angle between ∇φ and l is 90 degrees. We
know intuitively that there is no change in a level surface. Thus the gradient is a
normal to the level surfaces.
1.5.3. Coordinate-independent representation of gradient.
The representation is
∇φ(x) = limV→0
1V
∫ ∫∂V
nφ(y) dSy
where V is a domain that contains the point x and n is the unit exterior normal to
∂V . We also use V to denote the volume of the domain V , a misnomer.
3
Mathematical proof of the coordinate-independent representation
Consider
A = Cφ
where C is a constant vector. Let us apply Gauss’ Theorem:∫ ∫ ∫V
div A dV =
∫ ∫∂V
A · n dS.
Notice that div A = C · ∇φ. We find
C ·∫ ∫ ∫
V
∇φdV = C ·∫ ∫
∂V
nφ dS.
Since C is arbitrary, we conclude that∫ ∫ ∫V
∇φdV =
∫ ∫∂V
nφdS.
Dividing the equation by the volume V and taking the limit V → 0, we have
∇φ(x) = limV→0
1
V
∫ ∫∂V
nφ(y) dSy.
1.6. Vector fields
1.6.1. Flux of a vector field.
Let S be a two-sided piecewise-smooth surface in a vector field A(r). Let n be
a unit normal to S. The flux of A through an element dS is
A · n dS.
The flux through S is ∫ ∫S
A · n dS.
See Figure 1.4.2 for an earlier definition of flux.
To figure out the real meaning of a flux, let us consider an example. Imagine a
highway, an obserational gate, and you are watching the passing cars. If no car is
moving (a complete stall), you see no car passing through your gate, we say that the
flux is zero. Now suppose that all cars are moving at the same speed of 60 miles per
hour. Then in one minute the first car you saw at the begining of the minute has
traveled one mile. If the density of cars on the highway is 1 car per mile, then you
have seen 1 car in one minute passing through the gate. If the density is 2 cars per
mile, then you have seen 2 cars passing. So both velocity and density plays roles.
4
Now suppose that there are multiple lanes on the highway and the density is number
of cars per mile per lane, then the number of cars passed depends on the number of
gates you watch. That is, the cross length of the observational line plays a role. See
Figure 1.6.1.
(Figure 1.6.1.)
Figure 1.6.1. Flux associated with cars.sl
ante
dga
te
gate
cros
s
However, if the observational line is not perpendicular to the line of moving,
then the real length of the observational line does not play a role; the projection
of the observational line onto the line perpendicular to the lines of motion plays a
role. This projection is the same as the projection of the velocity vector field onto
the normal n of the surface S (observational line). So in three dimensions, let v be
the velocity of fluid particles, let S be a surface, let ρ be the density of the fluid
particles, then the quantity ∫ ∫Sρv · n dS
is the total mass of particles that have passed through S in unit time. Without the
density factor, it is called the flux.
1.6.2. The divergence of a vector field.
Recall
div A =∂A1
∂x1
+∂A2
∂x1
+∂A3
∂x3
= ( new notation )∇ ·A.
Recall Gauss’ Theorem: ∫ ∫ ∫V
div A dV =∫ ∫
∂VA · n dS.
Dividing the equation by the volume V and taking the limit V → 0 so that V shrinks
to a point x, we find
div A(x) = limV→0
∫ ∫∂V
A · n dS.
5
So we have found a coordinate-independent representation of the divergence. Using
the definition of flux, we see that the integral over ∂V is the total flux through the
surface ∂V . This total flux over volume V is the flux per unit volume. In the limit
V → 0, the limiting value measures the flux production per unit volume at the point
x. This is the real meaning of the divergence. If it is positive, then it is called a
source. If it is negative, then it is called a sink. If it is zero in a domain, then there
is no source or sink, and it is called divergence free.
6
Example 1.6.2a. Let
A(r) = qr
r3
where r denotes the norm of r. It is an exercise that
div A = 0
at every point where except r = 0. We are interested to find the flux through the
unit sphere S centered at the origin. We know that the unit exterior normal to the
unit sphere is given by
n =r
r.
So we have ∫∫S
A · n dS =∫∫
r=1q rr3· rrdS
= q∫∫r=1
dS
= 4πq.
(5)
If q > 0, it is a source (fountain). If q < 0, it is a sink. By Gauss’ Theorem, we can
see that the flux through any surface is 4πq if the surface encloses the origin. The flux
is zero if the surface does not enclose the origin. We also see that the flux is the same
no matter how small the surface is as long as it contains the origin. This vector field
is smooth every where away from the origin, and the origin is a point source/sink.
See Figure 1.6.2.
(Figure 1.6.2.)
Figure 1.6.2. Flux and source/sink.
SinkSource
1.6.3. The curl of a vector field.
Recall we have introduced the curl of a vector in association with Stokes’ Theo-
7
rem:curl A = (∂A3
∂x2− ∂A2
∂x3, ∂A1∂x3− ∂A3
∂x1, ∂A2∂x1− ∂A1
∂x2)
=
∣∣∣∣∣∣∣∣∣i1 i2 i3∂x1 ∂x2 ∂x3
A1 A2 A3
∣∣∣∣∣∣∣∣∣ .We state without proof that the curl has a coordinate-independent representation:
curl A = limV→0
1V
∫ ∫∂V
n×A dS (= ∇×A).
We see the real meaning of the curl in the next example.
εxample 1.6.3a. Consider a rigid body rotating about a fixed point O with
angular velocity ω. See Figure 1.6.3. The velocity of a point with position vector r
is given by
v = ω × r.
ω
vr
Figure 1.6.3. Curl is twice angular velocity.
Let us calculate the curl of v.
curl1v = ∂x2v3 − ∂x3v2
= ∂x2(ω1x2 − ω2x1)− ∂x3(ω3x1 − ω1x3)
= 2ω1
(6)
(ω is independent of r), and similarly
curl2v = 2ω2
curl3v = 2ω3.(7)
It follows that
curl v = 2ω.
That is, the curl of the velocity field of a rotating body equals twice the angular
velocity of the body.
End of Lecture 5.
8
1.7. Coordinate transformations.
We deal with coordinate transformations between rectangular coordinate sys-
tems, which play an important role in the definition of tensors.
Preliminary remark on tensors. Tensors are physical quantities that exist in-
dependent of coordinate systems. Scalar quantities are called zeroth-order tensors
(e.g., temperature); vectors are called first-order tensors (e.g., velocity); second-order
tensors can all be represented by 3× 3 matrices (e.g., the stress tensor), but not all
3 × 3 matrices are tensors. Tensors of orders greater than 2 cannot be represented
by 3 × 3 matrices. A n-th-order tensor requires 3n real numbers and is invariant
under change of coordinate systems. The requirement of the invariance is natural
since physical observables are invariant under change of coordinate systems.
Suppose we have two orthonormal bases:
i1, i2, i3, and i′1, i′2, i′3,
and two origin O and O′ to form two rectangular coordinate systems K and K ′. Let
a point M in space have the representation
r = x1i1 + x2i2 + x3i3r′ = x′1i
′1 + x′2i
′2 + x′3i
′3.
(1)
Note the equations:r = r′ + r′0, r′0 = ~OO′
r′ = r + r0, r0 = ~O′O,(2)
where the vector r′0 is a vector from O to O′ and r0 = −r′0. See Figure 1.7.1.
(Figure 1.7.1.)
xx
x
x
xx
r’
r
r
11’
2
3 3’
2’
0’
M
Figure 1.7.1. Coordinate transformation.
i
i
ii’
i’
KO
O’
K’
1
2
32
3
i’1
Now we use the summation convention: a repeated index is automatically summed:
xkik =3∑k=1
xkik.
Thus, the equations in (2) can be written
xkik = x′ki′k + x′0kik
x′ki′k = xkik + x0ki′k
(3)
where x′0k are the coordinates of r′0 with respect to the old system K, etc. Take
inner product with il or i′l in equations (3) and note the Kronecker delta function
ik · il = δkl =
0, k 6= l,
1, k = l.(4)
We findxl = x′k(i
′k · il) + x′0l
x′l = xk(ik · i′l) + x0l.(5)
2
We introduce new notations
i′k · il = 1 · 1 · cos(i′k, il) = αk′l. (6)
Thus
ik · i′l = i′l · ik = αl′k. (7)
Thereforexl = αk′lx
′k + x′0l
x′l = αl′kxk + x0l.(8)
The first equation of (8) is the transformation from K ′ to K, while the second
equation of (8) is the inverse transformation from K ′ to K. Note the index summed
in the second equation is the second index of α, while the index summed in the first
equation of (8) is the first index.
Properties of αl′k.
We note that the Kronecker delta in system K ′ is
δ′kl = i′k · i′l =
0, k 6= l,
1, k = l.(9)
Note the expansions:
i′k = αk′lil, and ik = αl′ki′l,
see Figure 1.7.2.
(Figure 1.7.2.)
i’
i’
O’ 2
3
i’1
i
i
iK
O
1
2
3 i’
i
k
k
Figure 1.7.2. Calculations of the coeffieients.
K’
(a) (b)
3
The (αl′k) are often given as a 3× 3 matrix
(αl′k) =
α1′1 α1′2 α1′3
α2′1 α2′2 α2′3
α3′1 α3′2 α3′3
.Note also
δkm = ik · im = αl′ki′l · αj′mi′j = αl′kαl′m
δ′km = i′k · i′m = αk′lil · αm′jij = αk′lαm′l.
Thus we have the propertyαl′kαl′m = δkm,
αk′lαm′l = δ′km.
These properties simply say that the columns of the matrix (αl′k) are orthonormal,
so are the rows of the matrix. Thus the matrix (αl′k) is an orthogonal matrix.
4
1.8. Zeroth-Order Tensors (Scalars).
A scalar is a single function (i.e., one component) which is invariant under
changes of the coordinate systems. We deal with rectangular coordinate systems
only. Thus our tensors are also called cartesian tensors.
Let ϕ be a function of points in a domain in space. Think of ϕ as a physical
or geometrical quantity. This function exists irrelevant of a coordinate system (e.g.
temperature, density, or pressure). Suppose we have two rectangular coordinate sys-
tems K and K ′. In K we have the representation ϕ(x1, x2, x3) of the function; while
in K ′ we have the representation ϕ′(x′1, x′2, x′3) where xi and x′i are the coordinates
of one and the same point in K and K ′. If the function is a scalar, then
ϕ(x1, x2, x3) = ϕ′(x′1, x′2, x′3)
for all points in the domain.
Example 1.8.1a. We show that the distance between two points is a scalar.
Let A and B be two points. Let K and K ′ be two rectangular coordinate systems.
In these systems both A and B have coordinates:
A has coordinates xAi in K, and x′iA in K ′;
B has coordinates xBi in K, and x′iB in K ′.
Let
∆xi = xBi − xAi , ∆x′i = x′iB − x′i
A.
Let the transformation from K to K ′ be
x′i = αi′kxk + x0i.
Then∆x′i = x′i
B − x′iA = αi′kx
Bk + x0i − αi′kxAk − x0i
= αi′k(xBk − xAk ) = αi′k∆xk.
Thus
∆x′i = αi′k∆xk. (1)
Recall Pythagorean Theorem for distance
(∆s′)2 =3∑i=1
(∆x′i)2.
Then(∆s′)2 = αi′k∆xkαi′l∆xl = αi′kαi′l∆xk∆xl
= δkl∆xk∆xl=∑3k=1(∆xk)2.
Thus
∆s′ = ∆s.
1.9. First-Order (Cartesian) Tensors (Vectors)
A first-order tensor is given by three components, and satisfies a certain trans-
formation law.
Think of point B as displaced from point A. Then ∆xi are the displacement.
We have seen that the displacement satisfies the law (1).
Definition. A vector (a.k.a first-order tensor) A is a quantity uniquely specified
in any coordinate system by three real numbers (called the components of the vector)
which transform under changes of the coordinate system according to the law
A′i = αi′kAk (2)
where Ak, A′i are the components of the vector in the old and new coordinate systems
K and K ′ respectively, and αi′k is the cosine of the angle between the i-th axis of
K ′ and the k-th axis of K.
We remark that it is obvious that the zero vector (0, 0, 0) is represented the same
way in any coordinate system. Furthermore, this definition of vector is equivalent
to the definition of a vector as a directed line segment. Lastly, we can use formula
(2) to calculate the components of the representation of a vector in K ′ from the
components of the representation in K.
Example 1.9a. A moving particle P has position coordinate xi(t) in a coordi-
nate system K. The displacement
xi(t+ ∆t)− xi(t)
satisfies the law
x′i(t+ ∆t)− x′i(t) = αi′k(xi(t+ ∆t)− xi(t)) (3)
2
by (1). Thus it determines a vector. We divide (3) by ∆t (a scalar) to find
x′i(t + ∆t)− x′i(t)∆t
= αi′k(xi(t + ∆t)− xi(t))
∆t.
Taking the limit ∆t→ 0 and using the definition of velocity
vi(t) = lim∆t→0
xi(t + ∆t)− xi(t)∆t
,
we find that
v′i(t) = αi′kvk(t).
So the velocity is a vector. Similarly the acceleration
ak(t) =dvk(t)dt
is a vector. Multiplying the acceleration with the scalar mass m, and by Newton’s
second law, we find that the force
F = mA
is a vector.
1.10. Second-Order Tensors
Definition. A second-order tensor is a quantity uniquely specified by 9 real
numbers (called the components of the tensor) which transform under changes of
the coordinate system according to the law
A′ik = αi′lαk′mAlm (4)
where Alm, A′ik are the components of the tensor in the old and new coordinate
systems K and K ′ respectively, and αi′k is the cosine of the angle between the i-th
axis of K ′ and the k-th axis of K.
Remarks. 1. We can use the transformation law to determine the coordinates
of A from one system to another.
2. The zero tensor has zero coordinates in any coordinate system.
3. The components of a second-order tensor are often written as a matrix:
A = (Aik) =
A11 A12 A13
A21 A22 A23
A31 A32 A33
.
3
It can be regarded as a representation of a tensor with respect to a coordinate
system.
4. Tensors of order higher than 2 cannot be represented by matrices.
Example 1.10a. Given two vectors A and B. There are nine products of a
component of A with a component of B:
AiBk (i, k = 1, 2, 3).
Suppose we transform to a new coordinate system K ′, in which A and B have
components A′i and B′k. Then
A′i = αi′lAl, B′k = αk′mBm
and hence
A′iB′k = αi′lαk′mAlBm.
This shows that AiBk is a second-order tensor. It is often denoted as A⊗B.
More examples are in the text book.
4
1.10.1. The Stress Tensor.
Consider an elastic medium, such as rubber. Use a surface to separate the
medium you will encouter a force acting between them. The total force divided by
the total surface area is called the stress (vector). The stress depends on the location
in the medium and the normal direction of the surface. It is possible that we can
factor out the direction part of the stress vector to form a quantity called the stress
tensor so that the stress vector depends bilinearly on the tensor and the direction
of the surface.
Take a rectangular coordinate system K. Take an arbitrary point M in the
elastic medium. Take a tetrahedron with M being one vertex, so that the three
faces passing through M are parallel to the coordinate surfaces, see Figure 1.10.1.
p
p
p
p
n
1
3
2
x
x
x
2
3
1
n
M
Figure 1.10.1. Tetrahedron at M with stress vectors.
Let n be the exterior normal to the slant surface, with area dσn. Let pn be the
stress (force/unit area) onto the tetrahedron through the slant surface. Let pi be the
stress onto the exterior of the tetrahedron through the surface that is perpendicular
to the i-th axis. Let a be the acceleration of the tetrahedron and f be the body
force per unit mass. By Newton’s second law, we have
adm = fdm + pndσn − pidσi.
Let dm go to zero, and note that volume goes to zero faster than corresponding
surface area, we find
pndσn = pidσi.
Note the area formula
dσi = dσn cos(n, ii) = nidσn.
We find that
pn = pini.
Projecting to ik we find
pnk = pikni.
Definition. The stress tensor is (pik). Normal stresses are pii. Tangential
(shearing) stresses are pij(i 6= j).
Note that n is arbitrary since the tetrahedron is not necessarily regular. We
have
pn = pnkik = pikniik,
which determines the stress (vector) on all surfaces.
Note now that (pik) depends only on M , not n.
Real Meaning of Stress Tensor. Once n is specified, the stress tensor (pik)
and n give the stress
pn = pikniik ( force/unit area ).
Once the area is specified as dσn, the force is
pndσn.
A second-order (stress) tensor takes a vector (unit normal) to a (stress) vector.
It only remains to verify that pik is indeed a second-order tensor. For mathemat-
ical rigor as well as the whole point of the concept of tensor, we should verify that
the stress satisfies the law of coordinate transformation. However, it is a technical
point, which I choose to skip in class.
(Note: there might be open space from regular sized text to footnote sized text
that is to follow.)
2
Additional readings. When we use the fact that the tetrahedron is rotation free
(See reference book by Young), we can deduce that
pik = pki.
Thus the matrix (pik) is symmetric.
In hydrodynamics, it is customary to write
pik = −pδik + pik
where the scalar p is called the hydrodynamic pressure and pik is the viscous stress
tensor. A Newtonian fluid is such that the linear relation
pik = ηiklmvlm
holds where
vlm =1
2(∂vl∂xm
+∂vm∂xl
)
is the rate of deformation tensor. For isotropic fluid, there holds
pik = 2µvik + µ′δikvll
where µ and µ′ are called viscosity coefficients.
Verification of the tensor character of the stress. Since the definition of (pik)
involves no restriction on the normal n, we can take n to be the i-th base vector of
the new coordinate system K′, so that
n = i′i
(K and K′ have orthonomal bases i1, i2, i3 and i′1, i′2, i′3, respectively). Then projecting
n onto the l-th axis of K gives
nl = n · il = i′i · il = αi′l,
where αi′l is the cosine of the angle between the i-th axis of K′ and the l-th axis of
K, and hence
pn ≡ p′i = plnl = αi′lpl = αi′limplm.
Finally, projecting p′i onto the k-th axis of K′, we obtain
p′i · i′k = αi′l(im · i′k)plm
or
p′ik = αi′lαk′mplm.
By definition, we find that (pik) transforms like a second-order tensor.
3
1.10.2. The moment of inertia tensor. Consider a rigid body system of n
particles with coordinates (x(j)1 , x
(j)2 , x
(j)3 ), j = 1, 2, · · · , n and mass mj in a coordi-
nate system K with origin O. The quantities
Iik =n∑j=1
mj(δikx(j)l x
(j)l − x
(j)i x
(j)k )
are called the moment of inertia tensor (about the origin O). It is a second-order
tensor. It is used in physics in
ωkIik = Li
where
L =n∑j=1
mj(r(j) × v(j))
is the angular momentum and ~ω is the angular velocity:
v(j) = ~ω × r(j).
4
1.10.3. The Deformation Tensor.
Let u(r) be the displacement of a point with position vector r. Then the quan-
tities
uik =12
(∂ui∂xk
+∂uk∂xi
)form a second-order tensor, called the deformation tensor.
1.10.3. The rate of Deformation Tensor.
Let v(M) be the velocity at a point M of a moving fluid. Then the quantities
vik =12
(∂vi∂xk
+∂vk∂xi
)form a second-order tensor, called the rate of deformation tensor.
1.11. High-Order Tensors.
By a tensor of order n is meant a quantity uniquely specified by 3n real numbers
(the components of the tensor) which transform under changes of coordinate systems
according to the law
A′i1i2···in = αi′1k1αi′2k2
· · ·αi′nknAk1k2···kn
where Ak1k2···kn , A′i1i2···in are the components of the vector in the old and new
coordinate systems K and K ′ respectively, and αi′k is the cosine of the angle between
the i-th axis of K ′ and the k-th axis of K.
Example 1.11a. If A,B, and C are three vectors, then the 33 = 27 quantities
Dijk = AiBjCk
form a tensor of order 3. The proof is omitted, but see an exercise.
Example 1.11b. Suppose one second-order tensor Aik is a linear function of
another second-order tensor Bik, such that
Aik = λiklmBlm,
then λiklm form a fourth-order tensor. Proof is omitted.
1.12. Tensor Algebra.
1.12.1. Addition. We can add any two tensors of the same order, the sum is
a tensor of the same order, whose components are the sums of the corresponding
components of the two tensors. For example, tensor Aik and tensor Bik can be added
to give a tensor Cik:
Cik = Aik +Bik.
1.12.2. Multiplication. We can multiply any number of tensors of arbitrary
orders. The product of two tensors, for example, is a tensor whose order is the sum of
the orders of the two tensors, and whose components are products of a component of
one tensor with any component of the other tensor. The product of two second-order
tensors Aik with Blm, for example, is a fourth-order tensor Ciklm with components
Ciklm = AikBlm.
Our product of tensors is also called outer product.
1.12.3. Contraction of Tensors.
Summing a tensor of order n (n ≥ 2) over two of its indices is called contraction.
For example, summing over the first and second indices of a third-order tensor
Aiik = A11k +A22k +A33k
gives a vector. This is called contraction in the first and second indices. Contraction
in both indices of a second-order tensor Bij gives a scalar
Bii = B11 +B22 +B33.
Another example is Aiki gives another vector.
Contraction can be done many times.
Inner product. Multiplying two or more tensors and then contracting the
product with respect to indices belonging to different factors is often called an inner
product of the given tensors. For example, AikBk, AiBi, and λiklmBlm are all inner
products. But AiiBk is not an inner product.
2
1.13. Symmetry Properties of Tensors.
A tensor Sikl··· ( of order 2 or higher) is said to be symmetric in the first and
second indices (say) if
Sikl··· = Skil···.
It is antisymmetric in the first and second indices (say) if
Sikl··· = −Skil···.
Antisymmetric tensors are also called skewsymmetric or alternating tensors. The
Kronecker δik is a symmetric second-order tensor since
δik = ii · ik = ik · ii = δki.
The stress tensor pik is symmetric. But the tensor
Cik = AiBk −AkBi
is antisymmetric. It can be shown easily that an antisymmetric second-order tensor
has an matrix like this:
(Cik) =
0 C12 C13
−C12 0 C23
−C13 −C23 0
.That is Cik = 0 for i = k for an antisymmetric tensor.
We note that any second-order tensor Tik can be represented as a sum of a
symmetric tensor and an antisymmetric tensor:
Tik = Sik +Aik
whereSik = 1
2 (Tik + Tki)
Aik = 12 (Tik − Tki).
1.14. Pseudotensors.
Given a coordinate system K with the basis vectors ii, (i = 1, 2, 3). Let us
consider the quantities:
εjkl = (ij × ik) · il.
We have been assuming that our coordinate system K is always right-handed, i.e.,
the thumb of the right hand points to the direction of i3 if we position our right
hand so that our four fingers can rotate from i1 to i2. In this case, we can calculate
to find that
εjkl =
1, if j, k, l is a cyclic permutation of 1, 2, 3.
−1, if j, k, l is a cyclic permutation of 2, 1, 3.
0, otherwise.
More precisely, we have
ε123 = ε231 = ε312 = 1
ε213 = ε132 = ε321 = −1,
and all others with repeated indices ε111 = ε112 = · · · are zero. We verify for example
that
ε123 = (i1 × i2) · i3 = i3 · i3 = 1,
and
ε213 = (i2 × i1) · i3 = −i3 · i3 = −1,
and
ε113 = (i1 × i1) · i3 = 0 · i3 = 0.
Under orthogonal coordinate transformations from this K to another right-
handed system K ′, we can show that εjkl transform like a third-order tensor.
But occasionally we need to use left-handed coordinate systems. In this case the
thumb of the left hand points to the direction of i3 if we position our left hand so
that our four fingers can rotate from i1 to i2. In a left handed coordinate system
the vector product of A × B is defined by the left hand rule; i.e., the direction of
A ×B has the direction so that the three vectors A,B, and A ×B follow the left
hand rule. For either handedness, the rule of the direction of the product A × B
is such that the three vectors A,B, and A × B have the same handedness as the
coordinate system. This way all the formula for vector product hold for both kinds
of coordinate systems. In particular, the formula
A×B =
∣∣∣∣∣∣∣∣∣i1 i2 i3a1 a2 a3
b1 b2 b3
∣∣∣∣∣∣∣∣∣2
is valid in both kinds of coordinate system.
A coordinate system transformation may change the handedness. We have al-
lowed for these transformations in our definition of tensors of all orders.
However there are tensor-like quantities that change slightly differently from the
laws of tensors. For example, let us calculate the changes in εjkl. Let K ′ be a
coordinate system with the basis vectors i′1 = i1, i′2 = i2, i′3 = −i3 and the same
origin. By definition we have
ε′123 = (i′1 × i′2) · i′3.
Note that K ′ is now left handed. So the way to figure out the vector product i′1× i′2is to use the left hand rule, so we find that
i′1 × i′2 = i′3.
Thus
ε′123 = 1.
Now let us calculate the term
α1′lα2′mα3′nεlmn
which would be equal to ε′123 if εjkl were a third-order tensor. Note that the coordi-
nate transformation coefficients are
(αi′l) =
1 0 0
0 1 0
0 0 −1
.Thus
α1′lα2′mα3′nεlmn = α1′1α2′2α3′3ε123 = −1.
We can do all the calculations to verify that there actually hold
ε′ijk = −αi′lαj′mαk′nεlmn.
So εjkl is not a third-order tensor. This leads to the concept and definition of
pseudotensors.
3
Definition of pseudotensors. A pseudotensor of order n has 3n components
Ak1k2···kn that transform under changes of coordinate system according to the law
A′i1i2···in = ∆αi′1k1αi′2k2
· · ·αi′nknAk1k2···kn
where Ak1k2···kn , A′i1i2···in are the components of the pseudovector in the old and new
coordinate systems K and K ′ respectively, αi′k is the cosine of the angle between the
i-th axis of K ′ and the k-th axis of K, ∆ is 1 if K and K ′ have the same handedness,
and ∆ is −1 if K and K ′ have different handedness.
Note that a change of coordinate system is called a proper transformation if it
preserves the handedness. It is called an improper transformation if it reverses the
handedness. Pseudotensors are also called tensor densities.
We can verify that the permutation tensor εjkl is a third-order pseudotensor. It
is called the unit pseudotensor of order three. Since it appears in many physical and
geometrical situations, it also has the name Levi-Civita tensor density. It sometimes
is denoted as δjkl, a reminder that it is a generalization of the Kronecker δjk.
We note further that the permutation tensor εjkl is antisymmetric in any pair of
indices:
εjkl = −εkjl; εjkl = −εjlk; εjkl = −εlkj.
With two swaps, we have
εjkl = −εkjl = εklj, etc.
Because of this, it is often called the alternating (pseudo-)tensor of third order.
Lastly we note that the vector product A×B, which does not transform as an
ordinary vector, has a pseudotensor representation:
(A×B)i = εijkAjBk.
That is, pseudotensors can be multiplied to yield pseudotensors. Higher-order pseu-
dotensors can be contracted to form pseudotensors. In the current situation, the
outer product of εjkl with ordinary vectors A and B yields a pseudotensor of order
5. When contracted twice, the result is a pseudotensor of order 1.
An ordinary first-order tensor is called a polar vector. Polar vectors transform
under both types of changes of coordinate systems without the factor ∆. A first-order
4
pseudotensor is called an axial vector. It is called axial because it has something to
do with the axis of rotation associated in the product v = ~ω × r.
5
1.15. Curvilinear Coordinate Systems.
We need curvilinear coordinate systems in applications. The spherical coordinate
system is an example of curvilinear coordinate systems.
Let u1, u2, u3 denote new coordinates and suppose that they are related to the
cartesian coordinates x1, x2, x3 by the equations
ui = φi(x1, x2, x3) (i = 1, 2, 3). (1)
Assume that φi have continuous first-order derivatives in a domain D of the x-space
and there holds the condition
∂(u1, u2, u3)∂(x1, x2, x3)
=
∣∣∣∣∣∣∣∣∣∂φ1
∂x1
∂φ1
∂x2
∂φ1
∂x3
∂φ2
∂x1
∂φ2
∂x2
∂φ2
∂x3
∂φ3
∂x1
∂φ3
∂x2
∂φ3
∂x3
∣∣∣∣∣∣∣∣∣ 6= 0 (2)
in the domain. This determinant is called the Jacobian of the transformation from
x to u.
The nonvanishing condition (2) ensures that it is possible to determine (x1, x2, x3)
in terms of the coordinates (u1, u2, u3); i.e., there exist functions fi(u1, u2, u3) (i =
1, 2, 3) such that
xi = fi(u1, u2, u3) (3)
where fi are defined in D determined from (1). Moreover, fi have continuous first-
order derivatives for which∂(x1, x2, x3)∂(u1, u2, u3)
6= 0
in D. The functions (f1, f2, f3) define the inverse transformation of (1). It is im-
portant to note that the Jacobians satisfy the relation
∂(u1, u2, u3)∂(x1, x2, x3)
· ∂(x1, x2, x3)∂(u1, u2, u3)
= 1.
Now let P be any point in D with coordinate (x1, x2, x3) and let the numbers
u1, u2, u3 be determined by (1). We call the ordered triple of numbers (u1, u2, u3) the
curvilinear coordinates of the point P . The equations in (1) are called the coordinate
transformation, and they are said to define a curvilinear coordinate system in D. It
follows that the Jacobian of a coordinate transformation is the reciprocal of the
Jacobian of its inverse.
Example 1.15a. Consider the transformation from the rectangular cartesian
coordinates (x, y) on a plane to the polar coordinates (r, θ) defined by
r =√x2 + y2, θ = arccos
x√x2 + y2
( or = arcsiny√
x2 + y2)
where arccos is chosen such that a unique θ in 0 ≤ θ < 2π exists so that cos θ =
x/√x2 + y2 and sin θ = y/
√x2 + y2. The domain D is all points except the origin.
The Jacobian is∂(r, θ)∂(x, y)
=
∣∣∣∣∣∣x√x2+y2
y√x2+y2
− yx2+y2
xx2+y2
∣∣∣∣∣∣ =1r.
In the above calculation we find the partial derivative ∂r/∂x as follows: From
r2 = x2 + y2
we find
2rrx = 2x.
Dividing by 2r we find rx = x/r. We can use cos θ = x/r to find the derivative θx,
etc. Thus the inverse exists except at the origin, the inverse is
x = r cos θ, y = r sin θ.
Note that the inverse is defined for all (r, θ). We can calculate
∂(x, y)∂(r, θ)
=
∣∣∣∣∣∣ cos θ −r sin θ
sin θ r cos θ
∣∣∣∣∣∣ = r.
It is clear that the product of the two Jacobians is 1.
1.15.1. Coordinate surfaces, coordinate curves, and local basis.
Let P0 = (x01, x
02, x
03) be a point in D with coordinates (u0
1, u02, u
03) in the curvi-
linear coordinate system. We call the surface
φi(x1, x2, x3) = u0i
the i-th coordinate surface passing through P0 (i = 1, 2, 3). The intersection of two
coordinate surfaces, say,
φ1(x1, x2, x3) = u01, φ2(x1, x2, x3) = u0
2
2
is called the u3-coordinate curve. See Figure 1.15.1.
P
x 3
x 1
x 2
φ
φ = u
= u
30
3
φ = u 022
0
101
Figure 1.15.1. Coordinate surfaces, coordinate curves, and local basis.
We next derive a basis at the point P0. The position vector of an arbitrary point
P is
R(u1, u2, u3) = xiii = fi(u1, u2, u3)ii.
If we set u2 = u02, u3 = u0
3, then the resulting vector function R(u1, u02, u
03) represents
the u1-curve. On this curve u1 is the parameter. It follows that the derivative
∂R∂u1
represents the tangent vector to this curve. Likewise, we have
∂R∂u2
,∂R∂u3
representing the tangent vectors to the u2- and u3-curves respectively.
Since the determinant of a matrix is the same as the determinant of its transpose,
it follows from the definition of Jacobian and that of the scalar triple product that
∂(x1, x2, x3)∂(u1, u2, u3)
=∂R∂u1·(∂R∂u2× ∂R∂u3
). (4)
(See homework problem 1.) Hence at each point where (4) is not zero, the three
tangent vectors
(∂R∂u1
,∂R∂u2
,∂R∂u3
) (5)
3
are linearly independent and thus form a basis.
Every vector or vector field at each point can then be represented in terms of
this basis (5). Unlike the unit vectors (i1, i2, i3), however, this new basis varies from
point to point in space. For this reason, we call (5) a local basis.
4
1.15.2. Arclength and orthogonal curvilinear coordinate systems.
We assume that the three vectors
(∂R∂u1
,∂R∂u2
,∂R∂u3
)
form a right-handed basis; i.e., the vector product ∂R∂u1× ∂R
∂u2has positive inner
product with ∂R∂u3
. In this case, the Jacobian ∂(x1,x2,x3)∂(u1,u2,u3) is positive.
We derive arclength formula in curvilinear coordinate systems. Consider the
position vector
R = xiii = fi(u1, u2, u3)ii.
We have(ds)2 = dR · dR
= (∑3i=1
∂R∂ui
dui) ·∑3j=1
∂R∂uj
duj)
= ( ∂R∂ui· ∂R∂uj
)duiduj
= gij dui duj
(1)
where we have introduced
gij =∂R∂ui· ∂R∂uj
which is called the metric tensor.
From here one can pursue the study of general metric tensors, which are used
for example in general relativity. For us, we choose to be more specific. We say
that the curvilinear coordinate system is orthogonal curvilinear if the triple vectors
∂R/∂ui(i = 1, 2, 3) are mutually orthogonal. For orthogonal curvilinear coordinate
systems, the directions and magnitudes of ∂R/∂ui(i = 1, 2, 3) can still vary. Let us
define
hi = |∂R∂ui| (i = 1, 2, 3).
Then we have
gij =
h2i , i = j,
0, i 6= j.
Example 1.15b. The transformation relating the cylindrical coordinates (r, θ, z)
to the rectangular cartesian coordinates (x, y, z) is defined by the equations
r =√x2 + y2
θ = arccos x√x2+y2
( or arcsin y√x2+y2
)
z = z.
It is defined for all (x, y, z) except for the origin.
We find
∂(r, θ, z)∂(x, y, z)
=
∣∣∣∣∣∣∣∣∣∂r∂x
∂r∂y 0
∂θ∂x
∂θ∂y 0
0 0 1
∣∣∣∣∣∣∣∣∣ =1r.
The inverse is
x = r cos θ, y = r sin θ, z = z
which is valid for all (r, θ, z).
Let us identify the coordinate surfaces and coordinate curves. Refer to Figure
1.15.2.
xy
z
Pr
zθ
Figure 1.15.2. Cylindrical coordinate system.
0
Coordinate surfaces: The coordinate surface r = r0 is the surface of a cylinder
passing through a point P0, and extending to infinity in both the positive and
negative directions of z-axis. The coordinate surface θ = θ0 is a half plane starting
at the z-axis and extending to infinity. The coordinate surface z = z0 is the plane
passing through P0 and perpendicular to the z-axis.
2
Coordinate curves: The r-coordinate curve is a ray starting on the z-axis, passing
through the point P0, and parallel to the xy-plane. The θ-coordinate curve is a circle
passing through the point P0, and parallel to the xy-plane. The z-coordinate curve
is a straight line parallel to the old z-axis.
We have the position vector
R(r, θ, z) = r cos θ i1 + r sin θ i2 + z i3.
Tangent vectors to the coordinate curves are
∂R∂r = cos θ i1 + sin θ i2∂R∂θ = −r sin θ i1 + r cos θ i2∂R∂z = i3.
Let u1 = r, u2 = θ, u3 = z, then the three tangent vectors form a right-handed
orthogonal curvilinear coordinate system. We find that gij = 0 for i 6= j, and
h1 = 1, h2 = r, h3 = 1.
Thus the distance formula is
(ds)2 = (dr)2 + (rdθ)2 + (dz)2.
Example 1.15c. The spherical coordinates u1 = r, u2 = φ, u3 = θ are defined
byr =
√x2 + y2 + z2
φ = arccos z√x2+y2+z2
θ = arccos x√x2+y2
.
(r ≥ 0, 0 ≤ φ < π, 0 ≤ θ < 2π). Refer to Figure 1.15.3 for the variables.
3
Figure 1.15.3. Spherical coordinate system.
x
y
z
P
u
u
u
2
3
1
0
φ r
θ
We can calculate the Jacobian
∂(r, φ, θ)∂(x, y, z)
=
∣∣∣∣∣∣∣∣∣∣rx
yr
zr
xz
r2√x2+y2
yz
r2√x2+y2
−x2+y2
r2
− y√x2+y2
x√x2+y2
0
∣∣∣∣∣∣∣∣∣∣=
1r2 sinφ
.
The inverse isx = r sinφ cos θ
y = r sinφ sin θ
z = r cosφ.
The Jacobian is∂(x, y, z)∂(r, φ, θ)
= r2 sinφ.
The position vector is
R(r, φ, θ) = r sinφ cos θ i1 + r sinφ sin θ i2 + r cosφ i3.
4
The three tangent vectors are
∂R∂r = sinφ cos θ i1 + sinφ sin θ i2 + cosφ i3∂R∂φ = r cosφ cos θ i1 + r cosφ sin θ i2 − r sinφ i3∂R∂θ = −r sinφ sin θ i1 + r sinφ cos θ i2.
They are mutually orthogonal. We have
g11 = h21 = 1
g22 = h22 = r2
g33 = h23 = r2 sin2 φ.
The distance formula is
(ds)2 = (dr)2 + (rdφ)2 + (r sinφdθ)2.
For the volume element, we have
dV = dr · rdφ · r sinφdθ = r2 sinφdr dφ dθ.
In the next lecture we calculate the grad, div, and curl in orthogonal curvilinear
coordinate systems.
5
1.16. Grad, div, and curl in orthogonal curvilinear coordinate systems.
In this section we derive the expressions of various vector concepts in an orthog-
onal curvilinear coordinate system.
Let (u1, u2, u3) be such a system:
ui = φi(x1, x2, x3). (i = 1, 2, 3).
Let
xi = fi(u1, u2, u3)
be the inverse transformation. We introduce the normalized coordinate tangent
vectors:
ui =1hi
∂R∂ui
( no summation )i = 1, 2, 3,
where hi = |∂R/∂ui|. Assume that (u1,u2,u3) is right-handed so that the Jacobian
is positive.
1.16.1. Gradient of a scalar field.
Let F (x1, x2, x3) be a scalar field in a rectangular system. We know that ∇F is
a vector, which can be represented as a linear combination of any basis. So let
∇F = F1u1 + F2u2 + F3u3.
We need to find (F1, F2, F3). We recall from Section 1.5.3(of Lecture 5) the coordinate-
independent formula
∇F (P0) = limV→0
1V
∫ ∫∂V
nF (y) dSy (1)
where V is a domain that contains the point P0 and n is the unit exterior normal
to ∂V . By the way, we have also the formulas
∇ · F(P0) = limV→0
1V
∫ ∫∂V
n · F(y) dSy
for the divergence (∇·) of a vector field F and and
∇× F(P0) = limV→0
1V
∫ ∫∂V
n× F(y) dSy
for the curl (∇×) which we will use for the representations of div and curl. The three
formulas certainly have striking uniformity. Back to our gradient representation, we
take V to be an elementary “curvilinear parallelepiped” of volume
ds1 ds2 ds3 = h1h2h3 du1 du2 du3
with faces perpendicular to the coordinate curves, see Figure 1.16.1.
Figure 1.16.1. Curvilinear parallelepiped.
u
u
u
n
n
P01
2
3
To calculate the surface integral (1), we first note that there are six sides. For
the side that passes through P0 and perpendicular to the u1-axis, we have the ap-
proximate value
−u1 F (P0)h2du2 h3du3,
where the surface area element is ds2 ds3 = h2du2 h3du3. The integral on the surface
that is parallel to the previous surface is approximately
u1 F (P0 + du1u1)h2du2 h3du3,
where P1 = P0 + du1u1 is the position of P0 with an increment du1 along the u1-
coordinate axis. Combining these two sides and note that the volume of the element
is
V = ds1 ds2 ds3 = h1h2h3 du1du2du3,
the average becomes
(−F (P0) + F (P0 + du1u1))h2h3 du2du3
h1h2h3 du1du2du3−→ 1
h1
∂F (P0)∂u1
u1
as V → 0. Similarly we can calculate the other four sides. In summary, we find
∇F =1h1
∂F
∂u1u1 +
1h2
∂F
∂u2u2 +
1h3
∂F
∂u3u3.
2
Theorem The del operator has the formula
∇ = u11h1
∂
∂u1+ u2
1h2
∂
∂u2+ u3
1h3
∂
∂u3.
Example 1.16a Find the expression of ∇ in cylindrical coordinates.
Solution. Let u1 = r, u2 = θ, u3 = z. It is right-handed. We have h1 = 1, h2 =
r, h3 = 1. Also
u1 = cos θ i1 + sin θ i2, u2 = − sin θ i1 + cos θ i2, u3 = i3.
Thus
∇ = u1∂
∂r+ u2
1r
∂
∂θ+ u3
∂
∂z.
Example 1.16b. Find the gradient of f = xyz in the cylindrical coordinates.
Solution. We have f = r2z sin θ cos θ. Thus
∇f = u12rz sin θ cos θ + u2rz(cos2 θ − sin2 θ) + u3r2 sin θ cos θ.
1.16.2. Divergence. We let
F = F1u1 + F2u2 + F3u3.
Then we can find, similar to the previous section, that
div F =1
h1h2h3
[∂
∂u1(F1h2h3) +
∂
∂u2(F2h1h3) +
∂
∂u3(F3h1h2)
].
Example 1.16c Derive the formula for the Laplacian ∆ defined as ∆ = div ∇.
Solution. Consider an F = ∇f . We have
∆f = div ∇f= 1
h1h2h3
[∂∂u1
(h2h3h1
∂f∂u1
) + ∂∂u2
(h1h3h2
∂f∂u2
) + ∂∂u3
(h1h2h3
∂f∂u3
)].
(2)
1.16.3. The curl.
Similarly, for
F = F1u1 + F2u2 + F3u3,
we can derive the formula
curl F =1
h1h2h3
∣∣∣∣∣∣∣∣∣h1u1 h2u2 h3u3
∂∂u1
∂∂u2
∂∂u3
F1h1 F2h2 F3h3
∣∣∣∣∣∣∣∣∣ .
3
Appendix: Useful expressions
I. In cylindrical coordinates
u1 = r, u2 = θ, u3 = z
h1 = 1, h2 = r, h3 = 1,
there holdgrad f = ∂f
∂rur + 1r∂f∂θuθ + ∂f
∂zuz,
div A = 1r∂∂r (rAr) + 1
r∂Aθ∂θ + ∂Az
∂z ,
curl A =(
1r∂Az∂θ −
∂Aθ∂z
)ur +
(∂Ar∂z −
∂Az∂r
)uθ+
+1r
(∂∂r (rAθ)− ∂Ar
∂θ
)uz,
∆f = 1r∂∂r
(r ∂f∂r
)+ 1
r2∂2f∂θ2 + ∂2f
∂z2 ,
where
ur = cos θ i1 + sin θ i2, uθ = − sin θ i1 + cos θ i2, uz = i3
is the local orthonormal basis, and A has components Ar, Aθ, Az with respect to
this basis.
II. In spherical coordinates. See text book by Borisenko, p174.
4
Chapter II. Complex Variables
Dates: September 24, 26, 28.
These three lectures will cover the following sections of the text book by Keener.
§6.1. Complex valued functions and branch cuts;
§6.2.1. Differentiation and analytic functions, Cauchy-Riemann conditions;
§6.2.2. Integration;
§6.2.3. Cauchy integral formula;
§6.2.4. Taylor series expansion.
2.1. Complex valued functions.
1. Complex numbers. We introduce the imaginary number i, whose square is
−1:
i2 = −1.
Complex numbers are in the form a+ ib where a and b are real numbers. Complex
numbers can be represented in the Argand diagram by the vector (a, b): (Figure to
be provided later). Addition and subtraction of two complex numbers are simple:
We try the form, motivated by the guess work for a scalar equation x = ceλt,x1(t)
x2(t)
x3(t)
=
aeλt
beλt
ceλt
. (3)
We see x′1
x′2
x′3
= λ
a
b
c
eλt = λ
x1
x2
x3
.Inserting this back to (1), we have
λx1 − x2 − x3 = 0,
λx2 − x1 − x3 = 0,
λx3 − x1 − x2 = 0.
This is a homogeneous linear system of three algebraic equations. We write it in
matrix form λ −1 −1
−1 λ −1
−1 −1 λ
·x1
x2
x3
= 0.
Remove the common factor eλt in [x1, x2, x3]T in the above equation, we findλ −1 −1
−1 λ −1
−1 −1 λ
·a
b
c
= 0. (4)
To have a nonzero solution [a, b, c]T , we need the matrix to have zero determinant:
det
λ −1 −1
−1 λ −1
−1 −1 λ
= 0. (5)
This determinant can be evaluated to be
λ3 − 3λ− 2 = (λ+ 1)2(λ− 2). (6)
The factorization is made possible by inspection and λ = −1 is a solution. Equation
(5) then has three roots:
λ1 = 2, λ2 = −1, λ3 = −1. (7)
Using the root λ1 = 2 in (4), we have the equation2 −1 −1
−1 2 −1
−1 −1 2
a
b
c
= 0. (8)
The solutions are a
b
c
= α
1
1
1
, (α− free) (9)
Thus we find the first batch of solutionsx1
x2
x3
= α
1
1
1
e2t. (10)
Using λ2 = −1 in (4), we find the equation−1 −1 −1
−1 −1 −1
−1 −1 −1
a
b
c
= 0. (11)
2
The solutions to (11) area
b
c
= β
1
0
−1
+ γ
0
1
−1
, (β, γ free) (12)
We find another batch of solutions to (1):x1
x2
x3
= β
1
0
−1
e−t + γ
0
1
−1
e−t. (13)
If λ3 is different from λ2, we can use it to find another batch. But so far we have
found plenty of solutions. We combine linearly the solutions (10) and (13) to end
up with the general solution formula for (1)x1
x2
x3
= α
1
1
1
e2t + β
1
0
−1
e−t + γ
0
1
−1
e−t. (14)
We can use initial condition (2) to determine the three arbitrary constants α, β, γ
(which we skip).
In general equation (1) can be written as
d~x
dt= A~x (15)
for an n× n matrix A with constant coefficients (aij). For our previous example
A =
0 1 1
1 0 1
1 1 0
.The guess work is
~x = ~aeλt. (16)
where ~a is a vector. Then λ needs to satisfy
det(λI −A) = 0 (17)
3
and ~a satisfies
A~a = λ~a. (18)
If the characteristic equation (17) has n roots λ1, λ2, · · · , λn, and the eigenvalue prob-
lem (18) has n corresponding linearly independent eigenvectors α1~a1, α2~a2, · · · , αn~an,
then the general solution for (15) is
~x(t) = α1~a1eλ1t + α2~a2e
λ2t + · · ·+ αn~aneλnt. (19)
If, for example, λ1 = λ2, and the corresponding linearly independent eigenvectors
are fewer than n, then by guess work (in addition to α1~a1eλ1t),
~x = α2(~a2eλ1t + ~a1te
λ1t).
This way a nonzero ~a2 can be found in the sequence of equations
A~a1 = λ1 ~a1,
A~a2 = λ1 ~a2 + ~a1.
And the general solution is
~x = α1~a1eλ1t + α2(~a2 + ~a1t)eλ1t + α3~a3e
λ3t · · ·+ αn~aneλnt.
If λ1 is repeated more times, then higher orders of t can be used in the guess work.
If, however, all eigenvalues are distinct, a theorem says that the n corresponding
eigenvectors are linearly independent and (19) gives the general solution.
4
5.4. Stability of first-order linear system
Motivation: A solution needs to be stable in order to be useful in practice. The
U.S. missile defense system is not yet stable.
Considerd~x
dt= A~x, ~x(0) = ~C. (1)
where A is an n× n matrix of constants, with n distinct eigenvalues. The solution
formula is
~x(t) = α1~a1eλ1t + α2~a2e
λ2t + · · ·+ αn~aneλnt.
Theorem 1. If the real parts of all the eigenvalues of the coefficient matrix A are
(strictly) negative, then any solution to (1) goes to zero as t→ +∞.Theorem 2. If one or more eigenvalues of A have positive real parts, then some
solutions of (1) go to infinity as t→ +∞.Proofs: They follow from the solution formula if all eigenvalues are distinct.
Otherwise, solutions are like tmeλt which also go to zero if the real part of λ is
negative, or go to infinity if the real part of λ is positive.
Now let us consider a perturbation of (1)
d~x
dt= A~x+R(t, ~x). (2)
Suppose
‖R(t, ~x)‖ ≤ α‖~x‖, on t ≥ 0, ‖~x‖ < H. (3)
for some constants α and H > 0. Then
Theorem 3. If the real parts of all the eigenvalues of A are (strictly) negative,
and (3) holds for a suitably small α, then the zero solution of (1) is asymptotically
stable; i.e., all solutions of (2) with small initial data go to zero as t→ +∞.Theorem 4. If one or more eigenvalues of A have positive real parts, then the zero
solution is not stable, provided that (3) holds for a suitably small α.
What if one eigenvalue has zero real part and all others have negative real parts?
This is called the critical case, and is where bifurcation occurs. We will discuss these
issues in the next section. We provide some concrete stability examples below.
Examples 1. Consider dx1dt = λx1
dx2dt = µx2.
(4)
Suppose that λ < 0, µ < 0. Then all solutions go to zero as t → +∞. Add
where ε < min(|λ|, |µ|), the zero solution is stable: all solutions x(t)→ 0 as t→ +∞.2. For (4) again, but λ < 0 < µ. Zero is still a solution. But it is not stable
since the initially nearby solution x1
x2
=
0
αeµt
,where α is small, grows to infinity.
3. Consider now for β 6= 0, the systemdx1dt = βx2
dx2dt = −βx1.
Differentiating the first equation and using the second equation we find
d2x1
dt2+ β2x1 = 0.
We can therefore find the solution formula x1 = x01 cos(βt) + x0
2 sin(βt)
x2 = −x01 sin(βt) + x0
2 cos(βt).
Introduce ρ(t) = (x21 + x2
2)12 , then ρ(t) = ((x0
1)2 + (x02)2)
12 . See Figure 5.2 for the
phase portrait of the solutions. This solution is however unstable to perturbations
of the form 0
αx2
,
2
where α > 0, because then the equation has the matrix 0 β
−β α
,one of whose eigenvalues has positive real part.
Notes 1. The stability of a nonzero solution w(t) can be transformed to the stability
of the zero solution to the equation for v(t) ≡ u(t)− w(t).
2. A general nonlinear system
d~x
dt= ~F (t, ~x)
may be approximateed by (2) just as a curve can be approximated by its tangent
lines.
x
x
x
x
Figure 5.1. Any solution of Example 3 traces a circle.
β < 0 β > 0
(x (t), x (t)) (x (t), x (t))
1
2 1
2 21
2
1
5.5. Hopf bifurcations and example.
Motivation: Bifurcation theory is used in many life sciences, ecological systems,
weather system, fluid, chaos, and turbulence.
Considerd2u
dt2+ (u2 − λ)
du
dt+ u = 0. (5)
It has the solution u = 0. Let us consider the linearized equation
d2u
dt2− λdu
dt+ u = 0.
3
When we try solutions of the form u = eµt, we find
µ2 − λu+ 1 = 0.
For λ < 0, both roots have negative real parts, so zero solution is stable. For λ > 0,
both roots have positive real parts, so zero solution is unstable. At λ = 0, the roots
are purely imaginary µ = i,−i and the linearized equation has periodic solutions
u = eit = cos t + i sin t, or u = e−it = cos t − i sin t. Both the real and imaginary
parts are real solutions u(t) = cos t or sin t.
We write equation (5) in vector form by introduing u1 = u, u2 = u′ : u′1 = u2
u′2 = (λ− u21)u2 − u1.
Or
~u(t) =
0 1
−1 λ
· u1
u2
+
0
−u21u2
.This nonlinear system has nonzero periodic solution near λ = 0 :
λ =ε2
4+O(ε3), u1(t) = ε cos(ωt) +O(ε3), ω = 1 +O(ε3).
We will derive this expansion in perturbation theory next semester. For now we have
a bifurcation diagram, see Figure 5.2, and we state a general bifurcation theorem
called Hopf bifurcation.
4
|| u ||max
λλ = 0
0
Amplitude of a solution
Branch of zero solution
A brach of periodic solutions
t
u (t)
Figure 5.2. Hopf bifurcation diagram.
(a). Bifurcation diagram
(b) A periodic solution
Each point on the branch indicates a periodic solution:
Theorem(Hopf Bifurcation). Suppose the n× n matrix A(λ) has eigenvalues µj =
µj(λ), (j = 1, 2, · · · , n), and that for λ = λ0, µ1(λ0) = iβ, µ2(λ0) = −iβ and
Reµj(λ0) 6= 0 for all j > 2. Suppose further that Re (µ′1(λ0)) 6= 0. Then the system
of differential equationsdu
dt= A(λ)u+ f(u)
with f(0) = 0, f(u) a smooth function of u, has a branch (continuum) of periodic
solutions emanating from u = 0, λ = λ0.
(The direction of bifurcation is not determinated by the Hopf Bifurcation The-
orem, but must be calculated by a local power series expansion (See Keener)).
We plan to do serious perturbation theory next semester, where we can un-
derstand how a mathematician’s perturbation and calaulation helps locating the