MONASH UNIVERSITY SCHOOL OF MATHEMATICAL SCIENCES ENG1091 Vectors Lecture 1 vector arithmetic revision dot product cross product Text Reference: §4.1 - 4.2 Vectors and Lines, a quick review. Many quantities in nature are completely specied by one number (called the magnitude of the quantity) and are usually referred to as scalar quantities. Some examples are temperature, time, length, and mass. However, certain quantities require both a magnitude and a direction to specify them. To say that a boat sailed 10 kilometers (km) does not specify where it went. It is necessary to give the direction too; perhaps it sailed 10 km northwest. We then describe the position of the boat by giving its displacement relative to some point, a quantity that involves distance as well as direction. Quantities that require both a magnitude and a direction to describe them are called vectors. Other examples include velocity and force. Vector quantities will be denoted by boldface type: u; v; w, and so on. In handwritten work vectors are denoted by v e or by ! v: The vector that joins the two points A and B is denoted ! AB or by AB: A vector v can be represented geometrically as a directed line segment or arrow. The magnitude of a vector v will be denoted by kvk and is sometimes referred to as the length of v because it is represented by the length of the arrow. Two vectors v and w are equal (written v = w) if they have the same length and the same di- rection. Thus, for example, the two vectors in the diagram are equal even though the initial and terminal points are di/erent! v w There is one vector that has no direction whatsoever-the zero vector 0: Given a vector v the vector that has the same length as v but opposite in direction is the neg- ative of vector v, denoted v: v - v When we multiply a vector by a scalar we multiply the length of the vector by the rel- evant amount, without changing its direction (unless the scalar is negative and then the di- rection is opposite). Two vectors are parallel if one is a scalar multiple of the other. That is, if a = b then a is parallel to b: -½v v 2v ENG1091 Mathematics for Engineering page 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
Many quantities in nature are completely specified by one number (called the magnitude of the
quantity) and are usually referred to as scalar quantities. Some examples are temperature, time,
length, and mass.
However, certain quantities require both a magnitude and a direction to specify them. To
say that a boat sailed 10 kilometers (km) does not specify where it went. It is necessary to
give the direction too; perhaps it sailed 10 km northwest. We then describe the position of the
boat by giving its displacement relative to some point, a quantity that involves distance as
well as direction. Quantities that require both a magnitude and a direction to describe them are
called vectors. Other examples include velocity and force. Vector quantities will be denoted by
boldface type: u,v,w, and so on. In handwritten work vectors are denoted by v˜ or by −→v . Thevector that joins the two points A and B is denoted
−−→AB or by AB.
A vector v can be represented geometrically as a directed line segment or arrow. The magnitude
of a vector v will be denoted by ‖v‖ and is sometimes referred to as the length of v because itis represented by the length of the arrow.
Two vectors v and w are equal (written v = w)
if they have the same length and the same di-
rection. Thus, for example, the two vectors in
the diagram are equal even though the initial
and terminal points are different!
vw
There is one vector that has no direction whatsoever-the zero vector 0.
Given a vector v the vector that has the same
length as v but opposite in direction is the neg-
ative of vector v, denoted −v.v
− v
When we multiply a vector by a scalar we
multiply the length of the vector by the rel-
evant amount, without changing its direction
(unless the scalar is negative and then the di-
rection is opposite). Two vectors are parallel if
one is a scalar multiple of the other.
That is, if a = λb then a is parallel to b.
−½vv2v
ENG1091 Mathematics for Engineering page 1
If u and v are two vectors we define their sum u+ v by adding the vectors ‘head to tail’which
is to say we attach the tail of the second vector,v, to the head of the first u, the sum u + v is
then the vector drawn from the tail of first vector to the head of the last.
This method allows also us to add several vec-
tors at once. a + b + cc
b
a
Should it happen that vectors add together forming a loop, so that the end point is the same as
the initial point, then the vector sum is 0. Thus for example if A,B,C are any three points in
space−−→AB +
−−→BC +
−→CA = 0.
We can also add two vectors u and v geometrically by drawing them from the same point and
completing a parallelogram with the two vectors as adjacent sides. The diagonal vector drawn
from the common tail to the common head point is then the vector u+ v.
From the parallelogram method of vector addition we see that u+ v = v + u.
The opposing diagonal, drawn towards v, is the vector v − u.
The unit coordinate vectors.
Vectors of length one unit are called unit vec-
tors. The unit vectors parallel to the positive x
and y axis in the plane are labelled i and j.
In three dimensional space we add a further
unit vector, k , parallel to the z axis.
Any vector r in space can be written as a com-
bination of multiples of i, j and k. The coeffi -
cients of i, j and k are called its rectangular
components.
r = (x, y, z) = xi+ yj+ zk
i
k
j
x
z
y
The magnitudes of vectors given in component form.
Using Pythagoras’theorem it is an easy matter to find the lengths of vectors:
In three dimensions where we have v = ai+ bj+ ck, then ‖v‖ = ‖ai+ bj+ ck‖ =√a2 + b2 + c2.
Example: ‖i− j+ k‖ =√
(1)2 + (−1)2 + (1)2 =√
3
In two dimensions the length of v = ai+ bj is given by |v| = ‖ai+ bj‖ =√a2 + b2.
ENG1091 Mathematics for Engineering page 2
The Scalar or “Dot”Product
In the previous section we saw how vectors can be added/subtracted together, and we saw how
to multiply them by scalars. The question naturally arises: is it possible to multiply two vectors
together?
There are two types of vector multiplication that are generally useful-the scalar or dot product
and the vector or cross product. Now for a word of warning. Many of the rules we take for
granted in ordinary arithmetic don’t hold when it comes to vector multiplication. When we look
at the vector cross product later this lecture we will see that a×b 6= b×a.We will also see thatthere is no such thing as vector division-vectors don’t have reciprocals! Of course we don’t just
multiply vectors for fun-we do it because it has useful applications.
First, consider the scalar product. One modern use of the scalar product is the projection of
a 3D image on a 2D screen and to do it in such a way as to convince the viewer that he/she is
looking at a 3D image.
Given two vectors a and b then we define their scalar or ‘dot’product as
a · b = (‖a‖ ‖b‖ cos θ)
where θ is the angle between the two vectors.
Note that a · b is a scalar quantity-it is not a vector.
Historically the reason that the scalar product was studied is that in physics the work done by
a force F in moving an object a displacement d is the dot product of force with displacement,
i.e. W = F · d.
From the definition we immediately get the following:
(i) a · a = ‖a‖2 (because the angle between a vector a and itself is 0.)
(ii) If a ⊥ b then a · b = 0
The dot products of the unit vectors i, j and k.
Given the definition above we see thati · j = j · k = k · i = 0
and i · i = j · j = k · k = 1
Properties of the Dot Product
(i) a · b = b · a the dot product is commutative
(ii) λa · b = a · λb =λ (a · b) , for any scalar λ
(iii) a · (b+ c) = a · b+ a · c the dot product is distributive
Notice that the expression a · (b · c) has absolutely no meaning because it is attempting toform a dot product of vector a with the scalar b · c.
The expression a (b · c) has a meaning though it is better written as (b · c)a. The expression(b · c)a means to multiply vector a by the scalar b · c, resulting in a vector having the same oropposite direction as a and of length: = |b · c| ‖a‖ .
ENG1091 Mathematics for Engineering page 3
Notice how we can use the distributive law to simplify the dot product of two vectors given in
component form: Let a = a1i+ a2j+ a3k, and b = b1i+ b2j+ b3k then
Now−→AP = 〈x, y, z〉 − 〈−1, 0, 4〉 = 〈x+ 1, y, z − 4〉 and
−→AP · n = 〈x+ 1, y, z − 4〉 · 〈−17, 3,−9〉 = 0
that is −17x− 17 + 3y − 9z + 36 = 0
giving the equation of the plane as 17x− 3y + 9z = 19.
ENG1091 Mathematics for Engineering page 11
We should check that all three points satisfy the plane’s equation:
A (−1, 0, 4) : 17x− 3y + 9z = −17 + 36 = 19 X
B (2, 5, 0) : 17x− 3y + 9z = 34− 15 = 19 X
C (2, 2,−1) : 17x− 3y + 9z = 34− 6− 9 = 19. X
There are two observations that can be made. Firstly, the equation of a plane in three-dimensional
space is unique (up to multiplication by a scalar constant). Secondly, parallel planes have the
same normal vector and hence will only differ by the constant D.
Example 3: Find the minimum distance between the parallel planes 2x + 3y − z = 6 and
2x+ 3y − z = 0.
Let P (x1, y1, z1) be any point in the plane 2x+ 3y − z = 6 and
Q (x2, y2, z2) be any point in the plane 2x+ 3y − z = 0. [Notice that the equations of the planes
are arranged so that they have identical coeffi cients. Rearrange the equations if necessary-this
is important for what comes next.]
The distance between two parallel planes with normal n is then (diagram)
d =∥∥∥−−→PQ∥∥∥ cos θ =
−−→PQ · n‖n‖
=
(−−→OQ−−−→OP
)· n
‖n‖
=
−−→OQ · n−−−→OP · n
‖n‖
however−−→OQ · n = 2x2 + 3y2 − z2 = 0 and similarly
−−→OP · n = 2x1 + 3y1 − z1 = 6.
Thus (and taking absolute value since we seek a distance):
d =
∣∣∣∣∣∣ 0− 6√22 + 32 + (−1)2
∣∣∣∣∣∣ =6√14
2. Lines and Planes
Combining the knowledge of lines, planes and basic vector operations allows for a wide range of
problems to be addressed in three-dimensional space. For example, we can find:
· the minimum distance from a point to a plane,
· the minimum distance from a point to line,
· the angle between two intersecting planes,
· the minimum distance between two non-intersecting lines.
Example 4: Find the line defined by the intersection of the planes −x+y+z = 2 and x+2y = 4
and the angle of intersection.
Solution: A direction vector of the line of intersection is easily found: it is normal to both
−i+ j+ k and i+ 2j and hence could be obtained using the cross product. To find the equation
ENG1091 Mathematics for Engineering page 12
of the line of intersection is best done using Gauss elimination (next lecture).
A direction vector is
∣∣∣∣∣∣∣∣i j k
−1 1 1
1 2 0
∣∣∣∣∣∣∣∣ = −2i + j − 3k. (Of course any non-zero scalar multiple of
this is also a direction vector.)
The angle between two planes is defined as being the angle between its normals (diagram).
(−i+ j+ k) · (i+ 2j) = −1 + 2 = 1
‖(−i+ j+ k)‖ =√
(−1)2 + 12 + 12 =√
3 and ‖(i+ 2j)‖ =√
5
The angle θ between the planes is then given by cos θ = 1√3√
5, hence θ = 75.04 .
3. Parametric representation of a plane
Recall that straight lines have parametric equations giving x, y, z as function of one parametric
variable (usually t). Planes have parametric equations where x, y, z are given as functions of two
parametric variables (usually u and v).
Suppose we know a point P0 (a, b, c) in the plane and two non-parallel direction vectors
w1 = pi+ qj+ rk, and w2 = li+mj+ nk also in the plane: (diagram):
w1
r(x,y,z)vw2
w2
O
uw1P0
Let r = xi + yj + zk denote the position vector of an general point P (x, y, z) in the plane, so
that r =−−→OP0 + uw1 + vw2 where u, v are any scalars (parameters).
This gives r (x, y, z) = 〈a, b, c〉+ u 〈p, q, r〉+ v 〈l,m, n〉 and hence
x (u, v) = a+ pu+ lv,
y (u, v) = b+ qu+mv,
z (u, v) = c+ ru+ nv.
Theses 3 equations are the parametric equations of a plane. The fact that two parameters (u
and v) are needed to describe it indicates that a plane is a 2 dimensional surface.
In more advanced mathematics (i.e. 2nd level maths), it will be imperative to represent surfaces
parametrically.
ENG1091 Mathematics for Engineering page 13
Example 5: Find a parametric representation of the plane going through the points (−1, 0, 4) , (2, 5, 0)
and (2, 2,−1) .
Solution: label the points P (−1, 0, 4) , Q (2, 5, 0) and R (2, 2,−1) .
Now a choice for w1 is−−→PQ = 〈2, 5, 0〉 − 〈−1, 0, 4〉 = 〈3, 5,−4〉
and a choice for w2 is−→PR = 〈2, 2,−1〉 − 〈−1, 0, 4〉 = 〈3, 2,−5〉 .
Check that these are non-parallel X. (Otherwise the three points are collinear and the ques-tion cannot be answered properly-there will be an infinite number of planes.)
In vector form the parametric equations are
r = (−1, 0, 4) + u (3, 5,−4) + v (3, 2,−5)
= (−1 + 3u+ 3v, 5u+ 2v, 4− 4u− 5v)
Hence
x (u, v) = −1 + 3u+ 3v,
y (u, v) = 5u+ 2v,
z (u, v) = 4− 4u− 5v.
ENG1091 Mathematics for Engineering page 14
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
ENG1091 Systems of Linear Equations
Lecture 4 · echelon form · Gauss eliminationText Reference: §5.5
Our object in this lecture is to solve a system of equations like
2x+ y + z + w = 4
4x+ y + 3z + 2w = 7
−2x+ z − w = 9.
Such a system is called linear because each of the equations on the left hand side is a linear
function of the unknown variables x, y, z and w. Simple linear systems of 2 or 3 variables
are commonly encountered in secondary school and is instructive to view an example before
discussing a more general procedure.
Suppose we wish to solve a system like
x+ 2y = 3 (1)
2x− 3y = −8 (2)
One way to proceed is to multiply equation 1 by 2 and subtract this from equation 2:
x+ 2y = 3 (1)
−7y = −14 (2(a))
The reason why this is effective is that one of the variables is eliminated. Equation (2a) is now
easily solved giving y = 2, and substituting this into equation 1 we find x = −1. Geometrically,
the equations x+ 2y = 3 and 2x− 3y = −8 represent two straight lines in the x− y plane whichintersect at the point (−1, 2).
The important point is that both of the systemsx+ 2y = 3
2x− 3y = −8and
x+ 2y = 3
−7y = −14have identical
solutions. Think about the operations we could perform on the two original equations.
We could
• interchange the two equations
• multiply either equation by any number we choose except zero, and
• add a multiple of one equation to the other.
Now performing any of these operations without thinking is not guaranteed to be effective but
at least we are assured that the resulting system of equations has an identical set of solutions.
Notice that the names of the variables is irrelevant: solvingx+ 2y = 3
2x− 3y = −8is exactly the same
as solving the systemu+ 2v = 3
2u− 3v = −8, only the coeffi cients are important.
1. The first step in solving a linear system is to write the system in augmented matrix form.
This is a way of writing the system using only the coeffi cients.
ENG1091 Mathematics for Engineering page 15
For example we write the system
2x+ y + z + w − 4 = 0
4x+ y + 3z + 2w = 7
−2x+ z − w = 9
as
2 1 1 1
4 1 3 2
−2 0 1 −1
∣∣∣∣∣∣∣∣4
7
9
.Notice each equation is written as a single row and that coeffi cients belonging to the same variable
are written directly underneath each other. (Equation 3, which appears to have no y, has in fact
a y−coeffi cient of zero.) Each constant term must be placed on the right hand side of the ‘equals’sign (the ‘−4’becomes +4 on the right hand side of equation 1) and the vertical partition is
used to separate the left hand side from the right hand side. (Think of it as replacing all of the
equals’signs.)
Example: Write the system
r + s + 2t = 0
2r − 3t = 1
6s − 5t = 0
in augmented matrix form.
Solution:
1 1 2
2 0 −3
0 6 −5
∣∣∣∣∣∣∣∣0
1
0
.2. Gaussian elimination
Gaussian elimination is a systematic method of solving linear equations by first reducing the
corresponding system into an equivalent system, called row echelon form, where the unknowns
can be calculated by back substitution.
Example: Given the system
r + s + 2t = 0
s − 3t = 1
− 5t = 5
find solutions to each of the variables
using back substitution.
Solution: t = 5−5 = −1 s = 1 + 3t
= −2
r = −2t− s= 2 + 2
= 4The system of equations in the last example has the augmented matrix
1 1 2
0 1 −3
0 0 −5
∣∣∣∣∣∣∣∣0
1
5
andwhich is one that is already in row echelon form. We saw how easy it is to find solutions of
systems in this form.
Definition: A matrix is in row echelon form when
• the leading (non-zero) coeffi cient of each row (called the pivot entry) has zeros below
it, and
• the pivot entries of following rows are located in columns further to the right.
• any rows which have no pivot (and therefore consist entirely of zeros) must come last.
ENG1091 Mathematics for Engineering page 16
Example: Given the following partitioned matrices, choose those which are in row echelon form:
A.
1 1 2
1 1 13
0 0 1
∣∣∣∣∣∣∣∣0
1
5
B.
1 0 2 0
0 1 −3 0
0 0 0 1
∣∣∣∣∣∣∣∣2
1
10
C.
1 0 0
0 1 0
0 0 1
∣∣∣∣∣∣∣∣0
1
5
no yes yes
D.
2 1 2
0 3 −3
0 0 2
∣∣∣∣∣∣∣∣0
6
5
E.
1 1 2
0 3 13
0 0 0
∣∣∣∣∣∣∣∣0
1
5
F.
1 0 0 0
0 1 1 0
1 0 0 1
∣∣∣∣∣∣∣∣0
0
0
yes yes no
G.
1 2 0 1 −3 1
0 0 0 1 2 −3
0 0 0 0 0 1
∣∣∣∣∣∣∣∣0
1
5
yes
To obtain the equivalent row echelon form of a system we apply a sequence of the three elementary
row operations on the augmented matrix. As discussed above these row operations do not change
the solution set of the corresponding system of linear equations.
The three elementary row operations are:
• Interchanging two rows
• Multiplying a row by a non-zero scalar
• Adding to one row a multiple of another
2. Row echelon forms
To reduce a matrix row echelon form systematically we follow these steps:
1. Locate the left-most column that doesn’t consist entirely of zeros.
2. Ensure that the top entry of this column is a non-zero entry. If necessary, interchange top
row with another row to achieve this.
3. Multiply this top row by the appropriate constant so that the first non-zero entry of this
row is 1. This entry is the pivot for that column. (It is not absolutely necessary that the
value of each pivot be 1 but this is certainly the most convenient value to have. As an
alternative to multiplying each row by a constant we can add/subtract multiples of other
rows to obtain a 1.)
4. Add a suitable multiple of this first row to each row below, so that all entries below this
pivot are 0.
5. Consider the submatrix obtained by removing the top row, and apply to this matrix steps
1 to 4.
ENG1091 Mathematics for Engineering page 17
Repeat steps 1-5 until the next submatrix under consideration has no rows left.
Example: Reduce the following matrix to row echelon form:
0 0 −2 0 12
3 6 −15 9 42
2 4 −5 6 −1
Solution:
0 0 −2 0 12
3 6 −15 9 42
2 4 −5 6 −1
(
13
)R2 → R2
R1 ↔ R2
1 2 −5 3 14
0 0 −2 0 12
2 4 −5 6 −1
R3 − 2R1 → R3
1 2 −5 3 14
0 0 −2 0 12
0 0 5 0 −29
(−1
2
)R2 → R2
1 2 −5 3 14
0 0 1 0 −6
0 0 5 0 −29
R3 − 5R2 → R3
1 2 −5 3 14
0 0 1 0 −6
0 0 0 0 1
← row echelon form
Exercise: Find a row echelon form of the matrix
1 0 −1 0
2 1 0 8
0 1 −2 0
1 −1 −2 −6
Solution:
1 0 −1 0
2 1 0 8
0 1 −2 0
1 −1 −2 −6
R2 − 2R1 → R2
R4− R1 → R4
1 0 −1 0
0 1 2 8
0 1 −2 0
0 −1 −1 −6
R3− R2 → R3
R4+ R3 → R4
1 0 −1 0
0 1 2 8
0 0 −4 −8
0 0 1 2
−14R3 →R3
1 0 −1 0
0 1 2 8
0 0 1 2
0 0 1 2
R4−R3 →R4
1 0 −1 0
0 1 2 8
0 0 1 2
0 0 0 0
← row echelon form
ENG1091 Mathematics for Engineering page 18
3. Solving a system using Gaussian elimination: To solve the system
x + 3y + 2z = 1
2x + 7y + 3z = 2
−3x − 10y − 6z = −5
1. we write the augmented matrix:
1 3 2
2 7 3
−3 −10 −6
∣∣∣∣∣∣∣∣1
2
−5
2. by performing appropriate row operations we find an equivalent row echelon form:
R2 − 2R1 → R2
R3 + 3R1 → R3
1 3 2
0 1 −1
0 −1 0
∣∣∣∣∣∣∣∣1
0
−2
R3+R2 →R3
1 3 2
0 1 −1
0 0 −1
∣∣∣∣∣∣∣∣1
0
−2
(−1)R3 →R3
1 3 2
0 1 −1
0 0 1
∣∣∣∣∣∣∣∣1
0
2
3. Use back substitution to find the values of the unknowns, in this case:
z = 2, y = z = 2 and x = 1− 2z − 3y = 1− 4− 6 = −9
So the three planes intersect in a single point: x = −9, y = 2, z = 2.
Note: The pivot in a column does not need to be equal to 1 − any non-zero number would do.
Exercise:
Solve the systems:
(a) −2a − 2b + 3c = 1
2a − 2b + c = 1
a + b − c = −3
ANS: Solution is: a = −52 , b = −11
2 , c = −5
(b) r + s + 2t = 0
2r + 4s − 3t = 1
3r + 6s − 5t = 0
ANS: Solution is: r = −17, s = 11, t = 3
Example: Find a vector equation for the line which forms the solution set of x+ y − z = 3
2x+ y + 2z = 1
(You will recall an example similar to this at the end of lecture 3.)
Writing the augmented matrix of this system and taking the system to row echelon form:[1 1 −1
2 1 2
∣∣∣∣∣ 3
1
]R2 − 2R1 →R2
[1 1 −1
0 −1 4
∣∣∣∣∣ 3
−5
](−1)R2 →R2
[1 1 −1
0 1 −4
∣∣∣∣∣ 3
5
]Here is a system of equations with an infinite solution set.
ENG1091 Mathematics for Engineering page 19
Notice that the pivot entries correspond to variables x and y.[1 1 −1
0 1 −4
∣∣∣∣∣ 3
5
]
The non-pivot variable, z, is said to be free and is set equal to a parameter t.
Let z = t
y − 4z = 5 hence y = 5 + 4z = 5 + 4t
x+ y − z = 3 hence x = 3 + z − y = −2− 3t
....parametric form
The solution can be written in vector form as
(x, y, z) = (−2− 3t, 5 + 4t, t) = (−2, 5, 0) + t (−3, 4, 1)
or in algebraic form:x+ 2
−3=y − 5
4=z − 0
1.
This shows that the solution is a straight line passing through the point (−2, 5, 0) and with
direction vector−3i+ 4j+ k.
Example: (from the previous lecture) Find an equation of the line of intersection of the
planes −x+ y + z = 2 and x+ 2y = 4.
Augmented matrix:
[−1 1 1
1 2 0
∣∣∣∣∣ 2
4
]
R2 + R1 → R2
[−1 1 1
0 3 1
∣∣∣∣∣ 2
6
](now in echelon form)
z is free, y = 2− 13z, x = −2 + z + y = 2
3z
set z = 3t, y = 2 − t, x = 2t and hence (x, y, z) = (0, 2, 0) + t (2,−1, 3) . (Compare with the
direction vector found in that example.)
ENG1091 Mathematics for Engineering page 20
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
ENG1091 Consistency of Linear Equations
Lecture 5 · no solution case · infinite solution caseText Reference: §5.6
The equation systems given in the last lecture were rather special in the sense that they all had
solutions.
An example of this is the equation systemx+ 2y = 3
2x− 3y = −8, which consists of two straight lines
intersecting in the point (−1, 2).
But of course straight lines do not always intersect. The equation systemx+ 2y = 3
2x+ 4y = 1represents
two parallel straight lines and has no solution.
The question is how can we do this systematically:
How do we use Gauss elimination to recognise when a system of
equations has no solution.
Notice what happens when we employ Gauss elimination to solve the system of equations likex+ 2y = 3
2x+ 4y = 1
Augmented matrix:
[1 2
2 4
∣∣∣∣∣ 3
1
]
Converting to row echelon form: (one step only),
[1 2
0 0
∣∣∣∣∣ 3
−5
].
Notice that in the last row all entries left of the partition are zero, and that there is a non zero
number to the right of the partition. Since it is impossible for 0x + 0y = −5 we know that the
system has no solution.
Definition: A linear system of equations without solution is called inconsistent.
Now of course the previous example didn’t need Gauss elimination to demonstrate its inconsis-
tency. However, a system of 3 equations in 3 unknowns (represented by three planes in space) is
rather more complex. A 3× 3 system of equations will be inconsistent if either
• the three planes are parallel
• two planes are parallel and are intersected by the third,
• neither of the planes is parallel but each pair of planes intersects in a line parallel to theothers.
Geometrically the situation for higher dimensions (>3 unknowns) is even more complex still but
algebraically very easy to sort out provided we apply Gauss elimination.
ENG1091 Mathematics for Engineering page 21
The great advantage of Gauss elimination is that it takes the guess work out of equation manip-
ulation by being systematic. We can tell whether equations are inconsistent or not by using the
following very simple test:
When the augmented matrix corresponding to a system of inconsistent equations
is converted into a row echelon form, there will be at least one row where
all entries left of the partition are zero and there is a non-zero entry to the right
of the partition.
To put it another way, the row echelon form of an inconsistent linear system will have a row of
type[
0 0 0 · · · 0∣∣∣ ∗ ] where ∗ is some non-zero number.
Moreover, the test is completely diagnostic: if no such row exists then the equation systemmust
have solutions.
Example: The following partitioned matrices are row echelon forms corresponding to various
systems of linear equations. Which linear systems are inconsistent?
A.
1 1 2
0 1 13
0 0 0
∣∣∣∣∣∣∣∣0
1
5
B.
1 0 2 0
0 1 −3 0
0 0 0 0
∣∣∣∣∣∣∣∣2
1
0
C.
1 1 −3
0 2 1
0 0 0
0 0 0
∣∣∣∣∣∣∣∣∣∣∣
0
4
1
0
D.
2 1 2
0 0 0
0 0 0
∣∣∣∣∣∣∣∣1
0
0
E.
1 0 2
0 2 0
0 0 0
∣∣∣∣∣∣∣∣0
1
0
F.
1 0 0 0
0 1 1 0
0 0 0 1
∣∣∣∣∣∣∣∣0
0
0
G.
1 2 0 1 −3 1
0 0 0 1 2 −3
0 0 0 0 0 0
∣∣∣∣∣∣∣∣0
1
5
Example: Show that the following system of equations is inconsistent by forming its augmented
matrix and then using row operations convert it to a matrix in row echelon form:
x + 2z = 1
y − z = 0
x + y + z = 2
Solution
Augmented matrix:
1 0 2
0 1 −1
1 1 1
∣∣∣∣∣∣∣∣1
0
2
(not in echelon from)
R3 − R1 → R3
1 0 2
0 1 −1
0 1 −1
∣∣∣∣∣∣∣∣1
0
1
R3 − R2 → R3
1 0 2
0 1 −1
0 0 0
∣∣∣∣∣∣∣∣1
0
1
The shaded row indicates inconsistency.
ENG1091 Mathematics for Engineering page 22
What is the geometric interpretation of this inconsistent system?
Answer: Since none of the three planes are parallel (why?) we conclude that each pair of planes
intersects in a line parallel to the others.
[Examine the normal vectors (1, 0, 2) , (0, 1,−1) , (1, 1, 1) . Since no two of these is parallel neither
is there a pair of parallel planes.]
A system of linear equations that does not have solutions is said to be inconsistent, so obviously
a consistent system is one that does have solutions.
Now we encounter a remarkable fact: either a consistent linear system has a unique solution
(exactly one solution for each of the unknowns) or else it possesses infinitely many! To put it
another way, if a linear system of equations is known to have two different solutions (say) then
that system must have infinitely many solutions.
2. Systems with infinitely many solutions:
The augmented matrix of the system
x − 3y + z = 1
2x − 6y + 3z = 4
−x + 3y = 1
reduces to the following equivalent row-echelon form:
Working: Augmented matrix:
1 −3 1
2 −6 3
−1 3 0
∣∣∣∣∣∣∣∣1
4
1
R2 − 2R1 → R2
R3 + R1 → R3
1 −3 1
0 0 1
0 0 1
∣∣∣∣∣∣∣∣1
2
2
R3 − R2 → R3
1 −3 1
0 0 1
0 0 0
∣∣∣∣∣∣∣∣1
2
0
row echelon form:
1 −3 1
0 0 1
0 0 0
∣∣∣∣∣∣∣∣1
2
0
The echelon form matrix gives us all the information concerning the original system. First of all
we notice there is no row of the type[
0 0 0 · · · 0∣∣∣ ∗ ] where ∗ is non-zero, so we know
that the system has solutions.
The third row is entirely zero and in effect is totally redundant. We ignore rows that consist
entirely of zeros.
From 2nd row we have z = 2.
Solving the first row for x we have x = 1− z + 3y = −1 + 3y (since z = 2).
So z = 2 and x = −1 + 3y where the choice for y is completely arbitrary. There are infinitely
many solutions, one for each value of y.
It is customary to assign a parameter to the free variable y. We can then write the solution set
ENG1091 Mathematics for Engineering page 23
as y = t, x = −1 + 3t, z = 2, where t is arbitrary.
What is the graphical interpretation of this consistent system?
Answer: The three planes x − 3y + z = 1, 2x − 6y + 3z = 4, and −x + 3y = 1 intersect
in a straight line in 3D space. This line has a vector equation (x, y, z) = (−1 + 3t, t, 2) =
(−1, 0, 2)+t (3, 1, 0) , and therefore passes through the point (−1, 0, 2) and points in the direction
of the vector 3i+ j+ 0k.
Example: Solve the 3× 4 system of linear equations:
2x+ y + z + w = 4
4x+ y + 3z + 2w = 7
−2x+ 2y + z − w = 9
Solution:
We write the system in augmented matrix form and use elementary row operations to convert
the system to an equivalent one in echelon form. (Gauss elimination.)
Augmented matrix:
[A | b] =
2 1 1 1
4 1 3 2
−2 2 1 −1
∣∣∣∣∣∣∣∣4
7
9
2 1 1 1
4 1 3 2
−2 2 1 −1
∣∣∣∣∣∣∣∣4
7
9
R2 − 2R1 →R2
2 1 1 1
0 −1 1 0
−2 2 1 −1
∣∣∣∣∣∣∣∣4
−1
9
R3+ R1 →R3
2 1 1 1
0 −1 1 0
0 3 2 0
∣∣∣∣∣∣∣∣4
−1
13
R3+ 3R2 →R3
2 1 1 1
0 −1 1 0
0 0 5 0
∣∣∣∣∣∣∣∣4
−1
10
This time the pivot variables are x, y and z (since the pivot entries occur in columns 1,2, and 3,
corresponding to the variables x, y, z).
The free variable is w.
w = free = t (say)
from row 3: 5z = 10 ∴ z = 2
from row 2: −y + z = −1 ∴ y = z + 1 = 3
from row 1: 2x+ y + z + w = 4 ∴ x = 2− 12z −
12y −
12w = −1
2 −12 t
Writing the solutions in vector form:
〈x, y, z, w〉 =⟨−1
2 −12 t, 3, 2, t
⟩=⟨−1
2 , 3, 2, 0⟩
+ t⟨−1
2 , 0, 0, 1⟩.
ENG1091 Mathematics for Engineering page 24
Exercise:
The row echelon form of a system with unknowns r, s, t, and u, is
1 1 0 1
0 0 1 1
0 0 0 0
0 0 0 0
∣∣∣∣∣∣∣∣∣∣∣
1
1
0
0
Describe the solutions of the system.
ANS: infinite number of solutions with s and u free t = 1− u, r = 1− t− s∴ (r, s, t, u) = (1− t− s, s, 1− u, u) where s, u are arbitrary.
Exercises: Solve the following systems of linear equations:
(a)
x − y − 2z = 3
x + 2y − z = 0
2x − y + z = 5
x − y − z = 3
ANS: unique solution x = 2, y = −1, z = 0
(b)
x + y + z = 2
x − y + z = 1
2x + 2z = 4
ANS: no solution
(c)
−a + b + c + 2d + e = 0
a − c + d − e = 1
2b + c − d − 2e = −1
ANS: infinite solution set where d and e are free.
Solving for a, b, c we get a = −2 + 6d+ 3e, b = 1− 3d, c = −3 + 7d+ 2e
A matrix is a rectangular arrangement of numbers or variables, which can be either real or
complex, enclosed in square brackets. It is usual to denote matrices using capital letters. For
example:
A =
−2 3
4 5 0
7 π√
2 −1
−0.18 7 20 −78
, B =
[1 2 3
4 5
]is not a matrix
A matrix has rows, running left to right, and columns running form top to bottom.
The matrix A has three rows and four columns and consists of 12 entries.
• A matrix with m rows and n columns is called a m × n matrix; the matrix A in the
example is a 3× 4 matrix.
• The position of each entry is determined by the column and row numbers. We use
subindices to indicate this, for example,
a24 is the entry in row 2 , column 4. In matrix A, a24 = −1.
a13 is the entry in row 1 , column 3. In matrix A, a13 = 5.
We use the notation A = [aij ] to indicate that A is a matrix (hence the square brackets) whose
entries are generically indicated as aij . The notation A = [aij ]m×n means that A is an m × nmatrix.
Some special matrices
1. A 1× n matrix is a row matrix or row vector, e.g.[
1 2 4 3]is a 1× 4 row vector.
2. An m× 1 matrix is a column matrix or column vector; e.g.
x =
1
2
3
is a 3 × 1 column vector. Column matrices are usually identified with ordinary vectors andfor this reason it is common to use lower case boldface letters to denote them.
3. A matrix with the same number of rows and columns is called a square matrix; e.g.[1 3
2 4
]is a 2× 2 matrix
4. A zero or null matrix contains all zero entries. That is 0 = [0ij ] where 0ij = 0 for all i and
j.
Operations with matrices
ENG1091 Mathematics for Engineering page 26
1. Addition and subtraction
Addition and subtraction are possible only between matrices of the same order. These
are performed by adding or subtracting the corresponding entries respectively.
Example: 1 −1
3 5
−4 8
+
7 12
6 1
−3 5
=
8 11
9 6
−7 13
The addition of matrices is commutative i.e. A+B = B +A.
2. Multiplication by scalars
Given a matrix A and a number k, the multiplication of A by the scalar k and is obtained
by multiplying each entry of A by k.
For example let k = 3 and A =
1 −1
3 5
−4 8
, then 3A = 3
1 −1
3 5
−4 8
=
3 −3
9 15
−12 24
Note that subtraction can be expressed in terms of a scalar product (k = −1) and an
addition: A−B = A+ (−B)
For any matrix A, A−A = 0.
3. Multiplication
Two matrices A and B can be multiplied together only when the number of columns in
A equals the number of rows in B. To find the ij entry in the product AB we multiply
the entries along the ith row of A pairwise with entries on the jth column of B and then
add:
A =
1 −1
3 5
−4 8
, B =
[1 −1 3
2 4 −2
], C =
1
2
3
(a)
AB =
1 −1
3 5
−4 8
[
1 −1 3
2 4 −2
]=
1× 1 +−1× 2 1×−1 +−1× 4 1× 3 +−1×−2
3× 1 + 5× 2 3×−1 + 5× 4 3× 3 + 5×−2
−4× 1 + 8× 2 −4×−1 + 8× 4 −4× 3 + 8×−2
=
−1 −5 5
13 17 −1
12 36 −28
(b) AC is not defined
(c)
BA =
[1 −1 3
2 4 −2
]1 −1
3 5
−4 8
=
[1× 1 +−1× 3 + 3×−4 1×−1 +−1× 5 + 3× 8
2× 1 + 4× 3 +−2×−4 2×−1 + 4× 5 +−2× 8
]
=
[−14 18
22 2
]
ENG1091 Mathematics for Engineering page 27
This example demonstrates something very important: matrix multiplication is not usually
commutative, i.e. AB 6= BA in general.
In fact AB and BA need not be of the same order, or even if one product AB is defined,
the other product, BA, need not be.
In general if A = [aij ]m×p and B = [bij ]p×n then AB is defined and AB = C = [cij ]m×n
where cij = ai1b1j + ai2b2j + · · ·+ aipbpj =p∑
k=1
aikbkj .
To illustrate: . . . . . . . . . . . .
. . . . . . . . . . . .
. . . cij . . . . . .
. . . . . . . . . . . .
m×n
=
. . . . . . . . . . . .
. . . . . . . . . . . .
ai1 ai2 . . . aip
. . . . . . . . . . . .
m×p
. . . b1j . . . . . .
. . . b2j . . . . . .
. . . . . . . . . . . .
. . . bpj . . . . . .
p×n
=
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . ai1b1j + ai2b2j + · · ·+ aipbpj . . . . . .
. . . . . . . . . . . .
So cij = ai1b1j + ai2b2j + · · ·+ aipbpj =
p∑k=1
aikbkj
4. Examples:[2 3
−1 5
]2× 2
[−1 2
−2 3
]2× 2 = 2× 2
[1 −2 3
−4 5 6
] −1 1 3
0 2 −1
3 5 4
2× 3 3× 3 = 2× 3
=
[2 (−1) + 3 (−2) 2 (2) + 3 (3)
−1 (−1) + 5 (−2) −1 (2) + 5 (3)
]=
[−1 + 9 1− 4 + 15 3 + 2 + 12
4 + 18 −4 + 10 + 30 −12− 5 + 24
]
=
[−8 13
−9 13
]=
[8 12 17
22 36 7
]
ENG1091 Mathematics for Engineering page 28
5. Important:
• We stress again that to be able to perform the matrix product AB there is a size
restriction:
the number of columns in A (the matrix on the left) must equal the number of rows
in B (the second matrix in the product). We then say that AB is defined.
• If A is a m× p matrix, and B is a p× n matrix, then AB is a m× n matrix.
6. Properties of matrix multiplication
If A,B, and C are matrices of appropriate sizes, and k is a scalar then:
• A(B + C) = AB +AC
• (B + C)A = BA+ CA
• (AB)C = A(BC)
• k(AB) = (kA)B = A (kB)
• AB 6= BA in general.
Exercises
1. Find the following product of matrices[2 1
−3 5
]×[
3 −1
−2 4
]=
[4 2
−19 23
]
2. The product in the reverse order, although possible, leads to a different matrix:[3 −1
−2 4
]×[
2 1
−3 5
]=
[9 −2
−16 18
]
3. Given
A =
[1 3
−1 2
], B =
0
7
8
, C =
[2 4 6
8 10 12
], D =
9 8 7 6
5 4 3 2
1 0 9 8
determine which of the following are defined and give their sizes (orders).
(a) AB not defined
(b) AC 2× 3
(c) CD 2× 4
(d) AD not defined
(e) DC not defined
(f) CB 2× 1
(g) BC not defined
(h) (AC)D 2× 4
(i) A(CD) 2× 4
ENG1091 Mathematics for Engineering page 29
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
The transpose of a matrix is obtained by interchanging its rows and columns. That is, the entries
of the ith row become the entries of the ith column.
So, if A is a m× n matrix, then its transpose, denoted AT , is a n×m matrix.
Example:
Let A be the 3× 2 matrix
1 3
2 4
5 8
.
Then AT is the 2× 3 matrix: AT =
1 3
2 4
5 8
T
=
[1 2 5
3 4 8
].
Transpose of a product
If A and B are such that AB is defined then (AB)T = BTAT .
In words: the transpose of a matrix product is equal to the product of individual transposes
taken in the reverse order.
First we illustrate this with an example.
Let
A =
[1 −2
3 0
], B =
[2 −1 4
0 1 3
]we have
AB =
[1 −2
3 0
][2 −1 4
0 1 3
]=
[2 −3 −2
6 −3 12
]so that
(AB)T =
[2 −3 −2
6 −3 12
]T=
2 6
−3 −3
−2 12
On the other hand
BTAT =
2 0
−1 1
4 3
[
1 3
−2 0
]=
2 6
−3 −3
−2 12
Now we give a proof to show why (AB)T = BTAT is always true.
Let A = [aij ]m×p and B = [bij ]p×n.
Then AB is an m× n matrix and for any i = 1, ...,m and j = 1, ..., n we have
(AB)ij = ai1b1j + ai2b2j + · · ·+ aipbpj =p∑
k=1
aikbkj .
ENG1091 Mathematics for Engineering page 30
Now the (i, j) entry of (AB)T is the the (j, i) entry of (AB); this is found by swapping
i’s and j’s in the formula for (AB)i,j :
(AB)Ti,j =
p∑k=1
ajkbki = aj1b1i + ai2b2i + · · ·+ ajpbpi
= b1iaj1 + b2iaj2 + · · ·+ bpiajp =
p∑k=1
bkiajk
= the sum of products found by multiplying, term by term, the
ith row of BT with the jth column of AT and this is the (i, j)
entry of BTAT
Special matrices
Some types of matrices that are particularly important are given below. This list is not exhaus-
tive.
• A symmetric matrix is one which is equal to its transpose: e.g.
1 2 3
2 4 5
3 5 6
• Diagonal matrices are square matrices where any non-zero entries occur on the main diag-
onal: e.g.
1 0 0
0 2 0
0 0 0
• Identity matrices are square matrices where the main diagonal entries are all 1’s. For
example I2 =
[1 0
0 1
], I3 =
1 0 0
0 1 0
0 0 1
.If the size of the identity matrix can be understood from the context, or is irrelevant, the
symbol I is used.
If A is a square matrix and I is the identity matrix the same size as A, then AI = IA = A.
Identity matrices play a role analogous to the number 1 in ordinary arithmetic.
The inverse of a matrix
Definition: The inverse of a square n× n matrix A is an n× n matrix B, (if one exists),such that AB = BA = I where I is the n× n identity matrix.Note: If such a B exists it is unique and we write it as A−1.
Warning: A−1 does not mean 1A .
If A has an inverse, then we say that matrix A is invertible or non-singular.
We can calculate A−1 by forming the augmented matrix consisting of A and I, the identity
matrix. We then apply a systematic sequence of row operations until, if possible, A becomes I.
In the process, (in other words apply exactly the same sequence of row operations to I), I will
have become A−1.
ENG1091 Mathematics for Engineering page 31
Schematically: [A | I] row operations−→
[I | A−1
]Recall that elementary row operations are:
1. Interchanging two rows.
2. Multiplying a row by a non-zero scalar.
3. Adding to one row a multiple of another.
STAGE 1: Forward elimination process
1. Let C = [A | I] .
2.If there is a stage where C has a column consisting entirely of zeros, we stop immediately:
A has no inverse.
3. Ensure that the top left entry of C is a non-zero entry, which we will label as a. (If necessary,
interchange the top row with another row to achieve this.)
4. Multiply this row by 1a so that the first non-zero entry of this row is 1. This entry is the
pivot for that column. (Alternatively this can sometimes be affected by row interchange.)
5. Add a suitable multiple of this first row to the rows below row so that all entries in the
column below the pivot become 0.
If there is a stage where there the sub-matrix of C left of the partition has a row consisting
entirely of zeros, we stop immediately: the matrix A has no inverse.
6. Consider the submatrix of C found by removing its 1st row and 1st column, regard this
as a new matrix C. Repeat steps 2-6 until the next submatrix under consideration has no
rows left.
7. Provided the algorithm has not been exited at steps 2 or 5 the full matrix is now in echelon
form. The pivots are all 1 and located on the main diagonal of the matrix left of the
partition.
STAGE 2: Backward elimination process
1. Notice that all pivots are 1 and are located on the main diagonal of the matrix left of the
partition. Locate the row containing the right-most pivot, (which must be in the bottom
row).
2. Add suitable multiples of this row to the rows above so that all entries in the column above
become 0.
3. Locate the next pivot by moving up the diagonal and repeat steps 2 and 3.
4. This procedure is repeated until the top left pivot is reached, at which point the full matrix
is[I | A−1
].
ENG1091 Mathematics for Engineering page 32
Examples:
Find inverses of the following (if they exist).
1. A =
0 1 1
109
0√
2 −4
0 3 π
, A has a column of zeros and hence no inverse.
2. A =
1 1 −1
1 1 0
1 1 1
We form [A | I] = C =
1 1 −1
1 1 0
1 1 1
∣∣∣∣∣∣∣∣1 0 0
0 1 0
0 0 1
Step 1We note that C has a pivot in the top left entry and that this pivot is 1. Steps 3-4
Subtract row 1 from row 2:
1 1 −1
0 0 1
1 1 1
∣∣∣∣∣∣∣∣1 0 0
−1 1 0
0 0 1
Subtract row 1 from row 3:
1 1 −1
0 0 1
0 0 2
∣∣∣∣∣∣∣∣1 0 0
−1 1 0
−1 0 1
. Step 5 is now complete.Step 7. We apply the algorithm again to the submatrix of C found by deleting its 1st row and
column (shaded)
1 1 −1
0 0 1
0 0 2
∣∣∣∣∣∣∣∣1 0 0
−1 1 0
−1 0 1
but since this new matrix has a column of zeros, we conclude the matrix
1 1 −1
1 1 0
1 1 1
has noinverse. (Exiting the algorithm at step 2.)
ENG1091 Mathematics for Engineering page 33
3. Find the inverse of A =
2 7 1
1 4 −1
1 3 0
Solution: [A | I] =
2 7 1
1 4 −1
1 3 0
∣∣∣∣∣∣∣∣1 0 0
0 1 0
0 0 1
R1 ↔ R2
1 4 −1
2 7 1
1 3 0
∣∣∣∣∣∣∣∣0 1 0
1 0 0
0 0 1
R2 − 2R1 → R2
1 4 −1
0 −1 3
1 3 0
∣∣∣∣∣∣∣∣0 1 0
1 −2 0
0 0 1
R3 − R1 → R3
1 4 −1
0 −1 3
0 −1 1
∣∣∣∣∣∣∣∣0 1 0
1 −2 0
0 −1 1
(−1)R2 → R2
1 4 −1
0 1 −3
0 −1 1
∣∣∣∣∣∣∣∣0 1 0
−1 2 0
0 −1 1
R2 + R3 → R3
1 4 −1
0 1 −3
0 0 −2
∣∣∣∣∣∣∣∣0 1 0
−1 2 0
−1 1 1
(−1
2
)R3 → R3
1 4 −1
0 1 −3
0 0 1
∣∣∣∣∣∣∣∣0 1 0
−1 2 012 −1
2 −12
R2 + 3R3 → R2
1 4 −1
0 1 0
0 0 1
∣∣∣∣∣∣∣∣0 1 012
12 −3
212 −1
2 −12
R1 + R3 → R1
1 4 0
0 1 0
0 0 1
∣∣∣∣∣∣∣∣12
12 −1
212
12 −3
212 −1
2 −12
R1 − 4R2 → R1
1 0 0
0 1 0
0 0 1
∣∣∣∣∣∣∣∣−3
2 −32
112
12
12 −3
212 −1
2 −12
=[I | A−1
].
Hence A−1 =
−3
2 −32
112
12
12 −3
212 −1
2 −12
= 12
−3 −3 11
1 1 −3
1 −1 −1
Check: 12
−3 −3 11
1 1 −3
1 −1 −1
2 7 1
1 4 −1
1 3 0
= 12
−6− 3 + 11 −21− 12 + 33 −3 + 3 + 0
2 + 1− 3 7 + 4− 9 3− 3 + 0
2− 1− 1 7− 4− 3 1 + 1 + 0
= 12
2 0 0
0 2 0
0 0 2
=
1 0 0
0 1 0
0 0 1
.
Strictly speaking we should also check that
2 7 1
1 4 −1
1 3 0
−3
2 −32
112
12
12 −3
212 −1
2 −12
=
1 0 0
0 1 0
0 0 1
,however it is a known fact for matrices that a left inverse is also a right inverse (and vice versa),
so a one sided check is suffi cient.
ENG1091 Mathematics for Engineering page 34
Inverses of 2× 2 matrices
Example: find the inverse of the matrix
[2 4
1 3
].
Solution: [A | I] =
[2 4
1 3
∣∣∣∣∣ 1 0
0 1
]12R1 → R1
[1 2
1 3
∣∣∣∣∣ 12 0
0 1
]R2 − R1 → R2
[1 2
0 1
∣∣∣∣∣ 12 0
−12 1
]
R1 −2R2 → R1
[1 0
0 1
∣∣∣∣∣ 32 −2
−12 1
]. Hence
[2 4
1 3
]−1
=
[32 −2
−12 1
].
However there is a simple formula for the inverse of 2× 2 matrices.
The inverse of a 2× 2 matrix:
Let A =
[a b
c d
], and suppose ad−bc 6= 0. Then A is invertible and A−1 =
1
ad− bc
[d −b−c a
].
The number ad− bc is called the determinant of A and is denoted by∣∣∣∣∣ a b
c d
∣∣∣∣∣ or det (A) .
The determinant of any square matrix A is also defined (see next lecture) and this number
determines whether or not A is invertible:
A square matrix A is invertible if and only if its determinant is non-zero..
Using matrix methods to solve linear systems of equations
Consider the 3× 3 linear system:
2x1 + 7x2 + x3
x1 + 4x2 − x3
x1 + 3x2
=
=
=
1
4
5
which can also be written in matrix form
2 7 1
1 4 −1
1 3 0
x1
x2
x3
=
1
4
5
.Any n× n linear system can be written in the form Ax = b, where x and b are column vectors
(matrices).
If A is invertible we can multiply on the left by A−1 and so obtain the unknown
matrix x :
Ax = b
A−1Ax = A−1b
Ix = A−1b
giving x = A−1b
This method is somewhat more restrictive than Gaussian elimination. It only works
for n × n systems and either produces a unique solution (when det (A) 6= 0) but is
ENG1091 Mathematics for Engineering page 35
incapable of distinguishing between the no solution or infinite solution cases which
occur when det (A) = 0.
The main advantage to using matrix inverse method occurs when working with mul-
tiple equations with the same set of coeffi cients.
Example:
Solve: (a)
2x1 + 7x2 + x3 = 1
x1 + 4x2 − x3 = 4
x1 + 3x2 = 5
and (b)
2x1 + 7x2 + x3 = −2
x1 + 4x2 − x3 = 4
x1 + 3x2 = 6
In (a) we have
x1
x2
x3
=
2 7 1
1 4 −1
1 3 0
−1
1
4
5
, and in (b)
x1
x2
x3
=
2 7 1
1 4 −1
1 3 0
−1
−2
4
6
.
Now
2 7 1
1 4 −1
1 3 0
−1
= 12
−3 −3 11
1 1 −3
1 −1 −1
(shown above),
giving the solution to (a):
x1
x2
x3
= 12
−3 −3 11
1 1 −3
1 −1 −1
1
4
5
=
20
−5
−4
and to (b):
x1
x2
x3
= 12
−3 −3 11
1 1 −3
1 −1 −1
−2
4
6
=
30
−8
−6
.Exercise: Solve the following system of equations using matrix inversion followed by matrix
multiplication:2x+ 3y = 7
4x+ y = 3
In matrix form:
[2 3
4 1
][x
y
]=
[7
3
]
If exists
[2 3
4 1
]−1
we may write
[2 3
4 1
]−1 [2 3
4 1
][x
y
]=
[2 3
4 1
]−1 [7
3
]
Now
∣∣∣∣∣ 2 3
4 1
∣∣∣∣∣ = 2− 12 6= 0 so
[2 3
4 1
]−1
exists.
The formula for the inverse of a 2×2matrix gives
[2 3
4 1
]−1
= 12−12
[1 −3
−4 2
]=
[− 1
10310
25 −1
5
]
So
[1 0
0 1
][x
y
]=
[− 1
10310
25 −1
5
][7
3
]=
[15115
]; giving x = 1/5 and y = 11/5
ENG1091 Mathematics for Engineering page 36
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
and say “the limit of f (x) , as x approaches a equals L” if we can make the values of f (x)
arbitrarily close to L (as close to L as we like) by taking x to be suffi ciently close to a (on either
side of a) but not equal to a.
Notice the phrase “but x 6= a”in the definition. This means in finding the the limit of f (x) as x
approaches a, we are not interested in the value of the function at x = a. In fact f (x) may not
even be defined when x = a. The only thing that maters is how f (x) is defined near a.
Illustrative example: Use a calculator and guess the value of
limx→0
sinx
x.
The limit laws.
The limit laws are listed below. Essentially they allow ‘common sense’manipulation of limit
expressions, following normal algebraic operations, e.g. the limit of a sum is the same as the sum
of its limits. It is important to note that these laws can only be applied when the combining
functions have an existing limit.
Suppose that c is a constant and the limits limx→a f(x) and limx→a g(x) exist. Then
1. limx→a
[f(x) + g(x)] = limx→a
f(x) + limx→a
g(x)
2. limx→a
[f(x)− g(x)] = limx→a
f(x)− limx→a
g(x)
3. limx→a
[cf(x)] = c limx→a
f(x)
4. limx→a
[f(x)× g(x)] = limx→a
f(x)× limx→a
g(x)
5. limx→a
f(x)
g(x)=
limx→a f(x)
limx→a g(x)if lim
x→ag(x) 6= 0.
6. To evaluate limits we will make frequent use of the continuous function rule: :
Suppose limx→a
g (x) = b and f is continuous at b.
Then limx→a
f(g (x)) = f(
limx→a
g (x))
= f (b)
To make effective use of rule 6 we will take it as known that the elementary functions (poly-
nomial, exponential, logarithmic, trigonometric and hyperbolic functions) are continuous
on their respective domains.
Examples: (examples 1-4 are evaluated using the limit laws above)
ENG1091 Mathematics for Engineering page 70
1. Evaluate limx→1
x2(6x+ 3)(2x− 7)
(x3 + 4)(x+ 17).
limx→1
x2(6x+ 3)(2x− 7)
(x3 + 4)(x+ 17)=
limx→1 x2(6x+ 3)(2x− 7)
limx→1(x3 + 4)(x+ 17)by rule 5 since lim
x→1(x3 + 4)(x+ 17) 6= 0
=limx→1 1(9)(−5)
limx→1(5)(18)
= −1
2.
In example 1 we could have found the limit by merely substituting in the value x = 1. If
we could always evaluate limits by doing this the concept of a limit would be superfluous.
However the notion of a limit of a function f (x) as x → a is most useful when f (x) is
undefined at x = a.
2. Find limx→1
1x − x1x − 1
limx→1
1x − x1x − 1
= limx→1
1− x2
1− x
= limx→1
(1− x) (1 + x)
1− x= lim
x→1(1− x)
= 0
3. Find limx→0
1x+4 −
14
x
limx→0
1x+4 −
14
x
= limx→0
4− (x+ 4)
x (x+ 4) 4
= limx→0
−xx (x+ 4) 4
= limx→0
−xx (x+ 4) 4
= limx→0
−1
(x+ 4) 4
= −1÷ limx→0
(x+ 4) 4
= − 1
16
ENG1091 Mathematics for Engineering page 71
4. Find limt→0
√2− t−
√2
t.
limt→0
√2− t−
√2
t= lim
t→0
√2− t−
√2
t×√
2− t+√
2√2− t+
√2
= limt→0
2− t− 2
t(√
2− t+√
2)
= limt→0
−tt(√
2− t+√
2)
= limt→0
−1(√2− t+
√2)
= −1÷ limt→0
(√2− t+
√2)
= − 1
2√
2
The last four examples demonstrate the use of algebra in evaluating limits. However in
evaluating most limits the use of algebra alone will not be suffi cient. The next technique
we introduce is much more powerful than algebraic methods.
ENG1091 Mathematics for Engineering page 72
Indeterminate forms and L’Hopital’s rule
Applying the limit techniques (particularly direct substitution) discussed earlier can often lead
to ‘meaningless’expressions of the type 00 or
∞∞ . These are called indeterminate forms, since
they have not correctly determined the true limit value.
However, if we ‘zoom in’around x = a for 2 functions f and g, such that f(a) = g(a) = 0 we
can see that the value of f(x)g(x) ≈
f ′(x)g′(x) .
f
g
x
y
0 a
This forms the basis of a rule known as L’Hopital’s Rule: Suppose f and g are differentiable,
with f(a) = g(a) = 0. If f ′ and g′ are continuous (but g′(x) 6= 0), then
limx→a
f(x)
g(x)= lim
x→af ′(x)
g′(x).
This rule can be applied for two-sided and one-sided limits, approaching a fixed value a or ±∞,which give the indeterminate form
0
0or∞∞ . To reduce expressions to a meaningful term, it may
be necessary to apply L’Hopital’s Rule two or more times.
Examples
limx→0
sin 2x
x limx→0
sin 2x
xis of the form ‘
0
0’so that L’Hopital’s rule may be applied:
= limx→0
2 cos (2x)
1
= limx→0
2
1= 2
limx→∞
lnx
x limx→∞
lnx
xis of the form ‘
∞∞’so that L’Hopital’s rule may be applied:
= limx→∞
(1x
)1
= limx→∞
1
x= 0
ENG1091 Mathematics for Engineering page 73
limx→0
x− sinx
x3
limx→0
x− sinx
x3is of the form ‘
0
0’so that L’Hopital’s rule may be applied:
= limx→0
1− cosx
3x2is of the form ‘
0
0’so that L’Hopital’s rule may be applied again
= limx→0
sinx
6xis of the form ‘
0
0’
= limx→0
cosx
6
=1
6
There are other types of indeterminate forms, involving combinations of 0 and ∞, dealt with asfollows:
Indeterminate Product 0. ∞
If limx→a
f(x)g(x) = 0 · ∞ re-arrange f(x)g(x) to f(x)1/g(x) , then apply L’H Rule.
limx→0+
x lnx limx→0+
x lnx is of the form ‘0 · ∞’so that some rearrangement is necessary
= limx→0
lnx
(1/x)is now of the form ‘
∞∞’so that L’Hopital’s rule may be applied
= limx→0
x−1
−x−2
= limx→0
x2
x
= limx→0
x
1= 0
Indeterminate Difference ∞−∞
If limx→a[f(x) − g(x)] = ∞ − ∞, convert the expression to a single fraction, using commondenominators, factorisation, or rationalisation, to produce a 0
0 or∞∞ form. Then apply L’H Rule.
Examples
limx→0
[1
x− 1
sinx]
limx→0
[ 1x −
1sinx ] is of the form ‘∞−∞’so that some rearrangement is necessary
Example: What is wrong with the following calculation:∫ 1
−1
1
x2dx =
[−x−1
]1−1
= −1− 1 = −2?
ANS: The function f (x) = 1x2is always positive. The definite integral of a positive function can
never be negative. (Definite integrals give the ‘signed’area between a curve and the x−axis.For a curve which is always positive this signed area must also be positive.)
We have applied the Fundamental Theorem of Calculus in circumstances where we were
not entitled to do so.
The Fundamental Theorem of Calculus which enables us to evaluate a definite integral by
taking an antiderivative of the integrand requires that the integrand be continuous over a finite
domain of integration [a, b].
The function 1x2is not continuous on the domain [−1, 1]. (In fact of course it is not even defined
on [−1, 1] .)
If we break the integral up we obtain
∫ 1
−1
1
x2dx =
∫ 1
0
1
x2dx+
∫ 0
−1
1
x2dx
However this introduces a new problem. The integrands in both these integrals are not Riemann
integrable because they are not bounded. (The function 1x2is unbounded near x = 0.)
We can extend the theory of Riemann integration by introducing ‘improper integrals’.
There are two types of improper integrals:
• an expression like∫ ∞
1exdx is improper because the domain of integration, in this
case [1,∞) , is not bounded;
• expressions like∫ 1
0
1
x2dx where the range of the integrand is unbounded on the
interval of integration. (In this case the function1
x2is unbounded on [0, 1] .)
When the domain of integration is not finite we have a Type 1 improper integral.
When the integrand is unbounded at a particular point, but continuous elsewhere, we have
a Type 2 improper integral.
Type 1: Infinite intervals
For these integrals, we are attempting to find the area of an ‘infinite space’. To do this, we
evaluate the definite integral over a finite interval, and investigate the limit of the integral as the
interval is extended.
ENG1091 Mathematics for Engineering page 78
Example∫ ∞1
1
x2dx [geometrically this is the area under the curve y =
1
x2to the right of x = 1].
∫ ∞1
1
x2dx = lim
t→∞
∫ t
1
1
x2dx (diagram)
= limt→∞
[−x−1
]t1
= limt→∞
(−1
t
)+ 1
= 1.
We say that∫ ∞
1
1
x2dx is convergent.
We use the following definitions to evaluate these integrals:
To define∫ ∞a
f(x)dx we require two things, (i) that∫ t
af(x)dx exists for every number t ≥ a
(ii) that the limt→∞
∫ t
af(x)dx exists and is finite.
We then say that∫ t
af(x)dx converges.
Provided these two conditions are satisfied we define∫ ∞a
f(x)dx = limt→∞
∫ t
af(x)dx.
A similar statement can be made regarding the definition of∫ a
−∞f(x)dx.
The integral∫ ∞−∞
f(x)dx is also considered a type I integral, we define
∫ ∞−∞
f(x)dx =
∫ 0
−∞f(x)dx+
∫ ∞0
f(x)dx
provided the two improper integrals on the right are convergent independently.
Note: In each of these cases, if the integral exists, we say that the improper integral is convergent
and that the limit becomes the value of the improper integral. If the limit fails to exist, the
improper integral is divergent.
ExampleDetermine if
∫ ∞0
e−2xdx is convergent or divergent.
∫ ∞0
e−2xdx = limt→∞
∫ t
0e−2xdx
= limt→∞
[−1
2e−2x
]t0
= limt→∞
(−1
2e−2t
)+
1
2
=1
2. (The integral is convergent.)
ENG1091 Mathematics for Engineering page 79
ExampleDetermine if
∫ ∞1
1
xdx is convergent or divergent.
∫ ∞1
1
xdx = lim
t→∞
∫ t
1
1
xdx
= limt→∞
[loge x]t1
= limt→∞
(loge t)− 0
=∞ since loge x is an unbounded function as x→∞.
This means the integral∫ ∞
1
1
xdx diverges.
ExampleFor what values of p is
∫ ∞1
1
xpdx convergent?
The case where p = 1 was considered in the previous example, so we know that∫ ∞
1
1
xpdx
diverges when p = 1.
Now consider what happens if p 6= 1 :∫ ∞1
1
xpdx = lim
t→∞
∫ t
1
1
xpdx
= limt→∞
[1
1− px−p+1
]t1
provided p 6= 1
= limt→∞
(1
1− pt−p+1
)− 1
1− p
Now1
1− pt−p+1 →∞ as t→∞ if p < 1, while
1
1− pt−p+1 → 0 as t→∞ if p > 1.
We conclude∫ ∞
1
1
xpdx converges if p > 1 and diverges if p ≤ 1.
Example
Evaluate∫ ∞−∞
1
1 + x2dx or else explain why the integral diverges.
If∫ ∞−∞
1
1 + x2dx is to converge we require the (independent) convergence of both
∫ ∞0
1
1 + x2dx
and∫ 0
−∞
1
1 + x2dx.
Now∫ ∞
0
1
1 + x2dx = lim
t→∞
∫ t
0
1
1 + x2dx While
∫ 0
−∞
1
1 + x2dx = lim
t→−∞
∫ 0
t
1
1 + x2dx
= limt→∞
[tan−1 x
]t0
= limt→−∞
[tan−1 x
]0t
= limt→∞
(tan−1 t
)− tan−1 (0) = 0− lim
t→−∞
(tan−1 t
)=π
2− 0 = 0−− π
2
=π
2. =
π
2.
ENG1091 Mathematics for Engineering page 80
So both∫ ∞
0
1
1 + x2dx and
∫ 0
−∞
1
1 + x2dx converge,
and we have∫ ∞−∞
1
1 + x2dx =
∫ ∞0
1
1 + x2dx+
∫ 0
−∞
1
1 + x2dx = π.
Type 2 - integrand unbounded at a single point
Suppose f is a function continuous on [a, b) but is not bounded at x = b, that is, limx→b− f (x) =
∞ or −∞.
Provided limt→b−
∫ t
af(x)dx exists, we define
∫ b
af(x)dx = lim
t→b−
∫ t
af(x)dx
The analogous definition can be made when f is not bounded at a :
Suppose f is a function continuous on (a, b] but is not bounded at x = a, that is, limx→a+ f (x) =
∞ or −∞.
Provided limt→a+
∫ b
tf(x)dx exists, we define
∫ b
af(x)dx = lim
t→a+
∫ b
tf(x)dx
Now we see why we have the apparent contradiction in the example:∫ 1
−1
1
x2dx.
The integral∫ 1
−1
1
x2dx is undefined because neither
∫ 1
0
1
x2dx nor
∫ 0
−1
1
x2dx exists.
(The failure of just one of these limits to exist results in the integral being undefined.)
∫ 1
0
1
x2dx = lim
t→0+
∫ 1
t
1
x2dx
= limt→0+
[−x−1
]1t
= limt→0+
(−1 +
1
t
)=∞. (The integral is divergent.)
Similarly∫ 0
−1
1
x2dx diverges.
ENG1091 Mathematics for Engineering page 81
Example
Is the area under the curve y = 1√xfrom x = 0 to x = 1 finite? If so, what is it?
Solution: The area, if it exists, is given by∫ 1
0
1√xdx. This integral is improper since the
integrand is unbounded at x = 0.
Now∫ 1
0
1√xdx = lim
t→0+
∫ 1
t
1√xdx
= limt→0+
[2x1/2
]1
t
= limt→0+
(2− 2
√t)
= 2.
The area under the curve is finite and is equal 2 sq. units.
Examples: Evaluate each of the following when they exist and explain the situation otherwise:
Find∫ 1
0
1√1− x2
dx
∫ 1
0
1√1− x2
dx = limt→1−
∫ t
0
1√1− x2
dx
= limt→1−
[sin−1 x
]t0
= limt→1−
(sin−1 t− 0
)= sin−1 (1)
= π/2
Find∫ e
0lnxdx
∫ e
0lnxdx = lim
t→0+
∫ e
tlnxdx diagram:
= limt→0+
[x lnx− x]et (see lecture 14:∫
lnxdx = x lnx− x),
= e ln e− e− limt→0
(t ln t− t)
= e− e− 0 since limt→0
(t ln t) = 0,
= 0
ENG1091 Mathematics for Engineering page 82
The Comparison Test for Improper Integrals allows us to discuss the convergence of an
improper integral without evaluating it directly, by comparing it to a known or easier integral.
If f and g are continuous functions, where f(x) ≥ g(x) ≥ 0, then
1.∫ ∞a
g(x)dx is convergent if∫ ∞a
f(x)dx is convergent.
2.∫ ∞a
f(x)dx is divergent if∫ ∞a
g(x)dx is divergent.
Example
Show that∫ ∞
1e−x
2dx is convergent. (This integral cannot be evaluated by elementary means
since the antiderivative of e−x2is not an elementary function).
Solution: We compare the integrand e−x2with e−x.
Since x2 ≥ x for all x ≥ 1 we have1
ex2≤ 1
exi.e. e−x
2 ≤ e−x for all x > 1 (in fact e−x2approaches
0 at a very much faster rate than does e−x).
So, using the comparison test,∫ ∞
1e−x
2dx converges if we can show
∫ ∞1
e−xdx converges.
∫ ∞1
e−xdx = limt→∞
∫ t
1e−xdx
= limt→∞
[−e−x
]t1
= limt→∞
(−e−t
)+ e−1.
Now limt→∞
(−e−t
)exists, in fact it is zero, and hence
∫ ∞1
e−xdx converges to the value e−1.
Thus∫ ∞
1e−x
2dx also converges. Its value (whatever it might be) is a number less than e−1.
ENG1091 Mathematics for Engineering page 83
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
Example 5 (Example 3 again but this time via shell method.): Find the volume of the solid
formed when the region bounded by y = x, and y = x2 is rotated through 2π radians about the
y-axis.
∆V = 2πx(x− x2
)∆x
V = 2π
∫ 1
0x(x− x2
)dx
= 2π
[1
3x3 − 1
4x4
]1
0
= 2π · 1
12
=π
6same as that obtained previously
The answers obtained by either method are identical, but the shell method avoids the use of
squaring.
Example 6 Find the volume of the solid generated when the region bounded by y = 1x , y = 0,
x =1 and x = 10 is rotated about the y-axis, using cylindrical shells.
∆V = 2πx
(1
x− 0
)∆x
= 2π∆x
V = 2π
∫ 10
11dx
= 2π · 9
= 18π
The next example shows that the shell method can also be used to find volumes of revolution
about the x axis.
ENG1091 Mathematics for Engineering page 87
Example 7 The region bounded by y =√x, the x -axis, and the line x = 4 is revolved about
the x -axis to generate a solid. Find its volume using shells.
∆V = 2πy (4− x) ∆y
= 2πy(4− y2
)∆y
V = 2π
∫ 2
0y(4− y2
)dy note use of y values as terminals
= 2π
[2y2 − 1
4y4
]2
0
= 8π
Here the shell method is more complicated than the washer method: V = π∫ 4
0 (√x)
2dx = 8π.
ENG1091 Mathematics for Engineering page 88
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
ENG1091 Sequences and Series
Lecture 18 · sequences · limits of sequencesText Reference: §7.1,7.2,7.5
1. Definition: An infinite sequence is a special kind of function whose domain is a set of
integers extending from some starting integer (usually 1) and then continuing indefinitely.
The sequence a1, a2, a3, a4, ... is the ordered list of function values of a function a wherea (n) = an at each positive integer n. We usually specify a sequence by giving its general
term, the formula for an.
2. Examples:
(a) an =
(−1
2
)n=
−1
2,1
4,−1
8,
1
16, ...
(b) an =
n− 1
n=
0,
1
2,2
3,3
4, ...
(c) an = (−1)n−1 = 1,−1, 1,−1, ...
(d) an =
(n2
2n
)=
1
2, 1,
9
8, 1,
25
32,
9
16, ...
(e) an =
(cos nπ2n
)=
0,−1
2, 0,
1
4, 0,−1
6, ...
(f) an =
(1 +
1
n
)n=
2,
(3
2
)2
,
(4
3
)3
,
(5
4
)4
, ...
.
3. Definition: An infinite sequence has a limit L if the terms of the sequence tend to that
limit. This is all very well but it doesn’t say very much. A real (or complex) number L
is the limit of a sequence an if for any number ε > 0 there is a number N such that all
terms of the sequence beyond N are within ε of L. Consult the picture on page 439 of your
text for a visual illustration of this definition. When an infinite sequence an has a limitL we write
limn→∞
an = L.
We are not going to use this definition in any formal sense because we are going to establish
convergence or divergence of sequences using the limit theorems which follow. However it
is important to bear in mind that the proofs of these theorems depend ultimately on this
definition.
Not all sequences have limits and those that do are said to be convergent to their limit.
If a sequence has no limit we say it diverges.
Many people have a false idea of a limit as a number which the terms of the sequence ‘get
closer to’somehow. Notice example (e) above which has the limit 0. Notice also that it is
not true to say that successive terms are getting closer to zero, in fact each non-zero term
is farther away from zero than its predecessor, which of course is exactly zero.
ENG1091 Mathematics for Engineering page 89
4. Examples:
(a) an =
(−1
2
)n=
−1
2,1
4,−1
8,
1
16, ...
converges to 0.
(b) an =n− 1
n=
0,
1
2,2
3,3
4, ...
converges to 1.
(c) an = (−1)n−1 = 1,−1, 1,−1, ... diverges since it oscillates indefinitely between −1
and 1.
(d) an =
(n2
2n
)=
1
2, 1,
9
8, 1,
25
32,
9
16, ...
converges to 0.
(e) an =
(cos nπ2n
)=
0,−1
2, 0,
1
4, 0,−1
6, ...
converges to 0.
(f) an =
(1 +
1
n
)n=
2,
(3
2
)2
,
(4
3
)3
,
(5
4
)4
, ...
, converges to e.
(g) an = n = 1, 2, 3, 4, ... , diverges since an →∞, we also say that an is unbounded.
5. Demonstrating divergence. Showing that a particular sequence diverges can in many
ways be more problematic.
(a) If we can show that the sequence is unbounded the sequence diverges. A sequence
an is unbounded if for all numbers M > 0 we may find an n such that |an| > M.
However, please remember that many bounded sequences are also divergent.
(b) If a sequence appears to have two or more different ‘limits’the sequence diverges. It
may happen, for example, that the sequence of odd terms of a converges to a limit
which is different to the limit of the sequence of even terms. This behaviour is apparent
in the example (c) above.
(c) Many divergent sequences behave like the divergent sequence an = sin (n) . The range
of this sequence is dense in the set [−1, 1] which means we can pick any number in
[−1, 1] and specify any positive distance we like, then there exists an n such that
sin (n) is as close as we please to our chosen number.
ENG1091 Mathematics for Engineering page 90
6. Sequence theorems
Suppose that c and p are constants and (unless stated otherwise) the limits limn→∞ an
and limn→∞ bn exist. Then
(a) limn→∞
[an + bn] = limn→∞
an + limn→∞
bn
(b) limn→∞
[an − bn] = limn→∞
an − limn→∞
bn
(c) limn→∞
[can] = c limn→∞
an
(d) limn→∞
[an × bn] = limn→∞
an × limn→∞
bn
(i) if limn→∞ bn 6= 0 then limn→∞anbn
= limn→∞ anlimn→∞ bn
,
(ii) if an is a bounded sequence and bn is unbounded then limn→∞
anbn
= 0. (It is
not necessary that limn→∞ an exists.)
(e) limn→∞
[anp] = [ lim
n→∞an]p
Part (f) is really a special case of the Continuous function theorem which says
that
if f is a continuous function then limn→∞
[f (an)] = f(
limn→∞
an
).
(f) limn→∞
c = c
(g) limn→∞
cn = 0 if |c| < 1 and divergent otherwise.
7. The following examples illustrate how the various properties listed above can be used to
establish convergence of sequences and find their limits.
(a) an = n diverges since an is unbounded.
(b) an = 1n converges to 0. Rather obvious but a special case of rule (e)ii.
(c) an =n2 − 3n+ 1
2n2 + 1
Write an =n2 − 3n+ 1
2n2 + 1
=n2(1− 3
n + 1n2
)n2(2 + 1
n2
)=
(1− 3
n + 1n2
)(2 + 1
n2
)
So limn→∞
an =limn→∞
(1− 3
n + 1n2
)limn→∞
(2 + 1
n2
) (apply rule (e))
=(1− 0 + 0)
(2 + 0)(apply rules (a),e(ii))
=1
2
ENG1091 Mathematics for Engineering page 91
(d) an =2n2 + 3n+ 1
n3 + 1
Write an =2n2 + 3n+ 1
n3 + 1
=n2(2 + 3
n + 1n2
)n3(1 + 1
n3
)=
1(2 + 3
n + 1n2
)n(1 + 1
n2
)
So limn→∞
an = limn→∞
(1
n
)×
limn→∞(2 + 3
n + 1n2
)limn→∞
(1 + 1
n2
) (apply rules (d,e))
= 0× 2
= 0
(e) an =√n+ 1−
√n
an =
√n+ 1−
√n
1×√n+ 1 +
√n√
n+ 1 +√n(a trick that often works with difference of sq. roots)
=n+ 1− n√n+ 1 +
√n
=1√
n+ 1 +√n
So limn→∞
an = limn→∞
1√n+ 1 +
√n
= 0 (since the sequences√n+ 1,
√n are unbounded)
Exercises Find the limits of the following sequences if they exist, or if they are divergent explain
why.
1. an =√n2 + 2n− n ANS: convergent: limn→∞ an = 1.
2. an =n2 − 4
n+ 5ANS: divergent: an = n2−4
n+5 is not bounded.
3. an = ln (n+ 1)− ln (2n− 1) ANS: convergent: limn→∞ an = ln 12 = − ln 2.
ENG1091 Mathematics for Engineering page 92
An important sequence
Show limn→∞
(1 +
x
n
)n= ex.
Use L’Hopitals rule but first we need to change it from a sequence limit to a function of a
continuous variable.
Consider instead limx→∞(1 + a
x
)x= L (a) .
Then lnL (a) = ln(limx→∞
(1 + a
x
)x)= limx→∞ ln
(1 + a
x
)x= limx→∞ x ln
(1 + a
x
)=∞ · 0
= limx→∞ln(1+ a
x)1/x = ·0
0
= limx→∞
(−ax2
)(1+ a
x)÷(−1x2
)applying L’Hopitals rule
= limx→∞a
(1+ ax)
= a
so L (a) = ea hence limx→∞(1 + a
x
)x= ea.
We conclude that the sequence limit also exists and limn→∞(1 + x
n
)n= ex.
Note that the existence of the function limit implies the existence of the corresponding sequence
limit but not vice versa.
For example limn→∞
sin (2πn) = 0 but limx→∞
sin (2πx) does not exist.
ENG1091 Mathematics for Engineering page 93
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
ENG1091 Sequences and Series
Lecture 19&20 · series · geometric series · convergenceText Reference: §7.6
Many students find this lecture very diffi cult and the material covered here is quite sparse. An
excellent account of this material can be found in the first chapter of Mathematical Methods in
the Physical Sciences by Mary L Boas. (Available in the Hargrave library.)
1. An infinite series is a formal sum of infinitely many terms; for example a1 + a2 + a3 + a4 + ...
is a series formed by adding the terms of the sequence an . This series is also denoted∞∑n=1
an.
∞∑n=1
an = a1 + a2 + a3 + a4 + ...
Examples:
1.∞∑n=1
(−1
2
)n= −1
2+
1
4− 1
8+
1
16− ....
2.∞∑n=1
n− 1
n= 0 +
1
2+
2
3+
3
4+ ...
3.∞∑n=1
(−1)n−1 = 1− 1 + 1− 1 + ...
4.∞∑n=1
(n2
2n
)=
1
2+ 1 +
9
8+ 1 +
25
32+
9
16+ ...
5.∞∑n=1
n = 1 + 2 + 3 + 4 + ...
6.∞∑n=2
(1
lnn
)=
1
ln 2+
1
ln 3+
1
ln 4+ ...
To every series∞∑n=1
an there is an associated sequence called the sequence of partial sums sn
whose nth term is the sum of the first n terms of the series:
s1 = a1
s2 = a1 + a2
s4 = a1 + a2 + a3
s4 = a1 + a2 + a3 + a4
...
sn =
n∑k=1
ak
...
Definition: We say that the series∞∑n=1
an converges to the sum s if the sequence of partial
sums sn , where sn =
n∑k=1
ak, converges to s. If this is the case we write∞∑n=1
an = s.
If the sequence of partial sums is a divergent sequence then the series∞∑n=1
an is said to diverge.
Recall what it means for a sequence sn to converge. Given any ε > 0 there exists N such that
|sn − L| < ε for all n > N. In particular the distance between any two terms sn and sn+1 must
ENG1091 Mathematics for Engineering page 94
be less than 2ε whenever n > N. To see this:
|sn+1 − sn| = |sn+1 − L+ L− sn|
≤ |sn+1 − L|+ |L− sn| by triangle inequality
< ε+ ε whenever n > N
But |sn+1 − sn| = |an+1| so the sequence an converges to zero. Thus we have the followingnecessary condition for convergence.
Theorem: The infinite series∑∞
n=1 an converges only if the parent sequence an converges tozero.
Example: Discuss the convergence or divergence of the series∞∑n=1
n− 1
n+ 1.
We have limn→∞
an = limn→∞
n− 1
n+ 1
= 1. Since this is not zero the series∞∑n=1
n− 1
n+ 1diverges.
Important note: The test limn→∞
an = 0 is a condition necessary for convergence; it is not
suffi cient.
Later on we show that the series∞∑n=1
1
nis a divergent series despite the fact that lim
n→∞1
n= 0.
2. Geometric series
A series of the form
a+ ar + ar2 + ....
where a 6= 0 is called a geometric series. The number a is its first term and the number r is
called the common ratio since it is the value of the ratio of any term to its predecessor.
Repeating decimals are infinite geometric series, e.g.
0.12 = 0.12121212... =12
100+
12
10, 000+
12
1, 000, 000+ ...; r =
1
100
Finding an explicit formula for sn for a geometric series is easy:
sn = a+ ar + ar2 + ....+ arn−1, (1)
and
rsn = ar + ar2 + ar3 + ....+ arn (2)
e.g. (1)− e.g. (2):
hence
sn =a (1− rn)
1− r
• For |r| < 1 we have limn→∞ rn = 0 and so the geometric series converges to
∞∑n=1
arn−1 =a
1− r .
ENG1091 Mathematics for Engineering page 95
• For r > 1 the sequencearn−1
is unbounded and so the geometric series diverges.
• For r = 1, and a 6= 0 we have the divergent constant series a+ a+ a+ .... and for r = −1
we have the series a − a + a − a + .... which alternates between a and 0, and hence also
diverges.
Exercise Use the formula∞∑n=1
arn−1 =a
1− r to find the fraction equivalent of the repeating
decimal 0.12.
0.12 = 0.121212...
Exercises: Discuss the convergence or divergence of each of the following series:
1. Use partial fractions to show1
n (n+ 1)=
1
n− 1
n+ 1. Use this to find a formula for its nth
partial sum sn. Hence show∞∑n=1
1
n (n+ 1)converges by finding its limit.
The nth partial sum is sn =n∑k=1
1
n (n+ 1)=
(1
1− 1
2
)+
(1
2− 1
3
)+ ...+
(1
n− 1
n+ 1
)= 1− 1
n+ 1
Hence∞∑n=1
1
n (n+ 1)= lim
n→∞
(1− 1
n+ 1
)= 1.
2.∞∑n=1
n− 1√n2 + 1
.
ENG1091 Mathematics for Engineering page 96
Tests for Series Convergence
The convergence or divergence of the geometric series was determined by finding a formula for
the sequence of partial sums sn . This is not always possible for more general series and hencethe need to establish some tests which are suffi cient to determine convergence or divergence.
For now we deal exclusively with positive series, that is series of the type∞∑n=1
an where an ≥ 0
for all n.
1. Integral Test.
Example: Determine the convergence or divergence of the series∞∑n=1
1
n2.
Notice that all of the terms of the series are positive.
The essential idea of the integral test is that the series∞∑n=1
1
n2and the improper integral
∫ ∞1
1
x2dx
either both converge, or both diverge (to ∞).
Now a quick calculation shows∫ ∞
1
1
x2dx converges:
Notice that∞∑n=2
1
n2<
∫ ∞1
1
x2dx (diagram) so that
∞∑n=1
1
n2< 1 +
∫ ∞1
1
x2dx
Since an =1
n2is always positive, the sequence of partial sums is increasing (since sn+1 − sn =
an+1 > 0).
Therefore the series is bounded above by 1 +
∫ ∞1
1
x2dx.
An increasing sequence sn that is bounded above converges, hence the series∞∑n=1
1
n2converges.
Example: Determine the convergence or divergence of the series∞∑n=1
1
n.
Notice once again that all of the terms of the series are positive. This time the corresponding
improper integral is∫ ∞
1
1
xdx which diverges (to ∞).
Calculation:
Notice that∞∑n=1
1
n>
∫ ∞1
1
xdx (diagram).
Hence∞∑n=1
1
n>
∫ ∞1
1
xdx is unbounded, and therefore
∞∑n=1
1
nis also unbounded and therefore
diverges.
Note: the divergent series∞∑n=1
1
nis called the harmonic series. It is rather special because it is
an example of a series that diverges and yet whose parent sequence, an = 1n , converges to zero.
ENG1091 Mathematics for Engineering page 97
Example (p-series): The series class∞∑n=1
1
np. are known collectively as p−series . By comparing
with the corresponding integral∫ ∞
1
1
xpdx a quick calculation shows:
∞∑n=1
1
npdiverges for p ≤ 1 and
∞∑n=1
1
npconverges for p > 1.
2. The comparison test
The integral test works by comparing an infinite series with the corresponding improper integral.
Why not compare two series? This then is the comparison test.
Example The series∑∞
n=11
n2+1≤∑∞
n=11n2because 1
n2+1≤ 1
n2for all n. We know
∑∞n=1
1n2
converges and since it dominates∑∞
n=11
n2+1this series must also converge. (Once again the fact
that∑∞
n=11
n2+1and
∑∞n=1
1n2are both series of positive terms is crucial here.)
The precise statement of the comparison test is as follows:
Let∑∞
n=1 an and∑∞
n=1 bn both be series of positive terms and that the
convergence or divergence of∑∞
n=1 bn is known.
Showing convergence: If∑∞
n=1 bn converges and an ≤ bn for all n, then∑∞
n=1 an
converges.
Showing divergence: If∑∞
n=1 bn diverges and an ≥ bn for all n, then∑∞
n=1 an diverges.
Warning: When using the comparison test it is important to get the inequalities the
correct way about and avoid using too coarse a comparison.
For example, it is true that 1n2+1
≤ 1n for all n and that
∑∞n=1
1n diverges. What can we say
about the behaviour of∑∞
n=11
n2+1on the basis of this comparison? Absolutely nothing!
Exercises: Discuss the convergence or divergence of each of the following series:
1.∞∑n=2
1
n lnn. [Compare with the integral
∫ ∞2
1
x lnxdx.]
2.∞∑n=1
en cos2 n
πn. [Compare with the geometric series
∞∑n=1
en
πn]
3.∞∑n=1
√n− 1
n2 + 1. [Compare with the p-series
∞∑n=1
1
n3/2]
4.∞∑n=1
n− 1
2n (n+ 1). [Compare with the geometric series
∞∑n=1
1
2n]
ENG1091 Mathematics for Engineering page 98
The Ratio Test
Recall that the infinite geometric series∑∞
n=1 arn−1 = a + ar + ar2 + ... converges for r < 1
and diverges for r > 1, where the common ratio r is the ratio of two consecutive terms of the
geometric sequence, i.e. r = an+1an
.
The ratio test for convergence of a series is a generalisation of this to other types of series.
Ratio Test: Suppose we have a series∞∑n=1
an where an > 0 for all n, and for which limn→∞
an+1
aneither exists or is infinite.
Let ρ = limn→∞
an+1
an.
• If ρ < 1 then∞∑n=1
an converges. (As a consequence we get limn→∞
an = 0.)
• If ρ > 1 then limn→∞
an =∞ and∞∑n=1
an diverges.
• If ρ = 1, then the ratio test fails as the series may converge, or diverge to ∞.
Notice that this test could also be used to test for convergence of a geometric series since in this
case limn→∞an+1an
= an+1an
= r, a constant.
Examples
1.∞∑n=1
1
n2
(ρ = 1 and therefore ratio test fails, but we know this series converges by earlier tests)
2.∞∑n=1
2n
n!
(ρ = 0, series converges by ratio test)
ENG1091 Mathematics for Engineering page 99
3.∞∑n=1
n100
2n
(ρ = 12 , series converges by ratio test)
4.∞∑n=1
n!
nn
(ρ = 1e , series converges by ratio test)
5. Use the ratio test to show the series∞∑n=1
ne−nconverges.
ENG1091 Mathematics for Engineering page 100
Absolute and Conditional convergence
All of the series in the previous section were series of positive terms. We can now remove
this restriction and allow arbitrary terms an. We can obtain a series of positive terms from an
arbitrary series by replacing all the terms with their absolute values.
Definition: The series∞∑n=1
an is said to be absolutely convergent if the series∞∑n=1
|an| con-
verges.
Absolute convergence Theorem: If a series converges absolutely then the series converges.
Thus the tests for series of positive terms can be used to determine the convergence of any series
converges by it showing converges absolutely.
Example: Show the series∞∑n=1
(−1)n
n2converges absolutely.
However the absolute convergence test (if we call it that) is a suffi cient condition for convergence,
but it is not a necessary condition. Many series may fail to be absolutely convergent and yet are
convergent just the same. We call such series conditionally convergent.
Example: The series∞∑n=1
(−1)n
ndoes not converge absolutely because if we replace all the terms
by their absolute values we get the divergent harmonic series.∞∑n=1
1
n.
However the alternating harmonic series∞∑n=1
(−1)n
nconverges (conditionally) as we will show.
We cannot use any of the tests previously discussed to show that the series∑∞
n=1(−1)n
n converges
as these tests apply only to series of positive terms. Generally speaking, to demonstrate conver-
gence where the convergence is not absolute is usually quite diffi cult. We will discuss but one of
ENG1091 Mathematics for Engineering page 101
many tests that do the job; this test is very easily applied but is quite restrictive as it can only
be used on special types of series.
The Alternating series test. Suppose we have a series of the form∞∑n=1
(−1)n an where the
sequence an satisfies:
(i) an ≥ 0, for all n
(ii) limn→∞ an = 0 and
(iii) an+1 ≤ an for all n.
Then the series∞∑n=1
(−1)n an converges.
Example: The series∞∑n=1
(−1)n
n.
(i) The series is of the required form with an = 1n . Clearly an > 0 for all n.
(ii) limn→∞1n = 0,
(iii) an − an+1 = 1n −
1n+1 = 1
n(n+1) > 0 for all n and hence an+1 ≤ an.
The three parts of the alternating series test are satisfied and we deduce that∞∑n=1
(−1)n
nconverges.
Example: The series∞∑n=2
cosnπ
loge n.
(i) Since cosnπ = (−1)n the series is of the required form with an = 1loge n
. Since loge n > 0
for all n ≥ 2, we have an > 0.
(ii) Also, limn→∞1
loge n= 0,
(iii) To show an − an+1 = 1loge n
− 1loge(n+1) > 0 for all n, is a little more awkward than that
for the previous example and one way of doing this is to show the function 1/ loge (x) is
decreasing for all x ≥ 2. This is easy using calculus:
The function 1/ loge (x) has derivative:
d
dx(loge (x))−1 = −1 (loge (x))−2 × d
dxloge (x)
= − 1
x (loge (x))2
This is clearly negative, and hence 1/ loge (x) is a decreasing function. Thus1
loge n− 1
loge(n+1) > 0 for all n ≥ 2.
All three parts of the alternating series test are satisfied and we deduce that∞∑n=2
cosnπ
loge nconverges.
ENG1091 Mathematics for Engineering page 102
The alternating series test is quite restrictive as it cannot be used to show the conditional
convergence of series whose terms do not strictly alternate in sign.
For example, the series∑∞
n=1sinnn is also convergent conditionally, but its terms do not strictly al-
ternate in sign. A more general test for conditional convergence (and which works for∑∞
n=1sinnn )
is Dirichlet’s test but will not be examined in this course.
ENG1091 Mathematics for Engineering page 103
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
Throughout our discussions on differentiation and integration we have examined functions with
only one independent variable. Yet we can think of any number of examples in engineering in
which a quantity is defined by two or more independent variables. The volume of a cylinder is a
function of the height of the cylinder and the radius of its base:
V = πr2h
The density of ocean water is a function of its temperature and salinity: density:
ρ = ρ (T, σ)
For the moment let us focus on functions with two independent variables, x and y. For further
convenience, we can assume that x and y are our familiar Cartesian coordinates. Given an
arbitrary function of our two independent variable, z = f(x, y), it is possible to view the variable
z as the height above the x-y plane. This function of two variables is thus a three-dimensional
surface above the x-y plane, which, unfortunately, is very diffi cult to graph on a piece of paper. In
graphing f(x, y), it is common to draw lines of constant height z (i.e. contours). Such diagrams
are completely analogous to contour maps used in bushwalking and mountaineering.
It is worth the time to graph a few simple functions to help with future lectures.
Consider the contour maps/surface plots for the functions below:
z1 =√
16− x2 − y2 z2 = 16− x2 − y2
ENG1091 Mathematics for Engineering page 108
z3 = 2x− y3
and
z4 = cos (x) cos (y) (not examinable)
It is worth noting that the function f (x, y) is often called a scalar field in vector calculus. Also,
we can readily extend this material to three dimensions and beyond; only it isn’t simple to draw
such functions on paper.
ENG1091 Mathematics for Engineering page 109
2. Partial differentiation: The aim of this section is to extend some of the principles of
basic calculus to functions with multiple independent variables. We begin with differentiation.
Thinking back to one independent variable, if f is a function of a single variable, x say, then we
define the derivative of f with respect to x as
df
dx= lim
∆x→0
f (x+ ∆x)− f (x)
∆x
Now if f is a function of two independent variables, x and y, then we can define the derivative
of f with respect to each of these variables as follows
∂f
∂x= lim
∆x→0
(∆f
∆x
)y=const
= lim∆x→0
(f(x+ ∆x, y)− f(x, y)
∆x
)y=const
(1)
In this operation we treat y as a constant. It is basically ignored. Note the special notation used
for the partial derivative. Note that∂f
∂xand
df
dx
have different meanings in multivariable calculus, so we need to be careful. The partial derivative
with respect to y is similarly defined as
∂f
∂y= lim
∆y→0
(∆f
∆y
)x=const
= lim∆y→0
(f(x, y + ∆y)− f(x, y)
∆y
)x=const
(2)
where x is held constant throughout.
The basic concepts of differentiation (e.g. the product rule,quotient rule, associative and distrib-
utive properties) extend across to higher dimensions as expected.
ENG1091 Mathematics for Engineering page 110
Returning to our visualisation of z = f (x, y) as representing a height or a 3-D surface, then the
partial derivative∂z
∂x
represents the change in height in the x direction or the slope of the surface in the x direction.
Example: Find both partial derivatives of
f (x, y) = sin (xy) + x2 + x/y
∂f
∂x= cos (xy) · y + 2x+ 1/y
∂f
∂y= cos (xy) · x− xy−2
= y cos (xy) + 2x+ 1/y = x cos (xy)− xy−2
Example: Given
f(x, y) = sin(xy) + x2 + x/y,
find both ∂f∂x and
∂f∂y at the point (π, 1) .
∂f
∂x|(π,1)= cos (π) + 2π + 1
∂f
∂y|(π,1)= π cos (π)− π
= 2π = −2π
As the text notes, partial differentiation can readily be extending to instances of more than two
independent variables.
Example (from text): Given
f (x, y, z) = xyz2 + 3xy − z
find∂f
∂x,∂f
∂yand
∂f
∂z.
∂f
∂x= yz2 + 3y
∂f
∂y= xz2 + 3x
∂f
∂z= 2xyz − 1
Suppose we want to evaluate the partial derivative at a specified point. That is, we want to
quantify the slope given a choice of x and y. Just as in one dimension, we must take the derivative
first before plugging in the variable. Note that since y is held constant in calculating∂f∂x ,, it doesn’t
really matter when we substitute in the given value of y.
ENG1091 Mathematics for Engineering page 111
3. The gradient and directional derivatives
Staying in Cartesian coordinates, it is natural to extend the partial derivatives to include a
direction. That is, we can turn them into a vector. Assuming that ∂f∂x points in the direction of
x and ∂f∂y points in the direction of y, then we call define the gradient of the field f(x, y) as
∇f(x, y) =∂f
∂xi+
∂f
∂yj (3)
where i and j are the unit vectors in the direction of x and y, respectively. The gradient of the
field f is often abbreviated as ‘gradf’and given the notation ∇f .
Example: Given the scalar field
f(x, y) =√
16− x2 − y2,
calculate ∇f . Sketch these vectors on the contour map of f(x, y).
Solution:
∇f(x, y) =∂f
∂xi+
∂f
∂yj =
1
2
(16− x2 − y2
)−1/2 · −2xi+1
2
(16− x2 − y2
)−1/2 · −2yj
=−1√
16− x2 − y2(xi+ yj)
Note that the gradient vector is always perpendicular to a level curve at a given point and
points towards the direction of increasing function value.
The previous example revealed a noteworthy point about the gradient. At all points the vectors
of the gradient are at right angles to the contour lines. In this two-dimensional, Cartesian
coordinate picture, the gradient points us in the direction of greatest change of our scalar field
f (x, y). Going back to our analogy of f (x, y) representing the contours of height on a map, the
gradient of f (x, y) gives us a vector that tells us the direction of the maximum slope and its
magnitude.
Example: Given the scalar field f(x, y) = xy, draw the contour field, calculate ∇f and sketchthe gradient vectors over the contour lines.
∇f = ∂f∂x i+ ∂f
∂y j = yi+ xj
5 4 3 2 1 1 2 3 4 5
5
4
3
2
1
1
2
3
4
5
x
y
ENG1091 Mathematics for Engineering page 112
Please note that the gradient can readily be extended to higher dimensions.
Example: Given the scalar field
f(x, y, z) = z + (x2 + y2)
calculate ∇f . Sketch a level surface f(x, y, z) = k for some suitable value of k and plot ∇f ata point on this surface. (The graphic illustrates the case k = 1, i.e. the surface z + (x2 + y2).)
42
4
4 2
3
2
z0
y
00
1
2
3
x2
1
42
Example: Given the scalar field f(x, y, z) = xyz2 + 3xy − z calculate∇f.
∇f =∂f
∂xi+
∂f
∂yj+
∂f
∂zk =
(yz2 + 3y
)i+(xz2 + 3x
)j+ (2xyz − 1)k
Directional derivative
We’ve seen that ∇f is a vector that tells us the direction and magnitude of the rate of changeof the scalar field f (x, y). We can also use ∇f to find the rate of change of the scalar field f (x,y) in some arbitrary direction. This is known as the directional derivative. Specifically, if we are
given a scalar field f (x , y) and a specified orientation to follow, say
v = vxi+ vyj
the unit vector having same direction as v is v =v
‖v‖ where ‖v‖ =√v2x + v2
y ;
then the directional derivative Dvf is defined as
Dvf = ∇f ·(v
‖v‖
)(4)
Example: Given the scalar field f(x, y) = xy, find the directional derivative in the direction of
v = 3i+ 4j
at the points ((1, 1), (1,−1) , and (−4, 3).
v = 3i+ 4j so that ‖v‖ =√
(3)2 + (4)2 = 5 and hence v = 35 i+ 4
5 j.
ENG1091 Mathematics for Engineering page 113
∇f = ∂f∂x i+ ∂f
∂y j = yi+ xj
Hence Dvf (x, y) = ∇f · v = 35y + 4
5x.
Dvf (1, 1) = 75 , Dvf (1,−1) = 1
5 , Dvf (−4, 3) = −75 .
The definition of the directional derivative presented here is different, in notation, than that
presented in the text. One would find that the definitions are identical in practice since:
v
‖v‖ =vxi+ vyj√v2x + v2
y
=
vx√v2x + v2
y
i+
vy√v2x + v2
y
j = cos(α)i+ sin(α)j (5)
where α is the angle that the vector v makes with the x axis. Using the dot product, eq.(4)
becomes:
∇f ·(v
‖v‖
)=
(∂f
∂xi+
∂f
∂yj
)· (cos(α)i+ sin(α)j)
=∂f
∂xcosα+
∂f
∂ysinα (6)
Equation (6) is the definition of directional derivative (of functions of two variables) given in the
text.
The vector definition presented in these notes is, in general, far more widely used in mathematics
and engineering as it can readily be extended to other coordinate systems and higher dimensions.
4. The chain rule
In one dimension the chain rule was employed when f (x ) and x (t). In such a case,
df
dt=df
dx× dx
dt.
When moving to multiple dimensions, the basic concept is extended.
Suppose that we have z = f (x, y) and that x (s, t) and y(s, t). Here we have f as a function of
two variables, and each of these variables, in turn is a function of two variables. In this case we
may find an expression for the change in f with respect to s and t.
∂z
∂s=∂f
∂x
∂x
∂s+∂f
∂y
∂y
∂s
and∂z
∂t=∂f
∂x
∂x
∂t+∂f
∂y
∂y
∂t
As the text notes, a good example of this is when undertaking a coordinate transformation. If
a function is defined in Cartesian coordinates, and we wish to change over to polar coordinates
(r, θ) then we need to recall the relations
x = r cos θ, and y = r sin θ.
In calculating the partial derivatives, one can either completely change coordinate systems first,
and then compute the partial derivatives, or apply the chain rule.
Example: Given the function z = sin(xy) is defined in for a Cartesian coordinate system, find
the partial derivatives∂z
∂r, and
∂z
∂θ.
ENG1091 Mathematics for Engineering page 114
∂z
∂x= y cos (xy)
∂z
∂y= x cos (xy)
From x = r cos θ, and y = r sin θ we have:
∂x
∂r= cos θ
∂x
∂θ= −r sin θ
∂y
∂r= sin θ
∂y
∂θ= r cos θ
Now
∂z
∂r=
∂z
∂x
∂x
∂r+∂z
∂y
∂y
∂r
= y cos (xy) cos θ + x cos (xy) sin θ
= 2r cos θ sin θ cos (xy)
= r cos (xy) sin (2θ)
∂z
∂θ=
∂z
∂x
∂x
∂θ+∂z
∂y
∂y
∂θ
= y cos (xy) · −r sin θ + x cos (xy) r cos θ
= cos (xy)(r2 cos2 θ − r2 sin2 θ
)= r2 cos (xy) cos (2θ)
Suppose now we have z = f(x, y) and that x and y are functions of a single variable t. Here we
might think of x and y being our Cartesian coordinates again, but these values are functions of
the time t. (Thus x (t) and y(t) define a path of some particle as it moves in the x -y plane.)
We can then define a derivative of z with regards to t as follows:
dz
dt=∂f
∂x
dx
dt+∂f
∂y
dy
dt
ENG1091 Mathematics for Engineering page 115
Example: Given z (x, y) = x2y − y lnx− 2x with the further relations x (t) = t2 and
y (t) = cos (t).
Finddz
dtand evaluate it at the time t = π.
dz
dt=∂z
∂x
dx
dt+∂z
∂y
dy
dt
∂
∂x
(x2y − y lnx− 2x
)= 2xy − y
x− 2
∂
∂y
(x2y − y lnx− 2x
)= x2 − lnx
dx
dt= 2t
= 2π when t = π
dy
dt= − sin t
= 0 when t = π
dz
dt=
∂z
∂x
dx
dt+∂z
∂y
dy
dt
=(
2xy − y
x− 2)
2π + 0
=
(−2π2 +
1
π2− 2
)2π substituting x = π2 and y = −1 when t = π
= −4π3 +2
π− 4π
ENG1091 Mathematics for Engineering page 116
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES
We are then back to optimising a function of two independent variables and we could approach
the problem as was done in the previous section.
Please note however that this can be very tedious. We can actually manipulate this problem
to present it in a manner that is usually easier to solve. Consider the constraint (26.1). This,
in general, represents a surface in 3-D space. We will define small motions along this surface
as ds = (dx, dy, dz). Without any loss of generality, we can consider this to be a vector in the
3-D Cartesian space. Since g(x, y, z) is constrained to be zero, we know that motion along this
surface won’t change the value of g(x, y, z) :
dg = ∇g · ds =∂g
∂xdx+
∂g
∂ydy +
∂g
∂zdz = 0
Now assume that we are at the point that conditional stationary point that actually both satisfies
the constraint and optimises f(x, y, z) under this constraint. Then small motions along the
surface will also require
df = ∇f · ds =∂f
∂xdx+
∂f
∂ydy +
∂f
∂zdz = 0
Using our basic understanding of the vector dot product we know that both ∇f and ∇g isperpendicular to ds. Thus they may be expressed as a linear combination of one another.
∇f − λ∇g =
(∂f
∂x,∂f
∂y,∂f
∂z
)− λ
(∂g
∂x,∂g
∂y,∂g
∂z
)= (0, 0, 0) (26.2)
ENG1091 Mathematics for Engineering page 125
Here λ is basically another unknown variable. At this point in time, some students might be
asking what the advantage in all of this is. We have moved from our initial optimisation problem
with three unknowns (x, y and z ) to a system with four equations [(26.1) and the three of (26.2)]
and four unknowns (x, y, z and λ). Experience tells us that this new approach is often easier to
solve than the original problem. Please note that the variable λ is called the Lagrange multiplier
and the function
φ(x, y, z) = f(x, y, z)− λg(x, y, z)
is called the auxiliary function.
Example: Find the extrema of the function
f(x, y, z) = x2 + y2 + z2
subject to the constraint
g(z, y, z) = x2 + 2y2 − z2 − 1 = 0
Solution: The three Lagrange multiplier equations can be written:
∇f = (2x, 2y, 2z) = λ∇g = λ (2x, 4y,−2z)
The first equation 2x = λ2x gives λ = 1 or x = 0
If λ = 1 (x is arbitrary) then the second component gives 2y = 4y hence y = 0; and the third
component 2z = −2z gives z = 0.
Solving the constraint equation x2 + 2y2 − z2 − 1 = 0 with y = z = 0 gives x = ±1.
Using the equation 2y = λ4y we have λ = 1/2.
If λ = 1/2, then y can be arbitrary and equations 1 and 3 give x = z = 0. The constraint equation
x2 + 2y2 − z2 − 1 = 0 with x = z = 0 gives y = ±1/√
2.
Using the equation 2z = λ (−2z) we have λ = −1.
If λ = −1, then z can be arbitrary and equations 1 and 2 give x = y = 0. The constraint equation
x2 + 2y2 − z2 − 1 = 0 becomes −z2 = 1 which has no solution.
There are thus the 4 constrained extreme points (±1, 0, 0) with f (x, y, z) = 1 and(0,±1/
√2, 0)
with f (x, y, z) = 1/2.
ENG1091 Mathematics for Engineering page 126
Example: Find the extrema of the function f(x, y, z) = xyz subject to the constraint
g(x, y, z) = x2 + y2 + z2 = 1.
Solution: The three Lagrange multiplier equations can be written:
∇f = (yz, xz, xy) = λ∇g = λ (2x, 2y, 2z)
λ =yz
2x=xz
2y=xy
2z
y2 = x2; z2 = y2; x2 = z2
x2 + y2 + z2 = 1 so 3x2 = 1⇒ x = ± 1√3
we have y = ± 1√3
; z = ± 1√3
so eight points:
(± 1√
3,± 1√
3,± 1√
3
).
Example: Use the method of Lagrange multipliers to find the maximum possible volume of a
cone inscribed in a sphere of radius a.
Solution: Let the cone have height h and radius r.
The function to be maximised is V = 13πr
2h.
The fact that the cone is inscribed in the sphere leads to the constraint:
a2 = r2 + (h− a)2 = g (r, h) .
This time there are two Lagrange multiplier equations:
∇V =
(2
3πrh,
1
3πr2)
= λ∇g = λ (2r, 2 (h− a))
so λ =23πrh
2r=
13πr
2
2 (h− a)
hence2h
r=
r
h− a and hence 2h2 − 2ah+ h2 − 2ah+ a2 = a2
3h2 − 4ah = 0 and hence h (3h− 4a) = 0⇒ h =4a
3(or h = 0)
From r2 + (h− a)2 = a2 we get
r2 = a2 − (h− a)2 = a2 −(a
3
)2=
8
9a2
r =2√
2
3a
Vmax =1
3πr2h =
32π
81a3
ENG1091 Mathematics for Engineering page 127
MONASH UNIVERSITY —SCHOOL OF MATHEMATICAL SCIENCES