Lecture Three: Linear Algebra * Hongjun Li † October 2014 Contents 1 Linear System 2 2 Matrix Algebra 4 2.1 Matrix Operations ..................................... 4 2.2 Invertible Matrix ...................................... 6 2.3 Partitioned Matrix ..................................... 8 3 Determinants 9 4 Vector Spaces 11 4.1 General Vector Space .................................... 11 4.2 Matrix Spaces ........................................ 14 4.3 Norm and Inner Product .................................. 15 5 Eigenvalues and Eigenvectors 17 5.1 Definitions and Examples ................................. 17 5.2 Properties of Eigenvalues ................................. 18 5.3 Symmetric Matrices .................................... 20 5.3.1 Properties of Symmetric Matrices ......................... 20 5.3.2 Quadratic Forms .................................. 21 6 Homework 24 * This is the lecture note for the class of Mathematical Foundations for Economists. Comments are welcome. † Email: [email protected]. Phone: 83952487. Address: Chengming Build. 331, Capital University of Economics and Business, Beijing, China. 1
25
Embed
Lecture Three: Linear Algebra€¦ · This is the lecture note for the class of Mathematical Foundations for Economists. Comments are welcome. yEmail: [email protected]. Phone: 83952487.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
∗This is the lecture note for the class of Mathematical Foundations for Economists. Comments are welcome.†Email: [email protected]. Phone: 83952487. Address: Chengming Build. 331, Capital University of Economics
and Business, Beijing, China.
1
1 Linear System
The analysis of many economic models reduces to the study of systems of equations. Further-more, some of the most frequently studied economic models are linear models. Although some ofthe relationships among the variables are described by a system of nonlinear equations, one cantake the derivative of the equations to convert them to an approximating linear system.
An equation is linear if it has the form
a1x1 + a2x2 + · · ·+ anxn = b,
where the letters a1, a2, · · · , an, and b, which stand for fixed numbers, are called parameters. Theletters x1, x2, · · · , xn stand for variables.
A system of linear equations (or a linear system) is a collection of one or more linear equationsinvolving the same variables–say, x1, · · · , xn. It looks like
A solution of the system is a list (s1, s2, · · · , sn) of numbers that makes each equation a truestatement when the values s1, s2, · · · , sn are substituted for x1, x2, · · · , xn, respectively. The setof all possible solutions is called the solution set of the linear system. Two linear systems arecalled equivalent if they have the same solution set. A system of linear equations is said to beconsistent if it has either one solution or infinitely many solutions; a system is inconsistent if ithas no solution.
The matrix notation of the linear system 1.1 isa11 a12 · · · a1na21 a22 · · · a2n...
.... . .
...ak1 ak2 · · · akn
x1x2...xn
=
b1b2...bk
(1.2)
orAX = B
whereA =
a11 a12 · · · a1na21 a22 · · · a2n...
.... . .
...ak1 ak2 · · · akn
is the coefficient matrix, X =
x1x2...xn
is the vector of unknown
variable. And
a11 a12 · · · a1n b1a21 a22 · · · a2n b2...
.... . .
......
ak1 ak2 · · · akn bk
is called the augmented matrix of the system.
To solve the linear system, we may take the elementary row operations, which include:
• (Scaling) Multiply all entries in a row by a nonzero constant.
• (Replacement) Replace one row by the sum of itself and a multiple of another row.
2
• (Interchange) Interchange two rows.
An elementary row operation is a special type of function (rule) e which associated with each k×nmatrix an k × n matrix e(A). One can precisely describe e in the three cases as follows:
• (Scaling) e(A)ij = Aij if i 6= r, e(A)rj = cArj
• (Replacement) e(A)ij = Aij if i 6= r, e(A)rj = Arj + cAsj ;
• (Interchange) e(A)ij = Aij if i is different from both r and s, e(A)rj = Asj and e(A)sj = Arj .
Theorem 1.1 To each elementary row operation e there corresponds an elementary row operatione1, of the same type as e, such that e1(e(A)) = A for each A.
Two matrices are called row equivalent if there is a sequence of elementary row operations thattransforms one matrix into the other.
Definition 1.1 (Echelon Form) A rectangular matrix is in echelon form if it has the followingthree properties:
1. All nonzero rows are above any rows of all zeros;
2. Each leading entry (the leftmost nonzero entry) of a row is in a column to the right of theleading entry of the row above it;
3. All entries in a column below a leading entry are zeros.
Definition 1.2 (Reduced Echelon Form) If a matrix in echelon form satisfies the followingadditional conditions, then it is in reduced echelon form:
1. The leading entry in each nonzero row is 1;
2. Each leading 1 is the only nonzero entry in its column.
Theorem 1.2 (Uniqueness of the Reduced Echelon Form) Each matrix is row equivalent toone and only one reduced echelon matrix.
Theorem 1.3 (Existence Theorem) A linear system is consistent if and only if an echelon formof the augmented matrix has no row of the form
[0, · · · , 0, b] with b nonzero.
Example 1.1 If
A =
3 −1 22 1 11 −3 0
find all solutions of AX = 0.
3
2 Matrix Algebra
A matrix is simply a rectangular array of numbers. If A is an k × n matrix, it has l rows andn columns. The number in row i and column j is called the (i, j)-entry, denoted by aij as thefollowing equation shows.
A = [aij ] = [A]ij =
a11 a12 · · · a1na21 a22 · · · a2n...
.... . .
...ak1 ak2 · · · akn
(2.1)
Each column of A is a list of k real numbers, which identifies a vector in Rk. Denote thesecolumns by a1,a2, · · · ,an and the matrix A can be written as
A = [a1,a2, · · · ,an]
The diagonal entries in a k × n matrix A = [aij ] are a11, a22, a33, · · · , and they form the maindiagonal of A. If k = n, then A is a square matrix. Several particular types of square matricesoccur frequently in economics and finance.
• A symmetric matrix is one in which aij = aji for all i and j.
• A diagonal matrix is a square matrix whose only nonzero elements appear on the maindiagonal.
• A scalar matrix is a diagonal matrix with the same value in all diagonal elements.
• An identity matrix is a scalar matrix with ones on the diagonal, always denotes as I.
• A triangular matrix is one that has only zeros either above or below the main diagonal.
• An idemponent matrix A is a square matrix for which AA = A.
2.1 Matrix Operations
Definition 2.1 (Equality) Matrices A and B are equal if and only if they have the same dimen-sions and each element of A equals to the corresponding element of B. That is,
A = B if and only if aij = bij for all i and j. (2.2)
Definition 2.2 (Transpose) The transpose of a matrix A, denoted A′ is obtained by creating thematrix whose jth row is the jth column of the original matrix. Thus
B = A′ ⇔ bji = aij for all i and j. (2.3)
For any matrix, (A′)′ = A. If A is symmetric, A = A′.
Definition 2.3 (Matrix Addition) Suppose two matrices A and B are of the same dimensions.We say C is the addition matrix of A and B if C = A+B = [aij + bij ].
4
It is easy to check that for matrices A, B, and C of the same dimensions,
A+B = B +A
(A+B) + C = A+ (B + C)
(A+B)′ = A′ +B′
Definition 2.4 (Scalar Multiplication) Scalar multiplication of a matrix is the operation ofmultiplying every element of the matrix by a given scalar. For scalar c and matrix A, cA = [caij ].
Before we introduce matrix multiplication, we firstly explain one vector multiplication, innerproduct. The inner product of two vectors, a and b, is a scalar which equals to a′b = a1b1 +a2b2 +· · ·+ anbn. It is each to check that a′b = b′a.
Definition 2.5 (Matrix Multiplication) For a k × n matrix A and a n × m matrix B, theproduct matrix, C = AB, is a k×m matrix whose (i, j)-th entry is the inner product of row i of Aand column j of B, cij = a′ibj.
If all the multiplications are conformable, then
(AB)C = A(BC)
A(B + C) = AB +AC
(AB)′ = B′A′
(ABC)′ = C ′B′A′
Example 2.1 1.
(5 −1 20 7 2
)=
(1 0−3 1
)(5 −1 215 4 8
);
2.
0 6 19 12 −812 62 −33 8 −2
=
1 0−2 35 40 1
(0 6 13 8 −2
);
3.
0 1 00 0 00 0 0
1 0 32 3 4−3 1 6
=
2 3 40 0 00 0 0
;
4.
1 0 32 3 4−3 1 6
0 1 00 0 00 0 0
=
0 1 00 2 00 −3 0
.
Definition 2.6 (Elementary Matrix) An n× n matrix is said to be an elementary matrix if itcan be obtained from the n× n identity matrix by means of a single elementary row operation.
Example 2.2 A 2× 2 elementary matrix is necessarily one of the following:(0 11 0
),
(1 c0 1
),
(1 0c 1
),
(c 00 1
)(c 6= 0),
(1 00 c
)(c 6= 0).
5
Theorem 2.1 Let e be an elementary row operation and let E be the n × n elementary matrixE = e(I). Then, for every n× k matrix A,
e(A) = EA.
Corollary 2.2 Let A and B are matrices of dimension k × n. Then B is row-equivalent to A ifand only if B = PA, where P is a product of n× n elementary matrices.
2.2 Invertible Matrix
Definition 2.7 (Invertible Matrix) Let A be an n × n matrix. An n × n matrix B such thatBA = I is called a left inverse of A; an n × n matrix B such that AB = I is called a rightinverse of A. If AB = BA = I, then B is called a two-sided inverse of A and A is said to beinvertible.
A matrix that is not invertible is sometimes called a singular matrix, and an invertible matrixis called a non singular matrix.
Theorem 2.3 Let A =
(a bc d
). If ad− bc 6= 0, then A is invertible and
A−1 =1
ad− bc
(d −b−c a
).
If ad− bc = 0, A is singular.
Lemma 2.4 If A has a left inverse B and a right inverse C, then B=C.
Proof. B = BI = B(AC) = (BA)C = IC = C.
Theorem 2.5 Let A and B be n× n matrices.
1. If A is invertible, so is A−1 and (A−1)−1 = A.
2. If both A and B are invertible, so is AB, and (AB)−1 = B−1A−1.
The proof is left as exercise.
Corollary 2.6 A product of invertible matrices is invertible.
Theorem 2.7 An elementary matrix is invertible.
Proof. Let E be an elementary matrix corresponding to the elementary row operation e. If e1 isthe inverse operation of e and E1 = e1(I), then
EE1 = e(E1) = e(e(I)) = I
andE1E = e1(E) = e1(e(I)) = I
so that E is invertible and E−1 = E1.
6
Example 2.3 1.
(0 11 0
)−1=
(0 11 0
).
2.
(1 c0 1
)−1=
(1 −c0 1
).
3.
(1 0c 1
)−1=
(1 0−c 1
).
4. When c 6= 0,(c 00 1
)−1=
(c−1 00 1
)and
(1 00 c
)−1=
(1 00 c−1
).
Theorem 2.8 If A is an n× n matrix, the following are equivalent.
• A is invertible.
• A is row-equivalent to the n× n identity matrix.
• A is a product of elementary matrices.
Proof. Let R be a row-reduced echelon matrix which is low-equivalent to A. By Theorem 2.1 andits corollary,
R = Em · · ·E2E1A
where E1, . . . , Em are elementary matrices. Each Ej is invertible, and so
A = E−11 · · ·E−1m R.
Thus, A is invertible if and only if R is invertible. Since R is a row-reduced echelon matrix, R isinvertible if and only if R = I. We have shown that A is invertible if and only if R = I, and ifR = I then A = E−11 · · ·E
−1k . Therefore, the above statements are equivalent.
Corollary 2.9 If A is an invertible n × n matrix and if a sequence of elementary row operationsreduces A to the identity, then that same sequence of operations when applied to I yield I−1.
Corollary 2.10 Let A and B be k × n matrices. Then B is row-equivalent to A if and only ifB = PA where P is an invertible k × k matrix.
Theorem 2.11 If A is an n× n matrix, the following are equivalent.
• A is invertible.
• The homogeneous system AX = 0 has only the trivial solution X = 0.
• The system of equations AX = B has a solution X for each n× 1 matrix B.
Corollary 2.12 A square matrix with either a left or right inverse is invertible.
Proof. We only prove the left inverse matrix while the right inverse case is left as exercise.Let A be an n × n matrix and BA = I. Then AX = 0 has only the trivial solution, becauseX = IX = B(AX). Therefore A is invertible.
7
Corollary 2.13 Let A = A1A2 · · ·Am, where A1, · · · , Am are n×n matrices. Then A is invertibleif and only if each Aj is invertible.
Example 2.4 If A =
(1 21 3
), then A−1 =
(3 −2−1 1
).
2.3 Partitioned Matrix
In formulating the elements of a matrix, it is sometimes useful to group some of the elementsin submatrices.
A =
1 4 52 9 3
8 9 6
=
(A11 A12
A21 A22
).
A is a partitioned matrix. A common special case is the block diagonal matrix:
A =
(A11 00 A22
),
where A11 and A22 are square matrices.For conformably partitioned matrices A and B,
A+B =
(A11 +B11 A12 +B12
A21 +B21 A22 +B22
),
and
AB =
(A11 A12
A21 A22
)(B11 B12
B21 B22
)=
(A11B11 +A12B21 A11B12 +A12B22
A21B11 +A22B21 A21B12 +A22B22
).
Theorem 2.14 (Column-Row Expansion of AB) If A is k × n and B is n×m, then
AB =(col1(A) col2(A) · · · coln(A)
)
row1(B)row2(B)
...rown(B)
= col1(A)row1(B) + · · ·+ coln(A)rown(B)
Proof. For each row index i and column index j, the (i, j)-entry in colp(A)rowp(B) is the productof aip from colp(A) and bpj from rowp(B). Hence the (i, j)-entry in the sum is
ai1b1j + ai2b2j + · · ·+ ainbnj.
This sum is also the (i, j)-entry in AB, by the row-column rule.The inverse of a block diagonal matrix is(
A11 00 A22
)−1=
(A−111 0
0 A−122
).
8
For a general 2× 2 partitioned matrix, one form of the partitioned inverse is(A11 A12
A21 A22
)−1=
(A−111 (I +A12F2A21A
−111 ) −A−111 A12F2
−F2A21A−111 F2
),
where F2 = (A22 −A21A−111 A12)
−1.
3 Determinants
The determinant of a square matrix is a function of the elements of the matrix. For a 2 × 2
matrix, A =
(a bc d
), the determinant of A is denoted as |A| or detA with
|A| = ad− bc.
Given the definition of determinant for 2 × 2 matrix, now we can define the determinant forgeneral square matrix recursively.
Definition 3.1 (Determinant) For n ≥ 2, the determinant of an n × n matrix A = [aij ] is thesum of n terms of the form ±a1jdetA1j, with plus and minus signs alternating, where the entriesa11, a12, · · · , a1n are from the first row of A. In symbols,
where Aij is the submatrix formed by deleting the ith row and jth column of A, and detAij is thedeterminant of Aij.
detAij is also called the (i, j)-th minor of A, while (−1)i+jdetAij is called the (i, j)-th cofactorof A, denoted as Cij .
Example 3.1 Compute the determinant of A =
1 5 02 4 −10 −2 0
Theorem 3.1 The determinant of a triangular or diagonal matrix is simply the product of itsdiagonal entries.
The proof is left as exercise.
Theorem 3.2 The determinant of an n × n matrix A can be computed by a cofactor expansionacross any row or down any column. The expansion across the ith row using the cofactors is
detA = ai1Ci1 + ai2Ci2 + · · ·+ ainCin.
The cofactor expansion down the jth column is
detA = a1jC1j + a2jC2h + · · ·+ anjCnj .
9
Corollary 3.3 For an n× n matrix A|A| = |A′|.
Example 3.2 Compute
∣∣∣∣∣∣∣∣5 −7 2 20 3 0 −4−5 −8 0 3
0 5 0 −6
∣∣∣∣∣∣∣∣ .Theorem 3.4 (Row Operations) Let A be a square matrix.
• (Scaling) If one row of A is multiplied by c to produce B, then |B| = c|A|.
• (Replacement) If a multiple of one row of A is added to another row to produce a matrix B,then |B| = |A|.
• (Interchange) If two rows of A are interchanged to produce B, then |B| = −|A|.
Theorem 3.5 A square matrix A is invertible if and only if |A| 6= 0.
Theorem 3.6 (Multiplicative Property) If A and B are n× n matrices, then |AB| = |A||B|.
Proof. If A is singular, then AB is singular as well. In this case, |AB| = |A||B| = 0. If A isinvertible, then there exist elementary matrices E1, · · · , Ep such that
Theorem 3.7 (Cramer’s Rule) Let A be an invertible n×n matrix. For any b ∈ Rn, the uniquesolution x of Ax = b has entries given by
xi =|Ai(b)||A|
, i = 1, 2, · · · , n,
where Ai(b) is the matrix obtained from A by replacing column i by the vector b.
Proof. Denote the columns of A by a1, · · · ,an and the columns of the n× n identity matrix I bye1, · · · , en. If Ax = b, the definition of matrix multiplication shows that
AIi(x) = A[e1 · · ·x · · · en]
= [Ae1 · · ·Ax · · ·Aen]
= [a1 · · ·b · · ·an]
= Ai(b)
10
By the multiplicative property of determinants
|A||Ii(x)| = |Ai(b)|
The second determinant on the left is simply xi. Hence |A|xi = |Ai(b)|. As A is invertible, |A| 6= 0
and xi = |Ai(b)||A| .
Cramer’s rule leads easily to a general formula for the inverse of an n × n matrix A. The jthcolumn of A−1 is a vector x that satisfies
Ax = ej
where ej is the jth column of the identity matrix, and the ith entry of x is the (i, j)-entry of A−1.By Cramer’s rule,
{(i, j)− entry ofA−1} = xi =|Ai(ej)||A|
.
Notice that|Ai(ej)| = (−1)i+j |Aji| = Cji
where Cji is a cofactor of A. Thus
A−1 =1
|A|
C11 C12 · · · C1n
C21 C22 · · · C2n...
.... . .
...Cn1 Cn2 · · · Cnn
. (3.1)
The matrix of cofactors on the right side of equation (3.1) is called the adjugate of A, denoted byadjA.
Theorem 3.8 Let A be an invertible n× n matrix. Then
A−1 =1
|A|adjA.
Example 3.3 Find the inverse of the matrix A =
2 1 31 −1 11 4 −2
.
4 Vector Spaces
4.1 General Vector Space
Definition 4.1 (Vector Space) A vector space is a nonempty set V of objects, called vectors, onwhich are defined two operations, called addition and scalar multiplication, subject to the axiomslisted below. For all vectors u,v,w ∈ V and for all scalars c and d:
1. The sume of u and v, denoted by u + v is in V .
2. u + v = v + u.
3. (u + v) + w = u + (v + w).
4. There is zero vector 0 ∈ V such that (u + 0) = u.
11
5. For each u ∈ V , there is a vector −u ∈ V such that u + (−u) = 0.
6. The scalar multiple of u by c, denoted by cu, is in V .
7. c(u + v) = cu + cv.
8. (c+ d)u = cu + du.
9. c(du) = cdu.
10. 1u = u.
Example 4.1 R2 is a vector space.
Definition 4.2 (Subspace) A subspace of a vector space V is a subset H of V that has threeproperties:
• The zero vector of V is in H.
• H is closed under vector addition. That is, for each u and v in H, the sum u + v is in H.
• H is closed under multiplication by scalars.
Example 4.2 The set consisting of only the zero vector in a vector space V is a subspace of V ,called the zero subspace and written as {0}.
Example 4.3 The vector space R2 is not a subspace of R3. The set
H =
st
0
: s and t are real
is a subspace of R3.
If A is a k × n matrix, with columns a1, · · · ,an, and if x ∈ Rn, then the product of A and x,denoted by Ax, is the linear combination of the columns of A using the corresponding entries in xas weight; that is
Ax =(a1 a2 · · · an
)x1...xn
= x1a1 + x2a2 + · · ·+ xnan.
Definition 4.3 (Spanning Vectors) The set of all linear combinations of a set of vectors is thevector space that is spanned by those vectors.
If a set of vectors {v1, · · · ,vp} in Rn spans a vector space H, then each vector in H is a linearcombination of v1, · · · ,vp and for each group of real numbers {c1, c2, · · · , cp} there exists a vectorh ∈ H such that h = c1v1 + c2v2 + · · ·+ cpvn.
12
Definition 4.4 (Linear Independence) An indexed set of vectors {v1, · · · ,v} in Rn is said tobe linearly independent if the vector equation
x1v1 + x2v2 + · · ·+ xpvp = 0
has only the trivial solution. The set {v1, · · · ,vp} is said to be linearly dependent if there existweights c1, · · · , cp, not all zero, such that vector equation
c1v1 + c2v2 + · · ·+ cpvp = 0
Example 4.4 Let v1 =
123
, v2 =
456
, and v3 =
210
. Determine if the set {v1,v2,v3} is
linearly independent.
Theorem 4.1 An indexed set {v1, · · · ,vp} of two or more vectors, with v1 6= 0, is linearly depen-dent if and only if some vj(with j > 1) is a linear combination of the preceding vectors, v1, · · · ,vj−1.
Definition 4.5 (Basis) A basis for a subspace H of Rn is a linearly independent set in H thatspans H.
Example 4.5 Let e1 =
100
, e2 =
010
, and e3 =
001
. Then {e1, e2, e3} is a basis for R3.
Theorem 4.2 If a vector space V has a basis B = {b1, · · · ,bn}, then any set in V containingmore than n vectors must be linearly dependent.
Theorem 4.3 If a vector space V has a basis of n vectors, then every basis of V must consist ofexactly n vectors.
The main reason for selecting a basis for a subspace H, instead of merely a spanning set, is thateach vector in H can be written in only one way as a linear combination of the basis vectors. Tocheck this, suppose B = {b1, · · · ,bp} is a basis for H, and suppose x ∈ H can be generated in twoways, say,
Since B is linearly independent, c1 − d1 = · · · = cp − dp = 0. That is cj = dj for 1 ≤ j ≤ p, whichshows that the two representations are actually the same.
Definition 4.6 (Coordinate) Suppose the set B = {b1, · · · ,bp} is a basis for a subspace H.For each x ∈ H, the coordinates of x relative to the basis B are the weights c1, · · · , cp such thatx = c1b1 + c2b2 + · · ·+ cpbp, and the vector in Rp
[x]B =
c1· · ·cp
is called the coordinate vector of x (relative to B) or the B-coordinate vector of x.
13
Example 4.6 Let e1 =
200
, e2 =
020
, and e3 =
002
. Then B = {e1, e2, e3} is a basis for R3.
Suppose x =
246
, then the coordinate vector of x relative to B is
123
.
Theorem 4.4 (Basis Theorem) Let H be a p-dimensional subspace of Rn. Any linearly inde-pendent set of exactly p elements in H is automatically a basis for H. Also, any set of p elementsof H that spans H is automatically a basis for H.
Definition 4.7 (Dimension) The dimension of a nonzero subspace H, denoted by dimH, is thenumber of vectors in any basis for H. The dimension of the zero subspace {0} is defined to be zero.
Example 4.7 The space Rn has dimension n.
Example 4.8 Let e1 =
210
, e2 =
120
, e3 =
110
, and H = Span{e1, e2, e3}. Then the
dimension of H is 2.
4.2 Matrix Spaces
Definition 4.8 (Column Space) The column space of a matrix A is the set ColA of all linearcombinations of the columns of A.
Example 4.9 Let A =
1 −3 −4−4 6 −2−3 7 6
and b =
33−4
. Determine whether b is in the column
space of A.
Theorem 4.5 The column space of a k × n matrix A is a subspace of Rk.
Definition 4.9 (Rank) The rank of a matrix A, denoted by rankA, is the dimension of the columnspace of A.
Example 4.10 Determine the rank of the matrix
A =
2 5 −3 −4 84 7 −4 −3 96 9 −5 2 40 −9 6 5 −6
Definition 4.10 (Null Space) The null space of a matrix A is the set NulA of all solutions ofthe homogeneous equation Ax = 0.
Theorem 4.6 The null space of a k × n matrix A is a subspace of Rn.
14
Proof. Certainly NulA is a subset of Rn because has n columns. We need to show that NulAsatisfies the three properties of a subspace.
(1) As A0 = 0, 0 ∈ NulA.(2) For any u,v ∈ NulA, we get Au = 0 and Av = 0. Thus
A(u + v) = Au +Av = 0,
which implies u + v ∈ NulA.(3) For any scalar c and any u ∈ NulA,
A(cu) = cA(u) = c(0) = 0,
so cu ∈ NulA. Together, NulA is shown to be a subspace of Rn.
Theorem 4.7 If a matrix A has n columns, then rankA+ dim NulA = n.
Definition 4.11 (Row Space) For a k×n matrix A, the set of all linear combinations of its rowvectors is call the row space of A and is denoted by RowA.
Since the rows of A are identified with the columns of A′, ColA′ = RowA. RowA is a subspaceof Rn.
Theorem 4.8 If two matrices A and B are row equivalent, then their row spaces are the same. IfB is in echelon form, the nonzero rows of B form a basis for the row space of A as well as for thatof B.
Sketch of Proof. Row operations are steps of linear combinations.
Theorem 4.9 Let A be an n × n matrix. Then the following statement are equivalent to thestatement that A is invertible.
• The columns of A form a basis of Rn
• ColA = Rn
• dim ColA = n
• rankA = n
• NulA = {0}
• dim NulA = 0
4.3 Norm and Inner Product
Given vectors in Rn, we may want to measure their lengths and the angles between them.Hence, norm and inner product are introduced.
Definition 4.12 (Euclidean Norm) For x = (x1, x2, · · · , xn) ∈ Rn, the Euclidean norm of x,‖x‖ is defined by
‖x‖ =√x21 + · · ·+ x2n.
15
The distance between two vectors x = (x1, x2, · · · , xn) and y = (y1, y2, · · · , yn) in Rn is given by
‖x− y‖ =√
(x1 − y1)2 + · · ·+ (xn − yn)2.
Theorem 4.10 For any c ∈ R and any x ∈ Rn,
‖cx‖ = c‖x‖.
The proof is left as exercise.
Theorem 4.11 For any two vectors x = (x1, x2, · · · , xn) and y = (y1, y2, · · · , yn) in Rn,
‖x + y‖ ≤ ‖x‖+ ‖y‖.
The proof is left as exercise.
Definition 4.13 (Euclidean Inner Product) Let x = (x1, x2, · · · , xn) and y = (y1, y2, · · · , yn)be two vectors in Rn. Then Euclidean inner produce of x and y, written as x · y is the number
x · y = x1y1 + x2y2 + · · ·+ xnyn.
It is easy to check that the Euclidean norm can be induced by inner product through
‖x‖ =√
x · x.
Example 4.11 If x = (4,−1, 2) and y = (6, 3,−4) then
x · y = 4 · 6 + (−1) · 3 + 2 · (−4) = 13.
Theorem 4.12 Let x,y, z be arbitrary vectors in Rn and let c be an arbitrary scalar. Then,
• x · y = y · x;
• x · (y + z) = x · y + x · z;
• x · (cy) = (cx) · y = cx · y;
• x · x ≥ 0;
• x · x = 0 implies x = 0;
• (x + y) · (x + y) = x · x + 2x · y + y · y.
Definition 4.14 (Orthogonality) Two vectors x,y ∈ Rn are orthogonal (to each other) if x ·y =0.
Theorem 4.13 (Pythagorean Theorem) Two vectors x,y ∈ Rn are orthogonal if and only if‖x + y‖2 = ‖x‖2 + ‖y‖2.
Theorem 4.14 Let x and y be two vectors in Rn. Let θ be the angle between them. Then,
x · y = ‖x‖‖y‖ cos θ.
Example 4.12 Let θ be the angle between x = (1, 0, 0) and y = (1, 1, 1). Find cos θ.
cos θ =x · y‖x‖‖y‖
=(1, 0, 0) · (1, 1, 1)
‖(1, 0, 0)‖‖(1, 1, 1)‖=
1√3.
16
5 Eigenvalues and Eigenvectors
The eigenvalues of a given n × n matrix are the n numbers which summarize the essentialproperties of that matrix. Since they characterize the matrix under study, they are often called the“characteristic values” of the matrix, with a German term:“eigenvalues.”
5.1 Definitions and Examples
Definition 5.1 (Eigenvalue) For an n × n matrix A, a scalar λ is called an eigenvalue of A ifthere is a nontrivial solution x of Ax = λx; such an x is called an eigenvector corresponding to λ .
By the definition of eigenvalue, (A − λI)x = 0 has nontrivial solution. Then the determinantof A− λI, |A− λI| should be zero. This provide our one way to find the eigenvalues of matrix Aby solving
|(A− λI)| = 0,
which is also called the characteristic equation or characteristic polynomial of matrix A.
Example 5.1 Let A =
(2 00 3
)and x =
(10
). Then Ax = 2x =
(20
). Thus, 2 is an eigenvalue
of A.
Theorem 5.1 The diagonal entries of a diagonal matrix D are eigenvalues of D.
The proof is left as exercise.
Theorem 5.2 A square matrix A is singular if and only if 0 is an eigenvalue of A.
Proof. (If) If 0 is an eigenvalue of A, |A− 0I| = 0. Thus A is singular.(Only if) If A is singular, |A− 0I| = |A| = 0. Thus 0 is an eigenvalue of A.
Example 5.2 Is 5 an eigenvalue of A =
6 −3 13 0 52 2 6
?
The fact that the square matrix A−λI is a singular matrix when λ is an eigenvalue of A meansthat the system of equations (A − λI)x = 0 has at least a solution other than x = 0. Given λ tobe the eigenvalue of A, we can solve (A− λI)x = 0 to find the corresponding eigenvectors.
Theorem 5.3 Let A be an n× n matrix and let λ be a scalar. Then, the following statements areequivalent:
• A− λI is a singular matrix.
• |A− λI| = 0.
• (A− λI)x = 0 for some nonzero vector x.
• Ax = λx for some nonzero vector x.
Example 5.3 Find the eigenvalues and eigenvectors of A =
(−1 32 0
)and B =
1 0 20 5 03 0 2
.
17
5.2 Properties of Eigenvalues
Let λ1, · · · , λn be eigenvalues of the n × n matrix A. Let v1, · · · ,vn be the correspondingeigenvectors. Form the matrix P whose jth column is eigenvector vj . Then
AP = A[v1, · · · ,vn]
= [Av1, · · · , Avn]
= [λ1v1, · · · , λnvn]
= [v1, · · · ,vn]
λ1 · · · 0...
. . ....
0 · · · λn
= P
λ1 · · · 0...
. . ....
0 · · · λn
.
If P is invertible, we multiply both sides by P−1 to obtain
P−1AP =
λ1 · · · 0...
. . ....
0 · · · λn
.
Theorem 5.4 Let λ1, · · · , λn be eigenvalues of the n × n matrix A. Let v1, · · · ,vn be the corre-sponding eigenvectors. Form the matrix P whose jth column is eigenvector vj. If P is invertible,then
P−1AP =
λ1 · · · 0...
. . ....
0 · · · λn
.
Conversely, if P−1AP is a diagonal matrix D, then columns of P must be eigenvectors of A andthe diagonal entries of D must be eigenvalues of A.
Theorem 5.5 If v1, · · · ,vr are eigenvectors that correspond to distinct eigenvalues λ1, · · · , λr ofan n× n matrix A, then the set {v1, · · · ,vr} is linearly independent.
Proof. Suppose {v1, · · · ,vr} is linearly dependent. Since v1 is nonzero, one of the vectors in theset is a linear combination of the preceding vectors. Let p be the least index such that vp+1 is alinear combination of the preceding vectors. Then there exist scalars c1, · · · , cp such that
c1v1 + · · ·+ cpvp = vp+1.
Multiplying both sides of the equation by A, we obtain
c1Av1 + · · ·+ cpAvp = Avp+1
c1λ1v1 + · · ·+ cpλpvp = λp+1vp+1
⇒c1(λ1 − λp+1)v1 + · · ·+ cp(λp − λp+1)vp = 0
18
Since {v1, · · · ,vp} is linearly independent, ci(λi − λp+1) = 0 for all i = 1, · · · , p. As none of thefactors λi − λp+1 are zero, ci = 0 for i = 1, · · · , p. This implies vp+1 = 0, which is impossible.Hence {v1, · · · ,vr} must be linearly independent.
Theorem 5.6 An n× n matrix A with n distinct eigenvalues is diagonalizable.
The proof is left as exercise.
Definition 5.2 (Trace) The trace of a square matrix is the sum of its diagonal entries. That is,for an n× n matrix A, its trace is
traceA = a11 + a22 + · · ·+ ann.
Theorem 5.7 Let A be a k × n matrix and B be a n× k matrix. Then
trace(AB) = trace(BA).
Theorem 5.8 Let A be an n× n matrix with eigenvalues λ1, · · · , λn. Then
• λ1 + λ2 + · · ·+ λn = traceA, and
• λ1λ2 · · ·λn = detA.
Proof. Since λ1, · · · , λn are the n roots of the nth order polynomial pA(λ), we can recover thecharacteristic polynomial pA(λ) of A as
pA(λ) = β(λ1 − λ)(λ2 − λ) · · · (λn − λ),
where β is some constant. If we multiply this out, we find that the coefficient of λn is (−1)nβ, thecoefficient of λn−1 is (−1)n−1β(λ1 + λ2 + · · ·+ λn) and the constant term is βλ1λ2 · · · , λn.
where s.o.(λn−1) denotes terms of order strictly less than n− 1 in λ.Compare the coefficients of λn and λn−1 and constant terms in two approaches, we get β = 1,
and
λ1 + λ2 + · · ·+ λn = traceAλ1λ2 · · ·λn = detA
Example 5.4 Find the determinant of matrix A =
4 1 1 11 4 1 11 1 4 11 1 1 4
.
19
It is easy to check that 3 is an eigenvalue of A. Meanwhile, one can find that 3 is an eigenvalue ofA of multiplicity at least 3. Using the fact that the sum of the eigenvalues of A is 16, the fourtheigenvalue of A should be 16− 3 ∗ 3 = 7. Thus |A| = 7 ∗ 33 = 189.
5.3 Symmetric Matrices
5.3.1 Properties of Symmetric Matrices
Economic and financial analysis draws from two key mathematical areas: optimization theoryand statistics. Most of the square matrices that arise in optimization and in economics are sym-metric matrices. Symmetric matrices have some nice properties which are summarized as follows:
Theorem 5.9 Let A be an n× n symmetric matrix. Then
• all n roots of the characteristic equation |A− λI| = 0 are real numbers;
• eigenvectors corresponding to distinct eigenvalues are orthogonal; and
• even if A has multiple eigenvalues, there is a nonsingular matrix P whose columns w1, · · · ,wn
are eigenvectors of A such that
– w1, · · · ,wn are mutually orthogonal to each other,
– P−1 = P ′, and
– P−1AP = P ′AP =
λ1 0 · · · 00 λ2 · · · 0...
.... . .
...0 0 · · · λn
.
Definition 5.3 (Orthogonal Matrix) A matrix P which satisfies the condition P−1 = P ′, orequivalently, P ′P = I, is called orthogonal matrix.
Example 5.5 Orthogonally diagonalize the symmetric matrix A =
3 1 −11 3 −1−1 −1 5
.
Solution. The eigenvalues are λ1 = 2, λ2 = 3, λ3 = 6. Corresponding eigenvectors are
v1 =
−110
,v2 =
111
,v3 =
112
.
These three vectors are perpendicular to each other. Normalizing them gives the orthonormalvectors
v1 =
−1/√
2
1/√
20
,v2 =
1/√
3
1/√
3
1/√
3
,v3 =
1/√
6
1/√
6
2/√
6
.
This produces the orthogonal matrix
Q =
−1/√
2 1/√
3 1/√
6
1/√
2 1/√
3 1/√
6
0 1/√
3 2/√
6
.
20
Then
Q′AQ =
2 0 00 3 00 0 6
with Q′Q = I.
5.3.2 Quadratic Forms
Definition 5.4 (Quadratic Form) A quadratic form on Rn is a real-valued function of the form
Q(x1, · · · , xn) =∑i≤j
aijxixj ,
in which each term is a monomial of degree two.
Example 5.6 For x ∈ R3, Q(x) = 5x21 + 3x22 + 2x23 − x1x2 + 8x2x3 is a quadratic form.
For every quadratic form Q(x) there is a symmetric matrix A such that Q(x) = x′Ax; andevery symmetric matrix A yields a quadratic form Q(x) = x′Ax.
Example 5.7 For x ∈ R3, let Q(x) = 5x21 + 3x22 + 2x23 − x1x2 + 8x2x3. Write this quadratic formas x′Ax.
Solution. The coefficients of x21, x22, x
23 go on the diagonal of A. To make A sysmmetric, the
coefficient of xixj for i 6= j must be split evenly between the (i, j)- and (j, i)-entries in A.
Definition 5.5 (Definiteness) Let A be an n× n symmetric matrix, then A is:
• positive definite if x′Ax > 0 for all x 6= 0 in Rn,
• positive semidefinite if x′Ax ≥ 0 for all x 6= 0 in Rn,
• negative definite if x′Ax < 0 for all x 6= 0 in Rn,
• negative semidefinite if x′Ax ≤ 0 for all x 6= 0 in Rn,
• indefinite if x′Ax > 0 for some x ∈ Rn and x′Ax < 0 for some other x ∈ Rn.
Accordingly, we define the definiteness of quadratic form Q(x) to be positive definite, positivesemidefinite, negative definite, negative semidefinite, or indefinite.
Example 5.8 Let A =
(5 −2−2 5
). A is positive definite.
Theorem 5.10 Let A be a symmetric matrix. Then,
• A is positive definite if and only if all the eigenvalues of A are > 0;
21
• A is negative definite if and only if all the eigenvalues of A are < 0;
• A is positive semidefinite if and only if all the eigenvalues of A are ≥ 0;
• A is negative semidefinite if and only if all the eigenvalues of A are ≤ 0;
• A is indefinite if and only if there are two eigenvalues of A, λ1 and λ2, such that λ1λ2 < 0.
The proof is left as exercise.
Theorem 5.11 Let A be a symmetric matrix. Then, the following are equivalent:
1. A is positive definite.
2. There exists a nonsingular matrix B so that A = B′B.
3. There exists a nonsingular matrix Q such that Q′AQ = I.
Proof. Since A is an n× n matrix , we can write
P ′AP =
λ1 · · · 0...
. . ....
0 · · · λn
,
or
A = P
λ1 · · · 0...
. . ....
0 · · · λn
P ′,
where λ1, · · · , λn are eigenvalues of A, and P = (v1, · · · ,vn) is a matrix of independent eigenvectorsof A.
1⇒ 2: If A is positive definite, then λ1, · · · , λn are positive. Let
B =
√λ1 · · · 0...
. . ....
0 · · ·√λn
P ′.
Then
B′B = P
√λ1 · · · 0...
. . ....
0 · · ·√λn
√λ1 · · · 0...
. . ....
0 · · ·√λn
P ′ = A.
2⇒ 3: If A = B′B for a nonsingular matrix B, then
(B′)−1AB−1 = I.
Define Q = B−1, then Q′ = (B−1)′ = (B′)−1 and
Q′AQ = I.
22
3⇒ 1: If Q′AQ = I for some nonsingular matrix Q, then for any nonzero vector x ∈ Rn,
x′Ax = x′(Q′)−1Q′AQQ−1x
= x′(Q′)−1Q−1x
=(Q−1x
)′ (Q−1x
)= y′y > 0
as y = Q−1x 6= 0 because of that x is nonzero and Q−1 is nonsingular. Therefore A is positivedefinite. We have proved the theorem by showing 1⇒ 2⇒ 3⇒ 1.
There are some other results concerning determining the definiteness of a symmetric matrix.We will list them here without proof.
Definition 5.6 (Principal Minor) Let A be an n×n matrix. A k×k submatrix of A formed bydeleting n− k columns, say columns i1, i2, · · · , in−k and the same n− k rows, rows i1, i2, · · · , in−k,from A is called a kth order principal submatrix of A. The determinant of a k × k principalsubmatrix is called a kth order principal minor of A.
Example 5.9 For a general 3 × 3 matrix A =
a11 a12 a13a21 a22 a23a31 a32 a33
there is on third order princi-
pal minor: |A|. There are three second order principal minors:
∣∣∣∣ a11 a12a21 a22
∣∣∣∣, ∣∣∣∣ a11 a13a31 a33
∣∣∣∣, and∣∣∣∣ a22 a23a32 a33
∣∣∣∣. There are three first order principal minors: |a11|, |a22|, and |a33|.
Definition 5.7 (Leading Principal Minor) Let A be an n× n matrix. The kth order principalsubmatrix of A obtained by deleting the las n− k rows and the last n− k columns from A is calledthe kth order leading principal submatrix of A, denoted as Ak. Its determinant is called the kthorder leading principal minor of A, denoted as |Ak|.
Theorem 5.12 Let A be an n× n symmetric matrix. Then,
• A is positive definite if and only if all its n leading principal minors are strictly positive.
• A is negative definite if and only if its n leading principal minors alternate in sign as follows:
|A1| < 0, |A2| > 0, |A3| < 0, etc.
The kth order leading principal minor should have the same sign as (−1)k.
• If some kth order leading principal minor of A is nonzero but does not fit either of the abovetwo sign patterns, then A is indefinite.
Theorem 5.13 Let A be an n× n symmetric matrix. Then, A is positive semidefinite if and onlyif every principal minor of A is ≥ 0; A is negative semidefinite if and only if every principal minorof odd orders is ≤ 0 and every principal minor of even order is ≥ 0.
23
6 Homework
Question 1. Given A and b as follows, solve the system Ax = b and write the solution as avector.
(1) A =
1 3 −41 5 2−3 −7 6
, b =
−2412
;
(2) A =
1 2 −1−3 −4 25 2 3
, b =
12−3
;
(3) A =
2 5 −10 1 −11 2 0
, b =
4−14
.
Question 2. Given A and B as follows, compute C = AB and D = BA.
(1) A =
(1 2−2 1
), b =
(3 5−1 4
);
(2) A =
(2 0 −14 −5 2
), b =
7 1−5 −41 −3
;
(3) A =
1 2 32 4 53 5 6
, b =
5 0 00 3 00 0 2
;
(4) A =
2 −5 0−1 3 −46 −8 −7−3 0 9
, b =
4 −6 3 17 1 −1 24 3 −2 1
.
Question 3. If A is invertible, show that (A′)−1 = (A−1)′.
Question 4. Find the inverse of A =
1 0 01 1 01 1 1
, B =
1 0 0 01 1 0 01 1 1 01 1 1 1
, C =
−1 −7 −32 15 61 3 2
,
and D =
1 0 −2−3 1 42 −3 4
.
Question 5. For each subspace defined below, find a basis for the subspace.
(1)
s− 2ts+ t
3t
: s, t ∈ R
;
24
(2)
2a−4b−2a
: a, b ∈ R
;
(3)
2ca− bb− 3ca+ 2b
: a, b, c ∈ R
;
(4){(a b c
): a− 3b+ c = 0, b− 2c = 0, 2b− c = 0
};
(5){(a b c d
): a− 3b+ c = 0 ∈ R
};
(6) NulA and ColA, given A =
1 −6 9 0 −20 1 2 −4 50 0 0 5 10 0 0 0 0
Question 6. For each of the following matrices A, find nonsingular matrix P and diagonal matrixD so that P−1AP = D: