8. Matrix Algebraboydj/mathecon/math08.pdf8. Matrix Algebra 9/1/20 We start by deﬁning matrices. Matrix. An m × n matrix is a rectangular array Aof mn elements arranged in m rows

8. Matrix Algebra

9/1/20We start by defining matrices.

Matrix. An m× n matrix is a rectangular array A of mn elements arranged in m rowsand n columns.

For our purposes, the elements will be real or complex numbers or functions takingreal or complex values, although more generality is allowed.1

A generic element of A can be written aij for i = 1, . . . ,m and j = 1, . . . , n. Theelement aij is in row i, column j. We can write the matrix A as

A =

a11 a12 a13 · · · a1n

a21 a22 a23 · · · a2n

a31 a32 a33 · · · a3n...

......

. . ....

am1 am2 am3 · · · amn

We may sometimes write A = [aij]. This is not to be confused with notations such as(a + b)ij, used for the ij element of (A + B).

Matrices A and B are equal if they are both m × n and have identical entries,aij = bij for every i = 1, . . . ,m and j = 1, . . . , n.

1 For example, there is no problem requiring elements be in some arbitrary field F, or be functions withvalues in F.

2 MATH METHODS

8.1 Matrix Addition

Matrix Addition. We can add two m×n matrices together by adding the correspondingelements. Thus (a + b)ij = aij + bij.

Matrices of different sizes cannot be added together. We will say that matrices areconformable for addition if they have the same number of rows and columns.

It follows that

A + B =

a11 a12 · · · a1n

a21 a22 · · · a2n...

.... . .

...am1 am2 · · · amn

+

b11 b12 · · · b1n

b21 b22 · · · b2n...

.... . .

...bm1 bm2 · · · bmn

=

a11 + b11 a12 + b12 · · · a1n + b1n

a21 + b21 a22 + b22 · · · a2n + b2n...

.... . .

...am1 + bm1 am2 + bm2 · · · amn + bmn

Addition is both associative and commutative for matrices because it is associativeand commutative for numbers.

(A + B) + C = A + (B + C) addition associates

A + B = B + A addition commutes

8. MATRIX ALGEBRA 3

8.2 Multiplying a Matrix by a Scalar

We can also multiply a matrix by a number α.

Scalar Multiplication. The matrix αA is defined by (αa)ij = αaij.

This means

αA =

αa11 αa12 · · · αa1n

αa21 αa22 · · · αa2n...

.... . .

...αam1 αam2 · · · αamn

Scalar multiplication is defined only on the left, not the right. Scalar multiplicationobeys two additive distributive laws. One distributes scalar multiplication over matrixaddition, the other distributes scalar addition when multiplying scalars by a matrix.

α(A + B) = αA + αB scalar distributive law I

(α + β)A = αA + βA scalar distributive law II

Zero Matrix. The m × n zero matrix, 0, is defined by aij = 0 for every i = 1, . . . ,mand j = 1, . . . , n. Every element is zero.

Not surprisingly, adding the zero matrix to any matrix has no effect. It is easy to showthat there can be only one additive identity. The zero matrix can also be obtained bymultiplying any matrix by zero: 0A = 0.

0 + A = A + 0 = A additive identity

0A = 0 scalars and additive identity

Additive Inverse. Each matrix A has a unique additive inverse −A, which can be ob-tained by multiplying A by (−1).

We know −A is an additive inverse because

−A + A = A−A

= (1)A + (−1)A

= (1 − 1)A

= 0A

= 0.

We have

(−1)A = −A scalars and additive inverse

A + (−A) = (−A) + A = 0 additive inverse

4 MATH METHODS

8.3 Matrix Addition and Scalar Multiplication

This gives us the following properties involving matrix addition and scalar multiplication.These and other properties listed later only apply to conformable matrices.









There are no real surprises when it comes to matrix addition and multiplication of amatrix by a number (scalar multiplication). However, we have yet to consider matrixmultiplication.

8. MATRIX ALGEBRA 5

8.4 Complex Matrices

There’s no reason to restrict ourselves to real numbers when defining matrices. Wecould use any number field. The only one that we will use are the complex numbersC.

Recall that the complex numbers can be written z = a + bi where a, b ∈ R and theimaginary unit i is defined as the square root of −1, i =

√−1. One key fact about the

complex numbers is that every solution to any complex polynomial equation in onevariable is a complex number. In fact, the Fundamental Theorem of Algebra states thatevery complex polynomial p(z) of degree n in z can be factored as

p(z) = α(z− λ1) · · · (z− λn)

where each λi ∈ C.2

This is not true of real numbers. The equation x2 + 1 = 0 has no real solutions.However, there are two complex solutions: i and−i. Accordingly, x2+1 = (x−i)(x+i).

2 As I write this in August 2020, Wikipedia points out that the theorem was named when algebra wasprimarily about solving polynomial equations. They comment that “Additionally, it is not fundamental formodern algebra; its name was given at a time when algebra was synonymous with theory of equations.”This is true in the sense that the Fundamental Theorem of Algebra is not as fundamental to algebra as itused to be. Nonetheless, huge chunks of modern algebra are still focused on the theory of equations, it’sjust that the theory has reached an incredible level of abstraction. One only has to look at Wiles’s proofof Fermat’s Last Theorem to see this.

6 MATH METHODS

8.5 Transpose of a Matrix

A matrix can be transformed by transposing it.

Transpose of a Matrix. Given an m×n matrix A, its transpose, AT is the n×m matrixdefined by aT

ij = aji.

In other words, we interchange rows and columns to transpose the matrix. Basically,we are flipping it along its main diagonal, consisting of the elements aii.

( 1 2 34 5 6

)T

=

( 1 42 53 6

)

.

It is easy to verify that the transpose is compatible with both addition and scalarmultiplication.

(A + B)T = AT + BT

(αA)T = αAT

If a matrix is square, it may be its own transpose, A = AT , meaning aij = aji for alli and j. Such matrices are called symmetric. A related concept is skew-symmetry. Amatrix A is skew-symmetric if AT = −A.

The main diagonal of any skew-symmetric matrix is zero since aii = −aii.Any square matrix can be decomposed into a sum of a symmetric matrix and a

skew-symmetric matrix.

Theorem 8.5.1. If A is a square matrix, B = (A+AT )/2 is symmetric, C = (A−AT )/2is skew-symmetric, and A = B + C.

Proof. Taking the transposes of B and C shows they are symmetric and skew-symmetric, respectively. Simple addition shows B + C = A. �

8. MATRIX ALGEBRA 7

8.6 Hermitian Conjugate of a Matrix

A related concept that only effects complex matrices is the Hermitian conjugate. Thecomplex conjugate of z = a + bi where a, b ∈ R is z = a − bi. One nice property ofthe conjugate is that zz = zz = a2 + b2 = |z|2.

Hermitian Conjugate. The Hermitian conjugate A∗ of a matrix A is the complex conju-gate of AT . Thus a∗

ij = aji.

It is easy to see how the Hermitian conjugate is compatible with addition and, witha twist, scalar multiplication.

(A + B)∗ = A∗ + B∗

(αA)∗ = αA∗

A matrix is Hermitian if A∗ = A and skew-Hermitian or anti-Hermitian if A∗ = −A.In particular, Hermitian matrices obey aii = aii, implying that the main diagonal isreal; anti-Hermitian matrices obey aii = −aii, yielding a purely imaginary diagonal.

Hermitian matrices are the complex matrix analog of real numbers and skew-Hermitian matrices are the analog of purely imaginary numbers, even though neitherneed be purely real or imaginary, except on the main diagonal. For example, the matrix

A =( +i −1

+1 −i

)

is skew-Hermitian as

A∗ =( −i +1−1 +i

)

= −A.

Any square matrix can be decomposed into a sum of a Hermitian matrix and aanti-Hermitian matrix.

Theorem 8.6.1. If A is a square matrix, B = (A+A∗)/2 is Hermitian, C = (A−A∗)/2is anti-Hermitian, and A = B + C.

Proof. Taking the Hermitian conjugates of B and C shows they are Hermitian andanti-Hermitian, respectively. Simple addition shows B + C = A. �

8 MATH METHODS

8.7 Matrix Multiplication

If the sizes are right, matrices can be multiplied. The size condition is that the numberof columns in the first matrix must be equal to the number of rows in the second. Wecan multiply an m×n matrix by an n×k matrix to obtain an m×k matrix. Multiplyingthem in the opposite order is only possible is k = n.

Multiplication gives us a second type of conformability. Two matrices A and B areconformable for multiplication if the number of columns in A and number of rows in Bare the same.

Matrix Multiplication. When matrices are conformable, the matrix product A× B (alsowritten AB) is defined as follows. Then

(a× b)ij = ai1b1j + ai2b2j + · · · + ainbnj

or using the summation notation

(a× b)ij =n∑

h=1

aihbhj

for all i = 1, . . . ,m and j = 1, . . . , k.

The matrix product easily relates to the transpose. When we take the transpose of amatrix product, we get the product of the transposes in reverse order: (AB)T = BTAT .The same thing happens with Hermitian conjugates: (AB)∗ = B∗A∗.

8. MATRIX ALGEBRA 9

8.8 Matrix Multiplication: Basic Properties

So what can we say about matrix multiplication? For conformable matrices, the follow-ing identities hold:

A× (B×C) = (A×B) ×C multiplication associates

α(A× B) = (αA) × B = A× (αB) scalar associative law

(A + B) ×C = A×C + B×C matrix distributive law I

A× (B + C) = A× B + A×C matrix distributive law II

You’ll notice the absence of a commutative law for matrix multiplication. Matrixmultiplication usually does not commute. Suppose A is m × n and B is n× k. ThenA × B exists, but B ×A will only exist if k = m. In the latter case, A × B is m ×mand B × A is n × n. These two products can only be the same if n = m. In otherwords, we can only think about matrix multiplication commuting when both matricesare both square and the same size. The following examples illustrate this.

10 MATH METHODS

8.9 Matrix Multiplication: Examples

We consider some examples of multiplying matrices of different sizes and shapes to-gether.

( 1 21 3

)

×( 10 11 13

7 5 1

)

=( 24 21 15

31 26 16

)

.

These two matrices cannot be multiplied in the opposite order because you can’tmultiply a 2 × 3 matrix by a 2 × 2 matrix. In general, we cannot assume that matrixmultiplication commutes.

What if the multiplication makes sense? Consider the 1 × 3 matrix

A = ( 1 2 3 ) .

Then

A×AT = ( 1 2 3 ) ×(1

23

)

= (1 + 4 + 9) = (14).

If we take the product in the opposite order, we get something entirely different.

AT ×A =

( 123

)

× ( 1 2 3 ) =

( 1 2 32 4 63 6 9

)

.

Here both products are defined, but the matrices still fail to commute. The sizedifferences between the products make it impossible for the matrices to commute.

8. MATRIX ALGEBRA 11

8.10 Matrix Multiplication: Square Examples

When matrices are square (same number of rows and columns) and the same size, itmakes sense to multiply them in either order. Now both products are square and havethe same size. That still doesn’t guarantee that they commute! Thus

(1 0 00 0 10 1 0

)

×(1 2 3

4 5 67 8 9

)

=

(1 2 37 8 94 5 6

)

(8.10.1)

but(1 2 3

4 5 67 8 9

)

×(1 0 0

0 0 10 1 0

)

=

(1 3 24 6 57 9 8

)

(8.10.2)

The products are not the same. These two matrices do not commute under multiplica-tion (of course, addition is still commutative).

Interestingly enough, in equation (8.10.1), pre-multiplication by the 0, 1 matrixswitches the second and third rows of the 1, 2, 3, . . . matrix. The pre-multiplication hascarried out an elementary row operation.

Even more interestingly, in equation (8.10.2), post-multiplication by the same matrixswitches the columns of the 1, 2, 3, . . . matrix.

12 MATH METHODS

8.11 Linear Systems and Matrices

We can use matrix multiplication to write any linear system as a matrix product. Recallour original linear system.

a11x1 + a12x2 + · · · + a1nxn = b1

a21x1 + a22x2 + · · · + a2nxn = b2

a31x1 + a32x2 + · · · + a3nxn = b3

......

am1x1 + am2x2 + · · · + amnxn = bm

(6.5.2)

We let x denote the n × 1 column vector of variables and b the column vector ofconstant terms. Thus

x =

x1

x2...xn

and b =

b1

b2...bn

.

Let A be the m×n coefficient matrix with ij element aij. Then the system (6.5.2) canbe written

Ax = b.

To make the formula work, we have to write the vectors of variables and constant termsas column vectors, not row vectors.

As you may guess, matrix algebra will be useful in solving these systems.


8.12 The Identity Matrix

Kronecker delta. The Kronecker delta, δij is defined by

δij =

{1 if i = j

0 if i 6= j

The Kronecker delta will be useful for defining matrices, beginning with the identitymatrix.

Identity Matrix. We define the n× n identity matrix In by iij = δij. In other words,

In =

1 0 · · · 00 1 · · · 0...

.... . .

...0 0 · · · 1

.

The identity matrix has the property that for any m× n matrix A,

Im ×A = A and A× In = A multiplicative identities.

In other words, Im is a (left) multiplicative identity, and In is a (right) multiplicativeidentity.

Theorem 8.12.1. Suppose A is an m× n matrix. Then Im ×A = A = A× In.

Proof. We consider the first equality. Call the product B. Then

bij =m∑

k=1

δikakj = aij

as all the terms with i 6= k are zero due to the Kronecker delta. So we have B = A.A similar argument establishes the other equality. �

If A is a square matrix, n = m and I = Im commutes with A. In fact, any scalarmultiple of I commutes with A. There may be other matrices that commute with A.

14 MATH METHODS

8.13 Inverse Matrices

Invertible Matrix. If A is a square matrix, it has an inverse if there is a matrix A−1 withA−1A = AA−1 = I. A matrix with an inverse is called invertible.

Invertible matrices are non-singular.

Theorem 8.13.1. Suppose an n × n matrix A is invertible. Then rankA = n andx = A−1b is the unique solution to the system Ax = b. Moreover, A is non-singular.

Proof. If A is invertible, then x = A−1b solves the system for every b by the rules ofmatrix algebra. By Corollary 7.22.1, rankA is the number of rows, n. The number ofcolumns is also n, so A is non-singular by Corollary 7.23.2. �

Non-singular Matrices. It is not hard to show that any non-singular matrix is invertible.We do this later, immediately prior to the statement of Theorem 8.23.1.

We can also relate the transpose and inverse of a matrix. The inverse of the transposeis the transpose of the inverse and the inverse of the conjugate is the conjugate of theinverse.

Theorem 8.13.2. Suppose A is invertible. Then (A−1)T = (AT )−1 and (A−1)∗ =(A∗)−1.

Proof. We know AA−1 = I = A−1A. Take the transpose to obtain (A−1)TAT =I = AT (A−1)T , showing that the transpose of A−1 is the inverse of AT . The sameargument applies to the Hermitian conjugate. �


8.14 Left and Right Inverses

If A is an m × n matrix with m 6= n, it is still possible to find either a left inverse or aright inverse. A left inverse is an n ×m matrix B with BA = In and a right inverse isan n×m matrix C with AC = Im. Notice that if A has both left and right inverses, itmust be square and the inverses must be identical as shown in Theorem 8.16.1.

For example,

A =( 1 2 3

1 2 4

)

has right inverse

B =

( 2 11 −2−1 1

)

because

A× B =( 1 2 3

1 2 4

)

×( 2 1

1 −2−1 1

)

=( 1 0

0 1

)

.

However,

B×A =

( 2 11 −2−1 1

)

×( 1 2 3

1 2 4

)

=

( 3 6 10−1 −2 −50 0 1

)

.

In fact, A has no left inverse.An example of a matrix with a left inverse but not a right inverse is AT , which has

left inverse BT .

16 MATH METHODS

8.15 One-sided Inverses and Linear Systems

The one-sided inverses are connected to the properties of the linear system Ax = b.

Theorem 8.15.1.

1. If A has a left inverse, then there is at most one solution to Ax = b and rankAis equal to the number of columns.

2. If A has a right inverse, then there is a solution to Ax = b and rankA is equal tothe number of rows.

Proof. (1) Suppose A has a left inverse B. Suppose it has two solutions, x and x′.Then

Ax = b and Ax′ = b.

Apply the left inverse to both equations, yielding

BAx = x = Bb and BAx′ = x′ = Bb.

Combining them, we see that x = x′. There is at most one solution. By Corollary7.23.2, rankA is the number of columns of A.

(2) Suppose A has a right inverse C. Then A(Cb) = (AC)b = b, so x = Cb is asolution to the system. Since this system always has a solution, rankA is the number ofrows of A by Corollary 7.22.1. �


8.16 Two-Sided Inverses

If a matrix has both left and right inverses, they must be the same and the matrix mustbe invertible.

Theorem 8.16.1. If an m× n matrix A has both a left inverse B and a right inverse C,then B = C and m = n. Furthermore, B = C is the inverse of A.

Proof. Suppose B is a left inverse and C a right inverse. Then BA = I. It followsthat

(BA)C = IC

B(AC) = C

BI = C

B = C,

showing that the two inverses must be identical.Theorem 8.15.1 tells us that rankA = #cols = #rows. This means that A is

non-singular.Finally, since A is m ×m and BA = Im = AB, which implies B = C = A−1, the

inverse of A. �

18 MATH METHODS

8.17 Matrix Inverses and Scalar Products

If two n× n matrices are invertible, their product is also invertible.

Theorem 8.17.1. Suppose A and B are invertible n × n matrices. Then AB is alsoinvertible with (AB)−1 = B−1A−1.

Proof. The proof is simple. Both

(B−1A−1) × (AB) =(

(B−1A−1)A)

B

=(

B−1(A−1A))

B

= (B−1I)B

= B−1B

= I

and

(AB) × (B−1A−1) =(

(AB)B−1)

A−1

=(

A(BB−1))

A−1

= AA−1

= I,

establishing the result. �

If an n× n matrix is invertible and α 6= 0, αA is also invertible.

Theorem 8.17.2. Suppose A is an invertible n× n matrix and α 6= 0. Then αA is alsoinvertible with (αA)−1 = α−1A−1.

Proof.

(αA) × α−1A−1 = A×A−1

= I

= A−1 ×A

= α−1A−1 × (αA)

establishing the result. �


8.18 Inverses of Diagonal Matrices

A square matrix is called a diagonal matrix if the only non-zero elements are those onthe main diagonal, elements of the form aii. We will denote a diagonal matrix with(λ1, . . . , λn) on the diagonal by diag(λ1, . . . , λn). Thus

diag(λ1, . . . , λn) =

λ1 0 · · · 00 λ2 · · · 0...

.... . . 0

0 0 · · · λn

.

It is easily verified that diagonal matrices with no zeros on the diagonal can beinverted. In fact

λ1 0 · · · 00 λ2 · · · 0...

.... . . 0

0 0 · · · λn

−1

=

1/λ1 0 · · · 00 1/λ2 · · · 0...

.... . . 0

0 0 · · · 1/λn

Whenever λ1 · · ·λn 6= 0.It is also the case that any two diagonal matrices of the same size commute. In fact,

their product is also a diagonal matrix with the product of the diagonal elements on thediagonal.

λ1 0 · · · 00 λ2 · · · 0...

.... . . 0

0 0 · · · λn

µ1 0 · · · 00 µ2 · · · 0...

.... . . 0

0 0 · · · µn

=

λ1µ1 0 · · · 00 λ2µ2 · · · 0...

.... . . 0

0 0 · · · λnµn

=

µ1λ1 0 · · · 00 µ2λ2 · · · 0...

.... . . 0

0 0 · · · µnλn

=

µ1 0 · · · 00 µ2 · · · 0...

.... . . 0

0 0 · · · µn

λ1 0 · · · 00 λ2 · · · 0...

.... . . 0

0 0 · · · λn

Or more concisely,

diag(λ1, . . . , λn) × diag(µ1, . . . , µn) = diag(λ1µ1, . . . , λnµn)

= diag(µ1λ1, . . . , µnλn)

= diag(µ1, . . . , µn) × diag(λ1, . . . , λn)

20 MATH METHODS

8.19 Elementary Row Matrices I

There are two classes of elementary matrices—elementary row matrices and elementarycolumn matrices. Pre-multiplying a matrix A by an elementary row matrix carriesout the corresponding elementary row operation. Post-multiplying by an elementarycolumn matrix carries out the corresponding elementary column operation.

We saw this earlier. Recall equation (8.10.1).

(1 0 00 0 10 1 0

)

×(1 2 3

4 5 67 8 9

)

=

(1 2 37 8 94 5 6

)

(8.10.1)

Pre-multiplying by the matrix(1 0 0

0 0 10 1 0

)

switched the second and third rows—an elementary row operation. In equation(8.10.2), post-multiplying by the same matrix switched the second and third columns,an elementary column operation.

More generally, suppose we form the matrix Eij by taking the identity matrix, andswitching the ith and jth rows (this is the same as switching the ith and jth columns).Then pre-multiplying any matrix A by this matrix will switch A’s ith and jth rows.

The matrix Eij is the m×m matrix with elements

ehk =

0 when hk = ii or hk = jj

1 when hk = ij or hk = ji

δhk otherwise.

We can now calculate the product for any m×n matrix A. Let ckℓ denote the elementsof Eij ×A.

ckℓ =m∑

h=1

ekhahℓ =

akℓ when k 6= i, j

ajℓ when k = i

aiℓ when k = j.

In other words, the ith and jth rows of A have been switched.


8.20 Elementary Row Matrices II

The other two types of elementary row operations have their own elementary matrices.All are formed by applying the desired row operation to the identity matrix.

To multiply row i by r 6= 0, we define the matrix Ei(r) by

ehk =

{rδik when h = i

δhk when h 6= i.

Notice that only the ith row (column) is changed, and it is multiplied by r.To add r times row i to row j, we define the matrix Eij(r) by

ehk =

ejj = 1

eji = r

ejk = 0 when k 6= i, j

δhk when h 6= j.

The only change from the identity occurs in row j, where there is an r in column iinstead of a 0.

The elementary row matrices are all invertible, and the inverses are also elementaryrow matrices. For r 6= 0, we have

E−1ij = Eij,

Ei(r)−1 = Ei(1/r), and

Eij(r)−1 = Eij(−r).

Two examples:

E2(3) =

(1 0 00 3 00 0 1

)

and E32(r) =

( 1 0 00 1 r0 0 1

)

.

22 MATH METHODS

8.21 Some Matrix Square Roots

The matrices Eij have a particularly interesting property. If we square them, we get theidentity matrix.

E2ij = I.

We can think of the Eij as a square root of the identity matrix. They are not the onlynon-trivial square roots. The matrix

( 1 00 −1

)

is also a square root of the identity. Matrices are quite different from real numbers inthis respect as 1 has only two square roots.

We also don’t need imaginary numbers to find square roots of −I. One example isskew-symmetric.

( 0 −11 0

)2

=( 0 −1

1 0

)

×( 0 −1

1 0

)

= −I.


8.22 Elementary Column Matrices

Two types of the elementary matrices, Eij and Ei(r) are symmetric. The other type,Eij(r) is not symmetric.

There are elementary column operations corresponding to the elementary row op-erations. There are three of them: Interchanging two columns, multiplying a columnby a non-zero scalar, and adding a non-zero multiple of one column to another.

The matrices that carry out the first two operations when post-multiplied are Eij andEi(r). However, the third elementary column operation requires a different matrix,Eij(r)T .

Where does the transpose come in? The symmetry of the other two types of matricesmeans for those elementary column operations, we use the same matrices as theelementary row operations.

In fact, if we apply an elementary row operation to A, it means that we have appliedan elementary column operation to AT . If E is the elementary row matrix that doesthis, the transformed matrix is E×A, and when we put it in column form by transposingwe obtain (E ×A)T = AT × ET . Any elementary column operation can be obtainedby post-multiplying by the transpose of the corresponding elementary row matrix.

24 MATH METHODS

8.23 Row Operations and Inversion

Suppose that A is a non-singular matrix. Such a matrix can be row-reduced to the iden-tity matrix (Lemma 7.25.1). That means that there are elementary matrices E1, . . . ,Ek

with (Ek · · ·E2E1)A = I. Then the inverse of A can be expressed as a product ofidentity matrices Ek · · ·E1. Since each Ei is invertible, so is their product, which isA−1.

Combined with Theorem 8.13.1, we have proven that non-singularity and invertibilityare the same.

Theorem 8.23.1. An n× n matrix A is non-singular if and only if it is invertible.

This also gives us a method for finding the inverse. Consider the matrix

(

A | I)

.

We row-reduce this by pre-multiplying by A−1 = Ek · · ·E1. What we get is

(

I | A−1)

.

In other words, by row-reducing(

A | I)

,

we obtain the inverse of A in the right-hand portion of the row-reduced matrix.It follows that any invertible matrix can be written as the product of elementary

matrices.

Theorem 8.23.2. Let A be an n × n invertible matrix. Then there are elementarymatrices F1, . . . , Fk with A = F1F2 · · ·Fk.

Proof. Using the notation above, we have A−1 = Ek · · ·E1. Take the inverse. Sincethe inverse of any of the elementary matrics is also an elementary matrix, we may setFi = E−1

i . �


8.24 Input-Output Systems

Earlier, we examined input-output systems. Suppose we have an input-output modelwithout labor and that the input coefficient matrix is n × n. Let c be the desiredconsumption vector. Given outputs x, the required input is Ax. For this to work, wemust have c + Ax = x. In other words, c must solve c = (I−A)x or x = (I−A)−1c.The inputs must be non-negative for this to be feasible. When does that happen?

To make things commensurate, we will measure inputs and outputs by their dollarvalues, and write the input coefficient matrix so that it shows the dollar cost of inputsfor one dollar’s worth of output. We will expect that the dollar value of output exceedsthe dollar value of input (firms are making profits).

That means that aij is the cost of i used in the production of one dollar’s worth of j.That

n∑

i=1

aij = cost to produce $1 worth of j

and that the positive profit condition is

n∑

i=1

aij < 1 for every j.

We have the following result.

Theorem 8.24.1. If each aij ≥ 0 and for every j,∑n

i=1 aij < 1, then (I −A)−1 existsand each entry is non-negative.

Proof. We will not do this in class. The proof is in section 8.5 of Simon and Blume. �

There is a corollary, which provides an answer to the question of when inputs arenon-negative.

Corollary 8.24.2. Under the conditions of Theorem 8.24.1, for all non-negative c,x = (I−A)−1c is non-negative.

Proof. Since each element of (I − A)−1 is non-negative, the matrix product showsthat each xi is the sum of non-negative numbers. �

The corollary tells us that any non-negative consumption vector is feasible in thisinput-output model under the positive profit condition:

∑i aij < 1 for every j.

26 MATH METHODS

8.25 Summary of Matrix Algebra

For conformable matrices:







A× (B×C) = (A×B) ×C multiplication associates



α(A×B) = (αA) × B = A× (αB) scalar associative law

(A + B) ×C = A×C + B×C matrix distributive law I

A× (B + C) = A× B + A×C matrix distributive law II

Im ×A = A and A× In = A multiplicative identities, A is m× n

A×A−1 = A−1 ×A = I multiplicative inverse

(αA)−1 = α−1A−1 inverse of scalar multiple

(AB)−1 = B−1A−1 inverse of matrix product

(A + B)T = AT + BT transpose of sum

(αA)T = αAT transpose of scalar multiple

(AB)T = BTAT transpose of matrix product

(A + B)∗ = A∗ + B∗ conjugate of sum

(αA)∗ = αA∗ conjugate of scalar multiple

(AB)∗ = B∗A∗ conjugate of matrix product

9. DETERMINANTS 27

9. Determinants

Determinants are defined only for square matrices.1 Let A be an n×n matrix. We willinductively define the determinant of A, detA. If n = 1, A = (a11) and detA = a11.If we have defined determinants up to size n − 1, we define the determinant for ann× n matrix A by

detA = a11C11 + a12C12 + · · · + a1nC1n =n∑

i=1

aijCij

where the ij-cofactor of aij is

Cij = (−1)i+jMij = (−1)i+j detAij.

HereMij = detAij is referred to as the ij-minor and the matrixAij is the (n−1)×(n−1)submatrix of A formed by removing row i and column j.

Another notation for the determinant is to replace the matrix parentheses or bracketsby vertical bars:

detA =

∣

∣

∣

∣

∣

∣

∣

∣

a11 a12 · · · a1n

a21 a22 · · · a2n...

.... . .

...an1 an2 · · · ann

∣

∣

∣

∣

∣

∣

∣

∣

.

We state one result without proof.

Determinant Fact. The determinant can be calculated by expanding by cofactors alongany row or any column. But you must use the same row or column for the entirecalculation.

1 This chapter draws on Chapters 9 and 26

28 MATH METHODS

9.1 Determinants of Diagonal Matrices

9/3/20It’s easy to calculate the determinant of a diagonal matrix directly from the definition.

Theorem 9.1.1. Let Dn = diag(λ1, λ2, . . . , λn) be an n× n diagonal matrix. Its deter-minant is detDn = λ1λ2 · · ·λn.

Proof. We prove this by induction on the size of the matrix. It is true for n = 1, as

detD1 = det(a11) = a11 = λ1.

Now suppose it is true for n. Then we expand along the top row:

detDn+1 = λ1C11 + 0C12 + · + 0C1,n+1

= λ1C11

= (−1)1+1λ1 detA11

= λ1

(

det diag λ2, . . . , λn+1

)

= λ1λ2 · · ·λn+1

where the last line follows from the induction hypothesis. This shows that the result istrue for (n+ 1) if it is true for n. Since we already showed it was true for n = 1, followsthat it is true for every n = 1, 2, . . . . �

9. DETERMINANTS 29

9.2 Determinants Do Not Add

Although it may happen in some special cases, determinants generally do not add. Thatis, the usual case is that detA + detB 6= det(A + B).

For example∣

∣

∣

∣

1 00 0

∣

∣

∣

∣

+

∣

∣

∣

∣

0 10 0

∣

∣

∣

∣

6=∣

∣

∣

∣

1 00 1

∣

∣

∣

∣

.

The left-hand side is 0 + 0 while the right-hand side is 1.There are some cases where they do add. Here’s one where both sides are zero.

∣

∣

∣

∣

1 00 0

∣

∣

∣

∣

+

∣

∣

∣

∣

1 00 0

∣

∣

∣

∣

=

∣

∣

∣

∣

2 00 0

∣

∣

∣

∣

.

Examples where they do add and both sides are zero are much easier to create thanthose where both sides are not zero.

Since we have a formula for the determinant of diagonal matrices, we can investigatethat case a little more closely

When matrices are diagonal, A = diag(a1, . . . , an) and B = diag(b1, . . . , bn), thecondition for additivity of the determinant is

n∏

i=1

ai +n∏

i=1

bi =n∏

i=1

(a + b)i

Even in the 2×2 case, this requires a1b2 +a2b1 = 0. For larger matrices, the conditionsfor additivity of the determinant become more stringent.

30 MATH METHODS

9.3 Triangular Matrices

A matrix A is an upper triangular matrix if aij = 0 whenever i > j. An upper triangularmatrix looks like this.

A =

a11 a12 · · · a1,n−1 a1n

0 a22 · · · a2,n−1 a2n...

.... . .

......

0 0 · · · an−1,n−1 an−1,n

0 0 · · · 0 ann

.

All the elements below the main diagonal are zero in an upper triangular matrix. Alower triangular matrix is the opposite, everything above the main diagonal is zero, soA is a lower triangular matrix if aij = 0 whenever i < j. A lower triangular matrix lookslike this.

A =

a11 0 · · · 0 0a21 a22 · · · 0 0

......

. . ....

...an−1,1 an−1,2 · · · an−1,n−1 0an1 an2 · · · an,n−1 ann

.

Theorem 9.3.1. If an n×n matrix A is either an upper or lower triangular matrix, thendetA = a11a22 · · ·ann.

Proof. For a lower triangular matrix, we repeat the proof of Theorem 9.1.1. Theupper triangular case is a similar induction, except we expand along the first columnrather rather than the first row. �

9. DETERMINANTS 31

9.4 2× 2 Determinants

You might be thinking this is easy after computing determinants for diagonal and trian-gular matrices. That’s because we’re starting with the easy ones.

The determinant of a 2 × 2 matrix is still easy, but includes something besides thediagonal terms. Suppose A is a 2 × 2 matrix. Then

detA = a11C11 + a12C12.

Now A11 = (a22) and A12 = (a21). Using the formula for size one determinants,we find detA11 = a22 and detA12 = a21. The cofactors are then C11 = a22 andC12 = −a21. It follows that

∣

∣

∣

∣

a11 a12

a21 a22

∣

∣

∣

∣

= a11a22 − a12a21.

One way to remember this is the following: We multiply the numbers on the maindiagonal (NW to SE) and subtract the product of the numbers on the anti-diagonal (SWto NE).

For example,∣

∣

∣

∣

1 23 4

∣

∣

∣

∣

= 1(4) − 2(3) = −2,

and∣

∣

∣

∣

15 37 12

∣

∣

∣

∣

= 15(12) − 3(7) = 180 − 21 = 159.

Determinants can also be zero.∣

∣

∣

∣

1 22 4

∣

∣

∣

∣

= 4 − 4 = 0.

When we get to the 3 × 3 case, we’ll start to see the general pattern. But first, weestablish some general results with the current definition.

32 MATH METHODS

9.5 Interchanging Rows

If we interchange any two rows of a matrix, it flips the sign of the determinant. Thiscan only happen if n ≥ 2. This result is important because it tells us how one of theelementary row (column) operations affects the determinant.

Theorem 9.5.1. Let A be an n×n matrix with n ≥ 2. Form B from A by interchangingany two rows or columns of A. Then detB = −detA.

Proof. We prove this by induction, starting when n = 2, When n = 2, we use theformula for determinants of size two.

detB =

∣

∣

∣

∣

a21 a22

a11 a12

∣

∣

∣

∣

= a21a12 − a11a22 = −detA.

When n ≥ 3, we have a little more room. We will interchange rows h and k of A.We use induction on the size of the matrix. Suppose the result is true for matrices ofsize n with n ≥ 2.

For the induction step, we consider matrices of size n + 1 ≥ 3 and expand thedeterminant along row i 6= h, k. Then

detB = ai1C′

i1 + ai2C′

i2 + · · · + ai,n+1C′

i,n+1

=n+1∑

j=1

aijC′

ij

= −n+1∑

j=1

aijCij

= −detA.

Here C′

ij are the cofactors in B. The induction hypothesis is used to get from the second

to third row as C′

ij = (−1)i+j detBij = −(−1)i+j detAij = −Cij. This because thetwo rows h and k are in each submatrix Aij. They are interchanged in each Aij toget each of the Bij, reversing the sign by the induction hypothesis. Then we put thedeterminant back together in the last line to finsh the induction step. It follows that theresult is true for n = 2, 3, . . . .

The column case is the same, but expanded along an uninvolved column. �

Alternating. A function f(x1, . . . , xn) is alternating if whenever we interchange two ofthe xi, f is multiplied by (−1), flipping the sign.

Theorem 9.5.1 tells us that determinants are alternating, both with respect to rowinterchange and column interchange.

9. DETERMINANTS 33

9.6 Determinants with Repetition

When a row or column is repeated, the determinant is zero. We already did the mainpart of the work for this in Theorem 9.5.1.

Theorem 9.6.1. Suppose A is an n× n matrix with n ≥ 2. If either a row or column isrepeated, then detA = 0.

Proof. Let i and j be the repeated rows. If we interchange rows i and j, we still havematrix A. But by Theorem 9.5.1, detA = −detA. Then 2 detA = 0, so detA = 0.

34 MATH METHODS

9.7 Determinants of Size 3

Now that we have the determinants of size two under control, we can proceed to sizethree.

detA =

∣

∣

∣

∣

∣

a11 a12 a13

a21 a22 a23

a31 a32 a33

∣

∣

∣

∣

∣

= a11C11 + a12C12 + a13C13.

We now use the formula for the size two determinants to find

detA = a11

∣

∣

∣

∣

a22 a23

a32 a33

∣

∣

∣

∣

− a12

∣

∣

∣

∣

a21 a23

a31 a33

∣

∣

∣

∣

+ a13

∣

∣

∣

∣

a21 a22

a31 a32

∣

∣

∣

∣

= a11(a22a33 − a23a32) − a12(a21a33 − a31a23) + a13(a21a32 − a31a22)

= a11a22a33 − a11a23a32 − a12a21a33 + a12a23a31 + a13a21a32 − a13a22a31

A way to remember 3×3 determinants is to repeat the first two columns, and attacha plus sign to the first 3 diagonals, and a minus sign to the first three anti-diagonals.

+ + + − − −a11 a12 a13 a11 a12

a21 a22 a23 a21 a22

a31 a32 a33 a31 a32

− − − + + +

9. DETERMINANTS 35

9.8 Another View of Determinants

One way to think about the 3 × 3 determinant

detA = a11a22a33 − a11a23a32

− a12a21a33 + a12a23a31

+ a13a21a32 − a13a22a31

is to notice that each of the six terms is composed of elements from every row andevery column. The first product, a11a22a33 uses row 1 and column 1, then row 2 andcolumn 2, and finally row 3 and column 3. The second product, a11a23a32 again usesrow 1 and column 1, then row 2 and column 3, and finally row 3 and column 2. Eachrow is used once, each column is used once.

As for the signs, the plus sign is applied when the column numbers are in the sameorder as the row numbers such as ’123’ and ’123’. The minus appears when there is areversal such as ’123’ and ’132’. The same thing happens in the next pair where ’123’is matched with ’213’, which gets a negative sign. while ’123’ is matched with ’231’with a plus sign. In case there two switches of adjacent elements, to ’213’ and then to’231’ with the two minus signs canceling. Finally, in the third pair ’123’ goes with ’312’where two switches, to ’132’ and then ’312’ resulting in a plus sign. The last term takes’123’ to ’321’ (one more switch) and so a minus sign.

What’s happening here is we are taking all possible paths from the top to the bottomof the matrix (or left side to the right side) where each product takes elements fromeach row and column. We assign the sign based on whether we have an even numberof interchanges in the indices (positive), or an odd number (negative).

This also works on the 2 × 2 determinant. Then the rows are ’12’. The positive signis applied when the columns go in the same order, ’12’. The negative is applies whenthe columns are in the opposite order, ’21’. The determinant is then a11a22 − a12a21.

We generalize this to every size n by writing it in terms of permutations.

36 MATH METHODS

9.9 Determinants via Permutation

A second commonly used definition of determinants uses permutations of the indices.

Permutation. We say σ : {1, . . . , n} → {1, . . . , n} is a permutation if σ takes each value{1, . . . , n} exactly once.

In other words, ’σ(1) . . . σ(n)’ is a rearrangement of ’12 . . . n’. The sign of a per-mutation, sgnσ, is +1 when an even number of interchanges of adjacent elements of’12 . . . n’ yield ’σ(1) . . . σ(n)’. The sign is −1 when an odd number of interchanges isinvolved. Let Pn denote the set of permutations of ′12 . . .n’. There are n! permutationsof ’12 . . . n’.

We can now write the determinant as

detA =∑

σ∈Pn

(sgnσ)

(

n∏

i=1

aiσ(i)

)

.

This is what we just described on the previous page. Each element in∏n

i=1 aiσ(i) isfrom a different row (i = 1, . . . n) and a different column (σ(1), . . . , σ(n)). The sign ofof each product is determined by the number of interchanges in the permutation σ(i).

When n = 3 there are 6 permutations to consider: ’123’, ’132’, ’312’, ’321’, ’231’,and ’213’. Since each is created by a single interchange from the previous permutation,the signs alternate. For a 3 × 3 matrix A the formula yields

detA =a11a22a33 − a11a23a32 + a13a21a32

− a13a22a31 + a12a23a31 − a12a21a33

which is the previously calculated value.

9. DETERMINANTS 37

9.10 The Language of Functions

Before continuing with determinants, it will be helpful to upgrade our terminology.2

We need to be able to discuss functions. Suppose X and Y are sets. A function from Xto Y is a rule that assigns an element of Y to every element of X. We write f : X → Y toindicate f is a function from X to Y. Here X is referred to as the domain of f, X = dom f,and Y is the target space of f.

Given x ∈ X, f(x) denotes the element of Y that f assigns to x. We sometimes writex 7→ f(x) to indicate that f assigns f(x) to x. We can also use functions on sets. If A ⊂ X,the image of A under f is f(A) = {f(x) : x ∈ A}. Of course f(A) ⊂ Y. The image of thedomain X is referred to as the range, ran f = f(X).

Onto, Surjective. We say that f is onto or surjective if the range of f is the entire targetspace, (i.e., ran f = Y).

◮ Example 9.10.1: Two Functions. For example, suppose f : R → R is defined byf(x) = x2 + 2. If A = [0, 2], f(A) = [2, 6]. The range of f is the interval [2,+∞). Asthis is smaller than the target space, f is not onto.

The function g : R → R defined by g(x) = x3 has rang = R because if y ∈ R,y = g

(

y1/3)

and y1/3 ∈ dom g. ◭

One-to-One = Injective. A function f is one-to-one or injective if f(x) = f(x′) impliesthat x = x′.

◮ Example 9.10.2: Matrix Functions. When A is an m×n matrix, the function f : Rn →Rm defined by f(x) = Ax is onto Rm if and only if Ax = b has a solution for everyb in Rn. It follows that f(x) = Ax is onto Rm if and only if rankA = m by Corollary7.22.1.

Now f is one-to-one if and only if Ax = Ax′ implies x = x′. That is, if and onlyif A(x − x′) = 0 implies x − x′ = 0. The function f will be one-to-one if and only ifAx = 0 has only one solution, x = 0. Corollary 7.21.2 tells us that f is one-to-one ifand only if n = rankA. ◭

Bijective. If f is both one-to-one and onto (injective and surjective), we call it bijective.

Theorem 9.10.3. If f : X → Y is bijective, for each y ∈ Y, there is a unique x(y) ∈ Xwith f

(

x(y))

= y.

Proof. Since f is onto, there is an x(y) ∈ X that is mapped back to y. That is, withf(

x(y))

= y. Since f is one-to-one, that x(y) is unique. �

We call the function x 7→ x(y) the inverse of f and denote it by f−1. Thus f−1 : Y → Xand f

(

f−1(y))

= y. Also, f−1(

f(x))

= x, since x is the unique element of Y that is theimage of a point in X.

2 See also section 13.1 of Simon and Blume.

38 MATH METHODS

9.11 Linear and Multilinear Functions

Linear Transformation. Let f : Rn → Rm. The function f : Rn → R is a linear functionor linear transformation if for every α ∈ R and every x,y ∈ R

n,

1. f(x + y) = f(x) + f(y) and2. f(αx) = αf(x).

Setting x = y = 0, condition (1) implies f(0) = 0 for any linear function.The two criteria for linearity can be combined as

f(αx + y) = αf(x) + f(y) (9.11.1)

for all scalars α and vectors x and y. It’s pretty obvious that the two linearity conditionsimply equation (9.11.1).

To see that equation (9.11.1) implies both conditions, set α = 1, which impliesf(x + y) = f(x) + f(y). This implies f(0) = 0 as above. Finally, setting y = 0 yields thesecond condition f(αx) = αf(x).

The transformation TA. Given an m× n matrix A, define the function TA : Rn → Rm

by TA(x) = Ax.

Theorem 9.11.1. Let A be an m × n matrix. The function TA : Rn → Rm defined byf(x) = Ax is a linear function.

Proof. That this is a linear function follows from the rules of matrix algebra. Let α ∈ R

and x,y ∈ Rn. Then

TA(αx + y) = A(αx + y)

= A(αx) + Ay = α(Ax) + Ay

= αTA(x) + TA(y)

showing that TA is linear. �

We will see in Theorem 10.5.1 that all linear transformations T : Rn → Rm have canbe written T = TA for some n× n matrix A.

We can write Rnk = Rn×Rn×· · ·×Rn, where Rn is repeated k times. Now writeelements of Rnk in the form (x1, . . . , xk) with each xi ∈ Rn.

Multilinearity. A function Rnk → R is k-linear if it is separately linear in each coordinatek. The term multilinear is used in the generic case, and bilinear is used when f is2-linear.

A variety of multilinear objects are generically referred to as tensors. A k-multilinearfunction is a k-tensor.

9. DETERMINANTS 39

9.12 Determinants are Multilinear

Our real reason for being interested so in multilinear functions at this moment is thatthe determinant is multilinear. So what is the determinant a function of? We can treatthe determinant as a function of either the rows or the columns of A. For the row case,let ai be the ith row of A. Then we can write

A =

a1

a2...an

,

which lets us think of the determinant as a function of the rows, fn(a1, . . . ,an) = detA.Similarly, let aj be the jth column of A, to make it a function of the columns.

Theorem 9.12.1. Let A be an n×n matrix, and fn(a1, . . . ,an) = detA, where the ai

are the rows (columns) of A. Then fn is n-linear in (a1, . . . ,an) for n ≥ 1.

Proof. Replace row i by ai + αa′

i for any scalar α and vector a′

i ∈ Rn+1. We nowexpand the determinant fn on the row of interest, row i.

fn(a1, . . . , ai + αa′

i, . . . ,an)

= (ai1 + αa′

i1)Ci1 + · · · + (ai,n + αa′

in)Ci,n

= (ai1Ci1 + · · · + ainCi,n) + α(a′

i1Ci1 + · · · + a′

inCi,n)

= f(a1, . . . ,an) + αf(a′

1, . . . ,a′

n)

showing that fn is linear separately in each ai, and so is n-linear.The proof in terms of columns is basically the same, but expands along the column

of interest. �

The multilinearity of the determinant means that we know how determinants behaveunder the third elementary row operation, adding a non-zero multiple of one row toanother. In particular, det Eij(r) = 1.

40 MATH METHODS

9.13 Bilinear Forms

We can use a matrix A = [aij] to define a bilinear function from R2n to R. Such

functions are called bilinear forms or quadratic forms.3 Set

f(x,y) =n∑

i=1

n∑

j=1

aijxiyj = xTAy.

Then f is bilinear. We’ll show that it is linear in the second coordinate using matrixnotation.

f(x, αy + z) = xTA(αy + z)

= xTA(αy) + xTAz

= α(xTAy) + xTAz

= αf(x,y) + f(x, z)

The case of the first coordinate is similar.This can also be shown by explicitly using the coordinates of the vectors and elements

of the matrix. To do so, we introduce the shorthand that

n∑

ij=1

meansn∑

i=1

n∑

j=1

,

n∑

ijk=1

meansn∑

i=1

n∑

j=1

n∑

k=1

,

and similarly for larger sets of indices.We write

f(x, αy + z) =n∑

ij=1

aijxi(αyj + zj)

=n∑

ij=1

aijxi(αyj) +n∑

ij=1

aijxizj

= α

n∑

ij=1

aijxiyj +n∑

ij=1

aijxizj

= αf(x,y) + f(x, z)

3 See section 13.3 of Simon and Blume for the basic definition. We will study them more in Chapter16.

9. DETERMINANTS 41

9.14 Tensors

Just as we can define 2-linear functions, a 4-linear function A can be defined by

A(w, x,y, z) =n∑

hijk=1

ahijkwhxiyjzk.

The 4-dimensional array[

ahijk

]

is an example of a tensor, more specifically, a 4-tensor.We can define a k-tensor by the k-linear mapping

A(x1, . . . , xk) =n∑

j1···jk=1

aj1···jkx1j1· · ·xkjk

where each xi = (xi1, . . . , xin) ∈ Rn gives us a tensor on Rnk described by thek-dimensional array

[

aj1···jk

]

.More generally, anything of the form

A(x1, . . . , xk) =n∑

j1···jk=1

aj1···jkx1j1· · ·xkjk

is k-linear, giving us a tensor A =[

aj1···jk

]

.This type of method, involving summation over coordinates is similar to Ricci and

Levi-Civita’s absolute differential calculus, developed for use in differential geometryand made famous by Albert Einstein. However, the notation is simplified here byfocusing on tensors that are functions solely of ordinary (contravariant) vectors.

Modern approaches to tensors emphasize coordinate-free methods. Although thiscan make many things easier, it also makes understanding more difficult due to sub-stantial abstraction.

42 MATH METHODS

9.15 Determinants: Yet Another Definition

The third definition of determinant consists of the three conditions stated in the followingtheorem.

The Determinant Theorem. Let Dn be a function from the set of n× n matrices to R.Given a matrix A, we regard Dn as a function defined on the rows of A, a1, . . . ,an.Such a function is the determinant if and only if

1. Dn is an alternating function of the rows.2. Dn is n-linear in the rows3. Dn(In) = 1.

Proof. Only if case: We have proven this in pieces already. There are three relevanttheorems using the cofactor definition. Theorem 9.5.1 showed that the determinant asdefined by cofactor expansion, is an alternating function. Theorem 9.12.1 showed thatit is also multilinear, and Theorem 9.1.1 implies that det I = 1.

It’s not terribly hard to verify these properties also hold for the permutation definition.If case: This will follow over the next two pages

Fact. The Determinant Theorem remains true if we require it to be alternating andmultilinear in terms of columns rather than rows.

9. DETERMINANTS 43

9.16 The Determinant Theorem, II

If (i). To prove the if portion of the determinant theorem, we will examine the effectsof the elementary row operations on any alternating multilinear function fn of the rowsof a matrix of size n.

Because fn is alternating, interchanging rows flips the sign of fn. Because of multi-linearity, multiplying a row by a scalar multiplies fn by that same scalar.

The third row operation is adding a non-zero multiple of one row to another. Supposewe replace ai by ai + raj. By multilinearity,

fn(a1, . . . ,ai + raj, . . . ,an) = fn(a1, . . . ,an) + rfn(a1, . . . ,aj, . . . ,an)

The latter term has aj in both row i and row j. Since fn is alternating, interchangingthose rows flips the sign. But it also leaves fn(a1, . . . ,aj, . . . ,an) unchanged. Theonly way that can happen is if it is zero. Thus

fn(a1, . . . ,ai + raj, . . . ,an) = fn(a1, . . . ,an).

This means that the third elementary row operation leaves fn unchanged.

44 MATH METHODS

9.17 The Determinant Theorem, III

If (ii). We can now provide a recipe for finding the fn by row reduction. Let R bea reduced row-echelon form of an n × n matrix A. Let m be the number of rowinterchanges in the reduction and r1, . . . , rk be the scalar multiples of rows in thereduction. Then

fn(A) = (−1)m

(

k∏

h=1

rh

)

fn(R).

There two possibilities for R. If A is non-singular, then R = In by Corollary 7.24.1 andLemma 7.25.1. It follows that

fn(A) = (−1)m

(

k∏

h=1

rh

)

fn(In)

in which case fn(In) uniquely determines fn.Otherwise, rankA < n and R will have a zero row. If we multiple that row by zero,

R doesn’t change, but fn has to take the value zero. This concludes the proof of the ifportion of the Determinant Theorem, and so the proof of the whole theorem. �

Incidentally, this also shows that the determinant as defined by cofactor expansion isthe same as in the Determinant Theorem. Since it is easy to show that the permutationdefinition is also alternating, multilinear, and takes the value 1 on identity matrices, thepermutation method also defines the same determinant.

9. DETERMINANTS 45

9.18 Determinants, Transposes, and Inverses

Further consideration of the Determinant Theorem shows that it yields the same deter-minant if we use columns instead of rows. If we apply the column version to AT , thatis the same as the row version applied to A, so detA = detAT .

The determinant of a transposed matrix is the same as the determinant of the originalmatrix. We state this as a theorem.

Theorem 9.18.1. Let A be an n× n matrix. Then detA = detAT .

Proof. See above. �

Although we will not prove it, the determinant of the product of two (or more)matrices is the product of the determinants.

Theorem 9.18.2. Let A and B be n× n matrices. Then det(AB) = (detA)(detB).

Proof. See the proof of Theorem 26.4 in Simon and Blume. �

We can now show the following theorem on the determinant of inverses.

Theorem 9.18.3. Let A be an n×n matrix. Then A is invertible if and only if detA 6= 0.In that case, detA−1 = 1/ detA. Moreover, A is invertible if and only if detA 6= 0.

Proof. Now suppose A is invertible. Then AA−1 = I. It follows that

(detA)(detA−1) = det(AA−1)

= det I

= 1.

This means that detA−1 = 1/ detA and that detA 6= 0. In the course of proving theDeterminant Theorem, we had shown that detA is non-zero if and only rankA = n.It follows that A is invertible if and only if detA 6= 0. �

46 MATH METHODS

9.19 The Adjoint Matrix

Adjoint Matrix. Let A be an n×n matrix. Define the adjoint of A, adjA by (adjA)ij =Cji where Cji is the ji-cofactor of (j, i).

Notice the transposition in the definition of adjoint. The ji-cofactor is used for the ijentry.

Theorem 9.19.1. Let A be an invertible n× n matrix. Then

A−1 =1

detAadjA.

Proof. It is enough to show that A× adjA = (detA)In. Now

A× adjA =

a11 · · · a1n...

. . ....

an1 · · · ann

C11 · · · Cn1...

. . ....

C1n · · · Cnn

=(

detA)

In (9.19.2)

We look at the ij element of the product, which is

ai1Cj1 + ai2Cj2 + · · · + ainCjn. (9.19.3)

If i = j, equation (9.19.3) becomes

ai1Ci1 + ai2Ci2 + · · · + ainCin = detA.

The point is that it is the expansion of the determinant A along row i.What if i 6= j? In that case, we have expanded along row j as far as the cofactors

are concerned, but have used row i for the ai1. Since i 6= j, we are computing thedeterminant of a matrix where row i occurs both in row i and row j. The result ofcourse is 0. Equation (9.19.2) follows, and so does the theorem. �

9. DETERMINANTS 47

9.20 Cramer’s Rule

A closely related result is Cramer’s Rule, which we state without proof. To prove it,recall that x = A−1b and use the adjoint formula for the inverse.

Cramer’s Rule. Let A be an invertible n× n matrix. Then the equation Ax = b has aunique solution when x and b are n× 1 vectors. The solution is

xi =detBi

detA

where Bi is the matrix obtained from A by replacing the ith column of A by b.

September 28, 2020

Copyright c©2020 by John H. Boyd III: Department of Economics, Florida InternationalUniversity, Miami, FL 33199

8. Matrix Algebraboydj/mathecon/math08.pdf8. Matrix Algebra 9/1/20 We start by deﬁning matrices. Matrix. An m × n matrix is a rectangular array Aof mn elements arranged in m rows

Documents