Solving Polynomial Equations Using Linear Algebra

JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 28, NUMBER 4 (2010)354

INTRODUCTIONMultivariate polynomials show up in many applications. Polynomials are attractive

because they are well understood and they have significant simplicity and structure in that they are vector spaces and rings. Additionally, degree-two polynomials (conic sections that are also known as quadrics) show up in many engineering applications, including multilateration.

This article begins with a discussion on finding roots of univariate polynomials via eigenvalues/eigenvectors of companion matrices. Next, we briefly introduce the concept of a ring to enable us to discuss ring representation of polynomials that are a generalization of companion matrices. We then discuss how multivariate polynomial systems can be solved with ring representations. Following that, we outline an algo-rithm attributable to Emiris1 that is used to find ring representations. Finally, we give an example of the methods applied to a trilateration quadric-intersection problem.

Solving Polynomial Equations Using Linear Algebra

Michael Peretzian Williams

engineering problems, such as multilateration. Typically,

uadric intersection is a common class of nonlinear systems of equations. Quadrics, which are the class of all degree-two

polynomials in three or more variables, appear in many

numerical methods are used to solve such problems. Unfortunately, these methods require an initial guess and, although rates of convergence are well understood, convergence is not necessarily certain. The method discussed in this article transforms the problem of simultaneously solving a system of polynomials into a linear algebra problem that, unlike other root-finding methods, does not require an initial guess. Additionally, iterative methods only give one solution at a time. The method outlined here gives all solutions (including complex ones).

JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 28, NUMBER 4 (2010) 355

COMPANION MATRICES: FINDING ROOTS OF UNIVARIATE POLYNOMIALS AS AN EIGENVALUE/EIGENVECTOR PROBLEM

Recall that by definition an n × n matrix M has eigenvectors vi and eigenvalues i if Mvi = ivi.

Such a matrix can have at most n eigenvalues/eigenvectors. Solving for i is equiva-lent to solving det(M – xI) = 0. This calculation yields a polynomial in x called the “characteristic polynomial,” the roots of which are the eigenvalues i. Once the i values are obtained, the eigenvectors vi can be calculated by solving (M – iI)vi = 0. Hence, it is not surprising that eigenvalues/eigenvectors provide a method for solving polynomials.

Next, we consider the univariate polynomial C xii

i = 0,

np(x) =� where Cn = 1.

The matrix

M

C C Cn

=

− − − −

0 1 0

0 0 1

0 1 1

L

M M O M

L

L

is known as the “companion matrix” for p. Recall that the Vandermonde matrix is defined as

V x x xx x x

x x x

nn

n nnn

( , , , )1 21 2

11

21 1

1 1 1

K

L

O

M M L M

L

=

− − −

.

As it turns out, the roots of the characteristic polynomial of M [i.e., det(M – xI) = 0] are precisely the roots of the polynomial p, and the eigenvectors are Vandermonde vectors of the roots.

Indeed, if j is a root of p [i.e., p(j) = 0] and v is a Vandermonde vector such that − 1

jn

j Lv = [1 ]T,�� then

�

�

��

��

�

�

�

�

��

�

�

�Mv

C C Cn

= = = =

− − − −

0 1 0

10 0

11

0 1 1

L

M M O M

L

L

jj

jn

j

j

i ji

i

n

C

M M

−

=

−

=

−1

2

0

1 −

j

j

j jp

2

Mn ( ) −

j

j

j 0

2

Mn

j2

M−

j =

jn

jv

1

.

�

�

Hence, the eigenvalues of M are the roots of p. Likewise, the right eigenvectors of M are the columns of the Vandermonde matrix of the roots of p.

The most important property of companion matrices in this article can be stated as follows:

Given a polynomial p, the companion matrix defines a matrix M such that the characteristic polynomial of M is p [i.e., det(M – xI) = ±p(x)].

To give a better understanding of the above statement, consider the following con-crete example.

Let p(x) = (x – 2)(x – 3)(x – 5)

= x3 – (2 + 3 + 5)x2 + (2 · 3 + 2 · 5 + 3 · 5)x – 2 · 3 · 5

= x3 – 10x2 + 31x – 30

We can verify (for this case) that the eigenvalue/eigenvector statements made above hold [i.e., det(M – xI) = ±p(x)]. The companion matrix for p(x) is

M. P. WILLIAMS


M =−

0 1 00 0 1

30 31 10.

Since the roots of p are (2, 3, 5) by construction, the eigenvectors of M should be given by the Vandermonde matrix

1 1 12 3 54 9 25

V(2, 3, 5) = .

We observe that

0 1 00 0 1

30 31 10

124

24

30 2−=

− +31 4 10

248

2124

−

0 1 00 0 1

30 31 10

139

=− +

39

30 3 31 9 10

3927

−

3139

0 1 00 0 130 31 10

155

25

525

30 5 31 25 10=

=

=

=

=

=

=− +

525125

51525

.

· ·

· ·

· ·

Hence, the eigenvalues of M are given by = 2, 3, 5, the roots of p(x) as claimed. Also, we note that p is the characteristic polynomial for M [i.e., det(M – xI) = ±p(x)]. Indeed,

��

��

( )+ −

det( ) detM I− =−

−− −

1 00 130 31 10

= − −− −

det 131 10

−−

1 0 130 10

det + −−

= − −

0 031 30

det

( ( ) )

.

10 31 30

10 31 303 2

− + +

= − + = −p

�

�

��

�

��

��

Hence, det(M – xI) = ±p(x) as claimed.In this section, we have outlined univariate examples of the companion matrix

and Vandermonde matrix. Indeed, we have demonstrated that finding the roots of a univariate polynomial is an eigenvalue/eigenvector problem. In a later section, we discuss the u-resultant algorithm, which is used to construct a generalization of the companion matrix. These generalizations, known herein as “ring representations,” can be used to solve polynomial systems, including quadric intersection.

RINGS AND RING REPRESENTATIONSPreviously, we discussed transforming univariate polynomial root finding into an

eigenvalue/eigenvector problem. For the remainder of this article, we will generalize the method above to simultaneously solve systems of multivariate polynomial equa-tions. More precisely, we want to find coordinates (x1, x2, . . . , xn) such that for poly-nomials f1, f2, . . . , fn

SOLVING POLYNOMIAL EQUATIONS USING LINEAR ALGEBRA


f x x x

f x x x

f x x x

n

n

n n

1 1 2

2 1 2

1 2

0

0

( , , , )

( , , , )

( , , , )

K

K

M

K

=

=

== 0 .

As with many multivariate generalizations, the univariate case does not require understanding of the deeper structure inherent to the problem. The last section is no exception. A brief discussion regarding the algebraic concept of a ring is necessary to advance the generalization.

Rings are an algebraic structure having both a multiplication and addition but not necessarily a division. The most common rings are integers, univariate polynomials, and multivariate polynomials. For any a, b, c in a ring R, the following properties hold.

Addition: A. Closure, a + b are in R B. Associativity, (a + b) + c = a + (b + c) C. Identity, a + 0 = 0 + a = a D. Inverse, a + (–a) = (–a) + a = 0 E. Commutativity, a + b = b + a

Multiplication: A. Closure, ab is in R B. Associativity, (ab)c = a(bc) C. Identity, a1 = 1a = a (optional) D. Inverse, a(a–1) = (a–1)a = 1 (optional) E. Commutativity, ab = ba (optional)

Algebra, the field of mathematics in which rings are studied, commonly looks at an operation called adjoining, or including new elements. If R is a ring, then adjoining is accomplished by including some new element j and creating the ring R[ j] in the following fashion: R[ j] = a0 + a1j + a2j2 + a3j3 . . . for all a0, a1, a2, a3, . . . in R. A common instance of this operation is adjoining the imaginary unit i to the real numbers to obtain the complex numbers.

Because matrices are well understood and easily implemented in code, ring rep-resentations (a special family of matrices particular to a ring) are a common way of handling algebraic applications. In the same way that logarithms map multiplication to addition (log(ab) = log(a) + log(b)) and Fourier transforms map convolution to mul-tiplication ([f * g] = [f ][g]), ring representations map ring multiplication and ring addition to matrix multiplication and addition.

There are many ways to view a matrix. It can be, and typically is, viewed as a transfor-mation whose eigenvalues could be adjoined to a ring if they were not present. Alterna-tively, because p(M) = 0 (where p is the characteristic polynomial of M), the matrix can itself be viewed as an element that can be adjoined because matrices are roots of their characteristic polynomial. Let R be any ring. The mapping to adjoin a matrix M to the ring R obtaining the ring R[M] is as follows: R[M] = u0I + u1M + u2M2 + · · · + unMn for all u0, u1, u2, . . ., un R.

Each element u0I + u1M + u2M2 + · · · + unMn in R[M] is a ring representation of the element u0 + u1 + u2

2 + · · · + unn in R[].

For a concrete example, consider the complex numbers, which form a ring as they satisfy all of the properties above. To cast the complex numbers as ring representa-tions over the real numbers, consider the characteristic polynomial of the complex numbers, i.e., p = x2 + 1 = x2 + 0x + 1 = 0. In order to form a ring representation, we are interested in adjoining a matrix whose eigenvalues are the roots of p. The com-panion matrix of this polynomial, as discussed above, will provide just such a matrix. The companion matrix is

M. P. WILLIAMS


J =−0 11 0

.

Indeed,

��

��det( ) det detJ I I− =

−− =

−0 11 0

11

12− −

= + = ±, .i

We can now define a ring representation of the complex numbers as rep(u0 + u1i) = u0I + u1J for all u0, u1 R. More concretely, we have

rep u u iu u

u u( ) .0 1

0 1

1 0

+ =−

This construction preserves addition. Indeed,

rep u u i rep v v iu u

u u

v( ) ( )0 1 0 1

0 1

1 0

+ + + =−

+ 00 1

1 0

0 0 1 1

1 1 0 0

v

v v

u v u v

u v u v

−

=+ +

− + +( )

= + + +rep u v u v i(( ) ( ) ) .0 0 1 1

Also, the construction preserves multiplication as

rep u u i rep v v iu u

u u

v v( ) ( )0 1 0 1

0 1

1 0

0+ + =−

11

1 0

0 0 1 1 0 1 1 0

0 1 1

−

=− +

− +

v v

u v u v u v u v

u v u( vv u v u v

rep u v u v u v

0 0 0 1 1

0 0 1 1 0 1

)

(( ) (

−

= − + ++

= + +

u v i

rep u u i v v i1 0

0 1 0 1

) )

(( )( )) .

Remarkably, this process can be generalized to include the roots of several multivari-ate polynomials.

MOTIVATION FOR USING RING REPRESENTATIONS As stated in the Companion Matrices section, one interpretation of the companion

matrix is as an answer to the inverse eigenvalue problem. Namely, given a polynomial p, find a matrix M such that p is the characteristic polynomial of M. We will show that ring representations are a multivariate generalization of companion matrices. In this section, we will motivate discussion of Emiris’ algorithm for finding ring representa-tions with an example.

Consider the following system of equations:

f1 = x2 + y2 – 10 = 0 f2 = x2 + xy + 2y2 – 16 = 0.

Let R now be the real numbers. These above equations are elements in the poly-nomial ring R[x, y]. If we could find ring representations X and Y for x and y, respec-tively, in a ring where f1 = f2 = 0, the following would be true by the properties of ring representations:

f1 = X2 + Y2 – 10I = 0 f2 = X2 + XY + 2Y2 – 16I = 0.



Furthermore, because the ring defined by the equations is commutative, X and Y would necessarily commute. As a result, if X and Y are nondefective (have distinct eigenvalues), then they are simultaneously diagonalizable (they can be diagonalized with the same basis or, equivalently, they share all of their eigenvectors). Conse-quently, the diagonal of the representations must be the coordinates of the roots. If V is the matrix of eigenvectors for X and Y, then the statements above can be expressed symbolically as follows:

V X Y I V

V X V V Y V V I

−

− − −

+ − =

+ −

1 2 2

1 2 1 2 1

10 0

10

( )

( ) ( ) ( )VV

x

x

x

x

= 0

0 0 0

0 0 0

0 0 0

0 0 0

1

2

3

4

+

21

2

3

4

0 0 0

0 0 0

0 0 0

0 0 0

y

y

y

y

−

2 10 0 0 0

0 10 0 0

0 0 10 0

0 0 0 10

=

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

= .

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

xx y

x y

x y

12

12

22

22

32

32

10 0 0 0

0 10 0 0

0 0 10 0

0 0 0

+ −

+ −

+ −

xx y42

42 10+ −

Similarly,

V X XY Y I V

x x y y

x x y

− + + − =

+ + −

+

1 2 2

12

1 1 12

22

2 2

2 16

16 0 0 0

0

( )

++ −

+ + −

+ + −

y

x x y y

x x y y

22

32

3 3 32

42

4 4 42

16 0 0

0 0 16 0

0 0 0 16

= .

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

Indeed, it turns out for this example that ring representations for x and y are as follows:

X Y= =

0 1 0 04 0 0 10 0 0 10 3 2 0

0 00 06 0

,0 1

0 3 2−

−

1 0

0

0 1 .

It is easy to verify

f X Y I12 2 10= + −

−

−

20 0 1 0

0 0 0 1

6 0 0 1

0 3 2 0

2

0 7 2 0

0 3 2 0

12 0 0 5

+

−−

−

−

6 0 0 1

0 3 2 0

0 3 8 0

12 0 0 5

+

−

−

−

−

10 0 0 0

0 10 0 0

0 0 10 0

0 0 0 10

= .

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

=

4 0 0 1

4 0 0 1

0 0 0 1

0 3 2 0

+ +

−

−

−

−

10 0 0 0

0 10 0 0

0 0 10 0

0 0 0 10

=

0 1 0 0

M. P. WILLIAMS


f X Y I12 2 10= + −

−

−

20 0 1 0

0 0 0 1

6 0 0 1

0 3 2 0

2

0 7 2 0

0 3 2 0

12 0 0 5

+

−−

−

−

6 0 0 1

0 3 2 0

0 3 8 0

12 0 0 5

+

−

−

−

−

10 0 0 0

0 10 0 0

0 0 10 0

0 0 0 10

= .

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

=

4 0 0 1

4 0 0 1

0 0 0 1

0 3 2 0

+ +

−

−

−

−

10 0 0 0

0 10 0 0

0 0 10 0

0 0 0 10

=

0 1 0 0

Similar calculations can be done for f2. The eigenvalues for X and Y that should yield the coordinates of the roots of the polynomial system are

��X = ± ±( , ) and8 1 = (± 2, m3).Y Indeed,

( ) ( )

( ) ( )( ) ( )

( )

± + ± − =

± + ± ± + ± − =

±

8 2 10 0

8 8 2 2 2 16 0

1

2 2

2 2

2 ++ − =

± + ± + − =

( )

( ) ( )( ) ( ) .

m

m m

3 10 0

1 1 3 2 3 16 0

2

2 2

EMIRIS’ ALGORITHMWe have established that, if we have a way of determining the ring representa-

tions for X and Y, we can solve the system of equations. Emiris’ algorithm, a method to accomplish just that, is discussed notionally here. For a full proof and description see Ref. 1 or 2.

Given polynomials f1, f2, . . . , fn each of degree d1, d2, . . . , dn, respectively, the algo-rithm is as follows:

1. Introduce an extra polynomial f0 = u0 + u1x1 + u2x2 + . . . + unxn, where u0, u1, . . . , un are arbitrarily chosen constants with at least one being non-zero

2. Consider 0 � i , i , , i � D},B x x x i i i Di ini

n nn= + + ={ | ,1 2 0 1 0 1

1 2 K K K where D = d1 + d2 + . . . + dn – n + 1

3. Consider

S b B x b

S b B x

n nd

n nd

n

n

=

=− −−

{ | divides }

{ | divi1 11 ddes but does not}

{ |divides but

b x

S b B b x

nd

n n

n

− =2dd

nd

d

n nx

S b B x b x

, do not}

{ | divides but

−−

=

1

11

1

1

M

nnd

nd d

ii

n

n nx x

S b B b S

, , , do not}

|

−

=

−

=

1 2

01

1 2K

U

�

�

�

�

� �� 4. Find M a linear transformation on the monomials in B such that



B x x x B x

S x

S x

S x

n

n

( , , , ) ( )

( )

( )

( )

1 2

0

1Kv

v

v

Mv

= =

M · B x M

S x

S x

S xn

=( )

( )

( )

( )

v

v

v

Mv

0

1 =

S x f x

S x x f x

S x

d

n

0 0

1 1 11

( ) ( )

( ( ) / ) ( )

( (

v v

v v

Mv)) / ) ( )x f xn

dn

nv

5. Consider

Mv p p p M · B p M

S p

S p

S p

n

n

( , , , ) ( )

( )

( )

( )

1 2

0

1Kv

v

v

Mv

= = =

S p f p

S p pd0 0

1 1

( ) ( )

( ( ) /

v v

v11

1) ( )

( ( ) / ) ( )

f p

S p p f pn nd

nn

v

Mv v

=

S p f p0 0

0

0

( ) ( )v v

M

6. Break M into blocks as follows:

M · B pM M

M M

S p

S p=( )

( )

( )v

v

v

M00 01

10 11

0

1

SS p

S p f p

S

n( )

( ) ( )

( (

v

v v

v

=

0 0

1 pp p f p

S p p f p

d

n nd

nn

) / ) ( )

( ( ) / ) ( )

1 11

v

Mv v

=

S p f p0 0

0

0

( ) ( )v v

M

7. Block triangularlize with the following transformation:

I M M

I

M M

M M

− −01 11

100 01

10 110=

S p

S p

S p

M

n

0

1 0

( )

( )

( )

v

v

Mv

%

MM M

S p

S p

S pn

10 11

0

1=

( )

( )

( )

v

v

Mv

=− −I M M

I

f p S01 111

0

0

( )v

00 0 0

0 0

( ) ( ) ( )v v vp f p S p

=

8. As a result of the above, % % KM M u u u M M M Mn= = − −( , , , )1 2 00 01 111

10

9. Solve the following eigenvector problem: % v v v KMS p f p S p u u p u p S pn n0 0 0 0 1 1 0 1( ) ( ) ( ) ( ) (= = + + + ,, , , )p pn2 K

10. Since S0(p1, p2, . . . , pn) = [a, ap1, ap2, . . . , apn, ap1p2, ap1p3, . . .]T (because it will be derived numerically), extracting the root coordinates can be done by dividing the eigenvector by a and collecting the 2nd through nth entries

The roots of the polynomial system are the coordinates (p1, p2, . . . , pn).

APPLICATIONOne application for Emiris’ algorithm is trilateration. Trilateration is achieved

by measuring the time difference of arrival between pairs of sensors. The so-called “TDOA” (time difference of arrival) equation is ||x – S1|| – ||x – S2|| = ±TDOA * C, where S1 and S2 are sensors, x is an emitter position, TDOA is the time difference of arrival, and C is the emission propagation rate. It is well known that this can be repre-sented as a polynomial (indeed, it is hyperbolic). Thus, the polynomial that describes the level surface above is given by

M. P. WILLIAMS


x S x S TDOA * C

x x

− − − = ±

+ − +

1 2

TDOA * C= ±[(S1 + S2)/2 + (S1 – S2)/2] [(S1 + S2)/2 – (S1 – S2)/2]

u S Sv S Sy x u

TDOA * C

y v y v

= += −= +=

+ − − = ±

( ) /( ) /

1 2 21 2 2

2�

2

4 4

2

2 2 2

�

� �

y v y v

y v y v y v

y y y v vT T

+ = + −

+ = ± − + −

+ + TT T T Tv y v y y y v v v

y · v

y · v

y v

= ± − + − +

= ± −

4 4 2

4 4 4

2

2

� �

� �

−− = ± −

− + = −

−

� �

� � �

�

2

2 2 4 2 2

2 2

2

2

y v

y v y v y v

y v

T T

T

( )

( ) yy v y y y v v v

y v y y

T T T T

T T

+ = − +

+ = +

� � � �

� � �

4 2 2 2

2 4 2

2

( ) 22

4 2 2

2 2

v v

y vv y y y v v

y vv I y v

T

T T T T

T T T

+ = +

− −

� � �

� �( ) vv + =�4 0

±2�

The equation above becomes

Let

As an example, let’s consider the problem of locating a gunshot within a city. Suppose there are three acoustic sensors (S1, S2, and S3) at (9, 39), (65, 10), and (64, 71), respectively, in a local Cartesian coordinate system with meters as the dis-tance unit. Suppose further that a gunshot is heard by S1 at 19:19:57.0875, by S2 at 19:19:57.1719, and by S3 at 19:19:57.1797, and the speed of sound (C) is 341 m/s. Also suppose the unknown emitter location is at (27, 42) and that it emitted at 19:19:57:0340 (see Fig. 1).

The time difference of arrival between S1 and S2 is 0.0844 and between S1 and S3 it is 0.0078. Hence, we wish to solve the system

( ) ( ) ( ) ( ) *

(

x y x y

x

− + − − − + − = ±9 39 65 10 0.0844 341

*0.0078 341.

2 2 2 2

− + − − − + − = ±9 39 64 712 2 2 2) ( ) ( ) ( )y x y

Equivalently, we solve

− − + +207485 6417 19846 0765 31843 375 537 028. . . .x y 11 812 36 7219 0

2480777 5687 88508 52

2 2x xy y− − =

−

.

. . 223 37529 9994 549 4318 880 49 18182 2x y x xy y− + + +. . . == 0 .

To apply Emiris’ algorithm, we pick f0 = u0 + u1x + u2y = 8 + 28x + 80y and calcu-late the ring representation F0 = u0 + u1X + u2Y = 8 + 28X + 80Y:

F = .

8–48618.3988–2483454.438–73249499.44

084.5799

–1549.6064–43682.6736

80–112.9542

64660.27441768850.5654

282548.0324

62895.72421969279.3923

0



100

80

60

40

20

0

–20160140120100806040200–20–40

Sensor 3

Sensor 2

Sensor 1 Emitter (27, 42)

19:19:57:1797

19:19:57:1719

19:19:57:0875

19:19:57:034

Known: S1 = (9, 39)Known: S2 = (65, 10)Known: S3 = (64, 71)Unknown: P = (27, 42)

y

x

Figure 1. Emitter and sensor locations.1

44.1845–100.0836

–4422.1484

125.9777

254.07396600.2544

12742

1134

1xyxy

170.108639.2354

2750.7347

Sensor 3

Sensor 2

Sensor 1

Known: S1 = (9, 39)Known: S2 = (65, 10)Known: S3 = (64, 71)Found: P = (27, 42)

Emitter (27, 42)

100

80

60

40

20

0

–20160140120100806040200–20–40

y

x

Figure 2. Emitter and sensor locations and eigenvector solutions.

Michael Peretzian Williams received his B.S. in Mathematics in 1998 from the University of South Carolina. He earned his M.S. in Mathematics in 2001, again from the University of South Carolina, focusing on Number Theory. He received his Ph.D. in Mathematics in 2004 from North Carolina State University, working in Lie Algebras. From 2005 to 2007, Dr. Williams worked as a researcher for Northrop Grumman, developing algorithms for Dempster–Shafer fusion. He has been with APL’s Global Engagement Department since 2007, and his current work is in upstream data fusion. His e-mail address is [email protected].

The Author

Michael P. Williams

The eigenvectors of F0 are

V = .

125.9777254.07396600.2544

170.108639.2354

2750.7347

12742

1134

144.1845

–100.0836–4422.1484

The second and third rows are the solutions, and we have recovered the emitter loca-tion at (27, 42), as seen in Fig. 2.

CONCLUSIONWe have discussed some eigenvector methods for finding the roots of multi-

variate polynomials. Unlike iterative, numerical methods typically applied to this problem, the methods outlined in this article possess the numerical stability of numer-ical linear algebra, do not require a good initial guess of the solution, and give all solu-tions simultaneously. Furthermore, if the initial guess is poor enough, the methods outlined herein may converge more quickly than iterative methods.

REFERENCES 1Emiris, I. Z., “On the Complexity of Sparse Elimination,” J. Complexity 12, 134–166 (1996). 2Cox, D., Little, J., and O’Shea, D., Using Algebraic Geometry, Graduate Texts in Mathematics Series,

Vol. 185, Springer, New York, 2nd Ed., pp. 122–128 (2005).

Solving Polynomial Equations Using Linear Algebra

Documents