-
Factorization of Polynomials and RealAnalytic Functions
Radoslaw (Radek) Stefanski 1
Department of Mathematics & Computer ScienceUniversity of
Richmond
April 27, 2004
1I would like to express my deep thanks to Prof. William T. Ross
for adviceand guidance on this project.
-
Abstract
In this project, we address the question: When can a polynomial
p(x, y) of
two variables be factored as p(x, y) = f(x)g(y), where f and g
are polynomi-
als of one variable. We answer this question, using linear
algebra, and create
a Mathematica program which carries out this factorization. For
example,
3+ 3x− 5x3+ y+xy− 53x3y+ y2+xy2− 5
3x3y2 = (1+x− 5
3x3)(3+ y+ y2).
We then generalize this concept and ask: When can p(x, y) be
written as
p(x, y) = f1(x)g2(y) + f2(x)g2(y) + · · ·+ fr(x)gr(y),
where fj, gj are polynomials? This can certainly be done (for
large enough
r). What is the minimum such r? Again, we have a Mathematica
program
which carries out this computation. For example,
1+2x+x2+2x3+2y+2x2y+7xy2+7x3y2 = (1+x2)(1+2y)+(x+x3)(2+7y2).
We generalize this further to larger number of variables (with
an appropriate
Mathematica program to carry out this computation). We then
apply this
and consider the domains of convergence of certain types of real
analytic
functions and try to relate the domain of convergence with the
rank of the
polynomial.
-
Contents
1. Introduction 22. Factorization of polynomials in two
dimensions 43. An Introduction to Real Analytic Function 114.
Factorization of Multivariate Real Analytic Functions 185.
Factorization of Complex Valued Real Analytic Functions 236.
Factorization of polynomials in three dimensions 257. Algorithm and
examples in two and n dimensions 308. Alternative algorithm and
examples three dimensions 32
-
2
1. Introduction
If R[x] is the set of all polynomials with real coefficients,
then the standard fac-torization of a polynomial p ∈ R[x], is
derived from a theorem in abstract algebra1.This theorem states
that p ∈ R[x] can be written as,
p(x) = v1(x)s1 · · · vn(x)sn ,where si ∈ N and vi(x), · · · ,
vn(x) are irreducible (i.e., they cannot be non-triviallyfactored
further as products of elements of R[x]).
Moreover, if R[x, y] is the set of all polynomials in x and y
with real coefficients,a similar theorem from abstract algebra
states that every polynomial, p ∈ R[x, y],can be written as,
p(x, y) = v1(x, y)s1 · · · vn(x, y)sn ,where si ∈ N and vi(x,
y), · · · , vn(x, y) are irreducible in the same sense as
above.
We are interested in a different type of factorization of
elements in R[x, y]. Ifp ∈ R[x, y], we wish to write
p(x, y) = f(x)g(y),
where f ∈ R[x] and g ∈ R[y] but f and g are not necessarily
irreducible in R[x]and R[y] respectively (we are not concerned
about further factorization of f and g).
Although unconventional, this form of factorization has been
seen before. Theseparable (but not necessarily polynomial)
solutions to certain differential or partialdifferential equations
such as the Laplace equation,
∆u = 0
or the wave equation,uxx − uyy = 0,
are exactly of this formu(x, y) = f(x)g(y).
Another factorization problem appears in Fourier analysis in the
following form:Suppose that u a is bounded function on the unit
circle with a complex Fourierseries,
u ∼∞∑
n=−∞û(n)eınθ,
where û(n) = 12π∫ 2π0
u(eıθ)e−ınθdθ are the Fourier coefficients of u. When can
wewrite
u = fg,where f and g are bounded on the circle, and
f ∼∞∑
n=0
f̂(n)eınθ
and
g ∼∞∑
n=0
ĝ(−n)e−ınθ
1Gallian, J., Contemporary Abstract Algebra, 1998, Houghton
Mifflin College Division, Boston.
-
3
are bounded functions? A theorem of Bourgain2 states that this
is possible if andonly if ∫ π
0
log |f(eıθ)|dθ > −∞.
The above polynomial factorization can be generalized to n
dimensions. If p ∈R[x1, · · · , xn], we wish to write
p(x1, · · · , xn) = f1(x1) · · · fn(xn),where fi ∈ R[xi] but the
fi’s need not be necessarily irreducible in R[xi]. We for-mulate an
algorithm to factor such polynomials and create a Mathematica
programto carry out the computations.
We generalize the above factorization method by assuming p ∈
R∞[x, y], bywhich we mean the set of formal power series,
p(x, y) =∞∑
n,m=0
an,mxnym.
After reviewing some basic notions about the domains of
convergence of theformal power series, we will focus on the
following problem. For a given p ∈R∞[x, y], when can we write
p(x, y) = p1(x)q1(y),
where p1 ∈ R∞[x] and q1 ∈ R∞[y]? We also generalize to
factorizations of the form
p(x, y) =r∑
j=1
pj(x)qj(y),
where pj ∈ R∞[x] and qj ∈ R∞[y]. Such functions are called
functions of finite rank.We then go on to determine the domain of
convergence of such functions and explorethe relationship between
the rank and the domain of convergence.
2Bourgain, J, A Problem of Douglas and Rudin, Pacific Journal of
Mathematics, (1986), 121,pp. 47-50.
-
4
2. Factorization of polynomials in two dimensions
We begin our examination of factorizing polynomials with some
basic definitions.
Definition 2.1. The degree of a polynomial, p(x, y) = a0,0 + ...
+ am,nxmyn is thehighest power of x or y with non-zero
coefficients.
Example 2.2.
(1) deg(2 + x + x2) = 2(2) deg(2 + x2y + xy) = 2(3) deg(x6y2 +
y4x5) = 6
The notation in Definition 2.1 is not quite standard since
others define the degreeto be the highest number m + n. So for
example, others define the degree ofp(x, y) = xy as 2, while we
define it to be one. This non-standard notation willmake things
easier for us later on.
Definition 2.3. We say a polynomial p(x, y) is of rank one if
and only if p(x, y) =p1(x)p2(y), where p1, p2 are polynomials.
Definition 2.4. For a polynomial
p(x, y) =∑
0≤i,j≤n
ai,jxiyj
of degree n we define the coefficient matrix C(p) to be the
(n+1)× (n+1) matrixwhose i, jth entry is ai,j . Hence,
C(p) :=
a0,0 · · · a0,n... ... ...an,0 · · · an,n
.Recall from basic linear algebra, that the set of linear
combinations of the column
vectors of a n×m matrix A, is called the column space of A.
Similarly, the set ofall linear combinations of the row vectors is
called the row space of A. The rank ofa matrix, is the dimension of
the column or row spaces, which, by a well knownfact from linear
algebra are the same3.
Theorem 2.5. A polynomial p(x, y) is of rank one if and only if
C(p) is of rankone.
Proof. Suppose, p(x, y) = p1(x)p2(y). Then, since the function p
is of degree n, thefunctions p1 and p2 must each be of at most
degree n. Hence, p1 =
∑ni=0 bix
i andp2 =
∑nj=0 cjy
j for some constants bi and cj where i, j ∈ {0, 1, · · · , n}.
Hence, ifthis factorization is possible, we write:
3Strong, G., Introduction to Linear Algebra, 1998, Wellesley
Cambridge, Boston.
-
5
p(x, y) =∑
0≤i,j≤n
ai,jxiyj
.= p1(x)p2(y)
= (n∑
i=0
bixi)(
n∑j=0
cjyj)
=∑
0≤i,j≤n
bicjxiyj
For .= to be satisfied and hence for factorization to be
possible, it is clear that wemust set the coefficients of x and y
in p(x, y) equal to the coefficients of p1(x)p2(y).Hence,
a0,0 = b0c0a1,0 = b1c0
... ...
an,n = bncn
Or, simplifying this notation with matrices we can write: a0,0 ·
· · a0,n... ... ...an,0 · · · an,n
= b0c0 · · · b0cn... ... ...
bnc0 · · · bncn
Notice however, that each subsequent column in b0c0 · · ·
b0cn... ... ...
bnc0 · · · bncn
is a multiple of the first column; that is, the matrix is of
rank one.The above argument can be reversed and so we can conclude
that p is of rank oneexactly when the corresponding coefficient
matrix, C(p), is of rank one. �
Let us now try a simple example to demonstrate this theorem.
Example 2.6. Factor the polynomial,
p(x, y) = 8 + 12y + 16y2 − 4x− 6xy − 8xy2 + 6x2 + 9x2y +
12x2y2
as p(x, y) = p1(x)p2(y), if possible.
First, let us write down the coefficient matrix of the above
polynomial.
C =
8 12 16−4 −6 −86 9 12
(Notice that deg(p) = 2, and so C(p) is a 3 × 3 matrix). Next,
we column reduce
-
6
the matrix C to,
Ccr =
1 0 0− 12 0 034 0 0
But this means that C is a rank one matrix. So, according to the
previous theoremwe can factor the polynomial p(x, y).
Notice, that since the matrix C is of rank one, each column in
that matrix canbe written as a product of a constant and the basis
column vector, in the followingfashion: 8 12 16−4 −6 −8
6 9 12
= 8(1) 12(1) 16(1)8(− 12 ) 12(− 12 ) 16(− 12 )
8( 34 ) 12(34 ) 16(
34 )
But we have already seen this in the previous theorem: a0,0 a0,1
a0,2a1,0 a1,1 a1,2
a2,0 a2,1 a2,2
= b0c0 b0c1 b0c2b1c0 b1c1 b1c2
b2c0 b2c1 b2c2
We can thus use this to solve for for the bi’s and ci’s This
means that:
b0 = 1 c0 = 8
b1 = −12
c1 = 12
b2 =34
c2 = 16
Finally using these values we can come up with the factorized
expression,
p(x, y) = 8 + 12y + 16y2 − 4x− 6xy − 8y2
+6x2 + 9x2 + 12x2y2
= (b0 + b1x + b2x2)(c0 + c1y + c2y2)
= (1− 12x +
34x2)(8 + 12y + 16y2)
The following is an alternate characterization of rank one
polynomials involvingpartial differential equations.
Theorem 2.7. Suppose p(x, y) has no zeros. Then p is rank one if
and only if
∂y
(∂xp
p
)= 0
Proof. (⇒): Suppose,
p(x, y) = p1(x)p2(y).
-
7
Then,∂xp
p=
p′1(x)p2(y)p1(x)p2(y)
=p′1(x)p1(x)
.
Hence substituting p′1(x)/p1(x) for ∂xp/p,
∂y
(∂xp
p
)= ∂y
(p′1(x)p1(x)
)= 0
and so
∂y
(∂xp
p
)= 0.
(⇐): If
∂y
(∂xp
p
)= 0
Integrating the above equation in terms of y, we obtain:∂xp
p= q(x)
for some function q. Next, we integrate the above expression in
terms of x.∫∂xp
pdx =
∫q(x)dx
And so,
ln p(x, y) = G(x) + H(y)Where, G′(x) = q(x).
Exponentiating both sides of this expression we get,
p(x, y) = eG(x)+H(y)
Hence
p(x, y) = eG(x)eH(y)
But this is only,
p(x, y) = p1(x)p2(y)
where p1(x) = eG(x) and p2(y) = eH(y). Note that since p(x, y)
is a polynomial,then p1 and p2 are also polynomials. But this means
that p(x, y) is of rank 1. �
Example 2.8. Consider the polynomial p(x, y) = −1− 3x− x2 + y +
3xy + x2y +2y2 + 6xy2 + 2x2y2. Note that,
∂xp
p=
−3− 2x + 3y + 2xy + 6y2 + 4xy2
−1− 3x− x2 + y + 3xy + x2y + 2y2 + 6xy2 + 2x2y2After some
calculation,
-
8
∂xp
p=
−3− 2x + 3y + 2xy + 6y2 + 4xy2
−1− 3x− x2 + y + 3xy + x2y + 2y2 + 6xy2 + 2x2y2
=(3 + 2x)(1 + y)(−1 + 2y)
(1 + 3x + x2)(1 + y)(−1 + 2y)
=(3 + 2x)
(1 + 3x + x2)
So now, we see that ∂xp/p is a function only in terms of x.
Hence,
∂y
(∂xp
p
)= 0
According to our theorem, p(x, y) is of rank one.
Is it possible to say something about polynomials of rank not
equal one? Is itpossible to factor such polynomials and if so, in
what way? In order to answer thesequestions, we define the general
rank of a polynomial.
Definition 2.9. We say that a polynomial p is of rank r if and
only if there exitsan integer r > 0 such that,
p(x, y) =r∑
i=1
fi(x)gi(y),
where fi,gi are polynomials, and if r is the smallest integer to
satisfy this condition.
To see why we require r to be the smallest integer to satisfy
the above condition,consider the following example.
Example 2.10. Let p(x, y) = xy − xy. Notice that if we do not
require r to beminimum, the rank of p is 2. In fact, if we try to
loosen the definition, the rank ofp can be any positive integer
just by adding (xy − xy) = 0 to p. The correct wayto calculate the
rank of p, is to notice that p(x, y) = 0 and hence that the rank
ofp is zero.
Notice also, that this definition of rank, extends and
complements the earlierdefinition of a rank one polynomial.
Theorem 2.11. A polynomial p(x, y) is of rank r if and only if
the correspondingcoefficient matrix, C(p), is of rank r.
Proof. If r = 1,p(x, y) = p1(x)q1(y)
This however is simply the rank one case, which we have proved
above. Therefore,we assume that r ≥ 2.
(⇐): We assume that C, the coefficient matrix of p(x, y), is of
rank r. By thedefinition of the rank of a matrix, C must have r
basis vectors - that is, each columnof the matrix C, taken as a
vector, is a linear combination of given r column vectors.Let us
assume, that these r column vectors are:
{u1, u1, · · · , ur}.
-
9
From the definition of the rank of a matrix, we know that C can
be written as:
C =(
c11u1 + · · ·+ cr1ur | · · · | c1nu1 + · · ·+ crnur)
where each entry represents column in C and n indicates the
number of columnsin the coefficient matrix and the ci’s are real
numbers.
Notice now that C can be separated in the following way
C =(
c11u1 + · · ·+ cr1ur | · · · | c1nu1 + · · ·+ crnur)
=(
c11u1 | · · · | c1nu1)
+ · · ·+(
cr1ur | · · · | crnur)
=r∑
i=1
(ci1ui | · · · | cinui
)Each
(ci1ui | · · · | cinui
)is a rank one matrix, since each column is a
linear combination of the first column. Hence, we use each rank
one matrix,(ci1ui | · · · | cinui
), and Theorem 2.5 to conclude that that matrix cor-
responds to a polynomial that can be factored into the product
of one variablepolynomials, pi(x)qi(y). But since C is a sum of r
such matrices, we conclude that,
p(x, y) =r∑
i=1
pi(x)qi(y).
Since r is minimal, then p is of rank r.(⇒): The proof in this
direction is almost identical to the one presented above.
We begin by assuming,
p(x, y) =r∑
i=1
pi(x)qi(y).
Since p is a sum of r rank 1 polynomials, each of those
polynomials can be rep-resented by a rank one matrix. This means
that the coefficient matrix C, can bewritten as a sum of r rank one
matrices. We write,
C =r∑
i=1
(ci1ui | · · · | cinui
)=
(c11u1 | · · · | c1nu1
)+ · · ·+
(cr1ur | · · · | crnur
)=
(c11u1 + · · ·+ cr1ur | · · · | c1nu1 + · · ·+ crnur
),
where the ui’s are simply basis vectors and ci’s are
constants.Each column of C is hence a linear combination of the r
basis vectors,
{u1, u1, · · · , ur}.
Hence C is a rank r matrix. �
Let us now consider an example of the above theorem.
Example 2.12. Let p(x, y) = 1 + y2 + xy + xy2. Let’s use the
above theorem to
-
10
(1) find its coefficient matrix C(2) find the basis vectors of
C(3) find the rank of p(x, y)(4) find p(x, y)’s factorization.
So,(1)
C =
1 0 10 1 10 0 0
(2) Notice, that C can be written as,
C =
1 0 10 0 00 0 0
+ 0 0 00 1 1
0 0 0
It is now easy to see, that the (column) basis vector of the
first matrix is 10
0
and that of the second matrix is 01
0
. Hence, the basis for thecolumn space of C is, {
100
, 01
0
}(3) Since the basis contains two vectors, we know that the rank
of C (and hence
of p) is 2.
(4) To solve this, either use Theorem 2.5, or - due to the
easiness of the exam-ple - simply regroup p(x, y). So,
p(x, y) = (1 + y2) + x(y + y2)
We leave this section with an open question. Recall that a
non-zero polynomial,p(x, y), is of rank one if and only if
∂y
(∂xp
p
)= 0.
Is there a partial differential equation that if satisfied, is
necessary and sufficientfor a polynomial to be of rank r?
-
11
3. An Introduction to Real Analytic Function
We now wish to generalize the above results for polynomials,
p(x, y), to realanalytic functions
f(x, y) =∞∑
m,n=0
am,nxmyn.
Before doing this however, we need to address the issue of
convergence of theseexpressions. We begin with some standard
results of single variable real analyticfunctions4.
The formal expression,∞∑
j=0
aj(x− α)j
with either real or complex constant aj ’s, is called a power
series on the real lineR. We usually take the coefficients aj to be
real and there is no loss of generality indoing so. Before much can
be done with this function, it is necessary to determinethe nature
of the set on which the power series converges.
Proposition 3.1. Assume that the power series∞∑
j=0
aj(x− α)j
converges at the value x = c. Let R = |c−α|. Then the series
converges uniformlyand absolutely on compact subsets of I = {x :
|x− α| < R}.
Proof. We may take the compact subset of I to be K = [α − s, α +
s] for somenumber 0 < s < r. For x ∈ K it then holds that
∞∑j=0
∣∣aj(x− α)j∣∣ = ∞∑j=0
∣∣aj(c− α)j∣∣ · ∣∣∣∣x− αc− α∣∣∣∣j ·
In the sum on the right, the first expression in absolute values
is bounded by someconstant C (by the convergence hypothesis). The
quotient in absolute values is notbigger than L = sr < 1. The
series on the right is thus dominated by
∞∑j=0
C · Lj ·
Hence, this geometric series converges. So, by the Weierstrass
M-test5, the originalseries converges absolutely and uniformly on
K. �
The above theorem says that the domain of convergence of a power
series mustbe an interval. This interval may be bounded, as in the
power series
∑xn, or
unbounded, as in the power series∑
xn/n!.
Definition 3.2. The set on which∞∑
j=0
aj(x− α)j
4Krantz, S. G. and Parks, H. R., A primer of real analytic
functions. Second edition. Birkhuser
Advanced Texts: Basel Textbooks, Boston.5Gaughan, E.,
Introduction to Analysis, 1998, Brooks-Cole, New York.
-
12
converges is an interval centered about α. This interval is
termed the interval ofconvergence. The series will converge
absolutely and uniformly on compact subsetsof the interval of
convergence. The radius of convergence of the interval is definedto
be half its length.
We remind the reader of the following useful theorem of
Hadamard6.
Theorem 3.3. If R is the radius of convergence of∞∑
n=0
an(x− α)n,
then,
R =1
lim supn→∞ |an|1n
.
Whether convergence holds at the end points of the interval of
convergence needsto be determined on case by case basis.
Definition 3.4. A function f , where f : U ⊆ R → R, is said to
be real analytic atα if the function f may be expressed as a power
series on some interval of positiveradius centered at α:
f(x) =∞∑
j=0
aj(x− α)j .
We say the function is real analytic on V ⊆ U if it is real
analytic at each α ∈ V .
Next, we present without proof the basic properties of real
analytic functions.
Proposition 3.5. Let,
f(x) =∞∑
j=0
aj(x− α)j
and
g(x) =∞∑
j=0
bj(x− α)j
be two power series defining the functions f(x) and g(x) on the
open intervals ofconvergence C1 and C2 respectively. Then, on their
common domain, C = C1
⋂C2,
it holds that,(1) f(x)± g(x) =
∑∞j=0(aj ± bj)(x− α)j
(2) f(x) · g(x) =∑∞
m=0
∑j+k=m(aj · bk)(x− α)m
(3) If g 6= 0 on C, ∃ an h(x) on C such that, h(x) = f(x)g(x)
=∑∞
j=0 dj(x− α)j,for some constants dj.
Proposition 3.6. Let∞∑
j=0
aj(x− α)j
be a power series with open interval of convergence C. Let f(x)
be the functiondefined by the series on the interval C. Then, the
function f is continuous and hascontinuous, real analytic
derivatives of all orders at α.
6Saff, E.B. and Snider, A.D.,Fundamentals of Complex Analysis
with Applications to Engi-neering, Science, and Mathematics, 1993,
Prentice-Hall, New Jersey.
-
13
Using this proposition, it is easy to show that a real analytic
function has aunique power series representation:
Corollary 3.7. If the function f is written as a convergent
power series on a giveninterval of positive radius centered at
α,
f(x) =∞∑
j=0
aj(x− α)j ,
then the coefficients of the power series can be obtained from
the derivatives of thefunction by
an =fn(α)
n!.
Proof. To obtain this result, simply differentiate both sides of
the above equationn-times and evaluate at α. Differentiation is
possible by the previous proposition.
�
We pause for a moment to point out that although real analytic
functions areinfinitely differentiable, the converse is not true.
For example, the function
f(x) ={
e−1
x2 , x 6= 0;0, x = 0.
is infinitely differentiable. It is somewhat technical to show
that f (h)(0) exists forall h, but nevertheless it can be done.
Moreover,
f (h)(0) = 0 ∀h ≥ 0and so the power series of f about x = 0 is
just zero. Thus, f does not equal itspower series about x = 0 and
so f is not real analytic - although it is
infinitelydifferentiable. The real analytic functions are indeed a
very special class of function.
The term real analytic comes from the fact that if the power
series of
f(x) =∑
an(x− α)n
converges on (α−R,α + R), then the series
f(z) =∑
an(z − α)n,
where z = x + ıy is a complex variable converges and is an
analytic function onthe ball {z ∈ C : |z − α| < R}. Conversely
if f is any convergent function, with apower series which converges
on {z ∈ C : |z − α| < R} when α ∈ R, then f(x) isa real analytic
function with power series converging on (α − R,α + R). We shallsay
more about this in later sections.
We now talk about real analytic functions of several variables.
In order to gener-alize the power series to higher dimensions, we
introduce the multi-index notation.
Definition 3.8. A multi-index µ, is an m-tuple (µ1, µ2, · · · ,
µm) of non-negativeintegers. We write,
Λ(m) = N× · · · × N,or alternatively,
Λ(m) = (N)m.
-
14
Definition 3.9. For
µ = (µ1, µ2, · · · , µm) ∈ Λ(m)and
x = (x1, x2, · · · , xm) ∈ Rm,we define the following
operations,
µ! = µ1!µ2! · · ·µm!,
|µ| = µ1 + µ2 + · · ·+ µm,
xµ = xµ11 xµ22 · · ·xµmm ,
|xµ| = |x1|µ1 |x2|µ2 · · · |xm|µm ,∂µ
∂xµ=
∂µ1
∂xµ11
∂µ2
∂xµ22· · · ∂
µm
∂xµmm
And for,
µ = (µ1, µ2, . . . , µm) ∈ Λ(m)
and
ν = (ν1, ν2, . . . , νm) ∈ Λ(m),
we write,µ ≤ ν
if µj ≤ νj for j = 1, 2, . . . ,m.
Definition 3.10. The formal expression∑µ∈Λ(m)
aµ(x− α)µ,
with α ∈ Rn and aµ ∈ R for each µ, is called a power series in m
variables.
Definition 3.11. The power series∑µ∈Λ(m)
aµ(x− α)µ
is said to converge at x if there is a function φ : Z+ → Λ(m)
which is one-to-oneand onto such that the series
∞∑j=0
aφ(j)(x− α)φ(j)
converges.
-
15
Definition 3.12. For a fixed power series∑
µ aµ(x− α)µ, we set
C =⋃r>0
{x ∈ Rm :∑
µ
|aµ(y − α)µ| < ∞, all |y − x| < r}
This set is called the domain of convergence.
Definition 3.13. We say that a function f : U ⊂ Rm → R is called
real analyticif for each α ∈ U , the function f may be represented
by a convergent power seriesin some neighborhood of α.
In a similar fashion to Proposition 3.5, it is relatively simple
to prove the follow-ing:
Proposition 3.14. Let U, V ⊂ Rm be open. If f : U → R and g : V
→ R are realanalytic, then f ± g, f · g are real analytic on U ∩ V
, and f/g is real analytic onU ∩ V ∩ {x : g(x) 6= 0}.Lemma 3.15.
Suppose that g : (−a, a) → R is a real analytic function and g(y) =
0on an open interval I ⊆ (−a, a) then, g ≡ 0.Proof. If g is real
analytic on (−a, a), then we can write g(x) =
∑∞n=0 anx
n on(−a, a). Hence, we can say that g(z) :=
∑∞n=0 anz
n is analytic on {|z| < a}. Thehypothesis states that the
zeroes of g have an accumulation point on I and so, sincethe zeroes
of an analytic function cannot have an accumulation point, g must
beidentically zero. �
Real analytic functions of one variable have domains of
convergence equal tointervals. For several variable functions, the
geometry of the domain of convergenceis more complicated.
Definition 3.16. We say a set G in linear space, is called
convex if for any twopoints x, y ∈ G, each point z = λx + (1− λ)y,
for 0 < λ < 1, also belongs to G.Example 3.17. A subset of R2
in the shape of pentagon is convex, whereas asubset of R3 in the
shape of a donut is not.
Definition 3.18. For a set G ⊂ Rm, we define log ‖ G ‖ as,log ‖
G ‖= {(log |g1|, · · · , log |gm|) : g = (g1, · · · , gm) ∈
SG}.
The set G is said to be logarithmically convex if log ‖ G ‖ is a
convex subset of ofRm.
Before proving the next theorem, let us establish some
facts.
Remark 3.19. For a fixed power series∑
µ aµ(x − α)µ, we denote by B, the setof points x ∈ Rm where
∑µ |aµ||x − α|µ is bounded. It is clear that if the power
series converges at a point x, then x ∈ B. Furthermore, using a
result calledAbel’s Lemma 7, it is possible to show that Int(B) =
C, where C, is the domainof convergence of the power series. This
information allows us to say somethingabout the shape of the domain
of convergence of a power series.
Theorem 3.20. For a power series∑
µ aµxµ, the domain of convergence C is
logarithmically convex.
7Krantz, S. G. and Parks, H. R., A primer of real analytic
functions. Second edition. BirkhuserAdvanced Texts: Basel
Textbooks, Boston.
-
16
Proof. Fix two points y = (y1, · · · , ym) ∈ C, z = (z1, · · · ,
zm) ∈ C and also let0 ≤ λ ≤ 1. Now, by the above remark, y ∈ C
means that y ∈ Int(B). Hence, bythe definition of an open set, for
some � > 0, (|y1|+ �, · · · , |ym|+ �) ∈ B. But, bythe above
remark, this means that there exists some constant L, such
that,
|aµ||(|y1|+ �, · · · , |ym|+ �)| ≤ L.
Simplifying and rewriting, this becomes,
aµ ≤L∏m
j=1(|yj |+ �)µj.
By the same process, we can replace � by a smaller positive
number and L by alarger number if necessary. Hence, without
changing notation, we also have
aµ ≤L∏m
j=1(|zj |+ �)µj.
Notice, that because we fixed y, z and λ, we can choose �′ >
0, such that thefollowing two expressions hold for j = 1, · · ·
,m
(|yj |+ �)λ ≥ |yj |λ + �′
and(|zj |+ �)1−λ ≥ |zj |1−λ + �′.
Then, we can choose σ > 0 so that for j = 1, · · · ,m,
(|yj |λ + �′)(|zj |1−λ + �′) ≥ |yj |λ|xj |1−λ + σ,
holds. Putting these facts together, we conclude that,
|aµ| = |aµ|λ|aµ|1−λ ≤L∏m
j=1(|yj |λ|xj |1−λ + σ)µj.
Thus, (|y1|λ|z1|1−λ, · · · , |ym|λ|zm|1−λ) ∈ Int(B) = C, or
equivalently
λ(log |y1|, · · · , |ym|) + (1− λ)(log(z1), · · · , log |zm|) ∈
log ||C||.
That is, the domain of convergence, C, is logarithmically
convex.�
Example 3.21. Show that a square in R2, S = {(x, y) : a < x
< b, c < y < d} islogarithmically convex for some a, b
> 0.
We want to show that
log ‖ S ‖= {(u, v) = (log |x|, log |y|) : (x, y) ∈ S},
is convex. Knowing that a < x < b and c < y < d, we
can write
log |a| < log |x| < log |b|
andlog |c| < log |y| < log |d|.
But this is justlog |a| < u < log |b|
andlog |c| < v < log |d|.
-
17
These restrictions define a square bounded by log |a|, log |b|,
log |c| and log |d| in theu-v plane. But this means that log ‖ S ‖
is convex and hence S is logarithmicallyconvex.
Example 3.22. Show that the domain of convergence of the power
series∑∞
n=0(xy)n =
11−xy , defined on |xy| < 1, is logarithmically convex.
Notice that any point within this domain will satisfy the
inequality |xy| < 1.Hence the domain of convergence for the
above power series is
S = {(x, y) : |xy| < 1}.We want to show that
log ‖ S ‖= {(u, v) = (log |x|, log |y|) : (x, y) ∈ S},is convex.
Knowing that |xy| < 1, we write
log |xy| < log |1|,which is simply,
log |x|+ log |y| < 0.Using the u-v notation we rewrite this
as
v < −u.This however, clearly implies that log ‖ S ‖ is convex
(since the domain is simplythe area under the v = −u curve, in the
u-v plane). Hence, S is logarithmicallyconvex. As a side note, it
is important to remember however, that even though thearea is
logarithmically convex, it is by no means convex.
With this new knowledge in tow, we now consider the
factorization of multivari-ate real analytic functions.
-
18
4. Factorization of Multivariate Real Analytic Functions
In this section, we return to examining rank one functions, with
the added con-straint of real analyticity.
Consider first, a definition of rank one, real analytic
functions.
Definition 4.1. A real analytic function with domain of
convergence C ⊆ R2,
f(x1, x2) =∑
µ
aµxµ,
where x = (x1, x2) ∈ R2 and µ ∈ Λ(2), is said to be rank one if
and only if
f(x1, x2) = g1(x1)g2(x2).
A clear question arises from this definition. Since f is
real-analytic on its domain,is it necessarily true that g1 and g2
are real analytic on their respective domains aswell? As it turns
out, this conjecture is true.
Theorem 4.2. Suppose that,
f(x1, x2) =∑
µ
aµxµ
where x = (x1, x2) ∈ R2 and µ ∈ Λ(2) has domain of convergence C
and
f(x1, x2) = g1(x1)g2(x2),
then, C = (−a1, a1) × (−a2, a2), and g1(x1), g2(x2) are real
analytic on (−a1, a1)and (−a2, a2) respectively.
Proof. Expanding f , notice that
f(x1, x2) =∞∑
|µ|=0
aµxµ11 x
µ22
Now, choose a constant c, such that−→P = (x1, c) ∈ C, c 6= 0 and
gi(c) 6= 0, for
i = {1, 2}. Notice that this last fact is guaranteed by Lemma
3.15.
Next, substitute−→P into f ,
f(x1, c) =∞∑
|µ|=0
aµxµ11 c
µ2
= gi(x1)g2(c).
Finally, remembering that g2(c) 6= 0, divide both sides of
∞∑|µ|=0
aµxµ11 c
µ2 = gi(x1)g2(c)
-
19
by g2(c), to obtain:
g1(x1) =∞∑
|µ|=0
dµxµ11 ,
where dµ = aµ/g2(c).
Proceed similarly for g2. Hence, g1 and g2 are real analytic and
can be repre-sented by power series.
Since g1 and g2 can represented by an analytic power series on
x1 and x2, thefunctions converge on the interval (−a1, a1) and
(−a2, a2) respectively. Hence,f = g1(x1)g2(x2) converges on (−a1,
a1)× (−a2, a2). �
Having shown this theorem true for rank one, real analytic
polynomials, it isinteresting to see whether it will also hold for
rank two, three or even r.
Before considering these cases let us first provide a detailed
definition of rank rand let us derive two propositions that will be
helpful in our proof.
Definition 4.3. A real analytic function, with domain of
convergence C ∈ R2,
f(x1, x2) =∑
µ
aµxµ,
where x = (x1, x2) ∈ R2 and µ ∈ Λ(2), is said to be rank r if
and only if
f(x1, x2) =r∑
i=1
fi(x1)gi(x2).
Proposition 4.4. Suppose that,
f(x1, x2) =∑
µ
aµxµ
where x = (x1, x2) ∈ R2, has domain of convergence equal to C
and is rank two, i.e.
f(x1, x2) = f1(x1)g1(x2) + f2(x1)g2(x2),then, for some non-zero
constants c, d, the following hold:
(1) g1(x2) 6= cg2(x2) ∀x2,(2) the matrix, (
g1(c) g2(c)g1(d) g2(d)
)is invertible.
Proof. (1) Proceed by contradiction. Assume that g1(x2) =
cg2(x2),∀x2 andsome constant c. Then,
f(x1, x2) = f1(x1)cg2(x2) + f2(x1)g2(x2)= (g2(x2))(cf1(x1) +
f2(x2)).
-
20
Therefore f(x1, x2) is rank one. But we already know that f(x1,
x2) is ranktwo. The contradiction is reached.Therefore, g1 is not a
multiple of g2.
(2) We will prove the second part using the above fact, that g1
is not a multipleof g2. Proceed by contradiction,Suppose g1(c)g2(d)
− g1(d)g2(c) = 0, for all c and d. Choose d such thatg2(d) 6= 0.8.
Therefore, we rewrite the first statement as
g1(c)g2(d) = g1(d)g2(c),
Now, since g2(d) 6= 0, divide both sides by g2(d) to obtain
g1(c) =g1(d)g2(d)
g2(c),
for all constants c. But notice that in the above statement,
g1(d)g2(d) is justsome constant, k. Therefore,
g1(c) = kg2(c),
for all c. This is a contradiction since g1 is not a multiple of
g2. Therefore,
g1(c)g2(d)− g1(d)g2(c) 6= 0,
for some c and d. Also,
g1(c)g2(d)− g1(d)g2(c) = det∣∣∣∣ g1(c) g2(c)g1(d) g2(d)
∣∣∣∣ .Therefore,
det
∣∣∣∣ g1(c) g2(c)g1(d) g2(d)∣∣∣∣ 6= 0
and hence, (g1(c) g2(c)g1(d) g2(d)
)is invertible.
�
With this proposition in tow, we will now consider the rank two
version ofTheorem 4.2.
Theorem 4.5. Suppose f is a real analytic, rank two function
with domain ofconvergence C ∈ R2, i.e,
f(x1, x2) = f1(x1)g1(x2) + f2(x1)g2(x2),
then, C = L1 ∩ L2 where Li = (−ai, ai) × (−bi, bi) and where
fi(x) and gi(y) aredefined on (−ai, ai) and (−bi, bi)
respectively.
8To convince yourself of this fact consider the following
argument. If g2(d) = 0 for all d, thenf(x1, x2) = f1(x)g1(x2) and f
is rank one. But we know f is rank two and so, g2(d) 6= 0, forsome
d.
-
21
Proof. (Notice that we are effectively being asked to show that
fi and gi are realanalytic on their corresponding domains)
Expanding f , we obtain:
f(x1, x2) =∞∑
|µ|=0
aµxµ11 x
µ22
First, we show that f1 and f2 are real analytic. Then, by the
same process itcan be shown that g1 and g2 are real analytic.
Choose non-zero constants c and d, such that−→P1 = (x1, c) ∈
C,
−→P2 = (x1, d) ∈ C
and such that the constants c and d satisfy Lemma 3.15.
Next, substitute−→P1 into f and call the resultant function
h1(x1),
f(x1, c) =∞∑
|µ|=0
aµxµ11 c
µ2
= h1(x1)
and−→P2 into f and call that function h2(x1),
f(x1, d) =∞∑
|µ|=0
aµxµ11 d
µ2
= h2(x1).
Notice that h1(x1) and h2(x1) are real analytic. The following
system of equa-tions for h1(x1) and h2(x1) holds,
h1(x1) = f(x1, c)= f1(x1)g1(c) + f2(x1)g2(c)
h2(x1) = f(x1, d)= f1(x1)g1(d) + f2(x1)g2(d).
So, rewriting this in matrix notation, we
obtain,(h1(x1)h2(x1)
)=
(g1(c) g2(c)g1(d) g2(d)
) (f1(x1)f2(x1)
)But, by second part of the previous proposition, we know
that
(g1(c) g2(c)g1(d) g2(d)
)is invertible.
-
22
So, letting (A BC D
)=
(g1(c) g2(c)g1(d) g2(d)
)−1,
where A,B,C and D are constants, we see that,(f1(x1)f2(x1)
)=
(A BC D
) (h1(x1)h2(x1)
)=
(Ah1(x1) + Bh2(x1)Ch1(x1) + Dh2(x1)
).
Hence, we can read off the values of f1 and f2. Therefore,
f1(x1) = Ah1(x1) +Bh2(x1) and f2(x1) = Ch1(x1)+Dh2(x1). Since f1
and f2 are linear combinationsof real analytic functions (recall
that h1 and h2 are real analytic), we know that f1and f2 must be
real analytic themselves.
Similarly, g1 and g2 must be real analytic.
Since each fi and gi can represented by an analytic power series
on the xi’s, thefi’s converge on (−ai, ai) and the gi’s converge on
(−bi, bi). So, the product figi,converges on Li = (−ai, ai)×(−bi,
bi) and so f1(x1)g1(x2)+f2(x1)g2(x2) convergeson L1 ∩ L2. Hence, C
= L1 ∩ L2. �
In fact, it is possible to prove the above theorem for real
analytic polynomials ofrank r in the same manner as above (except
for tedious labelling).
Theorem 4.6. Suppose f is a real analytic, rank r function with
domain of con-vergence C ∈ R2,
f(x1, x2) = f1(x1)g1(x2) + . . . + fr(x1)gr(x2),
then, C = L1 ∩ . . . ∩ Lr, where Li = (−ai, ai) × (−bi, bi) and
fi(x) and gi(y) aredefined on (−ai, ai) and (−bi, bi)
respectively.
Remark 4.7. Notice that the above theorem tells us that if f is
real analytic andof finite rank, then the domain of convergence of
f , is a box.
Remark 4.8. Notice also, that the above theorem says that if a
rank r function isreal analytic, each component fi and gi, will
also be real analytic.
-
23
5. Factorization of Complex Valued Real Analytic Functions
A very natural question to ask after seeing the theorems in the
previous section,is whether their converses hold. In order to
answer this question we must considerextending our factorization to
complex valued functions that are analytic. Webegin with some
introductory definitions and theorems for complex functions ofone
variable.
Theorem 5.1. If f : G ⊂ C → C is C1, then f is analytic on G
if∂f = 0,
where, ∂f = 12 (∂x + ı∂y)f .
Theorem 5.2. The function f is analytic on G ⊂ C, if and only
if, at each pointa ∈ G,
f(z) =∞∑
n=0
bn(z − a)n,
for z in an open neighborhood of a for some bn.
In other words, the above theorem says that the function f is
analytic if andonly if, f can be written locally as a convergent
power series.
Theorem 5.3. For an analytic function, the coefficients bn can
be computed as,
an =fn(a)
n!Notice that an important relation does exist between analytic
and real analytic
functions. In fact, real analytic functions are nothing more
than analytic functionsrestricted to the real number line.
Let us now consider analytic functions of higher dimensions.
Definition 5.4. We define ∂zj , for zj = xj + ıyj , as
∂zj =12(∂xj + ı∂yj ),
for all j ∈ N+.
Theorem 5.5. The C ′ function f : G ⊂ Cm → C is analytic on G
if∂zj f = 0,
for all j.
Theorem 5.6. We say that a complex function f : G ⊂ Cm → C is
analytic ifand only if, for all a = (a1, · · · , am) ∈ G, z ∈ Cm
and µ ∈ Λ(m),
f(z1, · · · , zm) =∑
µ
bµ(z − a)µ
Theorem 5.7. A real analytic function f̂ : C ⊆ Rn → R is an
analytic functionf : C ′ ⊆ Cn → C with all the variables (z1, . . .
, zn) restricted to the reals.
It is exactly this fact that justifies introducing complex
variables. We will nowconsider an example that will help us shed
some light on why the converse ofTheorem 4.2 (and by generalization
Theorem 4.6) does not hold.
-
24
Example 5.8. Find the domain of convergence of f(x) = e1
1+x2 , where f(x) =∑∞n=0 anz
n.Using regular methods of determining the radii of convergence,
this problem
would pose a challenge. However, if we simply notice that f(x) =
e1
1+x2 , is therestriction to the real numbers of f(z) = e
11+z2 , the problem becomes astonishingly
easy. We recall that f(z) = e1
1+z2 , is undefined at z = ı and z = −ı. As such, thedomain of
convergence for f(z) is |z| < 1. Remembering that z = x+ ıy and
lettingy = 0, we automatically obtain the domain of convergence for
f(x), to be |x| < 1.
Having this result, we turn back to Theorem 4.2. We know from
Theorem 4.2,that a real analytic finite rank function f(x1, x2)
=
∑µ aµx
µ, converges on a boxin R2. The question thus naturally arises,
whether the opposite it true. That is, itremains to be shown that a
real analytic function whose domain of convergence isa box, is of
finite rank. Unfortunately as we shall show, this is not the
case.
Example 5.9. Let f(x, y) = exp( 11+x21
1+y2 ).First, let us make sure that this is a real analytic
function. To do this we needto show that the function f(x, y) =
exp( 11+x2
11+y2 ), is simply a restriction of the
complex function f(z, w) = exp( 11+z21
1+w2 ), where z = x+ ıt and w = y+ ıs, to thereal plane. But, by
the same reasoning as in the above example, f(z, w) is analyticon D
× D = {(z, w) ∈ C × C : |z| < 1, |w| < 1}. Hence restricting
f(z, w) ontothe real plane by setting t = 0 and s = 0, means that
f(x, y) is real analytic onB = {(x, y) ∈ R× R : |x| < 1, |y|
< 1}. Notice also that B, is simply a box in
twodimensions.Finally, we need to show that f is not of finite
rank. This procedure is doneby demonstration, rather than by strict
mathematical proof - although a proofcertainly seems to be
achievable. We use Mathematica to calculate the power seriesof f to
order 10, calling it f̂ and noting that f̂ has degree 10. Next, we
placethe coefficients of the resulting polynomial in a 11× 11
coefficient matrix. We seethat this matrix is made up of
alternating zero and non-zero columns. In line withTheorem 2.11, we
row reduce this matrix in order to find the rank. The results
areencouraging. We see that the rank of this matrix is as high as
it could be, judgingfrom the number of non-zero columns. This
matrix (and hence f̂) is of rank 6. Werepeat this operation with
power series of 15, 20, 25 and 30. Each time, we see thatthe rank
of the resultant coefficient matrix (and hence of f̂ is
rank(f̂) = (deg(f̂))mod(2) + 1,
where deg(f̂) is the degree of the given series expansion. It
seems that this equationwill be true for higher orders of f̂ . If
the equation is indeed true, the following holds,
limdeg(f̂)→∞
rank(f̂) = ∞.
However, note that as deg(f̂) →∞, the approximation f̂ ≈ f ,
becomes exact. Thatis, as deg(f̂) →∞, f̂ = f , and so the rank of f
becomes infinite.Hence f(x, y) is a function that converges on a
box and is not of finite rank, dis-proving the converse of Theorem
4.2
-
25
6. Factorization of polynomials in three dimensions
The following section examines the factorization of polynomials
in three un-knowns.
Before any new theorems can be proven, consider the following
definitions.
Definition 6.1. We say a polynomial p(x1, x2, x3) is of rank one
iff p(x1, x2, x3) =p1(x1)p2(x2)p3(x3), where p1, p2, p3 are
polynomials.
This is analogous to the two dimensional definition.
Definition 6.2. A Mn×n×n grid is a n × n × n three dimensional
matrix, wherean entry is written as Ai,j,k, for 1 ≤ i, j, k ≤
n.
Definition 6.3. The ith face or slice of M , the grid defined by
Ai,j,k for 1 ≤i, j, k ≤ n, is the matrix Fj,k, defined by Fj,k :=
Ai,j,k .
Definition 6.4. The jth layer of M , the grid defined by Ai,j,k
for 1 ≤ i, j, k ≤ n,is the matrix Li,k, defined by Li,k := Ai,j,k
.
Let us attempt to visualize this concept in the following
example.
Example 6.5. Consider a 2× 2× 2 grid M, defined by the
following:A1,1,1 = 1, A1,1,2 = 2, A1,2,1 = 2, A1,2,2 = 4, A2,1,1 =
3, A2,1,2 = 6, A2,2,1 = 6, A2,2,2 = 12
The first face (or slice) of this grid is the matrix(
1 22 4
)and the second face
is,(
3 66 12
). Furthermore, notice that the first layer of this grid is the
matrix(
1 23 6
)and the second layer is,
(2 46 12
). Drawing this grid, can also be
helpful.
Notice that it is possible to represent a three dimensional
polynomial by a gridin a similar manner to the way the coefficients
of the two-dimensional polynomialwere written into a matrix.
Definition 6.6. The degree of a polynomial, p(x1, x2, x3) =
a0,0+...+am,nxk1xl2x
m3
is the highest power of x1, x2 or x3 with non-zero
coefficients.
Definition 6.7. For a polynomial
p(x1, x2, x3) =∑
0≤i,j,k≤n
ai,j,kxi1x
j2x
k3
of degree n, we define the coefficient grid, C(p) to be the
(n+1)× (n+1)× (n+1)grid whose i, j, kth entry is ai,j,k. The ith
face of this grid is,
Ci(p) :=
ai,0,0 · · · ai,0,n... ... ...ai,n,0 · · · ai,n,n
.
Definition 6.8. A 3 × 3 × 3 coefficient grid, M , is said to be
rank one iff thecorresponding polynomial is rank one.
-
26
In two dimensions it was easy to show that a rank one
coefficient matrix meantthat the corresponding polynomial was rank
one. In three dimensions, this isslightly more challenging. After
all, we knew how to compute the rank of a matrix.How is it possible
to find the rank of a grid? The following theorem answers
thesequestions.
Theorem 6.9. For a grid G ∈ M3×3×3, the following are
equivalent(1) G is rank one.(2) The polynomial p(x1, x2, x3) that
corresponds to grid G is of rank one.(3) If F1, F2 and F3 are the
faces of the grid, then Rank[
(F1 | F2 | F3
)] =
1 and Rank[(
F1 | F2 | F3)T ] = 1.
(4) Every face and every layer of G is rank one.
As a direct consequence of the definition, the first two
statements are equivalent.It is also very easy to show that the
last two statements are equivalent. Hence,all that remains to be
proven, is the equivalency between the second and the
thirdstatements.
Proof. (⇒) : We assume that the polynomial p(x1, x2, x3) is rank
one. Hence,
p(x1, x2, x3) =∑
0≤i,j,k≤n
di,j,kxi1x
j2x
k3
= p1(x1)p2(x2)p3(x3)
= (n∑
i=0
aixi1)(
n∑j=0
bjxj2)(
n∑k=0
cjxj3)
=∑
0≤i,j,k≤n
aibjckxi1x
j2x
k3
For notational clarity, we assume that n = 2. Now, setting
coefficients equivalentand writing down the ith face of the 3× 3× 3
coefficient grid, we obtain
Fi =
di,0,0 di,0,1 di,0,2di,1,0 di,1,1 di,1,2di,2,0 di,2,1 di,2,2
=
aib0c0 aib0c1 aib0c2aib1c0 aib1c1 aib1c2aib2c0 aib2c1 aib2c2
for 0 ≤ i ≤ 2.Clearly Fi is of rank one. Notice now, that
F1−F2−F3
=
a0b0c0 a0b0c1 a0b0c2a0b1c0 a0b1c1 a0b1c2a0b2c0 a0b2c1
a0b2c2a1b0c0 a1b0c1 a1b0c2a1b1c0 a1b1c1 a1b1c2a1b2c0 a1b2c1
a1b2c2a2b0c0 a2b0c1 a2b0c2a2b1c0 a2b1c1 a2b1c2a2b2c0 a2b2c1
a2b2c2
-
27
So, it is clear that Rank[(
F1 | F2 | F3)T ] = 1.
Also notice that,(F1 | F2 | F3
)= a0b0c0 a0b0c1 a0b0c2 a1b0c0 a1b0c1 a1b0c2 a2b0c0 a2b0c1
a2b0c2a0b1c0 a0b1c1 a0b1c2 a1b1c0 a1b1c1 a1b1c2 a2b1c0 a2b1c1
a2b1c2
a0b2c0 a0b2c1 a0b2c2 a1b2c0 a1b2c1 a1b2c2 a2b2c0 a2b2c1
a2b2c2
It is clear that Rank[
(F1 | F2 | F3
)] = 1.
(⇒) : Assume that Rank[(
F1 | F2 | F3)] = 1 and Rank[
(F1 | F2 | F3
)T ] =1.
Since Rank[(
F1 | F2 | F3)T ] = 1, there exist constants ai, l, m for 0 ≤
i ≤ 8 such that the following equality must hold,
F1−F2−F3
=
a0 la0 ma0a1 la1 ma1a2 la2 ma2−− −− −−a3 la3 ma3a4 la4 ma4a5 la5
ma5−− −− −−a6 la6 ma6a7 la7 ma7a8 la8 ma8
.
But this means that,
(F1 | F2 | F3
)=
a0 la0 ma0 a3 la3 ma3 a6 la6 ma6a1 la1 ma1 a4 la4 ma4 a7 la7
ma7a2 la2 ma2 a7 la7 ma7 a8 la8 ma8
But we also know, that Rank[(
F1 | F2 | F3)] = 1. Hence, it must be true
that for some constants n and h, where n 6= h 6= 0, the
following equalities hold
a3 = na0
a4 = na1a5 = na2
and,a6 = ha0a7 = ha1a8 = ha2
-
28
So, rewrite the faces F1, F2 and F3 of the coefficient grid
taking these equalitiesinto account.
F1 =
a0 la0 ma0a1 la1 ma1a2 la2 ma2
F2 =
na0 lna0 mna0na1 lna1 mna1na2 lna2 mna2
F2 =
ha0 lha0 mha0ha1 lha1 mha1ha2 lha2 mha2
Now, rewriting the grid into its polynomial representation, we
get:
p(x1, x2, x3) =a0+na0x1+ha0x21+a1x2+na1x1x2+ha1x
21x2+a2x
22+na2x1x
22+ha2x
21x
22+la0x3+
lna0x1x3+hla0x21x3+ la1x2x3+ lna1x1x2x3+hla1x21x2x3+ la2x
22x3+ lna2x1x
22x3+
hla2x21x
22x3+ma0x
23++mna0x1x
23+hma0x
21x
23+ma1x2x
23+mna1x1x2x
23+hma1x
21x2x
23+
ma2x22x
23 + mna2x1x
22x
23 + hmna2x
21x
22x
23
= (1 + nx1 + hx21)(1 +a1a0
x2 + a2a0 x22)(a0 + la0x3 + ma0x
23)
Therefore, the polynomial is of rank one.
Let us now consider polynomials of dimension three that are not
of rank one.We define a polynomial of higher rank in three
dimensions, in a similar manner tothe two dimensional case.
Definition 6.10. We say that a polynomial p is of rank r if and
only if there exitsan r such that,
p(x1, x2, x3) =r∑
i=1
fi(x1)gi(x2)hi(x3)
and if r is the smallest integer to satisfy this condition.
Notice also, that this definition of rank, extends and
complements the earlierdefinitions of rank one and rank r
polynomials.
We had to leave the following as an open question. When n = 2, a
polynomialis of rank r if and only if C(p) is of rank r. How can
this be generalized to grids?
Finally, it is quite possible and not at all conceptually
difficult to extend theabove theory for higher dimensions. Notice
however, that although the above dis-cussion forms a very neat and
convenient theory of grids, it is not at all efficient
-
29
from the computational point of view. Programs, that attempt to
factorize polyno-mials in the above manner are large, unwieldily
and slow. As such, we must developa quicker and slicker method for
factorizing multi-dimensional polynomials.
-
30
7. Algorithm and examples in two and n dimensions
As we already mentioned, the theory of grids described in the
previous sectionis difficult and unwieldily in its implementation
as a computer algorithm. In thissection we will describe an
alternative algorithm. We shall first describe computa-tional
method for polynomials of two dimensions and later we will
generalize thisto n dimensions, by a surprisingly simple
extension.
The two dimensional algorithm is based completely on Theorem
2.11 and Ex-ample 2.12. Let p(x1, x2), be an 2 dimensional, rank
one polynomial, of degree d,represented by the coefficient matric
C(p). Since, p(x1, x2) is rank one, we knowthat C(p) is rank one
and so we can factor it as
p(x1, x2) = p1(x1)p2(x2).
We factorize by the following algorithm:(1) Create coefficient
matrix C(p).(2) Column reduce this matrix and save the leading
(non-zero) column as the
vector−→β . (These will be the coefficients of p1(x1)).
(3) Save the top row of C(p) in a vector, −→γ . These will be
the coefficients ofthe p2(x2)).
Example 2.12 follows this algorithm exactly.
In three and more dimensions, we simply use the above algorithm
recursively,holding the other variables constant. Let us first
consider an example before lookingat the algorithm itself.
Example 7.1. Let p be the following rank polynomial
p(x1, x2, x3) = 1 + 3 x1 + 2 x2 + 6 x1 x2 + 2 x3 + 6 x1 x3 + 4
x2 x3 + 12 x1 x2 x3.
We wish to factorize this polynomial as,
p(x1, x2, x3) = p1(x1)p2(x2)p3(x3).
First, we want to express this polynomial in matrix form.
However, we do not wishto use the complicated grid system. As such,
let us simply assume for the momentthat this polynomial is a
function in two variables, x1 and x2 and that the thirdvariable, x3
is simply a constant. With this in mind, we write down the
coefficientmatrix to be: (
1 + 2 x3 2 + 4 x33 + 6 x3 6 + 12 x3
).
So, the (1, 1) entry represents the ’constant’ term, (1, 2)
entry represents the x2term, the (2, 1) entry represents the x1
term and finally the (2, 2) entry representsthe x1x2 term. Let us
now use the above algorithm. We column reduce the abovematrix and
obtain (
1 03 0
).
Hence, our beta vector is,−→β = {1, 3}.
-
31
Next, we read off the top row of the initial coefficient matrix
to obtain the gammavector,
−→γ = {1 + 2 x3, 2 + 4 x3}.This information tells us that
p1(x1) = 1 + 3x1from the beta vector, and
q(x2, x3) = 1 + 2x3 + (2 + 4x3)x2from the gamma vector. Notice
that we have managed to reduce the problem, andthat we can now
write p(x1, x2, x3) as,
p(x1, x2, x3) = p1(x1)q(x2, x3)= (1 + 3x1)(1 + 2 x2 + 2 x3 + 4
x2 x3).
Now, to find the full factorization of p, we only have to
factorize q - a function oftwo variables. But we already know how
to do this. We once again use the abovealgorithm to obtain,
q(x1, x2) = (1 + 2 x2) (1 + 2 x3) .Putting these two facts
together, gives us the final factorization for p,
p(x1, x2, x3) = (1 + 3 x1) (1 + 2x2) (1 + 2x3) .
Hence, in general, the algorithm for n ≥ 3 dimensional
factorization of polyno-mial p(x1, . . . xn) is as follows,
(1) Create coefficient matrix C(p) in terms of x1 and x2,
holding the termsx3, . . . , xn constant.
(2) Column reduce this matrix and save the leading (non-zero)
column as thevector
−→β . (These will be the coefficients of p1(x1)).
(3) Save the top row of C(p) in a vector, −→γ . This will be the
coefficients of thenew polynomial q1(x2, . . . , xn)).
(4) Repeat steps 1 through 3 using qi as the new polynomial each
time.
The above directions describe exactly the algorithm used in the
construction of theMathematica program FactNDim, that was created
as part of this paper.
Recall that we mentioned creating an algorithm using the theory
developed in theprevious section. Although this is a far more
complicated and involved algorithmwe include it, for completeness
sake.
-
32
8. Alternative algorithm and examples three dimensions
The following section examines the Mathematica algorithm that
strictly followsfrom the grid theory proposed in one of the
previous sections. This is in no wayan efficient or optimal
algorithm. It is presented to contrast with the previousalgorithm
and to demonstrate the way the theory could be used. For an
efficientimplementation refer to the previous section.
Let p(x1, x2, x3), be an 3 dimensional, rank one polynomial, of
degree d, repre-sented by the coefficient grid G = {A1, · · · ,
Ad}. Since, p(x1, x2, x3) is of rank one,we can can factor it as
p(x1, x2, x3) = p1(x1)p2(x2)p3(x3). The following
algorithmdescribes the method used to find the coefficients of each
of the polynomials pi(xi).
(1) Create an array of length two, call this the ”answer matrix
array” or AMA.(2) In the first entry of AMA, create an array or
length d, called AMA1(3) Enter the first non-zero element of each
Ai into AMA1,i. If, an entire face
Ai is zero, enter zero into AMA1,i.(4) Find the index, j, of the
first non-zero entry of the AMA1 array.(5) Enter the matrix Aj into
the second entry of AMA called AMA2.(6) Create an array called
answer of length three. The ith entry of that array,
will contain another array that will store the coefficients of
xi.(7) Column reduce AMA1 and enter the leading column into
answer1. These
are the coefficients for x1.(8) Column reduce AMA2 and enter the
leading column into answer2. These
are the coefficients for x2.(9) Enter the first non-zero row of
AMA2 into answer3. These are the coeffi-
cients for x3.(10) The answer array contains the desired
coefficients.
We will see later that this algorithm generalizes to n
dimensions. An AMAarray will still be found. We will then column
and row reduce each of the entriesof AMA (except the last one - we
will only column reduce that one), and enterthe leading columns and
rows of the reduced matrices into an answer array. Thereare
however, some technical details that need to be explained about the
creation ofthe AMA matrix in n-dimensions, so we will hold off with
the explanation till later.
The simplest way to understand the workings of the mathematica
methods andthe above algorithm, is to examine some examples. As
such, a simple example ofthe powerful algorithm that is used to
solve the factorization problem is given. Theexample is then used
as a tool for explaining individual methods.
Example 8.1. Notice that for polynomial, p(x1, x2, x3),
p(x1, x2, x3) = 1 + 3x1 + 2 x2 + 6 x1 x2 + 2 x3 + 6 x1 x3 + 4 x2
x3 + 12 x1 x2 x3= p1(x1)p2(x2)p3(x3)= (1 + 3x1) (1 + 2x2) (1 + 2x3)
.
First recognize that this polynomial can be represented by a 2 ×
2 grid G =
{A1, Aa2}, with faces A1 =(
1 22 4
)and A2 =
(3 66 12
).
-
33
Notice that this coefficient grid is of rank one9 and hence the
polynomial it repre-sents can be factored.
Now, create a matrix (in this case an array) that contains the
first non-zero
entries of the two faces in the gird, A1 and A2. This array is
simply(
13
). Next,
in accordance with the algorithm, find the location of first
non-zero entry in thisresultant array. In this case, the first
non-zero entry is 1 and it is in the first slot
of the(
13
)array. So, according to the algorithm, single out the first
face of
the grid,(
1 22 4
). We have now found the two matrices that will determine
the
coefficients of the factored polynomials.Create an ”answer
matrix array” (AMA), that will contain both these matrices.
Hence,
AMA =
(
13
)(
1 22 4
)
We will now column and row reduce each entry in the AMA matrix.
Column
reduce the first entry of AMA. Notice, that we get the same
array,(
13
). This is
the coefficient matrix of p1(x1). Hence, we can say that p1(x1)
= (1 + 3x1). Now,
row reduce the first entry of AMA. Notice, that we get the
array,(
10
). But this
simply represents the polynomial p = 1. So we ignore its effect
on the factorization.
Continue similarly with the second matrix.
Column reduce the second entry of AMA. Notice, that we get the
matrix,(1 02 0
). This is the coefficient matrix of p2(x2). Hence, we can say
that
p2(x2) = (1 + 2x2).Notice, that according to the algorithm,
because we have reached the last entryof AMA, we do not row-reduce
the matrix, but take the first row directly, as thecoefficients of
p3(x3). As such, p3(x3) = (1 + 2x3).
To represent our answers neatly, we create an answer array. The
ith entry ofthat array, will contain another array that will store
the coefficients of xi. So,
answer =
(13
)(
12
)(
12
)
We have successfully factored the polynomial. p(x1, x2, x3).
�
9Each slice and each layer of the grid is rank one.
title_page2radek3