Top Banner
Factorization of Polynomials and Real Analytic Functions Radoslaw (Radek) Stefanski 1 Department of Mathematics & Computer Science University of Richmond April 27, 2004 1 I would like to express my deep thanks to Prof. William T. Ross for advice and guidance on this project.
35

Factorization of Polynomials and Real Analytic Functions...1. Introduction 2 2. Factorization of polynomials in two dimensions 4 3. An Introduction to Real Analytic Function 11 4.

Feb 05, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Factorization of Polynomials and RealAnalytic Functions

    Radoslaw (Radek) Stefanski 1

    Department of Mathematics & Computer ScienceUniversity of Richmond

    April 27, 2004

    1I would like to express my deep thanks to Prof. William T. Ross for adviceand guidance on this project.

  • Abstract

    In this project, we address the question: When can a polynomial p(x, y) of

    two variables be factored as p(x, y) = f(x)g(y), where f and g are polynomi-

    als of one variable. We answer this question, using linear algebra, and create

    a Mathematica program which carries out this factorization. For example,

    3+ 3x− 5x3+ y+xy− 53x3y+ y2+xy2− 5

    3x3y2 = (1+x− 5

    3x3)(3+ y+ y2).

    We then generalize this concept and ask: When can p(x, y) be written as

    p(x, y) = f1(x)g2(y) + f2(x)g2(y) + · · ·+ fr(x)gr(y),

    where fj, gj are polynomials? This can certainly be done (for large enough

    r). What is the minimum such r? Again, we have a Mathematica program

    which carries out this computation. For example,

    1+2x+x2+2x3+2y+2x2y+7xy2+7x3y2 = (1+x2)(1+2y)+(x+x3)(2+7y2).

    We generalize this further to larger number of variables (with an appropriate

    Mathematica program to carry out this computation). We then apply this

    and consider the domains of convergence of certain types of real analytic

    functions and try to relate the domain of convergence with the rank of the

    polynomial.

  • Contents

    1. Introduction 22. Factorization of polynomials in two dimensions 43. An Introduction to Real Analytic Function 114. Factorization of Multivariate Real Analytic Functions 185. Factorization of Complex Valued Real Analytic Functions 236. Factorization of polynomials in three dimensions 257. Algorithm and examples in two and n dimensions 308. Alternative algorithm and examples three dimensions 32

  • 2

    1. Introduction

    If R[x] is the set of all polynomials with real coefficients, then the standard fac-torization of a polynomial p ∈ R[x], is derived from a theorem in abstract algebra1.This theorem states that p ∈ R[x] can be written as,

    p(x) = v1(x)s1 · · · vn(x)sn ,where si ∈ N and vi(x), · · · , vn(x) are irreducible (i.e., they cannot be non-triviallyfactored further as products of elements of R[x]).

    Moreover, if R[x, y] is the set of all polynomials in x and y with real coefficients,a similar theorem from abstract algebra states that every polynomial, p ∈ R[x, y],can be written as,

    p(x, y) = v1(x, y)s1 · · · vn(x, y)sn ,where si ∈ N and vi(x, y), · · · , vn(x, y) are irreducible in the same sense as above.

    We are interested in a different type of factorization of elements in R[x, y]. Ifp ∈ R[x, y], we wish to write

    p(x, y) = f(x)g(y),

    where f ∈ R[x] and g ∈ R[y] but f and g are not necessarily irreducible in R[x]and R[y] respectively (we are not concerned about further factorization of f and g).

    Although unconventional, this form of factorization has been seen before. Theseparable (but not necessarily polynomial) solutions to certain differential or partialdifferential equations such as the Laplace equation,

    ∆u = 0

    or the wave equation,uxx − uyy = 0,

    are exactly of this formu(x, y) = f(x)g(y).

    Another factorization problem appears in Fourier analysis in the following form:Suppose that u a is bounded function on the unit circle with a complex Fourierseries,

    u ∼∞∑

    n=−∞û(n)eınθ,

    where û(n) = 12π∫ 2π0

    u(eıθ)e−ınθdθ are the Fourier coefficients of u. When can wewrite

    u = fg,where f and g are bounded on the circle, and

    f ∼∞∑

    n=0

    f̂(n)eınθ

    and

    g ∼∞∑

    n=0

    ĝ(−n)e−ınθ

    1Gallian, J., Contemporary Abstract Algebra, 1998, Houghton Mifflin College Division, Boston.

  • 3

    are bounded functions? A theorem of Bourgain2 states that this is possible if andonly if ∫ π

    0

    log |f(eıθ)|dθ > −∞.

    The above polynomial factorization can be generalized to n dimensions. If p ∈R[x1, · · · , xn], we wish to write

    p(x1, · · · , xn) = f1(x1) · · · fn(xn),where fi ∈ R[xi] but the fi’s need not be necessarily irreducible in R[xi]. We for-mulate an algorithm to factor such polynomials and create a Mathematica programto carry out the computations.

    We generalize the above factorization method by assuming p ∈ R∞[x, y], bywhich we mean the set of formal power series,

    p(x, y) =∞∑

    n,m=0

    an,mxnym.

    After reviewing some basic notions about the domains of convergence of theformal power series, we will focus on the following problem. For a given p ∈R∞[x, y], when can we write

    p(x, y) = p1(x)q1(y),

    where p1 ∈ R∞[x] and q1 ∈ R∞[y]? We also generalize to factorizations of the form

    p(x, y) =r∑

    j=1

    pj(x)qj(y),

    where pj ∈ R∞[x] and qj ∈ R∞[y]. Such functions are called functions of finite rank.We then go on to determine the domain of convergence of such functions and explorethe relationship between the rank and the domain of convergence.

    2Bourgain, J, A Problem of Douglas and Rudin, Pacific Journal of Mathematics, (1986), 121,pp. 47-50.

  • 4

    2. Factorization of polynomials in two dimensions

    We begin our examination of factorizing polynomials with some basic definitions.

    Definition 2.1. The degree of a polynomial, p(x, y) = a0,0 + ... + am,nxmyn is thehighest power of x or y with non-zero coefficients.

    Example 2.2.

    (1) deg(2 + x + x2) = 2(2) deg(2 + x2y + xy) = 2(3) deg(x6y2 + y4x5) = 6

    The notation in Definition 2.1 is not quite standard since others define the degreeto be the highest number m + n. So for example, others define the degree ofp(x, y) = xy as 2, while we define it to be one. This non-standard notation willmake things easier for us later on.

    Definition 2.3. We say a polynomial p(x, y) is of rank one if and only if p(x, y) =p1(x)p2(y), where p1, p2 are polynomials.

    Definition 2.4. For a polynomial

    p(x, y) =∑

    0≤i,j≤n

    ai,jxiyj

    of degree n we define the coefficient matrix C(p) to be the (n+1)× (n+1) matrixwhose i, jth entry is ai,j . Hence,

    C(p) :=

    a0,0 · · · a0,n... ... ...an,0 · · · an,n

    .Recall from basic linear algebra, that the set of linear combinations of the column

    vectors of a n×m matrix A, is called the column space of A. Similarly, the set ofall linear combinations of the row vectors is called the row space of A. The rank ofa matrix, is the dimension of the column or row spaces, which, by a well knownfact from linear algebra are the same3.

    Theorem 2.5. A polynomial p(x, y) is of rank one if and only if C(p) is of rankone.

    Proof. Suppose, p(x, y) = p1(x)p2(y). Then, since the function p is of degree n, thefunctions p1 and p2 must each be of at most degree n. Hence, p1 =

    ∑ni=0 bix

    i andp2 =

    ∑nj=0 cjy

    j for some constants bi and cj where i, j ∈ {0, 1, · · · , n}. Hence, ifthis factorization is possible, we write:

    3Strong, G., Introduction to Linear Algebra, 1998, Wellesley Cambridge, Boston.

  • 5

    p(x, y) =∑

    0≤i,j≤n

    ai,jxiyj

    .= p1(x)p2(y)

    = (n∑

    i=0

    bixi)(

    n∑j=0

    cjyj)

    =∑

    0≤i,j≤n

    bicjxiyj

    For .= to be satisfied and hence for factorization to be possible, it is clear that wemust set the coefficients of x and y in p(x, y) equal to the coefficients of p1(x)p2(y).Hence,

    a0,0 = b0c0a1,0 = b1c0

    ... ...

    an,n = bncn

    Or, simplifying this notation with matrices we can write: a0,0 · · · a0,n... ... ...an,0 · · · an,n

    = b0c0 · · · b0cn... ... ...

    bnc0 · · · bncn

    Notice however, that each subsequent column in b0c0 · · · b0cn... ... ...

    bnc0 · · · bncn

    is a multiple of the first column; that is, the matrix is of rank one.The above argument can be reversed and so we can conclude that p is of rank oneexactly when the corresponding coefficient matrix, C(p), is of rank one. �

    Let us now try a simple example to demonstrate this theorem.

    Example 2.6. Factor the polynomial,

    p(x, y) = 8 + 12y + 16y2 − 4x− 6xy − 8xy2 + 6x2 + 9x2y + 12x2y2

    as p(x, y) = p1(x)p2(y), if possible.

    First, let us write down the coefficient matrix of the above polynomial.

    C =

    8 12 16−4 −6 −86 9 12

    (Notice that deg(p) = 2, and so C(p) is a 3 × 3 matrix). Next, we column reduce

  • 6

    the matrix C to,

    Ccr =

    1 0 0− 12 0 034 0 0

    But this means that C is a rank one matrix. So, according to the previous theoremwe can factor the polynomial p(x, y).

    Notice, that since the matrix C is of rank one, each column in that matrix canbe written as a product of a constant and the basis column vector, in the followingfashion: 8 12 16−4 −6 −8

    6 9 12

    = 8(1) 12(1) 16(1)8(− 12 ) 12(− 12 ) 16(− 12 )

    8( 34 ) 12(34 ) 16(

    34 )

    But we have already seen this in the previous theorem: a0,0 a0,1 a0,2a1,0 a1,1 a1,2

    a2,0 a2,1 a2,2

    = b0c0 b0c1 b0c2b1c0 b1c1 b1c2

    b2c0 b2c1 b2c2

    We can thus use this to solve for for the bi’s and ci’s This means that:

    b0 = 1 c0 = 8

    b1 = −12

    c1 = 12

    b2 =34

    c2 = 16

    Finally using these values we can come up with the factorized expression,

    p(x, y) = 8 + 12y + 16y2 − 4x− 6xy − 8y2

    +6x2 + 9x2 + 12x2y2

    = (b0 + b1x + b2x2)(c0 + c1y + c2y2)

    = (1− 12x +

    34x2)(8 + 12y + 16y2)

    The following is an alternate characterization of rank one polynomials involvingpartial differential equations.

    Theorem 2.7. Suppose p(x, y) has no zeros. Then p is rank one if and only if

    ∂y

    (∂xp

    p

    )= 0

    Proof. (⇒): Suppose,

    p(x, y) = p1(x)p2(y).

  • 7

    Then,∂xp

    p=

    p′1(x)p2(y)p1(x)p2(y)

    =p′1(x)p1(x)

    .

    Hence substituting p′1(x)/p1(x) for ∂xp/p,

    ∂y

    (∂xp

    p

    )= ∂y

    (p′1(x)p1(x)

    )= 0

    and so

    ∂y

    (∂xp

    p

    )= 0.

    (⇐): If

    ∂y

    (∂xp

    p

    )= 0

    Integrating the above equation in terms of y, we obtain:∂xp

    p= q(x)

    for some function q. Next, we integrate the above expression in terms of x.∫∂xp

    pdx =

    ∫q(x)dx

    And so,

    ln p(x, y) = G(x) + H(y)Where, G′(x) = q(x).

    Exponentiating both sides of this expression we get,

    p(x, y) = eG(x)+H(y)

    Hence

    p(x, y) = eG(x)eH(y)

    But this is only,

    p(x, y) = p1(x)p2(y)

    where p1(x) = eG(x) and p2(y) = eH(y). Note that since p(x, y) is a polynomial,then p1 and p2 are also polynomials. But this means that p(x, y) is of rank 1. �

    Example 2.8. Consider the polynomial p(x, y) = −1− 3x− x2 + y + 3xy + x2y +2y2 + 6xy2 + 2x2y2. Note that,

    ∂xp

    p=

    −3− 2x + 3y + 2xy + 6y2 + 4xy2

    −1− 3x− x2 + y + 3xy + x2y + 2y2 + 6xy2 + 2x2y2After some calculation,

  • 8

    ∂xp

    p=

    −3− 2x + 3y + 2xy + 6y2 + 4xy2

    −1− 3x− x2 + y + 3xy + x2y + 2y2 + 6xy2 + 2x2y2

    =(3 + 2x)(1 + y)(−1 + 2y)

    (1 + 3x + x2)(1 + y)(−1 + 2y)

    =(3 + 2x)

    (1 + 3x + x2)

    So now, we see that ∂xp/p is a function only in terms of x. Hence,

    ∂y

    (∂xp

    p

    )= 0

    According to our theorem, p(x, y) is of rank one.

    Is it possible to say something about polynomials of rank not equal one? Is itpossible to factor such polynomials and if so, in what way? In order to answer thesequestions, we define the general rank of a polynomial.

    Definition 2.9. We say that a polynomial p is of rank r if and only if there exitsan integer r > 0 such that,

    p(x, y) =r∑

    i=1

    fi(x)gi(y),

    where fi,gi are polynomials, and if r is the smallest integer to satisfy this condition.

    To see why we require r to be the smallest integer to satisfy the above condition,consider the following example.

    Example 2.10. Let p(x, y) = xy − xy. Notice that if we do not require r to beminimum, the rank of p is 2. In fact, if we try to loosen the definition, the rank ofp can be any positive integer just by adding (xy − xy) = 0 to p. The correct wayto calculate the rank of p, is to notice that p(x, y) = 0 and hence that the rank ofp is zero.

    Notice also, that this definition of rank, extends and complements the earlierdefinition of a rank one polynomial.

    Theorem 2.11. A polynomial p(x, y) is of rank r if and only if the correspondingcoefficient matrix, C(p), is of rank r.

    Proof. If r = 1,p(x, y) = p1(x)q1(y)

    This however is simply the rank one case, which we have proved above. Therefore,we assume that r ≥ 2.

    (⇐): We assume that C, the coefficient matrix of p(x, y), is of rank r. By thedefinition of the rank of a matrix, C must have r basis vectors - that is, each columnof the matrix C, taken as a vector, is a linear combination of given r column vectors.Let us assume, that these r column vectors are:

    {u1, u1, · · · , ur}.

  • 9

    From the definition of the rank of a matrix, we know that C can be written as:

    C =(

    c11u1 + · · ·+ cr1ur | · · · | c1nu1 + · · ·+ crnur)

    where each entry represents column in C and n indicates the number of columnsin the coefficient matrix and the ci’s are real numbers.

    Notice now that C can be separated in the following way

    C =(

    c11u1 + · · ·+ cr1ur | · · · | c1nu1 + · · ·+ crnur)

    =(

    c11u1 | · · · | c1nu1)

    + · · ·+(

    cr1ur | · · · | crnur)

    =r∑

    i=1

    (ci1ui | · · · | cinui

    )Each

    (ci1ui | · · · | cinui

    )is a rank one matrix, since each column is a

    linear combination of the first column. Hence, we use each rank one matrix,(ci1ui | · · · | cinui

    ), and Theorem 2.5 to conclude that that matrix cor-

    responds to a polynomial that can be factored into the product of one variablepolynomials, pi(x)qi(y). But since C is a sum of r such matrices, we conclude that,

    p(x, y) =r∑

    i=1

    pi(x)qi(y).

    Since r is minimal, then p is of rank r.(⇒): The proof in this direction is almost identical to the one presented above.

    We begin by assuming,

    p(x, y) =r∑

    i=1

    pi(x)qi(y).

    Since p is a sum of r rank 1 polynomials, each of those polynomials can be rep-resented by a rank one matrix. This means that the coefficient matrix C, can bewritten as a sum of r rank one matrices. We write,

    C =r∑

    i=1

    (ci1ui | · · · | cinui

    )=

    (c11u1 | · · · | c1nu1

    )+ · · ·+

    (cr1ur | · · · | crnur

    )=

    (c11u1 + · · ·+ cr1ur | · · · | c1nu1 + · · ·+ crnur

    ),

    where the ui’s are simply basis vectors and ci’s are constants.Each column of C is hence a linear combination of the r basis vectors,

    {u1, u1, · · · , ur}.

    Hence C is a rank r matrix. �

    Let us now consider an example of the above theorem.

    Example 2.12. Let p(x, y) = 1 + y2 + xy + xy2. Let’s use the above theorem to

  • 10

    (1) find its coefficient matrix C(2) find the basis vectors of C(3) find the rank of p(x, y)(4) find p(x, y)’s factorization.

    So,(1)

    C =

    1 0 10 1 10 0 0

    (2) Notice, that C can be written as,

    C =

    1 0 10 0 00 0 0

    + 0 0 00 1 1

    0 0 0

    It is now easy to see, that the (column) basis vector of the first matrix is 10

    0

    and that of the second matrix is 01

    0

    . Hence, the basis for thecolumn space of C is, {

    100

    , 01

    0

    }(3) Since the basis contains two vectors, we know that the rank of C (and hence

    of p) is 2.

    (4) To solve this, either use Theorem 2.5, or - due to the easiness of the exam-ple - simply regroup p(x, y). So,

    p(x, y) = (1 + y2) + x(y + y2)

    We leave this section with an open question. Recall that a non-zero polynomial,p(x, y), is of rank one if and only if

    ∂y

    (∂xp

    p

    )= 0.

    Is there a partial differential equation that if satisfied, is necessary and sufficientfor a polynomial to be of rank r?

  • 11

    3. An Introduction to Real Analytic Function

    We now wish to generalize the above results for polynomials, p(x, y), to realanalytic functions

    f(x, y) =∞∑

    m,n=0

    am,nxmyn.

    Before doing this however, we need to address the issue of convergence of theseexpressions. We begin with some standard results of single variable real analyticfunctions4.

    The formal expression,∞∑

    j=0

    aj(x− α)j

    with either real or complex constant aj ’s, is called a power series on the real lineR. We usually take the coefficients aj to be real and there is no loss of generality indoing so. Before much can be done with this function, it is necessary to determinethe nature of the set on which the power series converges.

    Proposition 3.1. Assume that the power series∞∑

    j=0

    aj(x− α)j

    converges at the value x = c. Let R = |c−α|. Then the series converges uniformlyand absolutely on compact subsets of I = {x : |x− α| < R}.

    Proof. We may take the compact subset of I to be K = [α − s, α + s] for somenumber 0 < s < r. For x ∈ K it then holds that

    ∞∑j=0

    ∣∣aj(x− α)j∣∣ = ∞∑j=0

    ∣∣aj(c− α)j∣∣ · ∣∣∣∣x− αc− α∣∣∣∣j ·

    In the sum on the right, the first expression in absolute values is bounded by someconstant C (by the convergence hypothesis). The quotient in absolute values is notbigger than L = sr < 1. The series on the right is thus dominated by

    ∞∑j=0

    C · Lj ·

    Hence, this geometric series converges. So, by the Weierstrass M-test5, the originalseries converges absolutely and uniformly on K. �

    The above theorem says that the domain of convergence of a power series mustbe an interval. This interval may be bounded, as in the power series

    ∑xn, or

    unbounded, as in the power series∑

    xn/n!.

    Definition 3.2. The set on which∞∑

    j=0

    aj(x− α)j

    4Krantz, S. G. and Parks, H. R., A primer of real analytic functions. Second edition. Birkhuser

    Advanced Texts: Basel Textbooks, Boston.5Gaughan, E., Introduction to Analysis, 1998, Brooks-Cole, New York.

  • 12

    converges is an interval centered about α. This interval is termed the interval ofconvergence. The series will converge absolutely and uniformly on compact subsetsof the interval of convergence. The radius of convergence of the interval is definedto be half its length.

    We remind the reader of the following useful theorem of Hadamard6.

    Theorem 3.3. If R is the radius of convergence of∞∑

    n=0

    an(x− α)n,

    then,

    R =1

    lim supn→∞ |an|1n

    .

    Whether convergence holds at the end points of the interval of convergence needsto be determined on case by case basis.

    Definition 3.4. A function f , where f : U ⊆ R → R, is said to be real analytic atα if the function f may be expressed as a power series on some interval of positiveradius centered at α:

    f(x) =∞∑

    j=0

    aj(x− α)j .

    We say the function is real analytic on V ⊆ U if it is real analytic at each α ∈ V .

    Next, we present without proof the basic properties of real analytic functions.

    Proposition 3.5. Let,

    f(x) =∞∑

    j=0

    aj(x− α)j

    and

    g(x) =∞∑

    j=0

    bj(x− α)j

    be two power series defining the functions f(x) and g(x) on the open intervals ofconvergence C1 and C2 respectively. Then, on their common domain, C = C1

    ⋂C2,

    it holds that,(1) f(x)± g(x) =

    ∑∞j=0(aj ± bj)(x− α)j

    (2) f(x) · g(x) =∑∞

    m=0

    ∑j+k=m(aj · bk)(x− α)m

    (3) If g 6= 0 on C, ∃ an h(x) on C such that, h(x) = f(x)g(x) =∑∞

    j=0 dj(x− α)j,for some constants dj.

    Proposition 3.6. Let∞∑

    j=0

    aj(x− α)j

    be a power series with open interval of convergence C. Let f(x) be the functiondefined by the series on the interval C. Then, the function f is continuous and hascontinuous, real analytic derivatives of all orders at α.

    6Saff, E.B. and Snider, A.D.,Fundamentals of Complex Analysis with Applications to Engi-neering, Science, and Mathematics, 1993, Prentice-Hall, New Jersey.

  • 13

    Using this proposition, it is easy to show that a real analytic function has aunique power series representation:

    Corollary 3.7. If the function f is written as a convergent power series on a giveninterval of positive radius centered at α,

    f(x) =∞∑

    j=0

    aj(x− α)j ,

    then the coefficients of the power series can be obtained from the derivatives of thefunction by

    an =fn(α)

    n!.

    Proof. To obtain this result, simply differentiate both sides of the above equationn-times and evaluate at α. Differentiation is possible by the previous proposition.

    We pause for a moment to point out that although real analytic functions areinfinitely differentiable, the converse is not true. For example, the function

    f(x) ={

    e−1

    x2 , x 6= 0;0, x = 0.

    is infinitely differentiable. It is somewhat technical to show that f (h)(0) exists forall h, but nevertheless it can be done. Moreover,

    f (h)(0) = 0 ∀h ≥ 0and so the power series of f about x = 0 is just zero. Thus, f does not equal itspower series about x = 0 and so f is not real analytic - although it is infinitelydifferentiable. The real analytic functions are indeed a very special class of function.

    The term real analytic comes from the fact that if the power series of

    f(x) =∑

    an(x− α)n

    converges on (α−R,α + R), then the series

    f(z) =∑

    an(z − α)n,

    where z = x + ıy is a complex variable converges and is an analytic function onthe ball {z ∈ C : |z − α| < R}. Conversely if f is any convergent function, with apower series which converges on {z ∈ C : |z − α| < R} when α ∈ R, then f(x) isa real analytic function with power series converging on (α − R,α + R). We shallsay more about this in later sections.

    We now talk about real analytic functions of several variables. In order to gener-alize the power series to higher dimensions, we introduce the multi-index notation.

    Definition 3.8. A multi-index µ, is an m-tuple (µ1, µ2, · · · , µm) of non-negativeintegers. We write,

    Λ(m) = N× · · · × N,or alternatively,

    Λ(m) = (N)m.

  • 14

    Definition 3.9. For

    µ = (µ1, µ2, · · · , µm) ∈ Λ(m)and

    x = (x1, x2, · · · , xm) ∈ Rm,we define the following operations,

    µ! = µ1!µ2! · · ·µm!,

    |µ| = µ1 + µ2 + · · ·+ µm,

    xµ = xµ11 xµ22 · · ·xµmm ,

    |xµ| = |x1|µ1 |x2|µ2 · · · |xm|µm ,∂µ

    ∂xµ=

    ∂µ1

    ∂xµ11

    ∂µ2

    ∂xµ22· · · ∂

    µm

    ∂xµmm

    And for,

    µ = (µ1, µ2, . . . , µm) ∈ Λ(m)

    and

    ν = (ν1, ν2, . . . , νm) ∈ Λ(m),

    we write,µ ≤ ν

    if µj ≤ νj for j = 1, 2, . . . ,m.

    Definition 3.10. The formal expression∑µ∈Λ(m)

    aµ(x− α)µ,

    with α ∈ Rn and aµ ∈ R for each µ, is called a power series in m variables.

    Definition 3.11. The power series∑µ∈Λ(m)

    aµ(x− α)µ

    is said to converge at x if there is a function φ : Z+ → Λ(m) which is one-to-oneand onto such that the series

    ∞∑j=0

    aφ(j)(x− α)φ(j)

    converges.

  • 15

    Definition 3.12. For a fixed power series∑

    µ aµ(x− α)µ, we set

    C =⋃r>0

    {x ∈ Rm :∑

    µ

    |aµ(y − α)µ| < ∞, all |y − x| < r}

    This set is called the domain of convergence.

    Definition 3.13. We say that a function f : U ⊂ Rm → R is called real analyticif for each α ∈ U , the function f may be represented by a convergent power seriesin some neighborhood of α.

    In a similar fashion to Proposition 3.5, it is relatively simple to prove the follow-ing:

    Proposition 3.14. Let U, V ⊂ Rm be open. If f : U → R and g : V → R are realanalytic, then f ± g, f · g are real analytic on U ∩ V , and f/g is real analytic onU ∩ V ∩ {x : g(x) 6= 0}.Lemma 3.15. Suppose that g : (−a, a) → R is a real analytic function and g(y) = 0on an open interval I ⊆ (−a, a) then, g ≡ 0.Proof. If g is real analytic on (−a, a), then we can write g(x) =

    ∑∞n=0 anx

    n on(−a, a). Hence, we can say that g(z) :=

    ∑∞n=0 anz

    n is analytic on {|z| < a}. Thehypothesis states that the zeroes of g have an accumulation point on I and so, sincethe zeroes of an analytic function cannot have an accumulation point, g must beidentically zero. �

    Real analytic functions of one variable have domains of convergence equal tointervals. For several variable functions, the geometry of the domain of convergenceis more complicated.

    Definition 3.16. We say a set G in linear space, is called convex if for any twopoints x, y ∈ G, each point z = λx + (1− λ)y, for 0 < λ < 1, also belongs to G.Example 3.17. A subset of R2 in the shape of pentagon is convex, whereas asubset of R3 in the shape of a donut is not.

    Definition 3.18. For a set G ⊂ Rm, we define log ‖ G ‖ as,log ‖ G ‖= {(log |g1|, · · · , log |gm|) : g = (g1, · · · , gm) ∈ SG}.

    The set G is said to be logarithmically convex if log ‖ G ‖ is a convex subset of ofRm.

    Before proving the next theorem, let us establish some facts.

    Remark 3.19. For a fixed power series∑

    µ aµ(x − α)µ, we denote by B, the setof points x ∈ Rm where

    ∑µ |aµ||x − α|µ is bounded. It is clear that if the power

    series converges at a point x, then x ∈ B. Furthermore, using a result calledAbel’s Lemma 7, it is possible to show that Int(B) = C, where C, is the domainof convergence of the power series. This information allows us to say somethingabout the shape of the domain of convergence of a power series.

    Theorem 3.20. For a power series∑

    µ aµxµ, the domain of convergence C is

    logarithmically convex.

    7Krantz, S. G. and Parks, H. R., A primer of real analytic functions. Second edition. BirkhuserAdvanced Texts: Basel Textbooks, Boston.

  • 16

    Proof. Fix two points y = (y1, · · · , ym) ∈ C, z = (z1, · · · , zm) ∈ C and also let0 ≤ λ ≤ 1. Now, by the above remark, y ∈ C means that y ∈ Int(B). Hence, bythe definition of an open set, for some � > 0, (|y1|+ �, · · · , |ym|+ �) ∈ B. But, bythe above remark, this means that there exists some constant L, such that,

    |aµ||(|y1|+ �, · · · , |ym|+ �)| ≤ L.

    Simplifying and rewriting, this becomes,

    aµ ≤L∏m

    j=1(|yj |+ �)µj.

    By the same process, we can replace � by a smaller positive number and L by alarger number if necessary. Hence, without changing notation, we also have

    aµ ≤L∏m

    j=1(|zj |+ �)µj.

    Notice, that because we fixed y, z and λ, we can choose �′ > 0, such that thefollowing two expressions hold for j = 1, · · · ,m

    (|yj |+ �)λ ≥ |yj |λ + �′

    and(|zj |+ �)1−λ ≥ |zj |1−λ + �′.

    Then, we can choose σ > 0 so that for j = 1, · · · ,m,

    (|yj |λ + �′)(|zj |1−λ + �′) ≥ |yj |λ|xj |1−λ + σ,

    holds. Putting these facts together, we conclude that,

    |aµ| = |aµ|λ|aµ|1−λ ≤L∏m

    j=1(|yj |λ|xj |1−λ + σ)µj.

    Thus, (|y1|λ|z1|1−λ, · · · , |ym|λ|zm|1−λ) ∈ Int(B) = C, or equivalently

    λ(log |y1|, · · · , |ym|) + (1− λ)(log(z1), · · · , log |zm|) ∈ log ||C||.

    That is, the domain of convergence, C, is logarithmically convex.�

    Example 3.21. Show that a square in R2, S = {(x, y) : a < x < b, c < y < d} islogarithmically convex for some a, b > 0.

    We want to show that

    log ‖ S ‖= {(u, v) = (log |x|, log |y|) : (x, y) ∈ S},

    is convex. Knowing that a < x < b and c < y < d, we can write

    log |a| < log |x| < log |b|

    andlog |c| < log |y| < log |d|.

    But this is justlog |a| < u < log |b|

    andlog |c| < v < log |d|.

  • 17

    These restrictions define a square bounded by log |a|, log |b|, log |c| and log |d| in theu-v plane. But this means that log ‖ S ‖ is convex and hence S is logarithmicallyconvex.

    Example 3.22. Show that the domain of convergence of the power series∑∞

    n=0(xy)n =

    11−xy , defined on |xy| < 1, is logarithmically convex.

    Notice that any point within this domain will satisfy the inequality |xy| < 1.Hence the domain of convergence for the above power series is

    S = {(x, y) : |xy| < 1}.We want to show that

    log ‖ S ‖= {(u, v) = (log |x|, log |y|) : (x, y) ∈ S},is convex. Knowing that |xy| < 1, we write

    log |xy| < log |1|,which is simply,

    log |x|+ log |y| < 0.Using the u-v notation we rewrite this as

    v < −u.This however, clearly implies that log ‖ S ‖ is convex (since the domain is simplythe area under the v = −u curve, in the u-v plane). Hence, S is logarithmicallyconvex. As a side note, it is important to remember however, that even though thearea is logarithmically convex, it is by no means convex.

    With this new knowledge in tow, we now consider the factorization of multivari-ate real analytic functions.

  • 18

    4. Factorization of Multivariate Real Analytic Functions

    In this section, we return to examining rank one functions, with the added con-straint of real analyticity.

    Consider first, a definition of rank one, real analytic functions.

    Definition 4.1. A real analytic function with domain of convergence C ⊆ R2,

    f(x1, x2) =∑

    µ

    aµxµ,

    where x = (x1, x2) ∈ R2 and µ ∈ Λ(2), is said to be rank one if and only if

    f(x1, x2) = g1(x1)g2(x2).

    A clear question arises from this definition. Since f is real-analytic on its domain,is it necessarily true that g1 and g2 are real analytic on their respective domains aswell? As it turns out, this conjecture is true.

    Theorem 4.2. Suppose that,

    f(x1, x2) =∑

    µ

    aµxµ

    where x = (x1, x2) ∈ R2 and µ ∈ Λ(2) has domain of convergence C and

    f(x1, x2) = g1(x1)g2(x2),

    then, C = (−a1, a1) × (−a2, a2), and g1(x1), g2(x2) are real analytic on (−a1, a1)and (−a2, a2) respectively.

    Proof. Expanding f , notice that

    f(x1, x2) =∞∑

    |µ|=0

    aµxµ11 x

    µ22

    Now, choose a constant c, such that−→P = (x1, c) ∈ C, c 6= 0 and gi(c) 6= 0, for

    i = {1, 2}. Notice that this last fact is guaranteed by Lemma 3.15.

    Next, substitute−→P into f ,

    f(x1, c) =∞∑

    |µ|=0

    aµxµ11 c

    µ2

    = gi(x1)g2(c).

    Finally, remembering that g2(c) 6= 0, divide both sides of

    ∞∑|µ|=0

    aµxµ11 c

    µ2 = gi(x1)g2(c)

  • 19

    by g2(c), to obtain:

    g1(x1) =∞∑

    |µ|=0

    dµxµ11 ,

    where dµ = aµ/g2(c).

    Proceed similarly for g2. Hence, g1 and g2 are real analytic and can be repre-sented by power series.

    Since g1 and g2 can represented by an analytic power series on x1 and x2, thefunctions converge on the interval (−a1, a1) and (−a2, a2) respectively. Hence,f = g1(x1)g2(x2) converges on (−a1, a1)× (−a2, a2). �

    Having shown this theorem true for rank one, real analytic polynomials, it isinteresting to see whether it will also hold for rank two, three or even r.

    Before considering these cases let us first provide a detailed definition of rank rand let us derive two propositions that will be helpful in our proof.

    Definition 4.3. A real analytic function, with domain of convergence C ∈ R2,

    f(x1, x2) =∑

    µ

    aµxµ,

    where x = (x1, x2) ∈ R2 and µ ∈ Λ(2), is said to be rank r if and only if

    f(x1, x2) =r∑

    i=1

    fi(x1)gi(x2).

    Proposition 4.4. Suppose that,

    f(x1, x2) =∑

    µ

    aµxµ

    where x = (x1, x2) ∈ R2, has domain of convergence equal to C and is rank two, i.e.

    f(x1, x2) = f1(x1)g1(x2) + f2(x1)g2(x2),then, for some non-zero constants c, d, the following hold:

    (1) g1(x2) 6= cg2(x2) ∀x2,(2) the matrix, (

    g1(c) g2(c)g1(d) g2(d)

    )is invertible.

    Proof. (1) Proceed by contradiction. Assume that g1(x2) = cg2(x2),∀x2 andsome constant c. Then,

    f(x1, x2) = f1(x1)cg2(x2) + f2(x1)g2(x2)= (g2(x2))(cf1(x1) + f2(x2)).

  • 20

    Therefore f(x1, x2) is rank one. But we already know that f(x1, x2) is ranktwo. The contradiction is reached.Therefore, g1 is not a multiple of g2.

    (2) We will prove the second part using the above fact, that g1 is not a multipleof g2. Proceed by contradiction,Suppose g1(c)g2(d) − g1(d)g2(c) = 0, for all c and d. Choose d such thatg2(d) 6= 0.8. Therefore, we rewrite the first statement as

    g1(c)g2(d) = g1(d)g2(c),

    Now, since g2(d) 6= 0, divide both sides by g2(d) to obtain

    g1(c) =g1(d)g2(d)

    g2(c),

    for all constants c. But notice that in the above statement, g1(d)g2(d) is justsome constant, k. Therefore,

    g1(c) = kg2(c),

    for all c. This is a contradiction since g1 is not a multiple of g2. Therefore,

    g1(c)g2(d)− g1(d)g2(c) 6= 0,

    for some c and d. Also,

    g1(c)g2(d)− g1(d)g2(c) = det∣∣∣∣ g1(c) g2(c)g1(d) g2(d)

    ∣∣∣∣ .Therefore,

    det

    ∣∣∣∣ g1(c) g2(c)g1(d) g2(d)∣∣∣∣ 6= 0

    and hence, (g1(c) g2(c)g1(d) g2(d)

    )is invertible.

    With this proposition in tow, we will now consider the rank two version ofTheorem 4.2.

    Theorem 4.5. Suppose f is a real analytic, rank two function with domain ofconvergence C ∈ R2, i.e,

    f(x1, x2) = f1(x1)g1(x2) + f2(x1)g2(x2),

    then, C = L1 ∩ L2 where Li = (−ai, ai) × (−bi, bi) and where fi(x) and gi(y) aredefined on (−ai, ai) and (−bi, bi) respectively.

    8To convince yourself of this fact consider the following argument. If g2(d) = 0 for all d, thenf(x1, x2) = f1(x)g1(x2) and f is rank one. But we know f is rank two and so, g2(d) 6= 0, forsome d.

  • 21

    Proof. (Notice that we are effectively being asked to show that fi and gi are realanalytic on their corresponding domains)

    Expanding f , we obtain:

    f(x1, x2) =∞∑

    |µ|=0

    aµxµ11 x

    µ22

    First, we show that f1 and f2 are real analytic. Then, by the same process itcan be shown that g1 and g2 are real analytic.

    Choose non-zero constants c and d, such that−→P1 = (x1, c) ∈ C,

    −→P2 = (x1, d) ∈ C

    and such that the constants c and d satisfy Lemma 3.15.

    Next, substitute−→P1 into f and call the resultant function h1(x1),

    f(x1, c) =∞∑

    |µ|=0

    aµxµ11 c

    µ2

    = h1(x1)

    and−→P2 into f and call that function h2(x1),

    f(x1, d) =∞∑

    |µ|=0

    aµxµ11 d

    µ2

    = h2(x1).

    Notice that h1(x1) and h2(x1) are real analytic. The following system of equa-tions for h1(x1) and h2(x1) holds,

    h1(x1) = f(x1, c)= f1(x1)g1(c) + f2(x1)g2(c)

    h2(x1) = f(x1, d)= f1(x1)g1(d) + f2(x1)g2(d).

    So, rewriting this in matrix notation, we obtain,(h1(x1)h2(x1)

    )=

    (g1(c) g2(c)g1(d) g2(d)

    ) (f1(x1)f2(x1)

    )But, by second part of the previous proposition, we know that

    (g1(c) g2(c)g1(d) g2(d)

    )is invertible.

  • 22

    So, letting (A BC D

    )=

    (g1(c) g2(c)g1(d) g2(d)

    )−1,

    where A,B,C and D are constants, we see that,(f1(x1)f2(x1)

    )=

    (A BC D

    ) (h1(x1)h2(x1)

    )=

    (Ah1(x1) + Bh2(x1)Ch1(x1) + Dh2(x1)

    ).

    Hence, we can read off the values of f1 and f2. Therefore, f1(x1) = Ah1(x1) +Bh2(x1) and f2(x1) = Ch1(x1)+Dh2(x1). Since f1 and f2 are linear combinationsof real analytic functions (recall that h1 and h2 are real analytic), we know that f1and f2 must be real analytic themselves.

    Similarly, g1 and g2 must be real analytic.

    Since each fi and gi can represented by an analytic power series on the xi’s, thefi’s converge on (−ai, ai) and the gi’s converge on (−bi, bi). So, the product figi,converges on Li = (−ai, ai)×(−bi, bi) and so f1(x1)g1(x2)+f2(x1)g2(x2) convergeson L1 ∩ L2. Hence, C = L1 ∩ L2. �

    In fact, it is possible to prove the above theorem for real analytic polynomials ofrank r in the same manner as above (except for tedious labelling).

    Theorem 4.6. Suppose f is a real analytic, rank r function with domain of con-vergence C ∈ R2,

    f(x1, x2) = f1(x1)g1(x2) + . . . + fr(x1)gr(x2),

    then, C = L1 ∩ . . . ∩ Lr, where Li = (−ai, ai) × (−bi, bi) and fi(x) and gi(y) aredefined on (−ai, ai) and (−bi, bi) respectively.

    Remark 4.7. Notice that the above theorem tells us that if f is real analytic andof finite rank, then the domain of convergence of f , is a box.

    Remark 4.8. Notice also, that the above theorem says that if a rank r function isreal analytic, each component fi and gi, will also be real analytic.

  • 23

    5. Factorization of Complex Valued Real Analytic Functions

    A very natural question to ask after seeing the theorems in the previous section,is whether their converses hold. In order to answer this question we must considerextending our factorization to complex valued functions that are analytic. Webegin with some introductory definitions and theorems for complex functions ofone variable.

    Theorem 5.1. If f : G ⊂ C → C is C1, then f is analytic on G if∂f = 0,

    where, ∂f = 12 (∂x + ı∂y)f .

    Theorem 5.2. The function f is analytic on G ⊂ C, if and only if, at each pointa ∈ G,

    f(z) =∞∑

    n=0

    bn(z − a)n,

    for z in an open neighborhood of a for some bn.

    In other words, the above theorem says that the function f is analytic if andonly if, f can be written locally as a convergent power series.

    Theorem 5.3. For an analytic function, the coefficients bn can be computed as,

    an =fn(a)

    n!Notice that an important relation does exist between analytic and real analytic

    functions. In fact, real analytic functions are nothing more than analytic functionsrestricted to the real number line.

    Let us now consider analytic functions of higher dimensions.

    Definition 5.4. We define ∂zj , for zj = xj + ıyj , as

    ∂zj =12(∂xj + ı∂yj ),

    for all j ∈ N+.

    Theorem 5.5. The C ′ function f : G ⊂ Cm → C is analytic on G if∂zj f = 0,

    for all j.

    Theorem 5.6. We say that a complex function f : G ⊂ Cm → C is analytic ifand only if, for all a = (a1, · · · , am) ∈ G, z ∈ Cm and µ ∈ Λ(m),

    f(z1, · · · , zm) =∑

    µ

    bµ(z − a)µ

    Theorem 5.7. A real analytic function f̂ : C ⊆ Rn → R is an analytic functionf : C ′ ⊆ Cn → C with all the variables (z1, . . . , zn) restricted to the reals.

    It is exactly this fact that justifies introducing complex variables. We will nowconsider an example that will help us shed some light on why the converse ofTheorem 4.2 (and by generalization Theorem 4.6) does not hold.

  • 24

    Example 5.8. Find the domain of convergence of f(x) = e1

    1+x2 , where f(x) =∑∞n=0 anz

    n.Using regular methods of determining the radii of convergence, this problem

    would pose a challenge. However, if we simply notice that f(x) = e1

    1+x2 , is therestriction to the real numbers of f(z) = e

    11+z2 , the problem becomes astonishingly

    easy. We recall that f(z) = e1

    1+z2 , is undefined at z = ı and z = −ı. As such, thedomain of convergence for f(z) is |z| < 1. Remembering that z = x+ ıy and lettingy = 0, we automatically obtain the domain of convergence for f(x), to be |x| < 1.

    Having this result, we turn back to Theorem 4.2. We know from Theorem 4.2,that a real analytic finite rank function f(x1, x2) =

    ∑µ aµx

    µ, converges on a boxin R2. The question thus naturally arises, whether the opposite it true. That is, itremains to be shown that a real analytic function whose domain of convergence isa box, is of finite rank. Unfortunately as we shall show, this is not the case.

    Example 5.9. Let f(x, y) = exp( 11+x21

    1+y2 ).First, let us make sure that this is a real analytic function. To do this we needto show that the function f(x, y) = exp( 11+x2

    11+y2 ), is simply a restriction of the

    complex function f(z, w) = exp( 11+z21

    1+w2 ), where z = x+ ıt and w = y+ ıs, to thereal plane. But, by the same reasoning as in the above example, f(z, w) is analyticon D × D = {(z, w) ∈ C × C : |z| < 1, |w| < 1}. Hence restricting f(z, w) ontothe real plane by setting t = 0 and s = 0, means that f(x, y) is real analytic onB = {(x, y) ∈ R× R : |x| < 1, |y| < 1}. Notice also that B, is simply a box in twodimensions.Finally, we need to show that f is not of finite rank. This procedure is doneby demonstration, rather than by strict mathematical proof - although a proofcertainly seems to be achievable. We use Mathematica to calculate the power seriesof f to order 10, calling it f̂ and noting that f̂ has degree 10. Next, we placethe coefficients of the resulting polynomial in a 11× 11 coefficient matrix. We seethat this matrix is made up of alternating zero and non-zero columns. In line withTheorem 2.11, we row reduce this matrix in order to find the rank. The results areencouraging. We see that the rank of this matrix is as high as it could be, judgingfrom the number of non-zero columns. This matrix (and hence f̂) is of rank 6. Werepeat this operation with power series of 15, 20, 25 and 30. Each time, we see thatthe rank of the resultant coefficient matrix (and hence of f̂ is

    rank(f̂) = (deg(f̂))mod(2) + 1,

    where deg(f̂) is the degree of the given series expansion. It seems that this equationwill be true for higher orders of f̂ . If the equation is indeed true, the following holds,

    limdeg(f̂)→∞

    rank(f̂) = ∞.

    However, note that as deg(f̂) →∞, the approximation f̂ ≈ f , becomes exact. Thatis, as deg(f̂) →∞, f̂ = f , and so the rank of f becomes infinite.Hence f(x, y) is a function that converges on a box and is not of finite rank, dis-proving the converse of Theorem 4.2

  • 25

    6. Factorization of polynomials in three dimensions

    The following section examines the factorization of polynomials in three un-knowns.

    Before any new theorems can be proven, consider the following definitions.

    Definition 6.1. We say a polynomial p(x1, x2, x3) is of rank one iff p(x1, x2, x3) =p1(x1)p2(x2)p3(x3), where p1, p2, p3 are polynomials.

    This is analogous to the two dimensional definition.

    Definition 6.2. A Mn×n×n grid is a n × n × n three dimensional matrix, wherean entry is written as Ai,j,k, for 1 ≤ i, j, k ≤ n.

    Definition 6.3. The ith face or slice of M , the grid defined by Ai,j,k for 1 ≤i, j, k ≤ n, is the matrix Fj,k, defined by Fj,k := Ai,j,k .

    Definition 6.4. The jth layer of M , the grid defined by Ai,j,k for 1 ≤ i, j, k ≤ n,is the matrix Li,k, defined by Li,k := Ai,j,k .

    Let us attempt to visualize this concept in the following example.

    Example 6.5. Consider a 2× 2× 2 grid M, defined by the following:A1,1,1 = 1, A1,1,2 = 2, A1,2,1 = 2, A1,2,2 = 4, A2,1,1 = 3, A2,1,2 = 6, A2,2,1 = 6, A2,2,2 = 12

    The first face (or slice) of this grid is the matrix(

    1 22 4

    )and the second face

    is,(

    3 66 12

    ). Furthermore, notice that the first layer of this grid is the matrix(

    1 23 6

    )and the second layer is,

    (2 46 12

    ). Drawing this grid, can also be

    helpful.

    Notice that it is possible to represent a three dimensional polynomial by a gridin a similar manner to the way the coefficients of the two-dimensional polynomialwere written into a matrix.

    Definition 6.6. The degree of a polynomial, p(x1, x2, x3) = a0,0+...+am,nxk1xl2x

    m3

    is the highest power of x1, x2 or x3 with non-zero coefficients.

    Definition 6.7. For a polynomial

    p(x1, x2, x3) =∑

    0≤i,j,k≤n

    ai,j,kxi1x

    j2x

    k3

    of degree n, we define the coefficient grid, C(p) to be the (n+1)× (n+1)× (n+1)grid whose i, j, kth entry is ai,j,k. The ith face of this grid is,

    Ci(p) :=

    ai,0,0 · · · ai,0,n... ... ...ai,n,0 · · · ai,n,n

    .

    Definition 6.8. A 3 × 3 × 3 coefficient grid, M , is said to be rank one iff thecorresponding polynomial is rank one.

  • 26

    In two dimensions it was easy to show that a rank one coefficient matrix meantthat the corresponding polynomial was rank one. In three dimensions, this isslightly more challenging. After all, we knew how to compute the rank of a matrix.How is it possible to find the rank of a grid? The following theorem answers thesequestions.

    Theorem 6.9. For a grid G ∈ M3×3×3, the following are equivalent(1) G is rank one.(2) The polynomial p(x1, x2, x3) that corresponds to grid G is of rank one.(3) If F1, F2 and F3 are the faces of the grid, then Rank[

    (F1 | F2 | F3

    )] =

    1 and Rank[(

    F1 | F2 | F3)T ] = 1.

    (4) Every face and every layer of G is rank one.

    As a direct consequence of the definition, the first two statements are equivalent.It is also very easy to show that the last two statements are equivalent. Hence,all that remains to be proven, is the equivalency between the second and the thirdstatements.

    Proof. (⇒) : We assume that the polynomial p(x1, x2, x3) is rank one. Hence,

    p(x1, x2, x3) =∑

    0≤i,j,k≤n

    di,j,kxi1x

    j2x

    k3

    = p1(x1)p2(x2)p3(x3)

    = (n∑

    i=0

    aixi1)(

    n∑j=0

    bjxj2)(

    n∑k=0

    cjxj3)

    =∑

    0≤i,j,k≤n

    aibjckxi1x

    j2x

    k3

    For notational clarity, we assume that n = 2. Now, setting coefficients equivalentand writing down the ith face of the 3× 3× 3 coefficient grid, we obtain

    Fi =

    di,0,0 di,0,1 di,0,2di,1,0 di,1,1 di,1,2di,2,0 di,2,1 di,2,2

    =

    aib0c0 aib0c1 aib0c2aib1c0 aib1c1 aib1c2aib2c0 aib2c1 aib2c2

    for 0 ≤ i ≤ 2.Clearly Fi is of rank one. Notice now, that

    F1−F2−F3

    =

    a0b0c0 a0b0c1 a0b0c2a0b1c0 a0b1c1 a0b1c2a0b2c0 a0b2c1 a0b2c2a1b0c0 a1b0c1 a1b0c2a1b1c0 a1b1c1 a1b1c2a1b2c0 a1b2c1 a1b2c2a2b0c0 a2b0c1 a2b0c2a2b1c0 a2b1c1 a2b1c2a2b2c0 a2b2c1 a2b2c2

  • 27

    So, it is clear that Rank[(

    F1 | F2 | F3)T ] = 1.

    Also notice that,(F1 | F2 | F3

    )= a0b0c0 a0b0c1 a0b0c2 a1b0c0 a1b0c1 a1b0c2 a2b0c0 a2b0c1 a2b0c2a0b1c0 a0b1c1 a0b1c2 a1b1c0 a1b1c1 a1b1c2 a2b1c0 a2b1c1 a2b1c2

    a0b2c0 a0b2c1 a0b2c2 a1b2c0 a1b2c1 a1b2c2 a2b2c0 a2b2c1 a2b2c2

    It is clear that Rank[

    (F1 | F2 | F3

    )] = 1.

    (⇒) : Assume that Rank[(

    F1 | F2 | F3)] = 1 and Rank[

    (F1 | F2 | F3

    )T ] =1.

    Since Rank[(

    F1 | F2 | F3)T ] = 1, there exist constants ai, l, m for 0 ≤

    i ≤ 8 such that the following equality must hold,

    F1−F2−F3

    =

    a0 la0 ma0a1 la1 ma1a2 la2 ma2−− −− −−a3 la3 ma3a4 la4 ma4a5 la5 ma5−− −− −−a6 la6 ma6a7 la7 ma7a8 la8 ma8

    .

    But this means that,

    (F1 | F2 | F3

    )=

    a0 la0 ma0 a3 la3 ma3 a6 la6 ma6a1 la1 ma1 a4 la4 ma4 a7 la7 ma7a2 la2 ma2 a7 la7 ma7 a8 la8 ma8

    But we also know, that Rank[(

    F1 | F2 | F3)] = 1. Hence, it must be true

    that for some constants n and h, where n 6= h 6= 0, the following equalities hold

    a3 = na0

    a4 = na1a5 = na2

    and,a6 = ha0a7 = ha1a8 = ha2

  • 28

    So, rewrite the faces F1, F2 and F3 of the coefficient grid taking these equalitiesinto account.

    F1 =

    a0 la0 ma0a1 la1 ma1a2 la2 ma2

    F2 =

    na0 lna0 mna0na1 lna1 mna1na2 lna2 mna2

    F2 =

    ha0 lha0 mha0ha1 lha1 mha1ha2 lha2 mha2

    Now, rewriting the grid into its polynomial representation, we get:

    p(x1, x2, x3) =a0+na0x1+ha0x21+a1x2+na1x1x2+ha1x

    21x2+a2x

    22+na2x1x

    22+ha2x

    21x

    22+la0x3+

    lna0x1x3+hla0x21x3+ la1x2x3+ lna1x1x2x3+hla1x21x2x3+ la2x

    22x3+ lna2x1x

    22x3+

    hla2x21x

    22x3+ma0x

    23++mna0x1x

    23+hma0x

    21x

    23+ma1x2x

    23+mna1x1x2x

    23+hma1x

    21x2x

    23+

    ma2x22x

    23 + mna2x1x

    22x

    23 + hmna2x

    21x

    22x

    23

    = (1 + nx1 + hx21)(1 +a1a0

    x2 + a2a0 x22)(a0 + la0x3 + ma0x

    23)

    Therefore, the polynomial is of rank one.

    Let us now consider polynomials of dimension three that are not of rank one.We define a polynomial of higher rank in three dimensions, in a similar manner tothe two dimensional case.

    Definition 6.10. We say that a polynomial p is of rank r if and only if there exitsan r such that,

    p(x1, x2, x3) =r∑

    i=1

    fi(x1)gi(x2)hi(x3)

    and if r is the smallest integer to satisfy this condition.

    Notice also, that this definition of rank, extends and complements the earlierdefinitions of rank one and rank r polynomials.

    We had to leave the following as an open question. When n = 2, a polynomialis of rank r if and only if C(p) is of rank r. How can this be generalized to grids?

    Finally, it is quite possible and not at all conceptually difficult to extend theabove theory for higher dimensions. Notice however, that although the above dis-cussion forms a very neat and convenient theory of grids, it is not at all efficient

  • 29

    from the computational point of view. Programs, that attempt to factorize polyno-mials in the above manner are large, unwieldily and slow. As such, we must developa quicker and slicker method for factorizing multi-dimensional polynomials.

  • 30

    7. Algorithm and examples in two and n dimensions

    As we already mentioned, the theory of grids described in the previous sectionis difficult and unwieldily in its implementation as a computer algorithm. In thissection we will describe an alternative algorithm. We shall first describe computa-tional method for polynomials of two dimensions and later we will generalize thisto n dimensions, by a surprisingly simple extension.

    The two dimensional algorithm is based completely on Theorem 2.11 and Ex-ample 2.12. Let p(x1, x2), be an 2 dimensional, rank one polynomial, of degree d,represented by the coefficient matric C(p). Since, p(x1, x2) is rank one, we knowthat C(p) is rank one and so we can factor it as

    p(x1, x2) = p1(x1)p2(x2).

    We factorize by the following algorithm:(1) Create coefficient matrix C(p).(2) Column reduce this matrix and save the leading (non-zero) column as the

    vector−→β . (These will be the coefficients of p1(x1)).

    (3) Save the top row of C(p) in a vector, −→γ . These will be the coefficients ofthe p2(x2)).

    Example 2.12 follows this algorithm exactly.

    In three and more dimensions, we simply use the above algorithm recursively,holding the other variables constant. Let us first consider an example before lookingat the algorithm itself.

    Example 7.1. Let p be the following rank polynomial

    p(x1, x2, x3) = 1 + 3 x1 + 2 x2 + 6 x1 x2 + 2 x3 + 6 x1 x3 + 4 x2 x3 + 12 x1 x2 x3.

    We wish to factorize this polynomial as,

    p(x1, x2, x3) = p1(x1)p2(x2)p3(x3).

    First, we want to express this polynomial in matrix form. However, we do not wishto use the complicated grid system. As such, let us simply assume for the momentthat this polynomial is a function in two variables, x1 and x2 and that the thirdvariable, x3 is simply a constant. With this in mind, we write down the coefficientmatrix to be: (

    1 + 2 x3 2 + 4 x33 + 6 x3 6 + 12 x3

    ).

    So, the (1, 1) entry represents the ’constant’ term, (1, 2) entry represents the x2term, the (2, 1) entry represents the x1 term and finally the (2, 2) entry representsthe x1x2 term. Let us now use the above algorithm. We column reduce the abovematrix and obtain (

    1 03 0

    ).

    Hence, our beta vector is,−→β = {1, 3}.

  • 31

    Next, we read off the top row of the initial coefficient matrix to obtain the gammavector,

    −→γ = {1 + 2 x3, 2 + 4 x3}.This information tells us that

    p1(x1) = 1 + 3x1from the beta vector, and

    q(x2, x3) = 1 + 2x3 + (2 + 4x3)x2from the gamma vector. Notice that we have managed to reduce the problem, andthat we can now write p(x1, x2, x3) as,

    p(x1, x2, x3) = p1(x1)q(x2, x3)= (1 + 3x1)(1 + 2 x2 + 2 x3 + 4 x2 x3).

    Now, to find the full factorization of p, we only have to factorize q - a function oftwo variables. But we already know how to do this. We once again use the abovealgorithm to obtain,

    q(x1, x2) = (1 + 2 x2) (1 + 2 x3) .Putting these two facts together, gives us the final factorization for p,

    p(x1, x2, x3) = (1 + 3 x1) (1 + 2x2) (1 + 2x3) .

    Hence, in general, the algorithm for n ≥ 3 dimensional factorization of polyno-mial p(x1, . . . xn) is as follows,

    (1) Create coefficient matrix C(p) in terms of x1 and x2, holding the termsx3, . . . , xn constant.

    (2) Column reduce this matrix and save the leading (non-zero) column as thevector

    −→β . (These will be the coefficients of p1(x1)).

    (3) Save the top row of C(p) in a vector, −→γ . This will be the coefficients of thenew polynomial q1(x2, . . . , xn)).

    (4) Repeat steps 1 through 3 using qi as the new polynomial each time.

    The above directions describe exactly the algorithm used in the construction of theMathematica program FactNDim, that was created as part of this paper.

    Recall that we mentioned creating an algorithm using the theory developed in theprevious section. Although this is a far more complicated and involved algorithmwe include it, for completeness sake.

  • 32

    8. Alternative algorithm and examples three dimensions

    The following section examines the Mathematica algorithm that strictly followsfrom the grid theory proposed in one of the previous sections. This is in no wayan efficient or optimal algorithm. It is presented to contrast with the previousalgorithm and to demonstrate the way the theory could be used. For an efficientimplementation refer to the previous section.

    Let p(x1, x2, x3), be an 3 dimensional, rank one polynomial, of degree d, repre-sented by the coefficient grid G = {A1, · · · , Ad}. Since, p(x1, x2, x3) is of rank one,we can can factor it as p(x1, x2, x3) = p1(x1)p2(x2)p3(x3). The following algorithmdescribes the method used to find the coefficients of each of the polynomials pi(xi).

    (1) Create an array of length two, call this the ”answer matrix array” or AMA.(2) In the first entry of AMA, create an array or length d, called AMA1(3) Enter the first non-zero element of each Ai into AMA1,i. If, an entire face

    Ai is zero, enter zero into AMA1,i.(4) Find the index, j, of the first non-zero entry of the AMA1 array.(5) Enter the matrix Aj into the second entry of AMA called AMA2.(6) Create an array called answer of length three. The ith entry of that array,

    will contain another array that will store the coefficients of xi.(7) Column reduce AMA1 and enter the leading column into answer1. These

    are the coefficients for x1.(8) Column reduce AMA2 and enter the leading column into answer2. These

    are the coefficients for x2.(9) Enter the first non-zero row of AMA2 into answer3. These are the coeffi-

    cients for x3.(10) The answer array contains the desired coefficients.

    We will see later that this algorithm generalizes to n dimensions. An AMAarray will still be found. We will then column and row reduce each of the entriesof AMA (except the last one - we will only column reduce that one), and enterthe leading columns and rows of the reduced matrices into an answer array. Thereare however, some technical details that need to be explained about the creation ofthe AMA matrix in n-dimensions, so we will hold off with the explanation till later.

    The simplest way to understand the workings of the mathematica methods andthe above algorithm, is to examine some examples. As such, a simple example ofthe powerful algorithm that is used to solve the factorization problem is given. Theexample is then used as a tool for explaining individual methods.

    Example 8.1. Notice that for polynomial, p(x1, x2, x3),

    p(x1, x2, x3) = 1 + 3x1 + 2 x2 + 6 x1 x2 + 2 x3 + 6 x1 x3 + 4 x2 x3 + 12 x1 x2 x3= p1(x1)p2(x2)p3(x3)= (1 + 3x1) (1 + 2x2) (1 + 2x3) .

    First recognize that this polynomial can be represented by a 2 × 2 grid G =

    {A1, Aa2}, with faces A1 =(

    1 22 4

    )and A2 =

    (3 66 12

    ).

  • 33

    Notice that this coefficient grid is of rank one9 and hence the polynomial it repre-sents can be factored.

    Now, create a matrix (in this case an array) that contains the first non-zero

    entries of the two faces in the gird, A1 and A2. This array is simply(

    13

    ). Next,

    in accordance with the algorithm, find the location of first non-zero entry in thisresultant array. In this case, the first non-zero entry is 1 and it is in the first slot

    of the(

    13

    )array. So, according to the algorithm, single out the first face of

    the grid,(

    1 22 4

    ). We have now found the two matrices that will determine the

    coefficients of the factored polynomials.Create an ”answer matrix array” (AMA), that will contain both these matrices.

    Hence,

    AMA =

    (

    13

    )(

    1 22 4

    )

    We will now column and row reduce each entry in the AMA matrix. Column

    reduce the first entry of AMA. Notice, that we get the same array,(

    13

    ). This is

    the coefficient matrix of p1(x1). Hence, we can say that p1(x1) = (1 + 3x1). Now,

    row reduce the first entry of AMA. Notice, that we get the array,(

    10

    ). But this

    simply represents the polynomial p = 1. So we ignore its effect on the factorization.

    Continue similarly with the second matrix.

    Column reduce the second entry of AMA. Notice, that we get the matrix,(1 02 0

    ). This is the coefficient matrix of p2(x2). Hence, we can say that

    p2(x2) = (1 + 2x2).Notice, that according to the algorithm, because we have reached the last entryof AMA, we do not row-reduce the matrix, but take the first row directly, as thecoefficients of p3(x3). As such, p3(x3) = (1 + 2x3).

    To represent our answers neatly, we create an answer array. The ith entry ofthat array, will contain another array that will store the coefficients of xi. So,

    answer =

    (13

    )(

    12

    )(

    12

    )

    We have successfully factored the polynomial. p(x1, x2, x3).

    9Each slice and each layer of the grid is rank one.

    title_page2radek3