CHAPTER 4: THE QUARTIC quartic03c2063.netsolhost.com/.../uploads/2015/05/Quartic.pdf · 2015. 5. 24. · CHAPTER 4: THE QUARTIC A polynomial of degree 4 is called a quartic. In its

CHAPTER 4: THE QUARTIC

A polynomial of degree 4 is called a quartic. In its most general form it may be written as

f(x) = a4x4 + a3x3 + a2x2 + a1x+ a0,

where the ai, i = 0, 1, 2, 3, 4, are signed numbers with a4 6= 0. The domain of f(x) is all signed numbers asthere are no restrictions on x. Because a quartic is a polynomial of even degree its range will be, unlike thecubic, a half-line.

After graphing many quartics we find empirically that if the leading coefficient is positive the quartic willgrow to +∞ as x’s distance from zero approaches +∞ although there may be at most two finite intervalswhere its graph is descending. If the leading coefficient is negative the quartic will generally descend towards−∞ as |x| tends towards +∞ although again there may be up to two finite intervals where its graph isactually rising as the distance from the origin increases. All quartics have at least one local extremum andat most three but never seem to have only two. If the leading coefficient is positive, the ordinate of at leastone of the extrema will also serve as the absolute minimum value in the range of the quartic, whereas if theleading coefficient is negative, the ordinate of at least one of the extrema will serve as the absolute maximumvalue in the quartic’s range. Below are some graphs of quartics.

Figure 1 Figure 2

Figure 3 Figure 4

1

Following is a definition for absolute or universal extrema:

DEFINITION: M(xM , yM ) is a maximum point of polynomial f(x) if for all values of x, f(x) ≤ yM .m(xm, ym) is a minimum point of polynomial f(x) if for all values of x, f(x) ≥ ym.

Once again the horizontal translation function g(x) = f→+h(x) = f(x− h) may be calculated:

f(x− h) = a4(x− h)4 + a3(x− h)3 + a2(x− h)2 + a1(x− h) + a0= a4x4 + (−4a4h+ a3)x3 + (6a4h2 − 3a3h+ a2)x2 + (−4a4h3 + 3a3h2 − 2a2h+ a1)x+

(a4h4 − a3h3 + a2h2 − a1h+ a0).

So we may write

HORIZONTAL TRANSLATION to the right FUNCTION: Given f(x) = a4x4 + a3x3 + a2x2 +a1x+ a0, with a4 6= 0, as the general quartic, g(x) = f→+h(x) = f(x− h) is h units to the right of f(x) andwould have as its expression

g(x) =

a4x4+(−4a4h+a3)x3+(6a4h2−3a3h+a2)x2+(−4a4h3+3a3h2−2a2h+a1)x+(a4h4−a3h3+a2h2−a1h+a0).

Note that a4, the leading coefficient remained unchanged and that the constant term is f(−h) as with thecubic meaning that its roots are again the negatives of the roots of the given quartic and hence they are justas treacherous to find as the roots of the original.

We now succumb to an urge to generalize a little. Our methods so far have consciously eschewed the use ofthe derivative function developed at the outset of any presentation of the Differential and Integral Calculus.We are able to find the local extrema of polynomials without the use of the derivative. A translation whicheliminates the x-term creates a new function whose y-intercept is a candidate for being a local extremum.The coefficient of the x-term of the translated function is a polynomial in “h” of degree “n−1,” giving rise toat most “n−1” extrema. We then look at the coefficient of the x2-term (as with the cubic) of the translatedfunction and if this coefficient is nonzero the y-intercept of this new function is indeed a local extremum. Itis a local maximum if the coefficient of the x2-term is negative and it is a local minimum if the coefficientof the x2-term is positive. If the coefficent of the x2-term is zero we would remain unsure in the case wheref(x) is a polynomial of degree greater than three.Had we used derivates to find the extrema we would have started by taking the (first) derivative of f(x).The derivative or first derivative of f(x) is sometimes written as f ′(x) or as f (1)(x). With this notation wecan write f(x) itself as f (0)(x). The first derivative, f (1)(x), may be interpreted geometrically as providingthe value of the slope of the line tangent to f(x) at the point (x, f(x)). Clearly if this slope is not zero at,say x1, f(x) is still rising (or falling) at x = x1 so for x1 to be an extremum it would seem to be necessaryfor f (1)(x1) to be zero. However, the derivative equalling zero, is not sufficient, since f(x) may have stoppedrising (or falling) only momentarily at that point and then simply resume its rise (or fall) after that point.A point (x1, f(x1)) where f (1)(x1) = 0 is sometimes called a stationary point; so all extrema of polynomialsare stationary points but not all stationary points are extrema.Our main interest has been and remains to find the roots of the given polynomials but this writer couldnot resist the desire to show the connection between the horizontal translation function and the derivativefunctions. By the way the derivative of the derivative function, called the second derivative may be writtenf ′′(x) or f (2)(x). We hope to prove the following theorem describing the connection between the derivativesof a polynomial, the coefficients of the polynomial’s horizontal translation function and the polynomial’slocal extrema:

2

MINIMAX THEOREM: If g(x) = f→+h(x) = f(x−h) is the polynomial h units to the right of f(x) andif f (m)(x) is the mth derivative of f(x), 0 ≤ m ≤ n, where f(x) = anxn + an−1xn−1 + . . .+ a0 =

∑ni=0 aix

i,an 6= 0, n ≥ 1, then if i! = 1 · 2 · . . . · i where 0! is defined as 1, we would find

(i)

g(x) = f(−h) + f ′(−h)x+ 12f ′′(−h)x2 + f

(3)(−h)3!

x3 + . . .+f (n)(−h)

n!xn =

n∑i=0

1i!f (i)(−h)xi, and

(ii) the point (xex, f(xex)) is a local extremum if and only if f ′(xex) = 0 and the smallest Natural numberiex, 2 ≤ iex ≤ n, for which f (iex)(xex) 6= 0 is an even number. The local extremum would be a local maximumwhen f (iex)(xex) < 0 and a local minimum when f (iex)(xex) > 0.

Assuming we could prove this theorem, let us illustrate how we would proceed to find the x-coordinates ofthe local extrema and how we would determine whether they are local maxima or minima. We’d start bytaking the derivative of the given function f(x). We would then find the roots of the first derivative; i.e theroots of f (1)(x). Suppose x1 is one of these roots (there are at most n − 1 roots since f (1)(x) is of degreen − 1). We would then calculate, in ascending order, the derivatives f (2)(x), f (3)(x), . . . , f (n)(x). Supposej, 2 ≤ j ≤ n, is the smallest number such that f (j)(x1) 6= 0. If j is odd, x1 is not an extremum. If j is even:x1 is a local maximum when f (j)(x1) < 0; x1 is a local minimum when f (j)(x1) > 0.

EXAMPLE: Find the extrema or extremum of f(x) = x4 − 8x3 + 24x2 − 32x+ 19.

Solution: f ′(x) = f (1)(x) = 4x3−24x2+48x−32. To find the roots we note: f (1)(x) = 4·(x3−6x2+12x−8).By the rational roots test we find x1 = 2 to be a root, which results in the factorization:

f (1)(x) = 4 · (x3 − 6x2 + 12x− 8) = 4 · (x− 2) · (x2 − 4x+ 4) = 4 · (x− 2) · (x− 2)2 = 4 · (x− 2)3

which means x1 = 2 is the first derivative’s only root. Of course, since f(x) is a quartic (a polynomial ofeven degree) with leading coefficient positive we know it must have at least one minimum and since x1 = 2is the only candidate it must be the one. But let’s proceed with the method outlined anyway. By design,f (1)(2) = 0,

f (2)(x) = 12x2 − 48x+ 48so f (2)(2) = 12(2)2 − 48(2) + 48

= 48− 48(2) + 48= 0, so we continue :

f (3)(x) = 24x− 48so f (3)(2) = 24(2)− 48

= 48− 48= 0, so we continue :

f (4)(x) = 24 > 0for all x, including 2, so x1 = 2 is a minimum as 4 is even.

Since f(2) = 3, (2, 3) is a local minimum of f(x). It is also, in this case, the minimum value in the range off(x).

3

EXAMPLE: Use the result in Part (i) of the theorem to expand (a+ b)5.

Solution: Let f(x) = x5. Let g(x) be f(x) slid to the right by h = −b, i.e slid to the left b units. We willfirst find the expression for g(x) and then calculate g(a); for g(a) = f(a− (−b)) = f(a+ b) = (a+ b)5.

By Part (i) of the theorem, where −h = b, g(x) =∑5i=0

1i!f

(i)(b)xi. We start by calculating the six values off (i)(b) as i runs through 0,1,2,3,4,5.

i f (i)(x) f (i)(b) 1i!f(i)(b)xi 1i!f

(i)(b)xi simplified

0 x5 b5 10!b5x0 b5

1 5x4 5b4 51!b4x1 5b4x

2 4 · 5x3 4 · 5b3 4·52·1b3x2 10b3x2

3 3 · 4 · 5x2 3 · 4 · 5b2 3·4·53·2·1b2x3 10b2x3

4 2 · 3 · 4 · 5x 2 · 3 · 4 · 5b 2·3·4·54·3·2·1bx4 5bx4

5 1 · 2 · 3 · 4 · 5 1 · 2 · 3 · 4 · 5 1·2·3·4·55·4·3·2·1x5 x5

So adding up the terms in column 5 we find that g(x) = b5 + 5b4x+ 10b3x2 + 10b2x3 + 5bx4 + x5.And hence g(a) = b5 + 5b4a+ 10b3a2 + 10b2a3 + 5ba4 + a5. So we have that

(a+ b)5 = b5 + 5b4a+ 10b3a2 + 10b2a3 + 5ba4 + a5

We can confirm our result using the Binomial Theorem:

(a+ b)5 =(

50

)a5 +

(51

)a4b+

(52

)a3b2 +

(53

)a2b3 +

(54

)ab4 +

(55

)b5

= a5 + 5a4b+ 10a3b2 + 10a2b3 + 5ab4 + b5

which matches the above result although the order of the sum is reversed. After this example our confidencein Part (i) of the theorem is increased but we still have the proof ahead of us. In working towards thisdemonstration of the connection between the horizontal translation function of f(x) and f(x)’s derivativeswe’ll begin with the following Lemma.

LEMMA: Suppose s is a Natural number. Consider the polynomial of degree s: fs(x) =∑sj=0 ajx

j =a0 + a1x+ a2x2 + . . .+ asxs, as 6= 0. We will find

f (i)s (x) =s−i∑l=0

(s− li

)i!as−lxs−i−l,

where f (i)s (x) is the ith derivative of fs(x), with i among 0, 1, 2, . . . , s.

Proof:

fs(x) =s∑j=0

ajxj = a0 + a1x+ a2x2 + . . .+ asxs, as 6= 0

4

fs(x) has s + 1 terms. Each term is a monomial with a coefficient, and a power of x. Let’s associate withfs(x) the set Afs(x) = {(aj , xj)|j = 0, 1, . . . , s}, where (a, xj) = (b, xk) if and only if a = b and j = k. Thepurpose of this type of set will be to compare sums to see if they are equal. We have to be a little carefulbecause elements need not be repeated in sets but obviously alter a sum. Also in the above set any elementof the form (0, xj) is to be omitted as it would not contribute to the sum. It’s all quite trivial really; themost taxing aspect is keeping track of the notational maze.What is the ith derivative of ajxj , where 0 ≤ i ≤ s? If i > j it is 0; If i ≤ j it would be j · j − 1 · j − 2 · . . . ·(j − i+ 1) · aj · xj−i = j!(j−i)! · aj · x

j−i.So

Af(i)s (x)

={(

j!(j − i)!

· aj , xj−i)∣∣∣∣j = i, i+ 1, . . . , s}

It is clear by inspection that the s− i+ 1 elements of the above set, that is the ordered pairs, are all distinctas j runs its course from i to s. There will be no terms to discard because although aj might be zero forsome particular polynomial there will be no value of j for which aj will be definitely zero for the generalpolynomial fs(x).According to the lemma we have that

f (i)s (x) =s−i∑l=0

(s− li

)i!as−lxs−i−l,

so for this expression we would find the associated set to be

ALemma

f(i)s

(x) ={((

s− li

)i!as−l, xs−l−i

)∣∣∣∣l = 0, 1, 2, . . . , s− i}where i is an integer in {0, 1, 2, . . . , s}.

Now,(s− li

)=

(s− l)!i!(s− l − i)!

, so(s− li

)i! =

(s− l)!(s− l − i)!

leading to

ALemma

f(i)s

(x) ={(

(s− l)!(s− l − i)!

· as−l, xs−l−i)∣∣∣∣l = 0, 1, . . . , s− i}

Both ALemma

f(i)s

(x) and Af(i)s (x)

have the same number of distinct elements, namely s− i+ 1, so we need onlyshow these two finite sets are equal to establish the equality of the associated sums. To prove that two setsA and B are equal it is sufficient to show that A ⊆ B and B ⊆ A. However, if the two sets are finite andhave the same number of elements it is sufficient to show the truth of only one of the two containments. Toprove that A ⊆ B we must show that an arbitrary element in A will always be also found in B. So we willnow show that A

Lemma

f(i)s

(x) ⊆ Af(i)s (x)

. So we now take an arbitrary element in ALemma

f(i)s

(x):((s− l0)!

(s− l0 − i)!· as−l0 , xs−l0−i

).

and must show its presence in Af(i)s (x)

. Suppose jl0 = s− l0. What can we say about the range of jl0? Welljl0 can be no smaller than s − (s − i) = i and no bigger than s − 0 = s. In other words jl0 must be in{i, i+ 1, . . . , s}, meaning that the element below obtained by replacing s− l0 with jl0 in the element abovewill indeed place this element in A

f(i)s (x)

;(jl0 !

(jl0 − i)!· ajl0 , x

jl0−i).

This establishes the equality of the sets and the corresponding sums and hence the Lemma.

5

PROOF of MINIMAX THEOREM: Suppose g(x) is the right translation function by h of f(x) =fn(x) =

∑nj=0 ajx

j . As found earlier we have

g(x) = fn(x− h)

=n∑j=0

aj(x− h)j

=n∑j=0

aj

( j∑k=0

(j

k

)xj−k(−h)k

)

using the Binomial Theorem. We wish to show the above sum, to be identical with

gTheorem

(x) =n∑i=0

1i!f (i)(−h)xi.

From the Lemma we know that

f (i)(−h) = f (i)n (−h) =n−i∑l=0

(n− li

)i!an−l(−h)n−i−l.

Substituting the above into gTheorem

(x) we find

gTheorem

(x) =n∑i=0

1i!

( n−i∑l=0

(n− li

)i!an−l(−h)n−i−l

)xi

=n∑i=0

( n−i∑l=0

(n− li

)an−l(−h)n−i−l

)xi

so the coefficient of xi in gTheorem

(x), i = 0, 1, . . . , n is

n−i∑l=0

(n− li

)an−l(−h)n−i−l;

we would like to extract the coefficient of xi in

f(x− h) =n∑j=0

aj

( j∑k=0

(j

k

)xj−k(−h)k

)and show that it, too, has this same coefficient for these i ∈ {0, 1, . . . , n}.

Clearly the exponent j−k takes on exactly the values 0, 1, . . . , n. Suppose j−k = j∗ ∈ {0, 1, . . . , n}. Whichvalues of (k, j), 0 ≤ k ≤ j ≤ n, satisfy j − k = j∗? First we observe j ≥ j∗. The table below indicates then− j∗ + 1 distinct pairs (j, k) which satisfy j − k = j∗.

6

j k j − k ?= j∗

j∗ 0 j∗√

j∗ + 1 1 j∗√

j∗ + 2 2 j∗√

. . . . . . . . .

j∗ + c c j∗√

provided 0 ≤ c ≤ n− j∗

. . . . . . . . .

j∗ + (n− j∗) = n n− j∗ j∗√

The coefficient of xi in gTheorem

(x) has n − i + 1 terms and i, like j∗, takes on precisely the values in{0, 1, . . . , n}. So both coefficients have the same number of addends, namely “n− +1”.So it would be sufficient to show the individual addends are equal.

The coefficient of xj∗

is

(aj∗)(j∗

0

)(−h)0 + (aj∗+1)

(j∗ + 1

1

)(−h)1 + . . .+ (aj∗+(n−j∗))

(j∗ + (n− j∗)

n− j∗

)(−h)n−j

∗

=n∑

j=j∗

(aj)(

j

j − j∗

)(−h)j−j

∗

So, to correlate notation, if j∗ = i, the coefficient of xi would be

=n∑j=i

(aj)(

j

j − i

)(−h)j−i

= ai

(i

0

)(−h)0 + ai+1

(i+ 1

1

)(−h)1 + . . .+ an

(n

n− i

)(−h)n−i.

which adding up the terms in reverse would lead to the expression

n−i∑l=0

(an−l)(

n− ln− i− l

)(−h)n−i−l

=n−i∑l=0

(an−l)(

n− l(n− l)− i

)(−h)n−i−l

=n−i∑l=0

(an−l)(n− li

)(−h)n−i−l,

since (m

k

)=(

m

m− k

); k ≤ m.

7

This final sum has the same form as the coefficient of xi in the expression for gTheorem

(x), as was to be shown,proving part (i) of the Theorem.

Suppose xex satisfies f ′(xex) = 0 and that iex is the smallest Natural number 2 ≤ iex ≤ n for whichf (iex)(xex) 6= 0 and that iex is even. A smallest iex, 2 ≤ iex ≤ n, such that f (iex)(xex) 6= 0 will always existsince we always have f (n)(xex) 6= 0. We then construct g(x) by sliding f(x) to the “left” by xex meaning weuse h = −xex. By the result found in Part (i) of the Minimax Theorem we would have

g(x) = f(xex) + f ′(xex)x+12f ′′(xex)x2 +

f (3)(xex)3!

x3 + . . .+f (n)(xex)

n!xn

and substituting our assumptions about xex and iex leads to

g(x) = f(xex)+ < terms of value zero > +f (iex)(xex)

(iex)!xiex + . . .+

f (n)(xex)n!

xn

= f(xex) + xiex(f (iex)(xex)

(iex)!+f (iex+1)(xex)

(iex + 1)!x+ . . .+

f (n)(xex)n!

xn−iex)


(iex)!+n−iex∑j=1

f (iex+j)(xex)(iex + j)!

xj),

where the∑

term applies when iex 6= n. It is dropped in the case iex = n itself. So we let C(x), thecoefficent of xiex be defined as

C(x) =

f(iex)(xex)

(iex)!+∑n−iexj=1

f(iex+j)(xex)(iex+j)!

xj , if iex < n;f(iex)(xex)

(iex)!, if iex = n.

Provided iex < n we define Q(x) as

Q(x) =n−iex∑j=1


xj .

So we have

g(x) = f(xex) + xiex · C(x).

If n were even, our assumed even iex might actually equal n. Addressing this case first we would have

C(x) =f (iex)(xex)

(iex)!=f (n)(xex)

n!.

which would be a constant we’ll call Cn. So, in this special case

g(x) = f(xex) + xn · Cn.

By inspection it can be seen that the point (0, g(0)) = (0, f(xex)) is an absolute extemum of g(x): Since n iseven the term xn ·Cn will always have the sign of the constant Cn regardless of the sign of x. (0, f(xex)) willtherefore be a maximum point when the constant Cn is negative and a minimum point when the constantCn is positive.

We now consider iex < n and look at Q(x). Q(x) is a polynomial of degree n− iex ≥ 1. Since Q(0) = 0 andQ(x) is a polynomial, Q(x) is continuous everywhere including at x = 0. We are given that f (iex)(xex) 6= 0

so we may let δ =∣∣∣∣ f(iex)(xex)(iex)!

∣∣∣∣ which is guaranteed positive. By the continuity of Q(x) we can find an � > 08

so that |Q(x)| < δ whenever |x| < �. Since iex is even we have xiex > 0 whenever x 6= 0 regardless of x’s sign.As long as we restrict ourselves to this �-neighborhood of x = 0 we would have the sign of the coefficientC(x) always be the same as the sign of the contant f

(iex)(xex)(iex)!

. We’ll call this constant Ciex , allowing us towrite

g(x) = f(xex) + xiex(Ciex +Q(x)).

By an argument similar to the one used above for the case iex = n we see that (0, f(xex)) is a local extremum.In the above argument x was free to range over all numbers but here we are restricted to the �-neighborhoodwhich will keep |Q(x)| < δ = |Ciex | meaning we are observing possibly only a “local” extremum. So if Ciexis negative, (0, f(xex)) is a local maximum of g(x) and if Ciex is positive, (0, f(xex)) is a local minimum ofg(x). By the definition of Ciex it is immediately apparent that it has the same sign as f

(iex)(xex). So wesee that (xex, f(xex)) must have been a local extremum of f(x) when the two conditions in Part (ii) of theMinimax Theorem are assumed met.

Now for the converse: Assuming (xex, f(xex)) is a local extremum we must show the two conditions in Part(ii) to be necessarily true, namely that f ′(xex) = 0 and that the smallest Natural number iex, 2 ≤ iex ≤ nwith f (iex)(xex) 6= 0 is even. Suppose for a moment that the constant f ′(xex) which we shall call b were notzero. We would then have

g(x) = f(xex) +f ′(xex)

1!x+

f ′′(xex)2!

x2 + . . .+f (n)(xex)

n!xn

= f(xex) + x(f ′(xex) +

f ′′(xex)2!

x+ . . .+f (n)(xex)

n!xn−1

)= f(xex) + x

(b+

f ′′(xex)2!

x+ . . .+f (n)(xex)

n!xn−1

)This time we let

Q(x) =f ′′(xex)

2!x+ . . .+

f (n)(xex)n!

xn−1 =n∑j=2

f (j)(xex)j!

xj−1.

Again we have Q(0) = 0 and Q(x) is a polynomial and hence continuous at x = 0. We let δ = |b| > 0. Wecan then find an � > 0 so that whenever x ∈ (−�, �), |Q(x)| < δ making the coefficient of x retain the signof b on both sides of x = 0. If b were positive abscissas to the left of the origin would be below (0, f(xex))and abscissas to the right of the origin would be above (0, f(xex)) no matter how small a positive number �were. If b were negative, abscissas to the left of the origin would result in values for g(x) above (0, f(xex))and abcissas to the right of the origin would result in g(x) values below (0, f(xex)) making it impossiblefor (0, f(xex)) to be an extremum. Therefore we must have b = 0, i.e f ′(xex) = 0 establishing the firstcondition. Now for the second condition. Assume again for a moment that the smallest iex, 2 ≤ iex ≤ nmaking f iex(xex) 6= 0 were odd. We would then be able to write

g(x) = f(xex)+ < terms of value zero > +f (iex)(xex)

(iex)!xiex + . . .+

f (n)(xex)n!

xn−iex


(iex)!+n−iex∑j=1


xj)

where again the∑

term applies provided iex 6= n and is dropped in the case iex = n itself. So once more welet C(x), the coefficent of xiex be defined as

C(x) =

f(iex)(xex)

(iex)!+∑n−iexj=1

f(iex+j)(xex)(iex+j)!

xj , if iex < n;f(iex)(xex)

(iex)!, if iex = n.

9

Once again we can use the following short-cut notation for the constant

Ciex =f (iex)(xex)

(iex)!

so that Cn =f(n)(xex)

n! . In the event that n were odd we might again have that iex = n. Addressing this casefirst we could write

g(x) = f(xex) + Cn · xn.Since xn would be an odd power, Cn ·xn would have opposite signs for abscissas on opposite sides of x = 0 nomatter how tiny an interval about x = 0 we consider making it impossible for (0, f(xex)) to be an extremumof g(x) and hence for (xex, f(xex)) to be an extremum of f(x). In the case iex < n, we let, as before,

Q(x) =n−iex∑j=1


xj .

We have Q(0) = 0 and Q(x) being a polynomial is continuous at x = 0. We let δ = |Ciex | > 0 and we can findan � > 0 so that |Q(x)| < δ whenever x ∈ (−�, �), meaning that within this interval C(x) = Ciex +Q(x) retainsthe sign of Ciex . As iex is being assumed odd x

iex is an odd power and hence within the �-neighborhoodC(x) · xiex is of opposite sign on opposite sides of the origin. So we see that (0, f(xex)) cannot be a localextremum of g(x) as no matter how tight an �-neighborhood around x = 0 we choose the values for g(x) willbe on opposite sides of f(xex) when the abscissas are on opposite sides of x = 0, meaning that (xex, f(xex))could not be a local extremum of f(x) thus voiding the temporary assumption that iex might be odd. So iexmust be even which establishes the Minimax Theorem.

MiniMax Corollary For Linear (n=1) Polynomials: Given f(x) = a1x + a0, a1 6= 0, f(x) has noextrema.

Proof: f (1)(x) = a1 6= 0 for all values of x. Hence f (1)(x) = 0 has no solution so f(x) can have no extrema.

MiniMax Corollary For Quadratic (n=2) Polynomials: Given f(x) = a2x2 + a1x + a0, a2 6= 0. Thepoint (xex, f(xex)) =

(−a12a2

,4a0a2−a21

4a2

)is always an extremum and is a (universal) maximum when a2 < 0 and

a (universal) minimum when a2 > 0.

Proof: We start by calculating f (1)(x) = f ′(x) = a2x + a1. Since a2x + a1 is a polynomial of degree oneit will have at most one root and since its degree is odd it will have at least one root and therefore willhave exactly one root, which by inspection is xex = −a12a2 . We find that f

(2)(x) = 2a2 6= 0 for all values of xincluding therefore xex = −a12a2 making xex =

−a12a2

an extremum since “2” is even. xex will be the abscissa of alocal maximum when a2 is negative and the abscissa of a local minimum when a2 is positive. The extremumis universal for suppose we slide f(x) to the right by a12a2 to place the extremum on the y-axis:

f(x− a12a2

) = a2

(x− a1

2a2

)2+ a1

(x− a1

2a2

)+ a0

= a2

(x2 − a1

a2x+

a214a22

)+ a1x−

a212a2

+ a0

= a2x2 +a214a2− a

21

2a2+ a0

= a2x2 +4a2a0 − a21

4a2

We can now see by inspection of the last expression above that g(x) = f(x − a12a2 ) has (0,4a2a0−a21

4a2) as a

universal extremum. If a2 < 0 this point would be a universal maximum; if a2 > 0 this point would be auniversal minimum.

10

MiniMax Corollary For Cubic (n=3) Polynomials: Given f(x) = a3x3 +a2x2 +a1x+a0, a3 6= 0, f(x)will have two extrema

(xmax, f(xmax)) =(−a2 −

√a22 − 3a1a3

3a3,

2a32 − 9a1a2a3 + 27a23a4 + 2(a22 − 3a1a3)√a22 − 3a1a3

27a23

)and (xmin, f(xmin)) =

(−a2 +

√a22 − 3a1a3

3a3,

2a32 − 9a1a2a3 + 27a23a4 − 2(a22 − 3a1a3)√a22 − 3a1a3

27a23

)if and only if a22 − 3a1a3 > 0; xmax will be the abscissa of the local maximum and xmin will be the abscissaof the local minimum. If a22 − 3a1a3 ≤ 0 f(x) will have no extrema.

Proof: We start again with f (1)(x) = 3a3x2 +2a2x+a1. We wish to find the roots for 3a3x2 +2a2x+a1 = 0.Since f (1)(x) is a degree 2 polynomial it will have at most two roots. Since f (1)(x) is of even degree it mayhave no roots at all since its range is the half-line. Using the Quadratic Formula we find the two solutions

xex1 =−a2 −

√a22 − 3a1a3

3a3,

and xex2 =−a2 +

√a22 − 3a1a3

3a3

when a22 − 3a1a3 > 0; the one solution

x1 =−a23a3

when a22−3a1a3 = 0 and no solutions when a22−3a1a3 < 0. By the MiniMax Theorem there can be no extremawhen f (1)(x) cannot equal zero. Can x1 = −a23a3 be the abscissa of an extremum? Well f

(2)(x) = 6a3x+ 2a2,so f (2)(x1) = f (2)(−a23a3 ) = 6a3(

−a23a3

) + 2a2 = −2a2 + 2a2 = 0 and f (3)(x1) = f (3)(−a23a3 ) = 6a3 6= 0, so theanswer is “No” since 3 is odd. We are therefore left with xex1 and xex2 as the only candidates for beingextrema as f (1)(xex1) = f

(1)(xex2) = 0. Calculating the second derivative for each of these candidates wefind

f (2)(xex1) = 6a3xex1 + 2a2

= 6a3

(−a2 −

√a22 − 3a1a3

3a3

)+ 2a2

= −2√a22 − 3a1a3 < 0 and hence 6= 0

so xex1 is the abscissa of a local maximum so we use the symbol xmax for xex1 , and

f (2)(xex2) = 6a3xex2 + 2a2

= 6a3

(−a2 +

√a22 − 3a1a3

3a3

)+ 2a2

= 2√a22 − 3a1a3 > 0 and hence 6= 0

so xex2 is the abscissa of a local minimum so we use the symbol xmin for xex2 . To calculate the ordinates of

these extrema we must expand f(xmax) = f(−a2−

√a22−3a1a3

3a3) and f(xmin) = f(

−a2+√a22−3a1a3

3a3). This simple

but lengthy calculation was performed in the chapter on the cubic. Adapting the notation used there forthe coefficients of the given cubic to our present notation we find

11

f(xmax) =2a32 − 9a1a2a3 + 27a23a4 + 2(a22 − 3a1a3)

√a22 − 3a1a3

27a23and

f(xmin) =2a32 − 9a1a2a3 + 27a23a4 − 2(a22 − 3a1a3)

√a22 − 3a1a3

27a23completing the proof of the corollary.

MiniMax Corollary For Quartic (n=4) Polynomials: Given f(x) = a4x4+a3x3+a2x2+a1x+a0, a4 6=0, f(x) will have either one extremum or three extrema. Letting

p =8a2a4 − 3a23

16a24, q =

a33 − 4a2a3a4 + 8a24a132a34

, and D =q2

4+p3

27

we find

(i)if D ≥ 0 then

xex =3

√−q2

+√D + 3

√−q2−√D − a3

4a4is

the abscissa of the one and only extremum of f(x); xex is an (absolute) maximum when a4 < 0; xex is an(absolute) minimum when a4 > 0, and

(ii) if D < 0 then

xex1 = −2√−p3

cos[

13

arccos[

3q√−3p

2p2

]]− a3

4a4,

xex2 = 2

√−p3

cos[

13

arccos[

3q√−3p

2p2

]− 60o

]− a3

4a4and

xex3 = 2

√−p3

cos[

13

arccos[

3q√−3p

2p2

]+ 60o

]− a3

4a4

are the abscissas of the three extrema.

If a4 < 0 then xex1 is the abscissa of a local maximum, xex3 is the abscissa of a local minimum and xex2 isthe abscissa of a local maximum. The larger of f(xex1) and f(xex2) would serve as the absolute maximumvalue in the range of f(x).

If a4 > 0 then xex1 is the abscissa of a local minimum, xex3 is the abscissa of a local maximum and xex2 isthe abscissa of a local minimum. The smaller of f(xex1) and f(xex2) would serve as the absolute minimumvalue in the range of f(x).

For reference we carry out the following simple but lengthy calculations:

f(xex) =768a0a34 + 512a

44p

2 + 192a23a24p− 512a2a34p− 9a43 + 48a2a33a4 − 192a1a3a24

768a34

+(a33 − 4a2a3a4 + 8a1a24 − 8a34q

8a24

)(3

√−q2

+√D + 3

√−q2−√D

)

+(

8a2a4 − 3a23 − 8a24p8a4

)(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)12

f(xex1) =256a0a34 − 64a1a3a24 + 16a2a23a4 − 3a43

256a34+

4a2a3a4 − a33 − 8a1a244a24

√−p3

cos[

13

arccos(

3q√−3p

2p2

)]+

3a23p− 8a2a4p6a4

cos2[

13

arccos(

3q√−3p

2p2

)]+

169a4p

2 cos4[

13

arccos(

3q√−3p

2p2

)]

f(xex2) =256a0a34 − 64a1a3a24 + 16a2a23a4 − 3a43

256a34− 4a2a3a4 − a

33 − 8a1a24

4a24

√−p3

cos[

13

arccos(

3q√−3p

2p2

)− 60o

]+

3a23p− 8a2a4p6a4

cos2[

13

arccos(

3q√−3p

2p2

)]+

169a4p

2 cos4[

13

arccos(

3q√−3p

2p2

)− 60o

]

f(xex3) =256a0a34 − 64a1a3a24 + 16a2a23a4 − 3a43

256a34− 4a2a3a4 − a

33 − 8a1a24

4a24

√−p3

cos[

13

arccos(

3q√−3p

2p2

)+ 60o

]+

3a23p− 8a2a4p6a4

cos2[

13

arccos(

3q√−3p

2p2

)]+

169a4p

2 cos4[

13

arccos(

3q√−3p

2p2

)+ 60o

]Proof: Although our intent is to provide a proof using the Minimax Theorem it is interesting to note thatthis corollary is a direct consequence of the connection between the derivative and the integral (also aptlycalled the antiderivative). The Fundamental Theorem of the Differential and Integral Calculus finds that∫ b

a

g(1)(t)dt = G(b)−G(a)

where G(t) is any function whose first derivative is g(1)(t). Clearly, by definition, G(t) = g(t) is a functionwhose first derivative is g(1)(t), but so would be G(t) = g(t) + K where K is any constant whatsoever. Infact the set of all antiderivatives of g(t) is {g(t) +K|K, any number}.

The corollary states that when the first derivative, the cubic f (1)(x) of the given quartic, f(x), has oneroot this root is the abscissa of the one extremum, when this derivative has two roots (cubic’s flat case) thequartic still has just one extremum whose abscissa is the root associated with the point where the graph off (1)(x) actually “cuts through” the x-axis, and when f (1)(x) has three roots then the quartic f(x) has threeextrema. Extrema are frequently called “turning points.” Below are three graphs illustrating these cases:

f (1)(x) f (1)(x)

x0 A x0 A B

Case I: f (1)(x) has one root and hence Case II: f (1)(x) has two roots and stillone turning point for f(x) at A one turning point for f(x) at A only

13

f (1)(x)x0 A B C

Case III: f (1)(x) has three roots and hencethree turning points for f(x) at A, B, and C

Suppose we let “a” in the integral be x0, an arbitrary number on the x-axis less than any of f (1)(x)’s roots.One then has ∫ x

x0

f (1)(x)dx = f(x)− f(x0).

So,

f(x) =∫ xx0

f (1)(x)dx+ f(x0).

f(x0) is a constant;∫ xx0f (1)(x)dx may be thought of as the signed “area” under the curve f (1)(x) as we move

from x0 to x. As we move from x0 towards A in Case I the area is getting more and more negative meaningthat f(x) is getting smaller and smaller, but immediately after we pass A we start to pick up positive areaso f(x) will start to get larger meaning that A is indeed a turning point or extremum. In this particulardiagram it would correspond with a minimum.

In Case II we again find that as we move from x0 to A we are accumulating negative area meaning thatf(x) is declining, but then as we pass A we start accumulating positive area meaning that f(x) starts togrow making A a turning point. B is only a stationary point and not a turning point because before B (butafter A, of course) we are accumulating positive area meaning that f(x) is growing and after B we continueto accumulate positive area so f(x) is still growing. f(x)’s growth ceases only momentarily at B itself butnever actually declines so B is not an extremum.

In Case III as we move from x0 to A we are accumulating negative area meaning that f(x) is declining butjust after point A positive area starts to accumulate meaning that f(x) starts to increase making A a turningpoint or extremum. Area continues to be added until point B is reached meaning that f(x) rises non-stop inthe interval between A and B but after point B negative area starts to accumulate causing f(x) to declinemaking B a turning point; f(x) will continue to decline until point C is reached; after point C positive areastarts to accumulate meaning f(x) starts to rise again making point C also a turning point or extremum.

Now we start the formal proof using the Minimax Theorem. By this theorem we require all abscissas ofextrema of f(x) to be roots of f (1)(x). Now f (1)(x) = 4a4x3 + 3a3x2 + 2a2x+ a1 is a cubic. We will use thepreviously found cubic formula to find these roots of f (1)(x).

When we apply the cubic formula we must first calculate “p” and “q” and “D.” With the cubic formula“ai”, i = 0, 1, 2, 3, represents the coefficient of xi in the cubic to be solved. The cubic whose roots we seek is

f (1)(x) = 4a4x3 + 3a3x2 + 2a2x+ a1

14

where the aj ’s, j = 0, 1, 2, 3, 4, are from the given quartic. So when applying the cubic formula we wouldhave

“a3, ,

= 4a4“a2

, ,

= 3a3“a1

, ,

= 2a2“a0

, ,

= a1

The indexed a’s in quotes on the LHS (left hand side) of the above chain of equalities have no relation tothe indexed a’s on the RHS (right hand side). So

p =3“a3

, ,

“a1, , − “a2

, ,2

3“a3, ,2

=3(4a4)(2a2)− (3a3)2

3(4a4)2

=8a2a4 − 3a23

16a24

and

q =2“a2

, ,3 − 9“a1, ,

“a2, ,

“a3, ,

+ 27“a3, ,2“a0

, ,

27“a3, ,3

=2(3a3)3 − 9(2a2)(3a3)(4a4) + 27(4a4)2a1

27(4a4)3

=3a33 − 12a2a3a4 + 24a24a1

96a34

=a33 − 4a2a3a4 + 8a24a1

32a34

and also as defined in the cubic formula

D =q2

4+p3

27.

We suppose first that D > 0. In this case f (1)(x) has exactly the one root

xex =3

√−q2

+√D + 3

√−q2−√D − a3

4a4

We must show that either f (2)(xex) 6= 0 or that if f (2)(xex) is zero then f (3)(xex) is also zero (f (4)(xex) =24a4 6= 0 and has the same sign as a4). If f (2)(xex) 6= 0 we must further show that f (2)(xex) has the samesign as a4 which would make xex the abscissa of a local maximum when a4 < 0 and a local minimum whena4 > 0. We start by calculating the various derivatives:

f (1)(x) = 4a4x3 + 3a3x2 + 2a2x+ a1f (2)(x) = 12a4x2 + 6a3x+ 2a2f (3)(x) = 24a4x+ 6a3f (4)(x) = 24a4.

15

We now evaluate the second derivative at x = xex, the one and only root of the first derivative:

f (2)(xex) = f (2)(

3

√−q2

+√D + 3

√−q2−√D − a3

4a4

)= 12a4

(3

√−q2

+√D + 3

√−q2−√D − a3

4a4

)2+ 6a3

(3

√−q2

+√D + 3

√−q2−√D − a3

4a4

)+ 2a2

= 12a4

(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2+

a2316a24

− 2p3− a3

2a4

(3

√−q2

+√D + 3

√−q2−√D

))= +6a3

(3

√−q2

+√D + 3

√−q2−√D − a3

4a4

)+ 2a2

= 12a4

(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)+

3a234a4− 8a4p− 6a3

(3

√−q2

+√D + 3

√−q2−√D

)+ 6a3

(3

√−q2

+√D + 3

√−q2−√D

)− 3a

23

2a4+ 2a2

= 12a4

(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)+ 2a2 −

3a234a4− 8a4p

= 12a4

(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)+

8a2a4 − 3a234a4

− 8a4p

= 12a4

(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)+ 4a4p− 8a4p

= 12a4

(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)− 4a4p

= 12a4

((3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)− p

3

)So we have for this D > 0 case:

f (2)(xex) = 12a4

[(3

√(−q2

+√D

)2+ 3√(−q2−√D

)2)− p

3

],

f (3)(xex) = 24a4

(3

√−q2

+√D + 3

√−q2−√D − a3

4a4

)+ 6a3

= 24a4

(3

√−q2

+√D + 3

√−q2−√D

)− 6a3 + 6a3

= 24a4

(3

√−q2

+√D + 3

√−q2−√D

), and

f (4)(xex) = 24a4.

We are looking at the case D > 0. If p < 0 then by inspection one sees that the bracketed factor in the mostrecent expression for f (2)(xex) would be positive resulting in f (2)(xex) having the same sign as a4 as is to beshown. Suppose p = 0; then for D > 0 we would require q 6= 0 since D = q

2

4 +p3

27 =q2

4 and√D = |q|2 which

means exactly one of −q2 +√D and −q2 −

√D will be nonzero and hence the bracketed expression would once

again be positive. Finally, suppose p > 0. We must show that in this case

16

3

√(−q2

+√D

)2+ 3√(−q2−√D

)2>p

3

as well.

On the LHS we have two positive addends. If q > 0 the second addend would be the larger and we showthat it alone is larger than p3 (making the sum a fortiori larger than

p3 ), (remember p > 0 in the case being

presently considered).

3

√(−q2−√D

)2>p

3

if and only if (−q2−√D

)2>p3

27

if and only if

q2

4+q2

4+p3

27+ q√D >

p3

27

which is clearly true since we have on the LHS p3

27 itself plus positive addends. Note that if q were zero

this sum would equal p3

27 which would make3

√(−q2 −

√D

)2= p3 itself but the other addend in the original

expression would also be p3 making the sum2p3 which is greater than

p3 .

Now if q < 0 the first addend of the original expression would be the larger and it alone as we show wouldbe larger than p3 : (remember p > 0)

3

√(−q2

+√D

)2>p

3

if and only if (−q2

+√D

)2>p3

27

if and only if

q2

4+q2

4+p3

27− q√D >

p3

27

if and only if

p3

27+[q2

2− q√D

]>p3

27

which is true since q < 0 makes the bracketed expression positive, completing the case for D > 0.Suppose now, D = 0. By the cubic formula if p = q = 0 we again have only the one root xex whose expressionwas given above. So the expression for f (2)(xex) given above would also apply. We find that when p = q = 0f (2)(xex) = 0 but f (3)(xex) is also zero leaving us with f (4)(xex) = 24a4 6= 0 and having the same sign as a4completing the proof for the case when D = 0 and p = q = 0 and hence the cases when f (1)(x) has exactlyone root.

17

We continue with the case when D = 0. If D = 0 and q = 0 then p would also have to be zero which wouldbe the (one root) case just covered. So we now look at D = 0 and q 6= 0; we must then have p < 0 sinceD = q

2

4 +p3

27 . In this case (D = 0, q 6= 0, p < 0) the two roots of f(1)(x) are

x′ex =3

√q

2− a3

4a4and xex = −2 3

√q

2− a3

4a4.

Note that since D = 0 the second root above is the same as xex. First we must show that the first root, x′ex,above is not an extremum.

We have

f (2)(x′ex) = f(2)

(3

√q

2− a3

4a4

)= 12a4

(3

√q

2− a3

4a4

)2+ 6a3

(3

√q

2− a3

4a4

)+ 2a2

= 12a4

(3

√q2

4− a3

2a43

√q

2+

a2316a24

)+ 6a3 3

√q

2− 3a

23

2a4+ 2a2

= 12a43

√q2

4− 6a3 3

√q

2+

3a234a4

+ 6a3 3√q

2− 3a

23

2a4+ 2a2

= 12a43

√q2

4− 3a

23

4a4+ 2a2

but D = 0 means that q2

4 =−p327 so

f (2)(x′ex) = 12a4

(−p3

)+

8a2a4 − 3a234a4

= 12a4

(−p3

)+ 4a4p, since p =

8a2a4 − 3a2316a24

= −4a4p+ 4a4p= 0

but

f (3)(x′ex) = 24a4

(3

√q

2− a3

4a4

)+ 6a3

= 24a4 3√q

2− 6a3 + 6a3

= 24a4 3√q

26= 0, since q 6= 0

Therefore x′ex = 3√

q2 −

a34a4

cannot be an extremum. For the second root, xex = −2 3√

q2 −

a34a4

we have

f (2)(xex) = f (2)(− 2 3√q

2− a3

4a4

)= 12a4

[(3

√q2

4+ 3√q2

4

)− p

3

]= 12a4

[2 3√q2

4− p

3

]18

Since D = 0, we have q2

4 =−p327 , so

f (2)(xex) = 12a4(−p).

Since p < 0, f (2)(xex) is nonzero and has the same sign as a4 as we wished to show thus completing all thecases stemming from D ≥ 0.

Now for the cases when D < 0. The corollary asserts in this case that for xex1

(i) xex1 is the abscissa of an extremum(ii) xex1 is the abscissa of a local maximum when a4 < 0(iii) xex1 is the abscissa of a local minimum when a4 > 0.

Since xex1 , xex2 , xex3 are the roots of f(1)(x) we are given that f (1)(xex1) = 0. To prove that xex1 is the

abscissa of an extremum we must show that f (2)(xex1) 6= 0 or that if f (2)(xex1) = 0 then f (3)(xex1) = 0 aswell (since we know that f (4)(xex1) = 24a4 6= 0). We start, therefore, by calculating f (2)(xex1) where xex1 is

given as −2√−p3 cos

[13 arccos

[3q√−3p

2p2

]]− a34a4 and f

(2)(x) = 12a4x2 + 6a3x+ 2a2:

f (2)(xex1) = 12a4

(− 2√−p3

cos[

13

arccos[

3q√−3p

2p2

]]− a3

4a4

)2+ 6a3

(− 2√−p3

cos[

13

arccos[

3q√−3p

2p2

]]− a3

4a4

)+ 2a2

= 12a4

(−4p

3cos2

[13

arccos[

3q√−3p

2p2

]]+

a2316a24

+a3a4

√−p3

cos[

13

arccos[

3q√−3p

2p2

]])− 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]]− 3a

23

2a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]]+

3a234a4

+ 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]]− 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]]− 3a

23

4a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]]− 3a

23

4a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]]+

8a2a4 − 3a234a4

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]]+ 4a4p

= 4a4

[p

(1− 4 cos2

[13

arccos[

3q√−3p

2p2

]])]

If f (2)(xex1) < 0 then xex1 would be the abscissa of a local maximum by the Minimax Theorem. The presentcorollary asserts xex1 would be the abscissa of a local maximum provided a4 < 0.

Also if f (2)(xex1) > 0 then xex1 would be the abscissa of a local minimum by the Minimax Theorem. Thepresent corollary asserts xex1 would be the abscissa of a local minimum provided a4 > 0.

The two statements above require us to show that a4 and f (2)(xex1) have the same sign. For this to be true

we require the expression p(

1− 4 cos2(

13 arccos

(3q√−3p

2p2

)))to be positive. For D to be less than zero we

must have p < 0 since D = q2

4 +p3

27 . We are therefore reduced to showing that whenever D < 0

19

1− 4 cos2(

13

arccos(

3q√−3p

2p2

))< 0

or that

cos2(

13

arccos(

3q√−3p

2p2

))>

14

or that ∣∣∣∣ cos(13 arccos(

3q√−3p

2p2

))∣∣∣∣ > 12Since the arccos(·) function must return an angle between 0o and 180o, inclusive, 13 arccos

(3q√−3p

2p2

)must

be an angle between 0o and 60o, inclusive. For all θ ∈ [0o, 60o) we have cos θ > cos 60o = 12 , so the only

angle in the range [0o, 60o] for which∣∣∣∣ cos( 13 arccos( 3q√−3p2p2 ))∣∣∣∣ > 12 might not apply is θ = 60o for which

we would have “equality” instead of “greater than.” For θ to equal 60o we would require arccos(

3q√−3p

2p2

)to equal 180o. This would happen only when 3q

√−3p

2p2 = −1, or 2p2 = −3q

√−3p, or 4p4 = 9q2(−3p) or

4p3 = −27q2 or 4p3 + 27q2 = 0 or p3

27 +q2

4 = 0 which is not possible since D < 0 so θ can never equal 60o

meaning that∣∣∣∣ cos( 13 arccos( 3q√−3p2p2 ))∣∣∣∣ will, as required, always exceed 12 .

The corollary asserts in this case (D < 0) that for xex2

(i) xex2 is the abscissa of an extremum(ii) xex2 is the abscissa of a local maximum when a4 < 0(iii) xex2 is the abscissa of a local minimum when a4 > 0.

Since as mentioned earlier xex1 , xex2 , xex3 are the roots of f(1)(x) we are given that f (1)(xex2) = 0. To

prove that xex2 is the abscissa of an extremum we must show that f(2)(xex2) 6= 0 or that if f (2)(xex2) = 0

then f (3)(xex2) = 0 as well (since we know that f(4)(xex2) = 24a4 6= 0). We start, therefore, as earlier

by calculating f (2)(xex2) where xex2 is given as 2√−p3 cos

[13 arccos

[3q√−3p

2p2

]− 60o

]− a34a4 and f

(2)(x) =

12a4x2 + 6a3x+ 2a2:

f (2)(xex2) = 12a4

(2

√−p3

cos[

13

arccos[

3q√−3p

2p2

]− 60o

]− a3

4a4

)2+ 6a3

(2

√−p3

cos[

13

arccos[

3q√−3p

2p2

]− 60o

]− a3

4a4

)+ 2a2

= 12a4

(−4p

3cos2

[13

arccos[

3q√−3p

2p2

]− 60o

]+

a2316a24

− a3a4

√−p3

cos[

13

arccos[

3q√−3p

2p2

]− 60o

])+ 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]− 60o

]− 3a

23

2a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]− 60o

]+

3a234a4− 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]− 60o

]+ 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]− 60o

]− 3a

23

2a4+ 2a2

20

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]− 60o

]− 3a

23

4a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]− 60o

]+

8a2a4 − 3a234a4

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]− 60o

]+ 4a4p

= 4a4

(p

(1− 4 cos2

(13

arccos(

3q√−3p

2p2

)− 60o

)))If f (2)(xex2) < 0 then xex2 would be the abscissa of a local maximum by the Minimax Theorem. The presentcorollary asserts xex2 would be the abscissa of a local maximum provided a4 < 0.

Also if f (2)(xex2) > 0 then xex2 would be the abscissa of a local minimum by the Minimax Theorem. Thepresent corollary asserts xex2 would be the abscissa of a local minimum provided a4 > 0.

The two statements above again require us to show that a4 and f (2)(xex2) have the same sign. For this to

be true we require the expression p(

1− 4 cos2(

13 arccos

(3q√−3p

2p2

)− 60o

))to be positive. For D to be less

than zero we must have p < 0 since D = q2

4 +p3

27 . We are therefore again reduced to showing that wheneverD < 0

1− 4 cos2(

13

arccos(

3q√−3p

2p2

)− 60o

)< 0

or that

cos2(

13

arccos(

3q√−3p

2p2

)− 60o

)>

14


3q√−3p

2p2

)− 60o

)∣∣∣∣ > 12Since the arccos(·) function must return an angle between 0o and 180o, inclusive, 13 arccos

(3q√−3p

2p2

)− 60o

must be an angle between −60o and 0o, inclusive. For all θ ∈ (−60o, 0o] we have cos θ > cos(−60o) = 12 ,

so the only angle in the range [−60o, 0o] for which∣∣∣∣ cos( 13 arccos( 3q√−3p2p2 ) − 60o)∣∣∣∣ > 12 might not apply

is θ = −60o for which we would have “equality” instead of “greater than.” For θ to equal −60o we would

require arccos(

3q√−3p

2p2


√−3p

2p2 = 1, or 2p2 = 3q

√−3p, or

4p4 = 9q2(−3p) or 4p3 = −27q2 or 4p3 + 27q2 = 0 or p3

27 +q2

4 = 0 which is as before not possible since D < 0

so θ can never equal −60o meaning that∣∣∣∣ cos( 13 arccos( 3q√−3p2p2 ) − 60o)∣∣∣∣ will, as required, always exceed

12 .

Now for xex3 , (D < 0). The corollary asserts that

(i) xex3 is the abscissa of an extremum(ii) xex3 is the abscissa of a local minimum when a4 < 0(iii) xex3 is the abscissa of a local maximum when a4 > 0.

Once again xex1 , xex2 , xex3 are the roots of f(1)(x) so we are given that f (1)(xex3) = 0. To prove that xex3 is

the abscissa of an extremum we must show that f (2)(xex3) 6= 0 or that if f (2)(xex3) = 0 then f (3)(xex3) = 0

21

as well (since we know that f (4)(xex3) = 24a4 6= 0). We start, as earlier, by calculating f (2)(xex3) where xex3is given as 2

√−p3 cos

[13 arccos

[3q√−3p

2p2

]+ 60o

]− a34a4 and f

(2)(x) = 12a4x2 + 6a3x+ 2a2:

f (2)(xex3) = 12a4

(2

√−p3

cos[

13

arccos[

3q√−3p

2p2

]+ 60o

]− a3

4a4

)2+ 6a3

(2

√−p3

cos[

13

arccos[

3q√−3p

2p2

]+ 60o

]− a3

4a4

)+ 2a2

= 12a4

(−4p

3cos2

[13

arccos[

3q√−3p

2p2

]+ 60o

]+

a2316a24

− a3a4

√−p3

cos[

13

arccos[

3q√−3p

2p2

]+ 60o

])+ 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]+ 60o

]− 3a

23

2a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]+ 60o

]+

3a234a4− 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]+ 60o

]+ 12a3

√−p3

cos[

13

arccos[

3q√−3p

2p2

]+ 60o

]− 3a

23

2a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]+ 60o

]− 3a

23

4a4+ 2a2

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]+ 60o

]+

8a2a4 − 3a234a4

= −16a4p cos2[

13

arccos[

3q√−3p

2p2

]+ 60o

]+ 4a4p

= 4a4

(p

(1− 4 cos2

(13

arccos(

3q√−3p

2p2

)+ 60o

)))If f (2)(xex3) < 0 then xex3 would be the abscissa of a local maximum by the Minimax Theorem. The presentcorollary asserts xex3 would be the abscissa of a local maximum provided a4 > 0.

Also if f (2)(xex3) > 0 then xex3 would be the abscissa of a local minimum by the Minimax Theorem. Thepresent corollary asserts xex3 would be the abscissa of a local minimum provided a4 < 0.

The two statements above this time require us to show that a4 and f (2)(xex3) are of opposite sign. For this

to be true we require the expression p(

1 − 4 cos2(

13 arccos

(3q√−3p

2p2

)+ 60o

))to be negative. For D to

be less than zero we must have p < 0 since D = q2

4 +p3

27 . We are therefore now reduced to showing thatwhenever D < 0

1− 4 cos2(

13

arccos(

3q√−3p

2p2

)+ 60o

)> 0

or that

cos2(

13

arccos(

3q√−3p

2p2

)+ 60o

)<

14


3q√−3p

2p2

)+ 60o

)∣∣∣∣ < 12Since the arccos(·) function must return an angle between 0o and 180o, inclusive, 13 arccos

(3q√−3p

2p2

)+ 60o

must be an angle between 60o and 120o, inclusive. For all θ ∈ (60o, 120o) we have | cos θ| < 12 . cos 60o =

22

12 and cos 120

o = −12 so | cos 60o| = | cos 120o| = 12 . The two angles in the range [60

o, 120o] for which∣∣∣∣ cos( 13 arccos( 3q√−3p2p2 ) + 60o)∣∣∣∣ < 12 might not apply are θ1 = 60o and θ2 = 120o. θ1 = 60o wouldrequire arccos

(3q√−3p

2p2


√−3p

2p2 = 1, or 2p2 = 3q

√−3p, or

4p4 = 9q2(−3p) or 4p3 = −27q2 or 4p3 + 27q2 = 0 or p3

27 +q2

4 = 0 which is as before not possible since D < 0

so θ1 can never equal 60o. θ2 = 120o would require arccos(

3q√−3p

2p2

)= 180o. This would happen only when

3q√−3p

2p2 = −1 which would lead to D = 0 which is also not possible in this case and thus completes the proofof this corollary.

QUARTIC’S BARE FORM LEMMA: Suppose f(x) = a4x4 + a3x3 + a2x2 + a1x+ a0 with a4 6= 0. Insliding f(x) to the right by a34a4 units we obtain a second quartic we shall call g(x) which has the same shapeas f(x) and f ’s roots are a34a4 units to the left of those of g. We find

g(x) = a4t(x) where t(x) = x4 + px2 + qx+ r

with

p =8a2a4 − 3a23

8a24, q =

a33 − 4a2a3a4 + 8a1a248a34

, r =16a2a23a4 − 3a43 − 64a1a3a24 + 256a0a34

256a44

and if rti is a root of t(x) then rgi = rti is also a root of g(x) and

rfi = rti −a34a4

is a root of f(x). Since f(x) is a polynomial of degree 4, i ∈ {1, 2, 3, 4}. We will call t(x) the bare form off(x).

Proof: In the HORIZONTAL TRANSLATION to the right FUNCTION we found that

g(x) =

a4x4+(−4a4h+a3)x3+(6a4h2−3a3h+a2)x2+(−4a4h3+3a3h2−2a2h+a1)x+(a4h4−a3h3+a2h2−a1h+a0),

where h is the number of units f was slid to the right. Substituting h = a34a4 into the above expression withthe purpose of eliminating the x3-term we find after simplification

g(x) = a4x4 +8a2a4 − 3a23

8a4x2 +

a33 − 4a2a3a4 + 8a1a248a24

x+−3a43 + 16a2a23a4 − 64a1a3a24 + 256a0a34

256a34

from which we can see that g(x) = a4× t(x) where t(x), p, q, r are as defined above. Since a4 6= 0 we can seethat the roots of t(x) are identical to the roots of g(x).

QUARTIC FACTORIZABILITY LEMMA: Suppose t(x) is defined as in the Quartic’s Bare FormLemma. We find that t(x) may always be factored as the product of two quadratics Q1(x)×Q2(x) where

Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t

in at least one way and in no more than three ways where Q1(x)×Q2(x) = Q2(x)×Q1(x) is viewed as thesame factorization.

Proof: If we could affect this factorization we could at once determine the roots, if any, of t(x) by simplyfinding the roots of the quadratics Q1(x) and Q2(x) and then by subtracting a34a4 from these roots we wouldhave the roots of the original quartic f(x).

23

(In developing the forms for Q1 and Q2 in this Lemma we initially assumed them to be of the form x2+ux+sand x2 + vx+ t but immediately found v = −u since t(x) has no x3-term.)

So we want

Q1(x)×Q2(x) ≡ t(x) or(x2 + ux+ s)(x2 − ux+ t) ≡ x4 + px2 + qx+ r or

x4 + (s+ t− u2)x2 + (ut− us)x+ st ≡ x4 + px2 + qx+ r

which means we must be able to find numbers u, s, t which solve the following system of three equations inthese three unknowns:

s+ t− u2 = put− us = q

st = r

There are several ways of trying to solve this system. Eliminating s and u or t and u first seems to leadto excessive complexity so we start here by first eliminating s and t ending up with a polynomial in “u” tosolve. We’ll start with the case when niether r nor q is zero, i.e. rq 6= 0. From the third equation we find

s =r

t;

note that since r 6= 0 neither s nor t can be zero. Substituting this result into the second equation we find

ut− urt

= q or ut2 − qt− ur = 0 with u 6= 0 since q 6= 0.

Solving this quadratic in t for t we have at most the two solutions:

t1 =q +

√q2 + 4u2r2u

and t2 =q −

√q2 + 4u2r2u

,

where we will be showing the existence of at least one “u” making the radicand non-negative.

Using s = rt we find

s1 =r

t1

=r(

q+√q2+4u2r

2u

)=

2ur

q +√q2 + 4u2r

=2ur

q +√q2 + 4u2r

× q −√q2 + 4u2r

q −√q2 + 4u2r

=2ur(q −

√q2 + 4u2r)

q2 − (q2 + 4u2r)

=2ur(q −

√q2 + 4u2r)

−4u2r

=−q +

√q2 + 4u2r2u

24

and

s2 =r

t2

=r(

q−√q2+4u2r

2u

)=

2ur

q −√q2 + 4u2r

=2ur

q −√q2 + 4u2r

× q +√q2 + 4u2r

q +√q2 + 4u2r

=2ur(q +

√q2 + 4u2r)

q2 − (q2 + 4u2r)

=2ur(q +

√q2 + 4u2r)

−4u2r

=−q −

√q2 + 4u2r2u

Summarizing:

s1 =−q +

√q2 + 4u2r2u

, t1 =q +

√q2 + 4u2r2u

and

s2 =−q −

√q2 + 4u2r2u

, t2 =q −

√q2 + 4u2r2u

.

So

s1 + t1 =

√q2 + 4u2ru

, s2 + t2 = −√q2 + 4u2ru

.

We now plug “si + ti”, i = 1, 2 into the first equation in the system to obtain the polynomial in “u.” Weindex “u” with 1 or 2 to remind us whether it was derived from s1 + t1 or s2 + t2; so first for i = 1:

√q2 + 4u21ru1

− u21 = p√q2 + 4u21ru1

= p+ u21 and squaring both sides

q2 + 4u21ru21

= p2 + 2pu21 + u41

Note that in squaring both sides we introduced an extraneous root. If u11 solves the above equation then sodoes −u11 , but only one of {u11 ,−u11} can solve the original equation. We continue by multiplying throughby u21:

q2 + 4u21r = p2u21 + 2pu

41 + u

61

u61 + 2pu41 + (p

2 − 4r)u21 − q2 = 0

which is a polynomial of degree six with at most six roots and hence at most three roots which satisfy theoriginal equation and three extraneous roots.

25

Now for i = 2:

−√q2 + 4u22ru2

− u22 = p

−√q2 + 4u22ru2

= p+ u22 and squaring both sides

q2 + 4u22ru22

= p2 + 2pu22 + u42

Note that in squaring both sides we again introduced an extraneous root. If u21 solves the above equationthen so does −u21 , but only one of {u21 ,−u21} can solve the original equation. We continue by multiplyingthrough by u22:

q2 + 4u22r = p2u22 + 2pu

42 + u

62

u62 + 2pu42 + (p

2 − 4r)u22 − q2 = 0

which is not only a polynomial of degree six but the same polynomial as found above with at most six rootsand hence at most three roots which satisfy the original equation and three extraneous roots. Note that ifu∗ solves the original “i = 1” equation then −u∗ solves the original “i = 2” equation; so if u11 , u12 , and u13are the (at most) three solutions of the “i = 1” equation then −u11 ,−u12 , and −u13 are the three solutionsof the “i = 2” equation. Now s1, t1, u∗ implies the factorization

(x2 + u∗x+ s1)× (x2 − u∗x+ t1)

which by the commutative property leads to the factorization

(x2 + (−u∗)x+ t1)× (x2 − (−u∗)x+ s1).

Of course we don’t consider this to be a different factorization but it does account for another “s, t, u”solution, namely t1, s1,−u∗. So our solution set

{(s1, t1, u11), (s1, t1, u12), (s1, t1, u13), (s2, t2, u21), (s2, t2, u22), (s2, t2, u23)}

is now reduced to a maximum number of three triples leading to distinct factorizations:

{(s1, t1, u11), (s1, t1, u12), (s1, t1, u13)},

so we need only solve the “i = 1” case.

We must try to find the roots of

u61 + 2pu41 + (p

2 − 4r)u21 − q2 = 0

As can be seen directly this is a cubic in “u21” so letting U = u21 we now have

U3 + 2pU2 + (p2 − 4r)U − q2 = 0

to solve for U . This is sometimes called the cubic resolvent of the quartic.As the above is a cubic we are guaranteed at least one solution but this guarantee is not sufficient for ourpurposes. Since U = u21 we must be able to find a U which is not only non-negative, since u

21 cannot be

negative, but also makes q2 + 4Ur non-negative. Actually U must be positive since in the present case,rq 6= 0, means that u1 is also nonzero. Moving q2 to the RHS and dividing through by U we now have

U2 + 2pU + (p2 − 4r) = q2

U.

26

The LHS is a quadratic, also known as a parabola and the RHS is a hyperbola. We define the two functions:

L(U) = U2 + 2pU + (p2 − 4r) and R(U) = q2

U.

We may graph L(U) and R(U) in the Cartesian Plane with the vertical axis representing the values of L(U)and R(U) for each value of U in the domain as represented by the horizontal axis. A solution, U1, to thiscubic would be represented by the abscissa of a point of intersection of L(U) and R(U). Since we requireU > 0 we are only interested in Quadrant I intersections as depicted in the graph below:

(−p,−4r)

U1 U

L(U), R(U)

L(U)

R(U)

From the chapter on the quadratic we know that L(U) has a minimum whose coordinates are

MINL(U)(−p,−4r).

We find that if a solution U1 > 0 were to exist it will always satisfy our requirement that

q2 + 4U1r ≥ 0.

For r > 0 this is immediate, so let’s assume r < 0. When r < 0 MINL(U)(−p,−4r) will have the positiveordinate −4r and hence will be above the horizontal U -axis. Since MINL(U) is the minimum point of L(U)all points (U,L(U)) on L(U) must satisfy L(U) ≥ −4r. Any point (U,R(U)) on R(U) which is also sharedwith L(U) must therefore also satisfy R(U) ≥ −4r. So which points (U,R(U)) on the hyperbola R(U) satisfy

R(U) ≥ −4r? Well R−1(−4r) = q2

−4r , i.e. R(

q2

−4r

)= −4r. Since R(U) is always decreasing in Quadrant I

as U increases we see immediately that the abscissa, U1, of any intersection point must satisfy

U1 ≤q2

−4r, where we will let

q2

−4r= UMAX

or q2 + 4U1r ≥ 0 which is the required condition. See the graph below.

27

U1 U2 U3 UMAX

R(U)

L(U)

So we are left with having to prove the existence of at least one Quadrant I solution. The story-line of theproof that a Quadrant I intersection of L(U) and R(U) must exist is as follows. Moving from left to rightwhen L(U) just crosses the vertical axis at a y-intercept of (0, p2− 4r) and initially enters either Quadrant Ior Quadrant IV it will be below R(U) meaning that the function H(U) = L(U)−R(U) will have a negativevalue here. As U gets larger and larger L(U) gets arbitrarily large and R(U) which remains positive inQuadrant I gets arbitrarily close to zero meaning that for a sufficiently large U , H(U) = L(U)− R(U) willbe positive. So by the Intermediate Value Theorem there must be a least one positive U1 between these twopositive U ’s making H(U1) = 0 which would, of course, mean L(U1) = R(U1) which shows the existence ofa Quadrant I intersection of R(U) and L(U) as required.

Using more formal language for the proof we could proceed as follows. Since

limU→∞

R(U) = 0

for every y > 0, there exists a UyR so that R(U) < y whenever U > UyR. Since

limU→∞

L(U) =∞

for every y > 0, there exists a UyL so that L(U) > y whenever U > UyL. So pick any number y∗ > 0 andlet Uy∗max be any number greater than the larger of Uy∗R and Uy∗L. We would have that H(Uy∗max) =L(Uy∗max)−R(Uy∗max) > 0.L(0) = p2− 4r. Suppose first p2− 4r < 0. L(U) is continuous at U = 0 so for every δ > 0 there will exist an� > 0 so that |L(U)−L(0)| < δ whenever |U − 0| = |U | < �. So letting δ = |p2− 4r| we can find an �δ > 0 sothat |L(U)− (p2− 4r)| < |p2− 4r| whenever |U | < �δ. This means that for any Uδ ∈ (0, �δ), L(Uδ) < 0 whichmeans that H(Uδ) = L(Uδ)−R(Uδ) < 0 since R(Uδ) > 0 so by the Intermediate Value Theorem there mustexist a U1 > 0 between the positive numbers Uδ and Uy∗max so that H(U1) = 0 or L(U1) = R(U1) provingin this case the existence of a Quadrant I intersection of R(U) and L(U).

Suppose now that p2 − 4r = 0. Let γ be any positive number and let δ = R(γ) which is clearly a positivenumber. Since L(U) is continuous at U = 0 we can find an �δ > 0 so that |L(U) − L(0)| = |L(U)| < δwhenever |U | < �δ. So for any Uδ ∈ (0, �δ) we will again have H(Uδ) < 0 so again by the Intermediate ValueTheorem there must exist a U1 > 0 between the positive numbers Uδ and Uy∗max so that H(U1) = 0 orL(U1) = R(U1) proving in this case the existence of a Quadrant I intersection of R(U) and L(U).

Suppose finally that p2 − 4r > 0. R−1(p2 − 4r) is defined and equal to q2

p2−4r . Since we are making noassumptions about which side of the vertical axis MINL(U) lies we don’t know whether L(U) is rising or

falling at this y-intercept, (0, p2−4r). We choose any γ, 0 < γ < q2

p2−4r and we let δ = R(γ)−(p2−4r) which

28

is clearly a positive number since R(U) is monotonically decreasing in Quadrant I. Since L(U) is continuousat U = 0 we can find an � > 0 so that |L(U) − L(0)| = |L(U) − (p2 − 4r)| < δ whenever |U | < �. For anyUδ ∈ (0, �) we will have H(Uδ) < 0 so again by the Intermediate Value Theorem there must exist a U1 > 0between the positive numbers Uδ and Uy∗max so that H(U1) = 0 or L(U1) = R(U1) proving in this final casethe existence of a Quadrant I intersection of R(U) and L(U) and thus concluding the Lemma for the caserq 6= 0.

We now address the case when rq = 0. We will have the following three sub-cases:

Irq=0 : r = 0, q 6= 0

IIrq=0 : r = 0, q = 0

IIIrq=0 : r 6= 0, q = 0

We are as in the previous case to establish the solvability of the system

s+ t− u2 = put− us = q

st = r

We start by looking at Case Irq=0 : r = 0, q 6= 0. Since r = 0, either s = 0, or t = 0 or both s and t arezero. If both s and t were zero we would require u2 = −p which would lead to a solution only when p ≤ 0.Suppose s = 0, t 6= 0. We would then have ut = q and t − u2 = p which would lead to qu − u

2 = p orq−u3 = pu or u3 +pu− q = 0 which, being a cubic will always have from one to three solutions {u1, u2, u3}.So in this case there will always be at least one (s, t, u) solution and at most three. (Note that if t = 0 ands 6= 0 we would have −qu − u

2 = p or u3 + pu+ q = 0 whose roots would be the negatives of the first cubic,i.e. {−u1,−u2,−u3} and hence the same factorizations but multiplied in reverse order.)

We now look at Case IIrq=0 : r = 0, q = 0. Can a solution be found in this case as well? Suppose s = 0.Since q = 0 either u or t must be zero so suppose first t = 0; the solvability in this case would again bedependent on the sign of p so let’s suppose u = 0. We then would have t = p which is always possible so(s, t, u) = (0, p, 0) would always be a solution establishing solvability in this case as well.

Finally we look at Case IIIrq=0 : r 6= 0, q = 0. Since r 6= 0, neither s nor t can be zero. So s = rt leading to

ut− urt = 0 or u(t− rt

)= 0 so either u = 0 which would require s+ t = p (or rt + t = p) or t−

rt = 0. Can

we always solve t− rt = 0? This would lead to t2− r = 0 or t2 = r which is solvable only when r > 0. When

can we solve rt + t = p? I.e. when can we solve t2 − pt+ r = 0? This is solvable only when p2 − 4r ≥ 0, i.e.

when r ≤ p2

4 . Sincep2

4 ≥ 0, and since r will aways be either positive (leading to the solvability of t−rt = 0)

or negative and hence less than p2

4 (leading to the solvability of s+ t = p (orrt + t = p)) an (s, t, u) solution

will always exist in this final case as well, establishing the factorizability of the quartic t(x) in every caseand completing the proof of the Lemma.

The following is an immediate consequence of the Lemma.

COROLLARY: Every quartic f(x) = a4x4 +a3x3 +a2x2 +a1x+a0, a4 6= 0 can be factored as the productof two quadratics.

Proof: Let g(x) and t(x) be as defined in the previous Lemma. We know from that Lemma that there exists∗, t∗, u∗ so that

t(x) = (x2 + u∗x+ s∗)× (x2 − u∗x+ t∗).

29

Therefore

g(x) = a4t(x), a4 6= 0= a4(x2 + u∗x+ s∗)(x2 − u∗x+ t∗)= (a4x2 + a4u∗x+ a4s∗)(x2 − u∗x+ t∗)

g(x) is the quartic defined by sliding f(x) to the right by a34a4 units so to get back to f(x) we must slide g(x)to the left by a34a4 , i.e.

f(x) = g(x+

a34a4

)=(a4

(x+

a34a4

)2+ a4u∗

(x+

a34a4

)+ a4s∗

)((x+

a34a4

)2− u∗

(x+

a34a4

)+ t∗

)=(a4x

2 +(a32

+ a4u∗)x+

(a23

16a4+a3u∗

4+ a4s∗

))(x2 +

(a32a4− u∗

)x+

(a23

16a24− a3u

∗

4a4+ t∗

))=(a4x

2 +a3 + 2a4u∗

2x+

a23 + 4a3a4u∗ + 16a24s

∗

16a4

)×(x2 +

a3 − 2a4u∗

2a4x+

a23 − 4a3a4u∗ + 16a24t∗

16a24

)= P1(x)× P2(x)

where

P1(x) = a4x2 +a3 + 2a4u∗

2x+

a23 + 4a3a4u∗ + 16a24s

∗

16a4

is a quadrtic since a4 6= 0, and

P2(x) = x2 +a3 − 2a4u∗

2a4x+

a23 − 4a3a4u∗ + 16a24t∗

16a24

is also a quadratic since the coefficient of x2, 1, is nonzero. This completes the proof of the Corollary.

We now begin the lengthy but straightforward task of actually finding expessions for the various s, t, usolutions in the various regions of p-q-r-coefficient-land which factor t(x) by referring to the proof of theQuartic Factorizability Lemma. We look first at the full-bodied cases stemming from rq 6= 0. We must startby finding the positive roots of the quartic’s cubic resolvent

U3 + 2pU2 + (p2 − 4r)U − q2.

In applying the Cubic Formula to the above cubic we once again have to be careful not to confuse the indexeda’s and p’s and q’s used there as coefficient values of the cubic being solved with the indexed a’s and p’sand q’s used in the proof of the Quartic Factorizability Lemma. To avoid confusion symbols pertinent tothe Cubic Formula are in quotes. To apply the Cubic Formula we first find values for “p”, “q”, and “D”:

“p, ,

=3“a3

, ,

“a1, , − “a2

, ,2

3“a3, ,2

=3(1)(p2 − 4r)− (2p)2

3(1)2

=−p2 − 12r

3

= −12r + p2

3,

30

“q, ,

=2“a2

, ,3 − 9“a1, ,

“a2, ,

“a3, ,

+ 27“a3, ,2“a0

, ,

27“a3, ,3

=2(2p)3 − 9(p2 − 4r)(2p)(1) + 27(1)2(−q)2

27(1)3

=16p3 − 18p3 + 72pr − 27q2

27

=72pr − 2p3 − 27q2

27,

and

“D, ,

=“q

, ,2

4+

“p, ,3

27=

(72pr − 2p3 − 27q2)2

4 · 272− (p

2 + 12r)3

272

=4p3q2 + 27q4 − 16p4r − 144pq2r + 128p2r2 − 256r3

108

We now let Cp = “p, ,

, Cq = “q, ,

, and CD = “D, ,

, so

Cp = −p2 + 12r

3and Cq =

72pr − 2p3 − 27q2

27

and

CD =4p3q2 + 27q4 − 16p4r − 144pq2r + 128p2r2 − 256r3

108.

Also

“a2, ,

= 2p, “a3, ,

= 1, so“a2

, ,

3“a3, , =

2p3

When we calculate Cp, Cq, and CD and apply the Cubic Formula we will find either 1, 2, or 3 solutions.The Quartic Factorizability Lemma guarantees that at least one root will be positive and sometimes evenall three will be positive. However, frequently we will find some of the three roots to be negative as in thegraph below where we show also the Quadrant III branch of the hyperbola q

2

U .

L(U)R(U)

U1 U2U3

(−p,−4r)R(U)

In the example graphed above we see that U1 and U2 are negative and hence would not lead to factorizationof t(x); only the positive solution U3 would lead to a factorization of t(x).

In trying to find expressions for the various factoizations of t(x) we again start with the full-bodied casewhere rq 6= 0. If the cubic has one solution then by the Factorizability Lemma this solution must be positive.

31

If the cubic has two solutions then if p ≥ 0 only one of the two is positive as the minimum of L(U) wouldbe to the left of the vertical axis so that L(U) will be of positive slope everywhere in Quadrant I whereR(U) is of negative slope everywhere in Quadrant I so that as the we follow the curve L(U) from left toright after we arrive at the first intersection L(U) will continue rising from that point whereas R(U) willdescend meaning there can be no other intersection; if p < 0 L(U)’s minimum point will be on the right sideof the vertical axis. If L(U)’s y-intercept p2 − 4r is negative then only one of the roots is positive since onlythe right segment of L(U) can share a point with the Quadrant I branch of R(U) and it can share only theone point. If L(U)’s y-intercept p2 − 4r is positive then there can be no intersections with the Quadrant IIIbranch of R(U) and hence both solutions must be positive.

If the cubic has three solutions then if p ≥ 0 only one of the solutions, as explained above, can be positive;if p < 0 L(U)’s minimum point will be again on the right side of the vertical axis. If L(U)’s y-interceptp2−4r is negative then only one of the three roots is positive since only the right segment of L(U) can sharea point with the Quadrant I branch of R(U) and it can share only the one point so the other two roots mustbe negative. If L(U)’s y-intercept p2 − 4r is positive then there can be no intersections with the QuadrantIII branch of R(U) and hence all three solutions must be positive. Note p2 − 4r = 0 is not consistent withour cubic having more than one root. Summarizing the above discussion we have:

CUBIC HAS ONE SOLUTION· ONE positive U only

CUBIC HAS TWO SOLUTIONS· ONE positive U when p ≥ 0· TWO positive U ’s when p < 0 AND p2 − 4r > 0· ONE positive U when p < 0 AND p2 − 4r < 0

CUBIC HAS THREE SOLUTIONS· ONE positive U when p ≥ 0· THREE positive U ’s when p < 0 AND p2 − 4r > 0· ONE positive U when p < 0 AND p2 − 4r < 0

The following Theorem provides expressions for the quadratic factors of any given quartic’s bare form andthe then obtainable roots of the given quartic.

THEOREM (“QUARTIC FORMULA”): Suppose f(x) = a4x4 + a3x3 + a2x2 + a1x+ a0, a4 6= 0, with bareform t(x) = x4 + px2 + qx + r and that U3 + CpU + Cq with U = u2 is the bare form of the resolventcubic of t(x) found in the Quartic Factorizability Theorem where t(x) was the product of the two soughtquadratic factors Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t. The values of these coefficients in terms ofthe coefficients of the given quartic are found to be

Cq =72pr − 2p3 − 27q2

27, Cp = −

p2 + 12r3

,

and

CD =C2q4

+C3p27

=4p3q2 + 27q4 − 16p4r − 144pq2r + 128p2r2 − 256r3

108,

where

p =8a2a4 − 3a23

8a24, q =

a33 − 4a2a3a4 + 8a1a248a34

,

and

32

r =16a2a23a4 − 3a43 − 64a1a3a24 + 256a0a34

256a44.

The roots, ri, i ∈ {1, 2, 3, 4} of f(x) are to be found in the appropriate cell in the nine-cell table below:

u2 − 4s < 0: u2 − 4s = 0: u2 − 4s > 0:

u2 − 4t < 0: NO ROOTS r1 = −u2 −a34a4

r1 = −u+√u2−4s2 −

a34a4

r2 = −u−√u2−4s2 −

a34a4

u2 − 4t = 0: r1 = u2 −a34a4

r1 = u2 −a34a4

r1 = u2 −a34a4

r2 = −u2 −a34a4

r2 = −u+√u2−4s2 −

a34a4

r3 = −u−√u2−4s2 −

a34a4

u2 − 4t > 0: r1 = u+√u2−4t2 −

a34a4

r1 = −u2 −a34a4

r1 = −u+√u2−4s2 −

a34a4

r2 = u−√u2−4t2 −

a34a4

r2 = u+√u2−4t2 −

a34a4

r2 = −u−√u2−4s2 −

a34a4

r3 = u−√u2−4t2 −

a34a4

r3 = u+√u2−4t2 −

a34a4

r4 = u−√u2−4t2 −

a34a4

where u, s, and t are to be selected from the applicable region of coefficient-land as described in the fourteencases that follow: A1 though A5 and B1 through B9, where the “A.x Cases” pertain to the full-bodied caseswith rq 6= 0 and the “B.x Cases” pertain to the rq = 0 flat cases. In cases with multiple u, s, t triples anytriple ui, si, ti within that case may be selected and must lead to the same roots.

Case A.1: If rq 6= 0 and if CD > 0 or if Cp = Cq = 0 then t(x) has exactly one pair of quadratic factors{Q1(x), Q2(x)} where Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t with

u = α

√3

√−Cq

2+√CD +

3

√−Cq

2−√CD −

2p3

where

α =

1, if3

√−Cq

2 +√CD +

3

√−Cq

2 −√CD + p3 ≥ 0;

−1, if 3√−Cq

2 +√CD +

3

√−Cq

2 −√CD + p3 < 0,

s =−q +

√q2 + 4r 3

√−Cq

2 +√CD + 4r

3

√−Cq

2 −√CD − 8pr3

2α

√3

√−Cq

2 +√CD +

3

√−Cq

2 −√CD − 2p3

,

and

t =q +

√q2 + 4r 3

√−Cq

2 +√CD + 4r

3

√−Cq

2 −√CD − 8pr3

2α

√3

√−Cq

2 +√CD +

3

√−Cq

2 −√CD − 2p3

.

Case A.2: If rq 6= 0, CD = 0, Cp < 0, p < 0, and p2 − 4r > 0 then t(x) has exactly two pairs of quadraticfactors {Q11(x), Q12(x)} and {Q21(x), Q22(x)} where

33

Qi1(x) = x2 + uix+ si and Qi2(x) = x

2 − uix+ ti, i = 1, 2

with

u1 = α

√3

√Cq2− 2p

3, where

α =

1, if3

√Cq2 +

p3 ≥ 0;

−1, if 3√

Cq2 +

p3 < 0,

s1 =−q +

√q2 + 4r 3

√Cq2 −

8pr3

2α

√3

√Cq2 −

2p3

,

t1 =q +

√q2 + 4r 3

√Cq2 −

8pr3

2α

√3

√Cq2 −

2p3

,

and

u2 = α

√−2 3√Cq2− 2p

3, where

α =

1, if −23

√Cq2 +

p3 ≥ 0;

−1, if −2 3√

Cq2 +

p3 < 0,

s2 =−q +

√q2 − 8r 3

√Cq2 −

8pr3

2α

√−2 3√

Cq2 −

2p3

,

t2 =q +

√q2 − 8r 3

√Cq2 −

8pr3

2α

√−2 3√

Cq2 −

2p3

.

Case A.3: If rq 6= 0, CD = 0, Cp < 0, and either p ≥ 0 or “p2 − 4r < 0 with p < 0” then t(x) has exactly onepair of quadratic factors {Q1(x), Q2(x)} where Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t with

u = α

√−2 3√Cq2− 2p

3, where

α =

1, if −23

√Cq2 +

p3 ≥ 0;

−1, if −2 3√

Cq2 +

p3 < 0,

34

s =−q +

√q2 − 8r 3

√Cq2 −

8pr3

2α

√−2 3√

Cq2 −

2p3

,

t =q +

√q2 − 8r 3

√Cq2 −

8pr3

2α

√−2 3√

Cq2 −

2p3

.

Case A.4: If rq 6= 0, CD < 0, p < 0, and p2 − 4r > 0 then t(x) has exactly three pairs of quadratic factors{Q11(x), Q12(x)}, {Q21(x), Q22(x)}, and {Q31(x), Q32(x)} where

Qi1(x) = x2 + uix+ si and Qi2(x) = x

2 − uix+ ti, i = 1, 2, 3

with

u1 = α

√−2√−Cp

3cos[

13

arccos[

3Cq√−3Cp

2C2p

]]− 2p

3,

where

α =

1, if −2

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]]+ p3 ≥ 0;

−1, if −2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]]+ p3 < 0,

s1 =

−q +

√q2 − 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]]− 8pr3

2α

√−2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]]− 2p3

,

t1 =

q +

√q2 − 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]]− 8pr3

2α

√−2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]]− 2p3

,

and

u2 = α

√2

√−Cp

3cos[

13

arccos[

3Cq√−3Cp

2C2p

]− 60o

]− 2p

3,

where

α =

1, if 2

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]+ p3 ≥ 0;

−1, if 2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]+ p3 < 0,

35

s2 =

−q +

√q2 + 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 8pr3

2α

√2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 2p3

,

t2 =

q +

√q2 + 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 8pr3

2α

√2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 2p3

,

and

u3 = α

√2

√−Cp

3cos[

13

arccos[

3Cq√−3Cp

2C2p

]+ 60o

]− 2p

3,

where

α =

1, if 2

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]+ 60o

]+ p3 ≥ 0;

−1, if 2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]+ 60o

]+ p3 < 0,

s3 =

−q +

√q2 + 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]+ 60o

]− 8pr3

2α

√2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]+ 60o

]− 2p3

,

t3 =

q +

√q2 + 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]+ 60o

]− 8pr3

2α

√2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]+ 60o

]− 2p3

.

Case A.5: If rq 6= 0, CD < 0, and either p ≥ 0 or “p2 − 4r < 0 with p < 0”, then t(x) has exactly one pair ofquadratic factors {Q1(x), Q2(x)} with Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t with

u = α

√2

√−Cp

3cos[

13

arccos[

3Cq√−3Cp

2C2p

]− 60o

]− 2p

3,

where

α =

1, if 2

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]+ p3 ≥ 0;

−1, if 2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]+ p3 < 0,

s =

−q +

√q2 + 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 8pr3

2α

√2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 2p3

,

36

t =

q +

√q2 + 8r

√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 8pr3

2α

√2√−Cp

3 cos[

13 arccos

[3Cq√−3Cp

2C2p

]− 60o

]− 2p3

.

Case B.1: If rq = 0 with r = q = 0, p ≥ 0 then t(x) has exactly one pair of quadratic factors {Q1(x), Q2(x)}with Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t where

u = 0, s = 0, and t = p.

Case B.2: If rq = 0 with r = q = 0, p < 0 then t(x) has exactly two pairs of quadratic factors {Q11(x), Q12(x)}and {Q21(x), Q22(x)} with Qi1(x) = x2 + uix+ si and Qi2(x) = x2 − ui + ti, i = 1, 2, where

u1 =√−p, s1 = 0, t1 = 0 and u2 = 0, s2 = 0, t2 = p.

Case B.3: If rq = 0 with r = 0, q 6= 0 and D = q2

4 +p3

27 > 0, then t(x) has exactly one pair of quadraticfactors {Q1(x), Q2(x)} with Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t where

u = 3√q

2+√D + 3

√q

2−√D, s = 0, t =

q

3

√q2 +√D + 3

√q2 −√D.

Case B.4: If rq = 0 with r = 0, q 6= 0, p < 0, and D = q2

4 +p3

27 = 0, then t(x) has exactly two pairs of quadraticfactors {Q11(x), Q12(x)} and {Q21(x), Q22(x)} with Qi1(x) = x2 +uix+si and Qi2(x) = x2−ui+ ti, i = 1, 2,where

u1 = − 3√q

2, s1 = 0, t1 = −2

3

√q2

4and u2 = 2 3

√q

2, s2 = 0, t2 =

3

√q2

4.

Case B.5: If rq = 0 with r = 0, q 6= 0 and D = q2

4 +p3

27 < 0, then t(x) has exactly three pairs of quadraticfactors {Q11(x), Q12(x)}, {Q21(x), Q22(x)}, a

CHAPTER 4: THE QUARTIC quartic03c2063.netsolhost.com/.../uploads/2015/05/Quartic.pdf · 2015. 5. 24. · CHAPTER 4: THE QUARTIC A polynomial of degree 4 is called a quartic. In its

Documents