-
CHAPTER 4: THE QUARTIC
A polynomial of degree 4 is called a quartic. In its most
general form it may be written as
f(x) = a4x4 + a3x3 + a2x2 + a1x+ a0,
where the ai, i = 0, 1, 2, 3, 4, are signed numbers with a4 6=
0. The domain of f(x) is all signed numbers asthere are no
restrictions on x. Because a quartic is a polynomial of even degree
its range will be, unlike thecubic, a half-line.
After graphing many quartics we find empirically that if the
leading coefficient is positive the quartic willgrow to +∞ as x’s
distance from zero approaches +∞ although there may be at most two
finite intervalswhere its graph is descending. If the leading
coefficient is negative the quartic will generally descend
towards−∞ as |x| tends towards +∞ although again there may be up to
two finite intervals where its graph isactually rising as the
distance from the origin increases. All quartics have at least one
local extremum andat most three but never seem to have only two. If
the leading coefficient is positive, the ordinate of at leastone of
the extrema will also serve as the absolute minimum value in the
range of the quartic, whereas if theleading coefficient is
negative, the ordinate of at least one of the extrema will serve as
the absolute maximumvalue in the quartic’s range. Below are some
graphs of quartics.
Figure 1 Figure 2
Figure 3 Figure 4
1
-
Following is a definition for absolute or universal extrema:
DEFINITION: M(xM , yM ) is a maximum point of polynomial f(x) if
for all values of x, f(x) ≤ yM .m(xm, ym) is a minimum point of
polynomial f(x) if for all values of x, f(x) ≥ ym.
Once again the horizontal translation function g(x) = f→+h(x) =
f(x− h) may be calculated:
f(x− h) = a4(x− h)4 + a3(x− h)3 + a2(x− h)2 + a1(x− h) + a0=
a4x4 + (−4a4h+ a3)x3 + (6a4h2 − 3a3h+ a2)x2 + (−4a4h3 + 3a3h2 −
2a2h+ a1)x+
(a4h4 − a3h3 + a2h2 − a1h+ a0).
So we may write
HORIZONTAL TRANSLATION to the right FUNCTION: Given f(x) = a4x4
+ a3x3 + a2x2 +a1x+ a0, with a4 6= 0, as the general quartic, g(x)
= f→+h(x) = f(x− h) is h units to the right of f(x) andwould have
as its expression
g(x) =
a4x4+(−4a4h+a3)x3+(6a4h2−3a3h+a2)x2+(−4a4h3+3a3h2−2a2h+a1)x+(a4h4−a3h3+a2h2−a1h+a0).
Note that a4, the leading coefficient remained unchanged and
that the constant term is f(−h) as with thecubic meaning that its
roots are again the negatives of the roots of the given quartic and
hence they are justas treacherous to find as the roots of the
original.
We now succumb to an urge to generalize a little. Our methods so
far have consciously eschewed the use ofthe derivative function
developed at the outset of any presentation of the Differential and
Integral Calculus.We are able to find the local extrema of
polynomials without the use of the derivative. A translation
whicheliminates the x-term creates a new function whose y-intercept
is a candidate for being a local extremum.The coefficient of the
x-term of the translated function is a polynomial in “h” of degree
“n−1,” giving rise toat most “n−1” extrema. We then look at the
coefficient of the x2-term (as with the cubic) of the
translatedfunction and if this coefficient is nonzero the
y-intercept of this new function is indeed a local extremum. Itis a
local maximum if the coefficient of the x2-term is negative and it
is a local minimum if the coefficientof the x2-term is positive. If
the coefficent of the x2-term is zero we would remain unsure in the
case wheref(x) is a polynomial of degree greater than three.Had we
used derivates to find the extrema we would have started by taking
the (first) derivative of f(x).The derivative or first derivative
of f(x) is sometimes written as f ′(x) or as f (1)(x). With this
notation wecan write f(x) itself as f (0)(x). The first derivative,
f (1)(x), may be interpreted geometrically as providingthe value of
the slope of the line tangent to f(x) at the point (x, f(x)).
Clearly if this slope is not zero at,say x1, f(x) is still rising
(or falling) at x = x1 so for x1 to be an extremum it would seem to
be necessaryfor f (1)(x1) to be zero. However, the derivative
equalling zero, is not sufficient, since f(x) may have
stoppedrising (or falling) only momentarily at that point and then
simply resume its rise (or fall) after that point.A point (x1,
f(x1)) where f (1)(x1) = 0 is sometimes called a stationary point;
so all extrema of polynomialsare stationary points but not all
stationary points are extrema.Our main interest has been and
remains to find the roots of the given polynomials but this writer
couldnot resist the desire to show the connection between the
horizontal translation function and the derivativefunctions. By the
way the derivative of the derivative function, called the second
derivative may be writtenf ′′(x) or f (2)(x). We hope to prove the
following theorem describing the connection between the
derivativesof a polynomial, the coefficients of the polynomial’s
horizontal translation function and the polynomial’slocal
extrema:
2
-
MINIMAX THEOREM: If g(x) = f→+h(x) = f(x−h) is the polynomial h
units to the right of f(x) andif f (m)(x) is the mth derivative of
f(x), 0 ≤ m ≤ n, where f(x) = anxn + an−1xn−1 + . . .+ a0 =
∑ni=0 aix
i,an 6= 0, n ≥ 1, then if i! = 1 · 2 · . . . · i where 0! is
defined as 1, we would find
(i)
g(x) = f(−h) + f ′(−h)x+ 12f ′′(−h)x2 + f
(3)(−h)3!
x3 + . . .+f (n)(−h)
n!xn =
n∑i=0
1i!f (i)(−h)xi, and
(ii) the point (xex, f(xex)) is a local extremum if and only if
f ′(xex) = 0 and the smallest Natural numberiex, 2 ≤ iex ≤ n, for
which f (iex)(xex) 6= 0 is an even number. The local extremum would
be a local maximumwhen f (iex)(xex) < 0 and a local minimum when
f (iex)(xex) > 0.
Assuming we could prove this theorem, let us illustrate how we
would proceed to find the x-coordinates ofthe local extrema and how
we would determine whether they are local maxima or minima. We’d
start bytaking the derivative of the given function f(x). We would
then find the roots of the first derivative; i.e theroots of f
(1)(x). Suppose x1 is one of these roots (there are at most n − 1
roots since f (1)(x) is of degreen − 1). We would then calculate,
in ascending order, the derivatives f (2)(x), f (3)(x), . . . , f
(n)(x). Supposej, 2 ≤ j ≤ n, is the smallest number such that f
(j)(x1) 6= 0. If j is odd, x1 is not an extremum. If j is even:x1
is a local maximum when f (j)(x1) < 0; x1 is a local minimum
when f (j)(x1) > 0.
EXAMPLE: Find the extrema or extremum of f(x) = x4 − 8x3 + 24x2
− 32x+ 19.
Solution: f ′(x) = f (1)(x) = 4x3−24x2+48x−32. To find the roots
we note: f (1)(x) = 4·(x3−6x2+12x−8).By the rational roots test we
find x1 = 2 to be a root, which results in the factorization:
f (1)(x) = 4 · (x3 − 6x2 + 12x− 8) = 4 · (x− 2) · (x2 − 4x+ 4) =
4 · (x− 2) · (x− 2)2 = 4 · (x− 2)3
which means x1 = 2 is the first derivative’s only root. Of
course, since f(x) is a quartic (a polynomial ofeven degree) with
leading coefficient positive we know it must have at least one
minimum and since x1 = 2is the only candidate it must be the one.
But let’s proceed with the method outlined anyway. By design,f
(1)(2) = 0,
f (2)(x) = 12x2 − 48x+ 48so f (2)(2) = 12(2)2 − 48(2) + 48
= 48− 48(2) + 48= 0, so we continue :
f (3)(x) = 24x− 48so f (3)(2) = 24(2)− 48
= 48− 48= 0, so we continue :
f (4)(x) = 24 > 0for all x, including 2, so x1 = 2 is a
minimum as 4 is even.
Since f(2) = 3, (2, 3) is a local minimum of f(x). It is also,
in this case, the minimum value in the range off(x).
3
-
EXAMPLE: Use the result in Part (i) of the theorem to expand (a+
b)5.
Solution: Let f(x) = x5. Let g(x) be f(x) slid to the right by h
= −b, i.e slid to the left b units. We willfirst find the
expression for g(x) and then calculate g(a); for g(a) = f(a− (−b))
= f(a+ b) = (a+ b)5.
By Part (i) of the theorem, where −h = b, g(x) =∑5i=0
1i!f
(i)(b)xi. We start by calculating the six values off (i)(b) as i
runs through 0,1,2,3,4,5.
i f (i)(x) f (i)(b) 1i!f(i)(b)xi 1i!f
(i)(b)xi simplified
0 x5 b5 10!b5x0 b5
1 5x4 5b4 51!b4x1 5b4x
2 4 · 5x3 4 · 5b3 4·52·1b3x2 10b3x2
3 3 · 4 · 5x2 3 · 4 · 5b2 3·4·53·2·1b2x3 10b2x3
4 2 · 3 · 4 · 5x 2 · 3 · 4 · 5b 2·3·4·54·3·2·1bx4 5bx4
5 1 · 2 · 3 · 4 · 5 1 · 2 · 3 · 4 · 5 1·2·3·4·55·4·3·2·1x5
x5
So adding up the terms in column 5 we find that g(x) = b5 +
5b4x+ 10b3x2 + 10b2x3 + 5bx4 + x5.And hence g(a) = b5 + 5b4a+
10b3a2 + 10b2a3 + 5ba4 + a5. So we have that
(a+ b)5 = b5 + 5b4a+ 10b3a2 + 10b2a3 + 5ba4 + a5
We can confirm our result using the Binomial Theorem:
(a+ b)5 =(
50
)a5 +
(51
)a4b+
(52
)a3b2 +
(53
)a2b3 +
(54
)ab4 +
(55
)b5
= a5 + 5a4b+ 10a3b2 + 10a2b3 + 5ab4 + b5
which matches the above result although the order of the sum is
reversed. After this example our confidencein Part (i) of the
theorem is increased but we still have the proof ahead of us. In
working towards thisdemonstration of the connection between the
horizontal translation function of f(x) and f(x)’s derivativeswe’ll
begin with the following Lemma.
LEMMA: Suppose s is a Natural number. Consider the polynomial of
degree s: fs(x) =∑sj=0 ajx
j =a0 + a1x+ a2x2 + . . .+ asxs, as 6= 0. We will find
f (i)s (x) =s−i∑l=0
(s− li
)i!as−lxs−i−l,
where f (i)s (x) is the ith derivative of fs(x), with i among 0,
1, 2, . . . , s.
Proof:
fs(x) =s∑j=0
ajxj = a0 + a1x+ a2x2 + . . .+ asxs, as 6= 0
4
-
fs(x) has s + 1 terms. Each term is a monomial with a
coefficient, and a power of x. Let’s associate withfs(x) the set
Afs(x) = {(aj , xj)|j = 0, 1, . . . , s}, where (a, xj) = (b, xk)
if and only if a = b and j = k. Thepurpose of this type of set will
be to compare sums to see if they are equal. We have to be a little
carefulbecause elements need not be repeated in sets but obviously
alter a sum. Also in the above set any elementof the form (0, xj)
is to be omitted as it would not contribute to the sum. It’s all
quite trivial really; themost taxing aspect is keeping track of the
notational maze.What is the ith derivative of ajxj , where 0 ≤ i ≤
s? If i > j it is 0; If i ≤ j it would be j · j − 1 · j − 2 · .
. . ·(j − i+ 1) · aj · xj−i = j!(j−i)! · aj · x
j−i.So
Af(i)s (x)
={(
j!(j − i)!
· aj , xj−i)∣∣∣∣j = i, i+ 1, . . . , s}
It is clear by inspection that the s− i+ 1 elements of the above
set, that is the ordered pairs, are all distinctas j runs its
course from i to s. There will be no terms to discard because
although aj might be zero forsome particular polynomial there will
be no value of j for which aj will be definitely zero for the
generalpolynomial fs(x).According to the lemma we have that
f (i)s (x) =s−i∑l=0
(s− li
)i!as−lxs−i−l,
so for this expression we would find the associated set to
be
ALemma
f(i)s
(x) ={((
s− li
)i!as−l, xs−l−i
)∣∣∣∣l = 0, 1, 2, . . . , s− i}where i is an integer in {0, 1,
2, . . . , s}.
Now,(s− li
)=
(s− l)!i!(s− l − i)!
, so(s− li
)i! =
(s− l)!(s− l − i)!
leading to
ALemma
f(i)s
(x) ={(
(s− l)!(s− l − i)!
· as−l, xs−l−i)∣∣∣∣l = 0, 1, . . . , s− i}
Both ALemma
f(i)s
(x) and Af(i)s (x)
have the same number of distinct elements, namely s− i+ 1, so we
need onlyshow these two finite sets are equal to establish the
equality of the associated sums. To prove that two setsA and B are
equal it is sufficient to show that A ⊆ B and B ⊆ A. However, if
the two sets are finite andhave the same number of elements it is
sufficient to show the truth of only one of the two containments.
Toprove that A ⊆ B we must show that an arbitrary element in A will
always be also found in B. So we willnow show that A
Lemma
f(i)s
(x) ⊆ Af(i)s (x)
. So we now take an arbitrary element in ALemma
f(i)s
(x):((s− l0)!
(s− l0 − i)!· as−l0 , xs−l0−i
).
and must show its presence in Af(i)s (x)
. Suppose jl0 = s− l0. What can we say about the range of jl0?
Welljl0 can be no smaller than s − (s − i) = i and no bigger than s
− 0 = s. In other words jl0 must be in{i, i+ 1, . . . , s}, meaning
that the element below obtained by replacing s− l0 with jl0 in the
element abovewill indeed place this element in A
f(i)s (x)
;(jl0 !
(jl0 − i)!· ajl0 , x
jl0−i).
This establishes the equality of the sets and the corresponding
sums and hence the Lemma.
5
-
PROOF of MINIMAX THEOREM: Suppose g(x) is the right translation
function by h of f(x) =fn(x) =
∑nj=0 ajx
j . As found earlier we have
g(x) = fn(x− h)
=n∑j=0
aj(x− h)j
=n∑j=0
aj
( j∑k=0
(j
k
)xj−k(−h)k
)
using the Binomial Theorem. We wish to show the above sum, to be
identical with
gTheorem
(x) =n∑i=0
1i!f (i)(−h)xi.
From the Lemma we know that
f (i)(−h) = f (i)n (−h) =n−i∑l=0
(n− li
)i!an−l(−h)n−i−l.
Substituting the above into gTheorem
(x) we find
gTheorem
(x) =n∑i=0
1i!
( n−i∑l=0
(n− li
)i!an−l(−h)n−i−l
)xi
=n∑i=0
( n−i∑l=0
(n− li
)an−l(−h)n−i−l
)xi
so the coefficient of xi in gTheorem
(x), i = 0, 1, . . . , n is
n−i∑l=0
(n− li
)an−l(−h)n−i−l;
we would like to extract the coefficient of xi in
f(x− h) =n∑j=0
aj
( j∑k=0
(j
k
)xj−k(−h)k
)and show that it, too, has this same coefficient for these i ∈
{0, 1, . . . , n}.
Clearly the exponent j−k takes on exactly the values 0, 1, . . .
, n. Suppose j−k = j∗ ∈ {0, 1, . . . , n}. Whichvalues of (k, j), 0
≤ k ≤ j ≤ n, satisfy j − k = j∗? First we observe j ≥ j∗. The table
below indicates then− j∗ + 1 distinct pairs (j, k) which satisfy j
− k = j∗.
6
-
j k j − k ?= j∗
j∗ 0 j∗√
j∗ + 1 1 j∗√
j∗ + 2 2 j∗√
. . . . . . . . .
j∗ + c c j∗√
provided 0 ≤ c ≤ n− j∗
. . . . . . . . .
j∗ + (n− j∗) = n n− j∗ j∗√
The coefficient of xi in gTheorem
(x) has n − i + 1 terms and i, like j∗, takes on precisely the
values in{0, 1, . . . , n}. So both coefficients have the same
number of addends, namely “n− +1”.So it would be sufficient to show
the individual addends are equal.
The coefficient of xj∗
is
(aj∗)(j∗
0
)(−h)0 + (aj∗+1)
(j∗ + 1
1
)(−h)1 + . . .+ (aj∗+(n−j∗))
(j∗ + (n− j∗)
n− j∗
)(−h)n−j
∗
=n∑
j=j∗
(aj)(
j
j − j∗
)(−h)j−j
∗
So, to correlate notation, if j∗ = i, the coefficient of xi
would be
=n∑j=i
(aj)(
j
j − i
)(−h)j−i
= ai
(i
0
)(−h)0 + ai+1
(i+ 1
1
)(−h)1 + . . .+ an
(n
n− i
)(−h)n−i.
which adding up the terms in reverse would lead to the
expression
n−i∑l=0
(an−l)(
n− ln− i− l
)(−h)n−i−l
=n−i∑l=0
(an−l)(
n− l(n− l)− i
)(−h)n−i−l
=n−i∑l=0
(an−l)(n− li
)(−h)n−i−l,
since (m
k
)=(
m
m− k
); k ≤ m.
7
-
This final sum has the same form as the coefficient of xi in the
expression for gTheorem
(x), as was to be shown,proving part (i) of the Theorem.
Suppose xex satisfies f ′(xex) = 0 and that iex is the smallest
Natural number 2 ≤ iex ≤ n for whichf (iex)(xex) 6= 0 and that iex
is even. A smallest iex, 2 ≤ iex ≤ n, such that f (iex)(xex) 6= 0
will always existsince we always have f (n)(xex) 6= 0. We then
construct g(x) by sliding f(x) to the “left” by xex meaning weuse h
= −xex. By the result found in Part (i) of the Minimax Theorem we
would have
g(x) = f(xex) + f ′(xex)x+12f ′′(xex)x2 +
f (3)(xex)3!
x3 + . . .+f (n)(xex)
n!xn
and substituting our assumptions about xex and iex leads to
g(x) = f(xex)+ < terms of value zero > +f (iex)(xex)
(iex)!xiex + . . .+
f (n)(xex)n!
xn
= f(xex) + xiex(f (iex)(xex)
(iex)!+f (iex+1)(xex)
(iex + 1)!x+ . . .+
f (n)(xex)n!
xn−iex)
= f(xex) + xiex(f (iex)(xex)
(iex)!+n−iex∑j=1
f (iex+j)(xex)(iex + j)!
xj),
where the∑
term applies when iex 6= n. It is dropped in the case iex = n
itself. So we let C(x), thecoefficent of xiex be defined as
C(x) =
f(iex)(xex)
(iex)!+∑n−iexj=1
f(iex+j)(xex)(iex+j)!
xj , if iex < n;f(iex)(xex)
(iex)!, if iex = n.
Provided iex < n we define Q(x) as
Q(x) =n−iex∑j=1
f (iex+j)(xex)(iex + j)!
xj .
So we have
g(x) = f(xex) + xiex · C(x).
If n were even, our assumed even iex might actually equal n.
Addressing this case first we would have
C(x) =f (iex)(xex)
(iex)!=f (n)(xex)
n!.
which would be a constant we’ll call Cn. So, in this special
case
g(x) = f(xex) + xn · Cn.
By inspection it can be seen that the point (0, g(0)) = (0,
f(xex)) is an absolute extemum of g(x): Since n iseven the term xn
·Cn will always have the sign of the constant Cn regardless of the
sign of x. (0, f(xex)) willtherefore be a maximum point when the
constant Cn is negative and a minimum point when the constantCn is
positive.
We now consider iex < n and look at Q(x). Q(x) is a
polynomial of degree n− iex ≥ 1. Since Q(0) = 0 andQ(x) is a
polynomial, Q(x) is continuous everywhere including at x = 0. We
are given that f (iex)(xex) 6= 0
so we may let δ =∣∣∣∣ f(iex)(xex)(iex)!
∣∣∣∣ which is guaranteed positive. By the continuity of Q(x) we
can find an � > 08
-
so that |Q(x)| < δ whenever |x| < �. Since iex is even we
have xiex > 0 whenever x 6= 0 regardless of x’s sign.As long as
we restrict ourselves to this �-neighborhood of x = 0 we would have
the sign of the coefficientC(x) always be the same as the sign of
the contant f
(iex)(xex)(iex)!
. We’ll call this constant Ciex , allowing us towrite
g(x) = f(xex) + xiex(Ciex +Q(x)).
By an argument similar to the one used above for the case iex =
n we see that (0, f(xex)) is a local extremum.In the above argument
x was free to range over all numbers but here we are restricted to
the �-neighborhoodwhich will keep |Q(x)| < δ = |Ciex | meaning
we are observing possibly only a “local” extremum. So if Ciexis
negative, (0, f(xex)) is a local maximum of g(x) and if Ciex is
positive, (0, f(xex)) is a local minimum ofg(x). By the definition
of Ciex it is immediately apparent that it has the same sign as
f
(iex)(xex). So wesee that (xex, f(xex)) must have been a local
extremum of f(x) when the two conditions in Part (ii) of theMinimax
Theorem are assumed met.
Now for the converse: Assuming (xex, f(xex)) is a local extremum
we must show the two conditions in Part(ii) to be necessarily true,
namely that f ′(xex) = 0 and that the smallest Natural number iex,
2 ≤ iex ≤ nwith f (iex)(xex) 6= 0 is even. Suppose for a moment
that the constant f ′(xex) which we shall call b were notzero. We
would then have
g(x) = f(xex) +f ′(xex)
1!x+
f ′′(xex)2!
x2 + . . .+f (n)(xex)
n!xn
= f(xex) + x(f ′(xex) +
f ′′(xex)2!
x+ . . .+f (n)(xex)
n!xn−1
)= f(xex) + x
(b+
f ′′(xex)2!
x+ . . .+f (n)(xex)
n!xn−1
)This time we let
Q(x) =f ′′(xex)
2!x+ . . .+
f (n)(xex)n!
xn−1 =n∑j=2
f (j)(xex)j!
xj−1.
Again we have Q(0) = 0 and Q(x) is a polynomial and hence
continuous at x = 0. We let δ = |b| > 0. Wecan then find an �
> 0 so that whenever x ∈ (−�, �), |Q(x)| < δ making the
coefficient of x retain the signof b on both sides of x = 0. If b
were positive abscissas to the left of the origin would be below
(0, f(xex))and abscissas to the right of the origin would be above
(0, f(xex)) no matter how small a positive number �were. If b were
negative, abscissas to the left of the origin would result in
values for g(x) above (0, f(xex))and abcissas to the right of the
origin would result in g(x) values below (0, f(xex)) making it
impossiblefor (0, f(xex)) to be an extremum. Therefore we must have
b = 0, i.e f ′(xex) = 0 establishing the firstcondition. Now for
the second condition. Assume again for a moment that the smallest
iex, 2 ≤ iex ≤ nmaking f iex(xex) 6= 0 were odd. We would then be
able to write
g(x) = f(xex)+ < terms of value zero > +f (iex)(xex)
(iex)!xiex + . . .+
f (n)(xex)n!
xn−iex
= f(xex) + xiex(f (iex)(xex)
(iex)!+n−iex∑j=1
f (iex+j)(xex)(iex + j)!
xj)
where again the∑
term applies provided iex 6= n and is dropped in the case iex =
n itself. So once more welet C(x), the coefficent of xiex be
defined as
C(x) =
f(iex)(xex)
(iex)!+∑n−iexj=1
f(iex+j)(xex)(iex+j)!
xj , if iex < n;f(iex)(xex)
(iex)!, if iex = n.
9
-
Once again we can use the following short-cut notation for the
constant
Ciex =f (iex)(xex)
(iex)!
so that Cn =f(n)(xex)
n! . In the event that n were odd we might again have that iex =
n. Addressing this casefirst we could write
g(x) = f(xex) + Cn · xn.Since xn would be an odd power, Cn ·xn
would have opposite signs for abscissas on opposite sides of x = 0
nomatter how tiny an interval about x = 0 we consider making it
impossible for (0, f(xex)) to be an extremumof g(x) and hence for
(xex, f(xex)) to be an extremum of f(x). In the case iex < n, we
let, as before,
Q(x) =n−iex∑j=1
f (iex+j)(xex)(iex + j)!
xj .
We have Q(0) = 0 and Q(x) being a polynomial is continuous at x
= 0. We let δ = |Ciex | > 0 and we can findan � > 0 so that
|Q(x)| < δ whenever x ∈ (−�, �), meaning that within this
interval C(x) = Ciex +Q(x) retainsthe sign of Ciex . As iex is
being assumed odd x
iex is an odd power and hence within the �-neighborhoodC(x) ·
xiex is of opposite sign on opposite sides of the origin. So we see
that (0, f(xex)) cannot be a localextremum of g(x) as no matter how
tight an �-neighborhood around x = 0 we choose the values for g(x)
willbe on opposite sides of f(xex) when the abscissas are on
opposite sides of x = 0, meaning that (xex, f(xex))could not be a
local extremum of f(x) thus voiding the temporary assumption that
iex might be odd. So iexmust be even which establishes the Minimax
Theorem.
MiniMax Corollary For Linear (n=1) Polynomials: Given f(x) = a1x
+ a0, a1 6= 0, f(x) has noextrema.
Proof: f (1)(x) = a1 6= 0 for all values of x. Hence f (1)(x) =
0 has no solution so f(x) can have no extrema.
MiniMax Corollary For Quadratic (n=2) Polynomials: Given f(x) =
a2x2 + a1x + a0, a2 6= 0. Thepoint (xex, f(xex)) =
(−a12a2
,4a0a2−a21
4a2
)is always an extremum and is a (universal) maximum when a2 <
0 and
a (universal) minimum when a2 > 0.
Proof: We start by calculating f (1)(x) = f ′(x) = a2x + a1.
Since a2x + a1 is a polynomial of degree oneit will have at most
one root and since its degree is odd it will have at least one root
and therefore willhave exactly one root, which by inspection is xex
= −a12a2 . We find that f
(2)(x) = 2a2 6= 0 for all values of xincluding therefore xex =
−a12a2 making xex =
−a12a2
an extremum since “2” is even. xex will be the abscissa of
alocal maximum when a2 is negative and the abscissa of a local
minimum when a2 is positive. The extremumis universal for suppose
we slide f(x) to the right by a12a2 to place the extremum on the
y-axis:
f(x− a12a2
) = a2
(x− a1
2a2
)2+ a1
(x− a1
2a2
)+ a0
= a2
(x2 − a1
a2x+
a214a22
)+ a1x−
a212a2
+ a0
= a2x2 +a214a2− a
21
2a2+ a0
= a2x2 +4a2a0 − a21
4a2
We can now see by inspection of the last expression above that
g(x) = f(x − a12a2 ) has (0,4a2a0−a21
4a2) as a
universal extremum. If a2 < 0 this point would be a universal
maximum; if a2 > 0 this point would be auniversal minimum.
10
-
MiniMax Corollary For Cubic (n=3) Polynomials: Given f(x) = a3x3
+a2x2 +a1x+a0, a3 6= 0, f(x)will have two extrema
(xmax, f(xmax)) =(−a2 −
√a22 − 3a1a3
3a3,
2a32 − 9a1a2a3 + 27a23a4 + 2(a22 − 3a1a3)√a22 − 3a1a3
27a23
)and (xmin, f(xmin)) =
(−a2 +
√a22 − 3a1a3
3a3,
2a32 − 9a1a2a3 + 27a23a4 − 2(a22 − 3a1a3)√a22 − 3a1a3
27a23
)if and only if a22 − 3a1a3 > 0; xmax will be the abscissa of
the local maximum and xmin will be the abscissaof the local
minimum. If a22 − 3a1a3 ≤ 0 f(x) will have no extrema.
Proof: We start again with f (1)(x) = 3a3x2 +2a2x+a1. We wish to
find the roots for 3a3x2 +2a2x+a1 = 0.Since f (1)(x) is a degree 2
polynomial it will have at most two roots. Since f (1)(x) is of
even degree it mayhave no roots at all since its range is the
half-line. Using the Quadratic Formula we find the two
solutions
xex1 =−a2 −
√a22 − 3a1a3
3a3,
and xex2 =−a2 +
√a22 − 3a1a3
3a3
when a22 − 3a1a3 > 0; the one solution
x1 =−a23a3
when a22−3a1a3 = 0 and no solutions when a22−3a1a3 < 0. By
the MiniMax Theorem there can be no extremawhen f (1)(x) cannot
equal zero. Can x1 = −a23a3 be the abscissa of an extremum? Well
f
(2)(x) = 6a3x+ 2a2,so f (2)(x1) = f (2)(−a23a3 ) = 6a3(
−a23a3
) + 2a2 = −2a2 + 2a2 = 0 and f (3)(x1) = f (3)(−a23a3 ) = 6a3 6=
0, so theanswer is “No” since 3 is odd. We are therefore left with
xex1 and xex2 as the only candidates for beingextrema as f
(1)(xex1) = f
(1)(xex2) = 0. Calculating the second derivative for each of
these candidates wefind
f (2)(xex1) = 6a3xex1 + 2a2
= 6a3
(−a2 −
√a22 − 3a1a3
3a3
)+ 2a2
= −2√a22 − 3a1a3 < 0 and hence 6= 0
so xex1 is the abscissa of a local maximum so we use the symbol
xmax for xex1 , and
f (2)(xex2) = 6a3xex2 + 2a2
= 6a3
(−a2 +
√a22 − 3a1a3
3a3
)+ 2a2
= 2√a22 − 3a1a3 > 0 and hence 6= 0
so xex2 is the abscissa of a local minimum so we use the symbol
xmin for xex2 . To calculate the ordinates of
these extrema we must expand f(xmax) = f(−a2−
√a22−3a1a3
3a3) and f(xmin) = f(
−a2+√a22−3a1a3
3a3). This simple
but lengthy calculation was performed in the chapter on the
cubic. Adapting the notation used there forthe coefficients of the
given cubic to our present notation we find
11
-
f(xmax) =2a32 − 9a1a2a3 + 27a23a4 + 2(a22 − 3a1a3)
√a22 − 3a1a3
27a23and
f(xmin) =2a32 − 9a1a2a3 + 27a23a4 − 2(a22 − 3a1a3)
√a22 − 3a1a3
27a23completing the proof of the corollary.
MiniMax Corollary For Quartic (n=4) Polynomials: Given f(x) =
a4x4+a3x3+a2x2+a1x+a0, a4 6=0, f(x) will have either one extremum
or three extrema. Letting
p =8a2a4 − 3a23
16a24, q =
a33 − 4a2a3a4 + 8a24a132a34
, and D =q2
4+p3
27
we find
(i)if D ≥ 0 then
xex =3
√−q2
+√D + 3
√−q2−√D − a3
4a4is
the abscissa of the one and only extremum of f(x); xex is an
(absolute) maximum when a4 < 0; xex is an(absolute) minimum when
a4 > 0, and
(ii) if D < 0 then
xex1 = −2√−p3
cos[
13
arccos[
3q√−3p
2p2
]]− a3
4a4,
xex2 = 2
√−p3
cos[
13
arccos[
3q√−3p
2p2
]− 60o
]− a3
4a4and
xex3 = 2
√−p3
cos[
13
arccos[
3q√−3p
2p2
]+ 60o
]− a3
4a4
are the abscissas of the three extrema.
If a4 < 0 then xex1 is the abscissa of a local maximum, xex3
is the abscissa of a local minimum and xex2 isthe abscissa of a
local maximum. The larger of f(xex1) and f(xex2) would serve as the
absolute maximumvalue in the range of f(x).
If a4 > 0 then xex1 is the abscissa of a local minimum, xex3
is the abscissa of a local maximum and xex2 isthe abscissa of a
local minimum. The smaller of f(xex1) and f(xex2) would serve as
the absolute minimumvalue in the range of f(x).
For reference we carry out the following simple but lengthy
calculations:
f(xex) =768a0a34 + 512a
44p
2 + 192a23a24p− 512a2a34p− 9a43 + 48a2a33a4 − 192a1a3a24
768a34
+(a33 − 4a2a3a4 + 8a1a24 − 8a34q
8a24
)(3
√−q2
+√D + 3
√−q2−√D
)
+(
8a2a4 − 3a23 − 8a24p8a4
)(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)12
-
f(xex1) =256a0a34 − 64a1a3a24 + 16a2a23a4 − 3a43
256a34+
4a2a3a4 − a33 − 8a1a244a24
√−p3
cos[
13
arccos(
3q√−3p
2p2
)]+
3a23p− 8a2a4p6a4
cos2[
13
arccos(
3q√−3p
2p2
)]+
169a4p
2 cos4[
13
arccos(
3q√−3p
2p2
)]
f(xex2) =256a0a34 − 64a1a3a24 + 16a2a23a4 − 3a43
256a34− 4a2a3a4 − a
33 − 8a1a24
4a24
√−p3
cos[
13
arccos(
3q√−3p
2p2
)− 60o
]+
3a23p− 8a2a4p6a4
cos2[
13
arccos(
3q√−3p
2p2
)]+
169a4p
2 cos4[
13
arccos(
3q√−3p
2p2
)− 60o
]
f(xex3) =256a0a34 − 64a1a3a24 + 16a2a23a4 − 3a43
256a34− 4a2a3a4 − a
33 − 8a1a24
4a24
√−p3
cos[
13
arccos(
3q√−3p
2p2
)+ 60o
]+
3a23p− 8a2a4p6a4
cos2[
13
arccos(
3q√−3p
2p2
)]+
169a4p
2 cos4[
13
arccos(
3q√−3p
2p2
)+ 60o
]Proof: Although our intent is to provide a proof using the
Minimax Theorem it is interesting to note thatthis corollary is a
direct consequence of the connection between the derivative and the
integral (also aptlycalled the antiderivative). The Fundamental
Theorem of the Differential and Integral Calculus finds that∫ b
a
g(1)(t)dt = G(b)−G(a)
where G(t) is any function whose first derivative is g(1)(t).
Clearly, by definition, G(t) = g(t) is a functionwhose first
derivative is g(1)(t), but so would be G(t) = g(t) + K where K is
any constant whatsoever. Infact the set of all antiderivatives of
g(t) is {g(t) +K|K, any number}.
The corollary states that when the first derivative, the cubic f
(1)(x) of the given quartic, f(x), has oneroot this root is the
abscissa of the one extremum, when this derivative has two roots
(cubic’s flat case) thequartic still has just one extremum whose
abscissa is the root associated with the point where the graph off
(1)(x) actually “cuts through” the x-axis, and when f (1)(x) has
three roots then the quartic f(x) has threeextrema. Extrema are
frequently called “turning points.” Below are three graphs
illustrating these cases:
f (1)(x) f (1)(x)
x0 A x0 A B
Case I: f (1)(x) has one root and hence Case II: f (1)(x) has
two roots and stillone turning point for f(x) at A one turning
point for f(x) at A only
13
-
f (1)(x)x0 A B C
Case III: f (1)(x) has three roots and hencethree turning points
for f(x) at A, B, and C
Suppose we let “a” in the integral be x0, an arbitrary number on
the x-axis less than any of f (1)(x)’s roots.One then has ∫ x
x0
f (1)(x)dx = f(x)− f(x0).
So,
f(x) =∫ xx0
f (1)(x)dx+ f(x0).
f(x0) is a constant;∫ xx0f (1)(x)dx may be thought of as the
signed “area” under the curve f (1)(x) as we move
from x0 to x. As we move from x0 towards A in Case I the area is
getting more and more negative meaningthat f(x) is getting smaller
and smaller, but immediately after we pass A we start to pick up
positive areaso f(x) will start to get larger meaning that A is
indeed a turning point or extremum. In this particulardiagram it
would correspond with a minimum.
In Case II we again find that as we move from x0 to A we are
accumulating negative area meaning thatf(x) is declining, but then
as we pass A we start accumulating positive area meaning that f(x)
starts togrow making A a turning point. B is only a stationary
point and not a turning point because before B (butafter A, of
course) we are accumulating positive area meaning that f(x) is
growing and after B we continueto accumulate positive area so f(x)
is still growing. f(x)’s growth ceases only momentarily at B itself
butnever actually declines so B is not an extremum.
In Case III as we move from x0 to A we are accumulating negative
area meaning that f(x) is declining butjust after point A positive
area starts to accumulate meaning that f(x) starts to increase
making A a turningpoint or extremum. Area continues to be added
until point B is reached meaning that f(x) rises non-stop inthe
interval between A and B but after point B negative area starts to
accumulate causing f(x) to declinemaking B a turning point; f(x)
will continue to decline until point C is reached; after point C
positive areastarts to accumulate meaning f(x) starts to rise again
making point C also a turning point or extremum.
Now we start the formal proof using the Minimax Theorem. By this
theorem we require all abscissas ofextrema of f(x) to be roots of f
(1)(x). Now f (1)(x) = 4a4x3 + 3a3x2 + 2a2x+ a1 is a cubic. We will
use thepreviously found cubic formula to find these roots of f
(1)(x).
When we apply the cubic formula we must first calculate “p” and
“q” and “D.” With the cubic formula“ai”, i = 0, 1, 2, 3, represents
the coefficient of xi in the cubic to be solved. The cubic whose
roots we seek is
f (1)(x) = 4a4x3 + 3a3x2 + 2a2x+ a1
14
-
where the aj ’s, j = 0, 1, 2, 3, 4, are from the given quartic.
So when applying the cubic formula we wouldhave
“a3, ,
= 4a4“a2
, ,
= 3a3“a1
, ,
= 2a2“a0
, ,
= a1
The indexed a’s in quotes on the LHS (left hand side) of the
above chain of equalities have no relation tothe indexed a’s on the
RHS (right hand side). So
p =3“a3
, ,
“a1, , − “a2
, ,2
3“a3, ,2
=3(4a4)(2a2)− (3a3)2
3(4a4)2
=8a2a4 − 3a23
16a24
and
q =2“a2
, ,3 − 9“a1, ,
“a2, ,
“a3, ,
+ 27“a3, ,2“a0
, ,
27“a3, ,3
=2(3a3)3 − 9(2a2)(3a3)(4a4) + 27(4a4)2a1
27(4a4)3
=3a33 − 12a2a3a4 + 24a24a1
96a34
=a33 − 4a2a3a4 + 8a24a1
32a34
and also as defined in the cubic formula
D =q2
4+p3
27.
We suppose first that D > 0. In this case f (1)(x) has
exactly the one root
xex =3
√−q2
+√D + 3
√−q2−√D − a3
4a4
We must show that either f (2)(xex) 6= 0 or that if f (2)(xex)
is zero then f (3)(xex) is also zero (f (4)(xex) =24a4 6= 0 and has
the same sign as a4). If f (2)(xex) 6= 0 we must further show that
f (2)(xex) has the samesign as a4 which would make xex the abscissa
of a local maximum when a4 < 0 and a local minimum whena4 >
0. We start by calculating the various derivatives:
f (1)(x) = 4a4x3 + 3a3x2 + 2a2x+ a1f (2)(x) = 12a4x2 + 6a3x+
2a2f (3)(x) = 24a4x+ 6a3f (4)(x) = 24a4.
15
-
We now evaluate the second derivative at x = xex, the one and
only root of the first derivative:
f (2)(xex) = f (2)(
3
√−q2
+√D + 3
√−q2−√D − a3
4a4
)= 12a4
(3
√−q2
+√D + 3
√−q2−√D − a3
4a4
)2+ 6a3
(3
√−q2
+√D + 3
√−q2−√D − a3
4a4
)+ 2a2
= 12a4
(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2+
a2316a24
− 2p3− a3
2a4
(3
√−q2
+√D + 3
√−q2−√D
))= +6a3
(3
√−q2
+√D + 3
√−q2−√D − a3
4a4
)+ 2a2
= 12a4
(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)+
3a234a4− 8a4p− 6a3
(3
√−q2
+√D + 3
√−q2−√D
)+ 6a3
(3
√−q2
+√D + 3
√−q2−√D
)− 3a
23
2a4+ 2a2
= 12a4
(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)+ 2a2 −
3a234a4− 8a4p
= 12a4
(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)+
8a2a4 − 3a234a4
− 8a4p
= 12a4
(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)+ 4a4p− 8a4p
= 12a4
(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)− 4a4p
= 12a4
((3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)− p
3
)So we have for this D > 0 case:
f (2)(xex) = 12a4
[(3
√(−q2
+√D
)2+ 3√(−q2−√D
)2)− p
3
],
f (3)(xex) = 24a4
(3
√−q2
+√D + 3
√−q2−√D − a3
4a4
)+ 6a3
= 24a4
(3
√−q2
+√D + 3
√−q2−√D
)− 6a3 + 6a3
= 24a4
(3
√−q2
+√D + 3
√−q2−√D
), and
f (4)(xex) = 24a4.
We are looking at the case D > 0. If p < 0 then by
inspection one sees that the bracketed factor in the mostrecent
expression for f (2)(xex) would be positive resulting in f (2)(xex)
having the same sign as a4 as is to beshown. Suppose p = 0; then
for D > 0 we would require q 6= 0 since D = q
2
4 +p3
27 =q2
4 and√D = |q|2 which
means exactly one of −q2 +√D and −q2 −
√D will be nonzero and hence the bracketed expression would
once
again be positive. Finally, suppose p > 0. We must show that
in this case
16
-
3
√(−q2
+√D
)2+ 3√(−q2−√D
)2>p
3
as well.
On the LHS we have two positive addends. If q > 0 the second
addend would be the larger and we showthat it alone is larger than
p3 (making the sum a fortiori larger than
p3 ), (remember p > 0 in the case being
presently considered).
3
√(−q2−√D
)2>p
3
if and only if (−q2−√D
)2>p3
27
if and only if
q2
4+q2
4+p3
27+ q√D >
p3
27
which is clearly true since we have on the LHS p3
27 itself plus positive addends. Note that if q were zero
this sum would equal p3
27 which would make3
√(−q2 −
√D
)2= p3 itself but the other addend in the original
expression would also be p3 making the sum2p3 which is greater
than
p3 .
Now if q < 0 the first addend of the original expression
would be the larger and it alone as we show wouldbe larger than p3
: (remember p > 0)
3
√(−q2
+√D
)2>p
3
if and only if (−q2
+√D
)2>p3
27
if and only if
q2
4+q2
4+p3
27− q√D >
p3
27
if and only if
p3
27+[q2
2− q√D
]>p3
27
which is true since q < 0 makes the bracketed expression
positive, completing the case for D > 0.Suppose now, D = 0. By
the cubic formula if p = q = 0 we again have only the one root xex
whose expressionwas given above. So the expression for f (2)(xex)
given above would also apply. We find that when p = q = 0f (2)(xex)
= 0 but f (3)(xex) is also zero leaving us with f (4)(xex) = 24a4
6= 0 and having the same sign as a4completing the proof for the
case when D = 0 and p = q = 0 and hence the cases when f (1)(x) has
exactlyone root.
17
-
We continue with the case when D = 0. If D = 0 and q = 0 then p
would also have to be zero which wouldbe the (one root) case just
covered. So we now look at D = 0 and q 6= 0; we must then have p
< 0 sinceD = q
2
4 +p3
27 . In this case (D = 0, q 6= 0, p < 0) the two roots of
f(1)(x) are
x′ex =3
√q
2− a3
4a4and xex = −2 3
√q
2− a3
4a4.
Note that since D = 0 the second root above is the same as xex.
First we must show that the first root, x′ex,above is not an
extremum.
We have
f (2)(x′ex) = f(2)
(3
√q
2− a3
4a4
)= 12a4
(3
√q
2− a3
4a4
)2+ 6a3
(3
√q
2− a3
4a4
)+ 2a2
= 12a4
(3
√q2
4− a3
2a43
√q
2+
a2316a24
)+ 6a3 3
√q
2− 3a
23
2a4+ 2a2
= 12a43
√q2
4− 6a3 3
√q
2+
3a234a4
+ 6a3 3√q
2− 3a
23
2a4+ 2a2
= 12a43
√q2
4− 3a
23
4a4+ 2a2
but D = 0 means that q2
4 =−p327 so
f (2)(x′ex) = 12a4
(−p3
)+
8a2a4 − 3a234a4
= 12a4
(−p3
)+ 4a4p, since p =
8a2a4 − 3a2316a24
= −4a4p+ 4a4p= 0
but
f (3)(x′ex) = 24a4
(3
√q
2− a3
4a4
)+ 6a3
= 24a4 3√q
2− 6a3 + 6a3
= 24a4 3√q
26= 0, since q 6= 0
Therefore x′ex = 3√
q2 −
a34a4
cannot be an extremum. For the second root, xex = −2 3√
q2 −
a34a4
we have
f (2)(xex) = f (2)(− 2 3√q
2− a3
4a4
)= 12a4
[(3
√q2
4+ 3√q2
4
)− p
3
]= 12a4
[2 3√q2
4− p
3
]18
-
Since D = 0, we have q2
4 =−p327 , so
f (2)(xex) = 12a4(−p).
Since p < 0, f (2)(xex) is nonzero and has the same sign as
a4 as we wished to show thus completing all thecases stemming from
D ≥ 0.
Now for the cases when D < 0. The corollary asserts in this
case that for xex1
(i) xex1 is the abscissa of an extremum(ii) xex1 is the abscissa
of a local maximum when a4 < 0(iii) xex1 is the abscissa of a
local minimum when a4 > 0.
Since xex1 , xex2 , xex3 are the roots of f(1)(x) we are given
that f (1)(xex1) = 0. To prove that xex1 is the
abscissa of an extremum we must show that f (2)(xex1) 6= 0 or
that if f (2)(xex1) = 0 then f (3)(xex1) = 0 aswell (since we know
that f (4)(xex1) = 24a4 6= 0). We start, therefore, by calculating
f (2)(xex1) where xex1 is
given as −2√−p3 cos
[13 arccos
[3q√−3p
2p2
]]− a34a4 and f
(2)(x) = 12a4x2 + 6a3x+ 2a2:
f (2)(xex1) = 12a4
(− 2√−p3
cos[
13
arccos[
3q√−3p
2p2
]]− a3
4a4
)2+ 6a3
(− 2√−p3
cos[
13
arccos[
3q√−3p
2p2
]]− a3
4a4
)+ 2a2
= 12a4
(−4p
3cos2
[13
arccos[
3q√−3p
2p2
]]+
a2316a24
+a3a4
√−p3
cos[
13
arccos[
3q√−3p
2p2
]])− 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]]− 3a
23
2a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]]+
3a234a4
+ 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]]− 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]]− 3a
23
4a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]]− 3a
23
4a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]]+
8a2a4 − 3a234a4
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]]+ 4a4p
= 4a4
[p
(1− 4 cos2
[13
arccos[
3q√−3p
2p2
]])]
If f (2)(xex1) < 0 then xex1 would be the abscissa of a local
maximum by the Minimax Theorem. The presentcorollary asserts xex1
would be the abscissa of a local maximum provided a4 < 0.
Also if f (2)(xex1) > 0 then xex1 would be the abscissa of a
local minimum by the Minimax Theorem. Thepresent corollary asserts
xex1 would be the abscissa of a local minimum provided a4 >
0.
The two statements above require us to show that a4 and f
(2)(xex1) have the same sign. For this to be true
we require the expression p(
1− 4 cos2(
13 arccos
(3q√−3p
2p2
)))to be positive. For D to be less than zero we
must have p < 0 since D = q2
4 +p3
27 . We are therefore reduced to showing that whenever D <
0
19
-
1− 4 cos2(
13
arccos(
3q√−3p
2p2
))< 0
or that
cos2(
13
arccos(
3q√−3p
2p2
))>
14
or that ∣∣∣∣ cos(13 arccos(
3q√−3p
2p2
))∣∣∣∣ > 12Since the arccos(·) function must return an angle
between 0o and 180o, inclusive, 13 arccos
(3q√−3p
2p2
)must
be an angle between 0o and 60o, inclusive. For all θ ∈ [0o, 60o)
we have cos θ > cos 60o = 12 , so the only
angle in the range [0o, 60o] for which∣∣∣∣ cos( 13 arccos(
3q√−3p2p2 ))∣∣∣∣ > 12 might not apply is θ = 60o for which
we would have “equality” instead of “greater than.” For θ to
equal 60o we would require arccos(
3q√−3p
2p2
)to equal 180o. This would happen only when 3q
√−3p
2p2 = −1, or 2p2 = −3q
√−3p, or 4p4 = 9q2(−3p) or
4p3 = −27q2 or 4p3 + 27q2 = 0 or p3
27 +q2
4 = 0 which is not possible since D < 0 so θ can never equal
60o
meaning that∣∣∣∣ cos( 13 arccos( 3q√−3p2p2 ))∣∣∣∣ will, as
required, always exceed 12 .
The corollary asserts in this case (D < 0) that for xex2
(i) xex2 is the abscissa of an extremum(ii) xex2 is the abscissa
of a local maximum when a4 < 0(iii) xex2 is the abscissa of a
local minimum when a4 > 0.
Since as mentioned earlier xex1 , xex2 , xex3 are the roots of
f(1)(x) we are given that f (1)(xex2) = 0. To
prove that xex2 is the abscissa of an extremum we must show that
f(2)(xex2) 6= 0 or that if f (2)(xex2) = 0
then f (3)(xex2) = 0 as well (since we know that f(4)(xex2) =
24a4 6= 0). We start, therefore, as earlier
by calculating f (2)(xex2) where xex2 is given as 2√−p3 cos
[13 arccos
[3q√−3p
2p2
]− 60o
]− a34a4 and f
(2)(x) =
12a4x2 + 6a3x+ 2a2:
f (2)(xex2) = 12a4
(2
√−p3
cos[
13
arccos[
3q√−3p
2p2
]− 60o
]− a3
4a4
)2+ 6a3
(2
√−p3
cos[
13
arccos[
3q√−3p
2p2
]− 60o
]− a3
4a4
)+ 2a2
= 12a4
(−4p
3cos2
[13
arccos[
3q√−3p
2p2
]− 60o
]+
a2316a24
− a3a4
√−p3
cos[
13
arccos[
3q√−3p
2p2
]− 60o
])+ 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]− 60o
]− 3a
23
2a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]− 60o
]+
3a234a4− 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]− 60o
]+ 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]− 60o
]− 3a
23
2a4+ 2a2
20
-
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]− 60o
]− 3a
23
4a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]− 60o
]+
8a2a4 − 3a234a4
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]− 60o
]+ 4a4p
= 4a4
(p
(1− 4 cos2
(13
arccos(
3q√−3p
2p2
)− 60o
)))If f (2)(xex2) < 0 then xex2 would be the abscissa of a
local maximum by the Minimax Theorem. The presentcorollary asserts
xex2 would be the abscissa of a local maximum provided a4 <
0.
Also if f (2)(xex2) > 0 then xex2 would be the abscissa of a
local minimum by the Minimax Theorem. Thepresent corollary asserts
xex2 would be the abscissa of a local minimum provided a4 >
0.
The two statements above again require us to show that a4 and f
(2)(xex2) have the same sign. For this to
be true we require the expression p(
1− 4 cos2(
13 arccos
(3q√−3p
2p2
)− 60o
))to be positive. For D to be less
than zero we must have p < 0 since D = q2
4 +p3
27 . We are therefore again reduced to showing that wheneverD
< 0
1− 4 cos2(
13
arccos(
3q√−3p
2p2
)− 60o
)< 0
or that
cos2(
13
arccos(
3q√−3p
2p2
)− 60o
)>
14
or that ∣∣∣∣ cos(13 arccos(
3q√−3p
2p2
)− 60o
)∣∣∣∣ > 12Since the arccos(·) function must return an angle
between 0o and 180o, inclusive, 13 arccos
(3q√−3p
2p2
)− 60o
must be an angle between −60o and 0o, inclusive. For all θ ∈
(−60o, 0o] we have cos θ > cos(−60o) = 12 ,
so the only angle in the range [−60o, 0o] for which∣∣∣∣ cos( 13
arccos( 3q√−3p2p2 ) − 60o)∣∣∣∣ > 12 might not apply
is θ = −60o for which we would have “equality” instead of
“greater than.” For θ to equal −60o we would
require arccos(
3q√−3p
2p2
)to equal 0o. This would happen only when 3q
√−3p
2p2 = 1, or 2p2 = 3q
√−3p, or
4p4 = 9q2(−3p) or 4p3 = −27q2 or 4p3 + 27q2 = 0 or p3
27 +q2
4 = 0 which is as before not possible since D < 0
so θ can never equal −60o meaning that∣∣∣∣ cos( 13 arccos(
3q√−3p2p2 ) − 60o)∣∣∣∣ will, as required, always exceed
12 .
Now for xex3 , (D < 0). The corollary asserts that
(i) xex3 is the abscissa of an extremum(ii) xex3 is the abscissa
of a local minimum when a4 < 0(iii) xex3 is the abscissa of a
local maximum when a4 > 0.
Once again xex1 , xex2 , xex3 are the roots of f(1)(x) so we are
given that f (1)(xex3) = 0. To prove that xex3 is
the abscissa of an extremum we must show that f (2)(xex3) 6= 0
or that if f (2)(xex3) = 0 then f (3)(xex3) = 0
21
-
as well (since we know that f (4)(xex3) = 24a4 6= 0). We start,
as earlier, by calculating f (2)(xex3) where xex3is given as 2
√−p3 cos
[13 arccos
[3q√−3p
2p2
]+ 60o
]− a34a4 and f
(2)(x) = 12a4x2 + 6a3x+ 2a2:
f (2)(xex3) = 12a4
(2
√−p3
cos[
13
arccos[
3q√−3p
2p2
]+ 60o
]− a3
4a4
)2+ 6a3
(2
√−p3
cos[
13
arccos[
3q√−3p
2p2
]+ 60o
]− a3
4a4
)+ 2a2
= 12a4
(−4p
3cos2
[13
arccos[
3q√−3p
2p2
]+ 60o
]+
a2316a24
− a3a4
√−p3
cos[
13
arccos[
3q√−3p
2p2
]+ 60o
])+ 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]+ 60o
]− 3a
23
2a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]+ 60o
]+
3a234a4− 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]+ 60o
]+ 12a3
√−p3
cos[
13
arccos[
3q√−3p
2p2
]+ 60o
]− 3a
23
2a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]+ 60o
]− 3a
23
4a4+ 2a2
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]+ 60o
]+
8a2a4 − 3a234a4
= −16a4p cos2[
13
arccos[
3q√−3p
2p2
]+ 60o
]+ 4a4p
= 4a4
(p
(1− 4 cos2
(13
arccos(
3q√−3p
2p2
)+ 60o
)))If f (2)(xex3) < 0 then xex3 would be the abscissa of a
local maximum by the Minimax Theorem. The presentcorollary asserts
xex3 would be the abscissa of a local maximum provided a4 >
0.
Also if f (2)(xex3) > 0 then xex3 would be the abscissa of a
local minimum by the Minimax Theorem. Thepresent corollary asserts
xex3 would be the abscissa of a local minimum provided a4 <
0.
The two statements above this time require us to show that a4
and f (2)(xex3) are of opposite sign. For this
to be true we require the expression p(
1 − 4 cos2(
13 arccos
(3q√−3p
2p2
)+ 60o
))to be negative. For D to
be less than zero we must have p < 0 since D = q2
4 +p3
27 . We are therefore now reduced to showing thatwhenever D <
0
1− 4 cos2(
13
arccos(
3q√−3p
2p2
)+ 60o
)> 0
or that
cos2(
13
arccos(
3q√−3p
2p2
)+ 60o
)<
14
or that ∣∣∣∣ cos(13 arccos(
3q√−3p
2p2
)+ 60o
)∣∣∣∣ < 12Since the arccos(·) function must return an angle
between 0o and 180o, inclusive, 13 arccos
(3q√−3p
2p2
)+ 60o
must be an angle between 60o and 120o, inclusive. For all θ ∈
(60o, 120o) we have | cos θ| < 12 . cos 60o =
22
-
12 and cos 120
o = −12 so | cos 60o| = | cos 120o| = 12 . The two angles in the
range [60
o, 120o] for which∣∣∣∣ cos( 13 arccos( 3q√−3p2p2 ) + 60o)∣∣∣∣
< 12 might not apply are θ1 = 60o and θ2 = 120o. θ1 = 60o
wouldrequire arccos
(3q√−3p
2p2
)to equal 0o. This would happen only when 3q
√−3p
2p2 = 1, or 2p2 = 3q
√−3p, or
4p4 = 9q2(−3p) or 4p3 = −27q2 or 4p3 + 27q2 = 0 or p3
27 +q2
4 = 0 which is as before not possible since D < 0
so θ1 can never equal 60o. θ2 = 120o would require arccos(
3q√−3p
2p2
)= 180o. This would happen only when
3q√−3p
2p2 = −1 which would lead to D = 0 which is also not possible in
this case and thus completes the proofof this corollary.
QUARTIC’S BARE FORM LEMMA: Suppose f(x) = a4x4 + a3x3 + a2x2 +
a1x+ a0 with a4 6= 0. Insliding f(x) to the right by a34a4 units we
obtain a second quartic we shall call g(x) which has the same
shapeas f(x) and f ’s roots are a34a4 units to the left of those of
g. We find
g(x) = a4t(x) where t(x) = x4 + px2 + qx+ r
with
p =8a2a4 − 3a23
8a24, q =
a33 − 4a2a3a4 + 8a1a248a34
, r =16a2a23a4 − 3a43 − 64a1a3a24 + 256a0a34
256a44
and if rti is a root of t(x) then rgi = rti is also a root of
g(x) and
rfi = rti −a34a4
is a root of f(x). Since f(x) is a polynomial of degree 4, i ∈
{1, 2, 3, 4}. We will call t(x) the bare form off(x).
Proof: In the HORIZONTAL TRANSLATION to the right FUNCTION we
found that
g(x) =
a4x4+(−4a4h+a3)x3+(6a4h2−3a3h+a2)x2+(−4a4h3+3a3h2−2a2h+a1)x+(a4h4−a3h3+a2h2−a1h+a0),
where h is the number of units f was slid to the right.
Substituting h = a34a4 into the above expression withthe purpose of
eliminating the x3-term we find after simplification
g(x) = a4x4 +8a2a4 − 3a23
8a4x2 +
a33 − 4a2a3a4 + 8a1a248a24
x+−3a43 + 16a2a23a4 − 64a1a3a24 + 256a0a34
256a34
from which we can see that g(x) = a4× t(x) where t(x), p, q, r
are as defined above. Since a4 6= 0 we can seethat the roots of
t(x) are identical to the roots of g(x).
QUARTIC FACTORIZABILITY LEMMA: Suppose t(x) is defined as in the
Quartic’s Bare FormLemma. We find that t(x) may always be factored
as the product of two quadratics Q1(x)×Q2(x) where
Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t
in at least one way and in no more than three ways where
Q1(x)×Q2(x) = Q2(x)×Q1(x) is viewed as thesame factorization.
Proof: If we could affect this factorization we could at once
determine the roots, if any, of t(x) by simplyfinding the roots of
the quadratics Q1(x) and Q2(x) and then by subtracting a34a4 from
these roots we wouldhave the roots of the original quartic
f(x).
23
-
(In developing the forms for Q1 and Q2 in this Lemma we
initially assumed them to be of the form x2+ux+sand x2 + vx+ t but
immediately found v = −u since t(x) has no x3-term.)
So we want
Q1(x)×Q2(x) ≡ t(x) or(x2 + ux+ s)(x2 − ux+ t) ≡ x4 + px2 + qx+ r
or
x4 + (s+ t− u2)x2 + (ut− us)x+ st ≡ x4 + px2 + qx+ r
which means we must be able to find numbers u, s, t which solve
the following system of three equations inthese three unknowns:
s+ t− u2 = put− us = q
st = r
There are several ways of trying to solve this system.
Eliminating s and u or t and u first seems to leadto excessive
complexity so we start here by first eliminating s and t ending up
with a polynomial in “u” tosolve. We’ll start with the case when
niether r nor q is zero, i.e. rq 6= 0. From the third equation we
find
s =r
t;
note that since r 6= 0 neither s nor t can be zero. Substituting
this result into the second equation we find
ut− urt
= q or ut2 − qt− ur = 0 with u 6= 0 since q 6= 0.
Solving this quadratic in t for t we have at most the two
solutions:
t1 =q +
√q2 + 4u2r2u
and t2 =q −
√q2 + 4u2r2u
,
where we will be showing the existence of at least one “u”
making the radicand non-negative.
Using s = rt we find
s1 =r
t1
=r(
q+√q2+4u2r
2u
)=
2ur
q +√q2 + 4u2r
=2ur
q +√q2 + 4u2r
× q −√q2 + 4u2r
q −√q2 + 4u2r
=2ur(q −
√q2 + 4u2r)
q2 − (q2 + 4u2r)
=2ur(q −
√q2 + 4u2r)
−4u2r
=−q +
√q2 + 4u2r2u
24
-
and
s2 =r
t2
=r(
q−√q2+4u2r
2u
)=
2ur
q −√q2 + 4u2r
=2ur
q −√q2 + 4u2r
× q +√q2 + 4u2r
q +√q2 + 4u2r
=2ur(q +
√q2 + 4u2r)
q2 − (q2 + 4u2r)
=2ur(q +
√q2 + 4u2r)
−4u2r
=−q −
√q2 + 4u2r2u
Summarizing:
s1 =−q +
√q2 + 4u2r2u
, t1 =q +
√q2 + 4u2r2u
and
s2 =−q −
√q2 + 4u2r2u
, t2 =q −
√q2 + 4u2r2u
.
So
s1 + t1 =
√q2 + 4u2ru
, s2 + t2 = −√q2 + 4u2ru
.
We now plug “si + ti”, i = 1, 2 into the first equation in the
system to obtain the polynomial in “u.” Weindex “u” with 1 or 2 to
remind us whether it was derived from s1 + t1 or s2 + t2; so first
for i = 1:
√q2 + 4u21ru1
− u21 = p√q2 + 4u21ru1
= p+ u21 and squaring both sides
q2 + 4u21ru21
= p2 + 2pu21 + u41
Note that in squaring both sides we introduced an extraneous
root. If u11 solves the above equation then sodoes −u11 , but only
one of {u11 ,−u11} can solve the original equation. We continue by
multiplying throughby u21:
q2 + 4u21r = p2u21 + 2pu
41 + u
61
u61 + 2pu41 + (p
2 − 4r)u21 − q2 = 0
which is a polynomial of degree six with at most six roots and
hence at most three roots which satisfy theoriginal equation and
three extraneous roots.
25
-
Now for i = 2:
−√q2 + 4u22ru2
− u22 = p
−√q2 + 4u22ru2
= p+ u22 and squaring both sides
q2 + 4u22ru22
= p2 + 2pu22 + u42
Note that in squaring both sides we again introduced an
extraneous root. If u21 solves the above equationthen so does −u21
, but only one of {u21 ,−u21} can solve the original equation. We
continue by multiplyingthrough by u22:
q2 + 4u22r = p2u22 + 2pu
42 + u
62
u62 + 2pu42 + (p
2 − 4r)u22 − q2 = 0
which is not only a polynomial of degree six but the same
polynomial as found above with at most six rootsand hence at most
three roots which satisfy the original equation and three
extraneous roots. Note that ifu∗ solves the original “i = 1”
equation then −u∗ solves the original “i = 2” equation; so if u11 ,
u12 , and u13are the (at most) three solutions of the “i = 1”
equation then −u11 ,−u12 , and −u13 are the three solutionsof the
“i = 2” equation. Now s1, t1, u∗ implies the factorization
(x2 + u∗x+ s1)× (x2 − u∗x+ t1)
which by the commutative property leads to the factorization
(x2 + (−u∗)x+ t1)× (x2 − (−u∗)x+ s1).
Of course we don’t consider this to be a different factorization
but it does account for another “s, t, u”solution, namely t1,
s1,−u∗. So our solution set
{(s1, t1, u11), (s1, t1, u12), (s1, t1, u13), (s2, t2, u21),
(s2, t2, u22), (s2, t2, u23)}
is now reduced to a maximum number of three triples leading to
distinct factorizations:
{(s1, t1, u11), (s1, t1, u12), (s1, t1, u13)},
so we need only solve the “i = 1” case.
We must try to find the roots of
u61 + 2pu41 + (p
2 − 4r)u21 − q2 = 0
As can be seen directly this is a cubic in “u21” so letting U =
u21 we now have
U3 + 2pU2 + (p2 − 4r)U − q2 = 0
to solve for U . This is sometimes called the cubic resolvent of
the quartic.As the above is a cubic we are guaranteed at least one
solution but this guarantee is not sufficient for ourpurposes.
Since U = u21 we must be able to find a U which is not only
non-negative, since u
21 cannot be
negative, but also makes q2 + 4Ur non-negative. Actually U must
be positive since in the present case,rq 6= 0, means that u1 is
also nonzero. Moving q2 to the RHS and dividing through by U we now
have
U2 + 2pU + (p2 − 4r) = q2
U.
26
-
The LHS is a quadratic, also known as a parabola and the RHS is
a hyperbola. We define the two functions:
L(U) = U2 + 2pU + (p2 − 4r) and R(U) = q2
U.
We may graph L(U) and R(U) in the Cartesian Plane with the
vertical axis representing the values of L(U)and R(U) for each
value of U in the domain as represented by the horizontal axis. A
solution, U1, to thiscubic would be represented by the abscissa of
a point of intersection of L(U) and R(U). Since we requireU > 0
we are only interested in Quadrant I intersections as depicted in
the graph below:
(−p,−4r)
U1 U
L(U), R(U)
L(U)
R(U)
From the chapter on the quadratic we know that L(U) has a
minimum whose coordinates are
MINL(U)(−p,−4r).
We find that if a solution U1 > 0 were to exist it will
always satisfy our requirement that
q2 + 4U1r ≥ 0.
For r > 0 this is immediate, so let’s assume r < 0. When r
< 0 MINL(U)(−p,−4r) will have the positiveordinate −4r and hence
will be above the horizontal U -axis. Since MINL(U) is the minimum
point of L(U)all points (U,L(U)) on L(U) must satisfy L(U) ≥ −4r.
Any point (U,R(U)) on R(U) which is also sharedwith L(U) must
therefore also satisfy R(U) ≥ −4r. So which points (U,R(U)) on the
hyperbola R(U) satisfy
R(U) ≥ −4r? Well R−1(−4r) = q2
−4r , i.e. R(
q2
−4r
)= −4r. Since R(U) is always decreasing in Quadrant I
as U increases we see immediately that the abscissa, U1, of any
intersection point must satisfy
U1 ≤q2
−4r, where we will let
q2
−4r= UMAX
or q2 + 4U1r ≥ 0 which is the required condition. See the graph
below.
27
-
U1 U2 U3 UMAX
R(U)
L(U)
So we are left with having to prove the existence of at least
one Quadrant I solution. The story-line of theproof that a Quadrant
I intersection of L(U) and R(U) must exist is as follows. Moving
from left to rightwhen L(U) just crosses the vertical axis at a
y-intercept of (0, p2− 4r) and initially enters either Quadrant Ior
Quadrant IV it will be below R(U) meaning that the function H(U) =
L(U)−R(U) will have a negativevalue here. As U gets larger and
larger L(U) gets arbitrarily large and R(U) which remains positive
inQuadrant I gets arbitrarily close to zero meaning that for a
sufficiently large U , H(U) = L(U)− R(U) willbe positive. So by the
Intermediate Value Theorem there must be a least one positive U1
between these twopositive U ’s making H(U1) = 0 which would, of
course, mean L(U1) = R(U1) which shows the existence ofa Quadrant I
intersection of R(U) and L(U) as required.
Using more formal language for the proof we could proceed as
follows. Since
limU→∞
R(U) = 0
for every y > 0, there exists a UyR so that R(U) < y
whenever U > UyR. Since
limU→∞
L(U) =∞
for every y > 0, there exists a UyL so that L(U) > y
whenever U > UyL. So pick any number y∗ > 0 andlet Uy∗max be
any number greater than the larger of Uy∗R and Uy∗L. We would have
that H(Uy∗max) =L(Uy∗max)−R(Uy∗max) > 0.L(0) = p2− 4r. Suppose
first p2− 4r < 0. L(U) is continuous at U = 0 so for every δ
> 0 there will exist an� > 0 so that |L(U)−L(0)| < δ
whenever |U − 0| = |U | < �. So letting δ = |p2− 4r| we can find
an �δ > 0 sothat |L(U)− (p2− 4r)| < |p2− 4r| whenever |U |
< �δ. This means that for any Uδ ∈ (0, �δ), L(Uδ) < 0
whichmeans that H(Uδ) = L(Uδ)−R(Uδ) < 0 since R(Uδ) > 0 so by
the Intermediate Value Theorem there mustexist a U1 > 0 between
the positive numbers Uδ and Uy∗max so that H(U1) = 0 or L(U1) =
R(U1) provingin this case the existence of a Quadrant I
intersection of R(U) and L(U).
Suppose now that p2 − 4r = 0. Let γ be any positive number and
let δ = R(γ) which is clearly a positivenumber. Since L(U) is
continuous at U = 0 we can find an �δ > 0 so that |L(U) − L(0)|
= |L(U)| < δwhenever |U | < �δ. So for any Uδ ∈ (0, �δ) we
will again have H(Uδ) < 0 so again by the Intermediate
ValueTheorem there must exist a U1 > 0 between the positive
numbers Uδ and Uy∗max so that H(U1) = 0 orL(U1) = R(U1) proving in
this case the existence of a Quadrant I intersection of R(U) and
L(U).
Suppose finally that p2 − 4r > 0. R−1(p2 − 4r) is defined and
equal to q2
p2−4r . Since we are making noassumptions about which side of
the vertical axis MINL(U) lies we don’t know whether L(U) is rising
or
falling at this y-intercept, (0, p2−4r). We choose any γ, 0 <
γ < q2
p2−4r and we let δ = R(γ)−(p2−4r) which
28
-
is clearly a positive number since R(U) is monotonically
decreasing in Quadrant I. Since L(U) is continuousat U = 0 we can
find an � > 0 so that |L(U) − L(0)| = |L(U) − (p2 − 4r)| < δ
whenever |U | < �. For anyUδ ∈ (0, �) we will have H(Uδ) < 0
so again by the Intermediate Value Theorem there must exist a U1
> 0between the positive numbers Uδ and Uy∗max so that H(U1) = 0
or L(U1) = R(U1) proving in this final casethe existence of a
Quadrant I intersection of R(U) and L(U) and thus concluding the
Lemma for the caserq 6= 0.
We now address the case when rq = 0. We will have the following
three sub-cases:
Irq=0 : r = 0, q 6= 0
IIrq=0 : r = 0, q = 0
IIIrq=0 : r 6= 0, q = 0
We are as in the previous case to establish the solvability of
the system
s+ t− u2 = put− us = q
st = r
We start by looking at Case Irq=0 : r = 0, q 6= 0. Since r = 0,
either s = 0, or t = 0 or both s and t arezero. If both s and t
were zero we would require u2 = −p which would lead to a solution
only when p ≤ 0.Suppose s = 0, t 6= 0. We would then have ut = q
and t − u2 = p which would lead to qu − u
2 = p orq−u3 = pu or u3 +pu− q = 0 which, being a cubic will
always have from one to three solutions {u1, u2, u3}.So in this
case there will always be at least one (s, t, u) solution and at
most three. (Note that if t = 0 ands 6= 0 we would have −qu − u
2 = p or u3 + pu+ q = 0 whose roots would be the negatives of
the first cubic,i.e. {−u1,−u2,−u3} and hence the same
factorizations but multiplied in reverse order.)
We now look at Case IIrq=0 : r = 0, q = 0. Can a solution be
found in this case as well? Suppose s = 0.Since q = 0 either u or t
must be zero so suppose first t = 0; the solvability in this case
would again bedependent on the sign of p so let’s suppose u = 0. We
then would have t = p which is always possible so(s, t, u) = (0, p,
0) would always be a solution establishing solvability in this case
as well.
Finally we look at Case IIIrq=0 : r 6= 0, q = 0. Since r 6= 0,
neither s nor t can be zero. So s = rt leading to
ut− urt = 0 or u(t− rt
)= 0 so either u = 0 which would require s+ t = p (or rt + t =
p) or t−
rt = 0. Can
we always solve t− rt = 0? This would lead to t2− r = 0 or t2 =
r which is solvable only when r > 0. When
can we solve rt + t = p? I.e. when can we solve t2 − pt+ r = 0?
This is solvable only when p2 − 4r ≥ 0, i.e.
when r ≤ p2
4 . Sincep2
4 ≥ 0, and since r will aways be either positive (leading to the
solvability of t−rt = 0)
or negative and hence less than p2
4 (leading to the solvability of s+ t = p (orrt + t = p)) an (s,
t, u) solution
will always exist in this final case as well, establishing the
factorizability of the quartic t(x) in every caseand completing the
proof of the Lemma.
The following is an immediate consequence of the Lemma.
COROLLARY: Every quartic f(x) = a4x4 +a3x3 +a2x2 +a1x+a0, a4 6=
0 can be factored as the productof two quadratics.
Proof: Let g(x) and t(x) be as defined in the previous Lemma. We
know from that Lemma that there exists∗, t∗, u∗ so that
t(x) = (x2 + u∗x+ s∗)× (x2 − u∗x+ t∗).
29
-
Therefore
g(x) = a4t(x), a4 6= 0= a4(x2 + u∗x+ s∗)(x2 − u∗x+ t∗)= (a4x2 +
a4u∗x+ a4s∗)(x2 − u∗x+ t∗)
g(x) is the quartic defined by sliding f(x) to the right by
a34a4 units so to get back to f(x) we must slide g(x)to the left by
a34a4 , i.e.
f(x) = g(x+
a34a4
)=(a4
(x+
a34a4
)2+ a4u∗
(x+
a34a4
)+ a4s∗
)((x+
a34a4
)2− u∗
(x+
a34a4
)+ t∗
)=(a4x
2 +(a32
+ a4u∗)x+
(a23
16a4+a3u∗
4+ a4s∗
))(x2 +
(a32a4− u∗
)x+
(a23
16a24− a3u
∗
4a4+ t∗
))=(a4x
2 +a3 + 2a4u∗
2x+
a23 + 4a3a4u∗ + 16a24s
∗
16a4
)×(x2 +
a3 − 2a4u∗
2a4x+
a23 − 4a3a4u∗ + 16a24t∗
16a24
)= P1(x)× P2(x)
where
P1(x) = a4x2 +a3 + 2a4u∗
2x+
a23 + 4a3a4u∗ + 16a24s
∗
16a4
is a quadrtic since a4 6= 0, and
P2(x) = x2 +a3 − 2a4u∗
2a4x+
a23 − 4a3a4u∗ + 16a24t∗
16a24
is also a quadratic since the coefficient of x2, 1, is nonzero.
This completes the proof of the Corollary.
We now begin the lengthy but straightforward task of actually
finding expessions for the various s, t, usolutions in the various
regions of p-q-r-coefficient-land which factor t(x) by referring to
the proof of theQuartic Factorizability Lemma. We look first at the
full-bodied cases stemming from rq 6= 0. We must startby finding
the positive roots of the quartic’s cubic resolvent
U3 + 2pU2 + (p2 − 4r)U − q2.
In applying the Cubic Formula to the above cubic we once again
have to be careful not to confuse the indexeda’s and p’s and q’s
used there as coefficient values of the cubic being solved with the
indexed a’s and p’sand q’s used in the proof of the Quartic
Factorizability Lemma. To avoid confusion symbols pertinent tothe
Cubic Formula are in quotes. To apply the Cubic Formula we first
find values for “p”, “q”, and “D”:
“p, ,
=3“a3
, ,
“a1, , − “a2
, ,2
3“a3, ,2
=3(1)(p2 − 4r)− (2p)2
3(1)2
=−p2 − 12r
3
= −12r + p2
3,
30
-
“q, ,
=2“a2
, ,3 − 9“a1, ,
“a2, ,
“a3, ,
+ 27“a3, ,2“a0
, ,
27“a3, ,3
=2(2p)3 − 9(p2 − 4r)(2p)(1) + 27(1)2(−q)2
27(1)3
=16p3 − 18p3 + 72pr − 27q2
27
=72pr − 2p3 − 27q2
27,
and
“D, ,
=“q
, ,2
4+
“p, ,3
27=
(72pr − 2p3 − 27q2)2
4 · 272− (p
2 + 12r)3
272
=4p3q2 + 27q4 − 16p4r − 144pq2r + 128p2r2 − 256r3
108
We now let Cp = “p, ,
, Cq = “q, ,
, and CD = “D, ,
, so
Cp = −p2 + 12r
3and Cq =
72pr − 2p3 − 27q2
27
and
CD =4p3q2 + 27q4 − 16p4r − 144pq2r + 128p2r2 − 256r3
108.
Also
“a2, ,
= 2p, “a3, ,
= 1, so“a2
, ,
3“a3, , =
2p3
When we calculate Cp, Cq, and CD and apply the Cubic Formula we
will find either 1, 2, or 3 solutions.The Quartic Factorizability
Lemma guarantees that at least one root will be positive and
sometimes evenall three will be positive. However, frequently we
will find some of the three roots to be negative as in thegraph
below where we show also the Quadrant III branch of the hyperbola
q
2
U .
L(U)R(U)
U1 U2U3
(−p,−4r)R(U)
In the example graphed above we see that U1 and U2 are negative
and hence would not lead to factorizationof t(x); only the positive
solution U3 would lead to a factorization of t(x).
In trying to find expressions for the various factoizations of
t(x) we again start with the full-bodied casewhere rq 6= 0. If the
cubic has one solution then by the Factorizability Lemma this
solution must be positive.
31
-
If the cubic has two solutions then if p ≥ 0 only one of the two
is positive as the minimum of L(U) wouldbe to the left of the
vertical axis so that L(U) will be of positive slope everywhere in
Quadrant I whereR(U) is of negative slope everywhere in Quadrant I
so that as the we follow the curve L(U) from left toright after we
arrive at the first intersection L(U) will continue rising from
that point whereas R(U) willdescend meaning there can be no other
intersection; if p < 0 L(U)’s minimum point will be on the right
sideof the vertical axis. If L(U)’s y-intercept p2 − 4r is negative
then only one of the roots is positive since onlythe right segment
of L(U) can share a point with the Quadrant I branch of R(U) and it
can share only theone point. If L(U)’s y-intercept p2 − 4r is
positive then there can be no intersections with the Quadrant
IIIbranch of R(U) and hence both solutions must be positive.
If the cubic has three solutions then if p ≥ 0 only one of the
solutions, as explained above, can be positive;if p < 0 L(U)’s
minimum point will be again on the right side of the vertical axis.
If L(U)’s y-interceptp2−4r is negative then only one of the three
roots is positive since only the right segment of L(U) can sharea
point with the Quadrant I branch of R(U) and it can share only the
one point so the other two roots mustbe negative. If L(U)’s
y-intercept p2 − 4r is positive then there can be no intersections
with the QuadrantIII branch of R(U) and hence all three solutions
must be positive. Note p2 − 4r = 0 is not consistent withour cubic
having more than one root. Summarizing the above discussion we
have:
CUBIC HAS ONE SOLUTION· ONE positive U only
CUBIC HAS TWO SOLUTIONS· ONE positive U when p ≥ 0· TWO positive
U ’s when p < 0 AND p2 − 4r > 0· ONE positive U when p < 0
AND p2 − 4r < 0
CUBIC HAS THREE SOLUTIONS· ONE positive U when p ≥ 0· THREE
positive U ’s when p < 0 AND p2 − 4r > 0· ONE positive U when
p < 0 AND p2 − 4r < 0
The following Theorem provides expressions for the quadratic
factors of any given quartic’s bare form andthe then obtainable
roots of the given quartic.
THEOREM (“QUARTIC FORMULA”): Suppose f(x) = a4x4 + a3x3 + a2x2 +
a1x+ a0, a4 6= 0, with bareform t(x) = x4 + px2 + qx + r and that
U3 + CpU + Cq with U = u2 is the bare form of the resolventcubic of
t(x) found in the Quartic Factorizability Theorem where t(x) was
the product of the two soughtquadratic factors Q1(x) = x2 + ux+ s
and Q2(x) = x2 − ux+ t. The values of these coefficients in terms
ofthe coefficients of the given quartic are found to be
Cq =72pr − 2p3 − 27q2
27, Cp = −
p2 + 12r3
,
and
CD =C2q4
+C3p27
=4p3q2 + 27q4 − 16p4r − 144pq2r + 128p2r2 − 256r3
108,
where
p =8a2a4 − 3a23
8a24, q =
a33 − 4a2a3a4 + 8a1a248a34
,
and
32
-
r =16a2a23a4 − 3a43 − 64a1a3a24 + 256a0a34
256a44.
The roots, ri, i ∈ {1, 2, 3, 4} of f(x) are to be found in the
appropriate cell in the nine-cell table below:
u2 − 4s < 0: u2 − 4s = 0: u2 − 4s > 0:
u2 − 4t < 0: NO ROOTS r1 = −u2 −a34a4
r1 = −u+√u2−4s2 −
a34a4
r2 = −u−√u2−4s2 −
a34a4
u2 − 4t = 0: r1 = u2 −a34a4
r1 = u2 −a34a4
r1 = u2 −a34a4
r2 = −u2 −a34a4
r2 = −u+√u2−4s2 −
a34a4
r3 = −u−√u2−4s2 −
a34a4
u2 − 4t > 0: r1 = u+√u2−4t2 −
a34a4
r1 = −u2 −a34a4
r1 = −u+√u2−4s2 −
a34a4
r2 = u−√u2−4t2 −
a34a4
r2 = u+√u2−4t2 −
a34a4
r2 = −u−√u2−4s2 −
a34a4
r3 = u−√u2−4t2 −
a34a4
r3 = u+√u2−4t2 −
a34a4
r4 = u−√u2−4t2 −
a34a4
where u, s, and t are to be selected from the applicable region
of coefficient-land as described in the fourteencases that follow:
A1 though A5 and B1 through B9, where the “A.x Cases” pertain to
the full-bodied caseswith rq 6= 0 and the “B.x Cases” pertain to
the rq = 0 flat cases. In cases with multiple u, s, t triples
anytriple ui, si, ti within that case may be selected and must lead
to the same roots.
Case A.1: If rq 6= 0 and if CD > 0 or if Cp = Cq = 0 then
t(x) has exactly one pair of quadratic factors{Q1(x), Q2(x)} where
Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t with
u = α
√3
√−Cq
2+√CD +
3
√−Cq
2−√CD −
2p3
where
α =
1, if3
√−Cq
2 +√CD +
3
√−Cq
2 −√CD + p3 ≥ 0;
−1, if 3√−Cq
2 +√CD +
3
√−Cq
2 −√CD + p3 < 0,
s =−q +
√q2 + 4r 3
√−Cq
2 +√CD + 4r
3
√−Cq
2 −√CD − 8pr3
2α
√3
√−Cq
2 +√CD +
3
√−Cq
2 −√CD − 2p3
,
and
t =q +
√q2 + 4r 3
√−Cq
2 +√CD + 4r
3
√−Cq
2 −√CD − 8pr3
2α
√3
√−Cq
2 +√CD +
3
√−Cq
2 −√CD − 2p3
.
Case A.2: If rq 6= 0, CD = 0, Cp < 0, p < 0, and p2 − 4r
> 0 then t(x) has exactly two pairs of quadraticfactors {Q11(x),
Q12(x)} and {Q21(x), Q22(x)} where
33
-
Qi1(x) = x2 + uix+ si and Qi2(x) = x
2 − uix+ ti, i = 1, 2
with
u1 = α
√3
√Cq2− 2p
3, where
α =
1, if3
√Cq2 +
p3 ≥ 0;
−1, if 3√
Cq2 +
p3 < 0,
s1 =−q +
√q2 + 4r 3
√Cq2 −
8pr3
2α
√3
√Cq2 −
2p3
,
t1 =q +
√q2 + 4r 3
√Cq2 −
8pr3
2α
√3
√Cq2 −
2p3
,
and
u2 = α
√−2 3√Cq2− 2p
3, where
α =
1, if −23
√Cq2 +
p3 ≥ 0;
−1, if −2 3√
Cq2 +
p3 < 0,
s2 =−q +
√q2 − 8r 3
√Cq2 −
8pr3
2α
√−2 3√
Cq2 −
2p3
,
t2 =q +
√q2 − 8r 3
√Cq2 −
8pr3
2α
√−2 3√
Cq2 −
2p3
.
Case A.3: If rq 6= 0, CD = 0, Cp < 0, and either p ≥ 0 or “p2
− 4r < 0 with p < 0” then t(x) has exactly onepair of
quadratic factors {Q1(x), Q2(x)} where Q1(x) = x2 + ux+ s and Q2(x)
= x2 − ux+ t with
u = α
√−2 3√Cq2− 2p
3, where
α =
1, if −23
√Cq2 +
p3 ≥ 0;
−1, if −2 3√
Cq2 +
p3 < 0,
34
-
s =−q +
√q2 − 8r 3
√Cq2 −
8pr3
2α
√−2 3√
Cq2 −
2p3
,
t =q +
√q2 − 8r 3
√Cq2 −
8pr3
2α
√−2 3√
Cq2 −
2p3
.
Case A.4: If rq 6= 0, CD < 0, p < 0, and p2 − 4r > 0
then t(x) has exactly three pairs of quadratic factors{Q11(x),
Q12(x)}, {Q21(x), Q22(x)}, and {Q31(x), Q32(x)} where
Qi1(x) = x2 + uix+ si and Qi2(x) = x
2 − uix+ ti, i = 1, 2, 3
with
u1 = α
√−2√−Cp
3cos[
13
arccos[
3Cq√−3Cp
2C2p
]]− 2p
3,
where
α =
1, if −2
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]]+ p3 ≥ 0;
−1, if −2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]]+ p3 < 0,
s1 =
−q +
√q2 − 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]]− 8pr3
2α
√−2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]]− 2p3
,
t1 =
q +
√q2 − 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]]− 8pr3
2α
√−2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]]− 2p3
,
and
u2 = α
√2
√−Cp
3cos[
13
arccos[
3Cq√−3Cp
2C2p
]− 60o
]− 2p
3,
where
α =
1, if 2
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]+ p3 ≥ 0;
−1, if 2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]+ p3 < 0,
35
-
s2 =
−q +
√q2 + 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 8pr3
2α
√2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 2p3
,
t2 =
q +
√q2 + 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 8pr3
2α
√2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 2p3
,
and
u3 = α
√2
√−Cp
3cos[
13
arccos[
3Cq√−3Cp
2C2p
]+ 60o
]− 2p
3,
where
α =
1, if 2
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]+ 60o
]+ p3 ≥ 0;
−1, if 2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]+ 60o
]+ p3 < 0,
s3 =
−q +
√q2 + 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]+ 60o
]− 8pr3
2α
√2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]+ 60o
]− 2p3
,
t3 =
q +
√q2 + 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]+ 60o
]− 8pr3
2α
√2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]+ 60o
]− 2p3
.
Case A.5: If rq 6= 0, CD < 0, and either p ≥ 0 or “p2 − 4r
< 0 with p < 0”, then t(x) has exactly one pair ofquadratic
factors {Q1(x), Q2(x)} with Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+
t with
u = α
√2
√−Cp
3cos[
13
arccos[
3Cq√−3Cp
2C2p
]− 60o
]− 2p
3,
where
α =
1, if 2
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]+ p3 ≥ 0;
−1, if 2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]+ p3 < 0,
s =
−q +
√q2 + 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 8pr3
2α
√2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 2p3
,
36
-
t =
q +
√q2 + 8r
√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 8pr3
2α
√2√−Cp
3 cos[
13 arccos
[3Cq√−3Cp
2C2p
]− 60o
]− 2p3
.
Case B.1: If rq = 0 with r = q = 0, p ≥ 0 then t(x) has exactly
one pair of quadratic factors {Q1(x), Q2(x)}with Q1(x) = x2 + ux+ s
and Q2(x) = x2 − ux+ t where
u = 0, s = 0, and t = p.
Case B.2: If rq = 0 with r = q = 0, p < 0 then t(x) has
exactly two pairs of quadratic factors {Q11(x), Q12(x)}and {Q21(x),
Q22(x)} with Qi1(x) = x2 + uix+ si and Qi2(x) = x2 − ui + ti, i =
1, 2, where
u1 =√−p, s1 = 0, t1 = 0 and u2 = 0, s2 = 0, t2 = p.
Case B.3: If rq = 0 with r = 0, q 6= 0 and D = q2
4 +p3
27 > 0, then t(x) has exactly one pair of quadraticfactors
{Q1(x), Q2(x)} with Q1(x) = x2 + ux+ s and Q2(x) = x2 − ux+ t
where
u = 3√q
2+√D + 3
√q
2−√D, s = 0, t =
q
3
√q2 +√D + 3
√q2 −√D.
Case B.4: If rq = 0 with r = 0, q 6= 0, p < 0, and D = q2
4 +p3
27 = 0, then t(x) has exactly two pairs of quadraticfactors
{Q11(x), Q12(x)} and {Q21(x), Q22(x)} with Qi1(x) = x2 +uix+si and
Qi2(x) = x2−ui+ ti, i = 1, 2,where
u1 = − 3√q
2, s1 = 0, t1 = −2
3
√q2
4and u2 = 2 3
√q
2, s2 = 0, t2 =
3
√q2
4.
Case B.5: If rq = 0 with r = 0, q 6= 0 and D = q2
4 +p3
27 < 0, then t(x) has exactly three pairs of quadraticfactors
{Q11(x), Q12(x)}, {Q21(x), Q22(x)}, a