Riemannâ€™s Explicit Formula

Riemann’s Explicit Formula

Sean LiCornell [email protected]

May 11, 2012

This paper is a brief study of Bernhard Riemann’s main result in analyticnumber theory: the article “Uber die Anzahl der Primzahlen unter einer gegebe-nen Grosse” (1859), in which he derives an explicit formula for the prime count-ing function. Much of our paper works to make Riemann’s intuitive statementsmore rigorous. In fact, to prove some of his ideas, we need to use theorems thatwere not invented until decades after his lifetime.

1 Introduction

The theory begins with Euler’s product formula, which states that for s > 1,

∞∑n=1

1

ns=∏p

1

1− 1ps

where p ranges over all primes. The fomula can be shown by expanding eachterm in the product as

1

1− 1ps

= 1 +1

ps+

1

p2s+

1

p3s+ · · ·

and multiplying out all of them. This results in an infinite sum of terms in theform

1

(pk11 pk22 · · · p

kmm )s

where p1, ..., pm are distinct primes and k1, ..., km are positive integers. Thenone may use the fundamental theorem of arithmetic, which states that everyinteger has a unique prime factorization, to see that each of these terms is a1/ns. When summed, these terms equal the left-hand side.

Riemann called this function ζ(s) and considered its behavior when s isa complex variable. It is not hard to see that it converges in the halfplaneRe(s) > 1. Let s = σ + it where σ and t are real. Then for σ > 1,

ns = nσnit = nσeit logn,

1

so that

ζ(s) =

∞∑n=1

1

ns=

∞∑n=1

1

nσe−it logn.

But |e−it logn| = 1, so the sum converges absolutely in the halfplane Re(s) > 1.Moreover, in Re(s) > 1 there is uniform convergence, so ζ is holomorphic in thishalfplane.

2 Properties of ζ(s)

To obtain a formula for ζ(s) that works when s is outside the halfplane Re(s) >1, we shall extend ζ to a meromorphic function in C, using the gamma andtheta functions.

2.1 The Gamma Function

Our first object of study is the gamma function, defined as

Γ(s) =

∫ ∞0

e−tts−1dt

for s > 0. When s is a positive integer, Γ(s) = (s−1)!. To see that it converges,one may break it up into

Γ(s) =

∫ 1

0

e−tts−1dt+

∫ ∞1

e−tts−1dt

and observe that the second integral defines an entire function, while the firstcan be dealt with accordingly. Expand e−t as a power series and integratetermwise, resulting in

∫ 1

0

e−tts−1dt =

∞∑1

(−1)ntn+s

n!(n+ s)

∣∣∣∣∣1

0

=

∞∑1

(−1)n

n!(n+ s),

which defines a meromorphic function on C, having poles at the negative integerswith residue (−1)n/n! at s = −n. This is easy to verify, as the rapid growthof n! in the denominator makes the series converge uniformly. Therefore, therelation

Γ(s) =

∞∑1

(−1)n

n!(n+ s)+

∫ ∞1

e−tts−1dt

defines a meromorphic function.

2

Before going on, we first write out a property of the gamma function thatshall be useful later:

Γ(s+ 1) = limN→∞

N !

(s+ 1)(s+ 2) · · · (s+N)(N + 1)s

=

∞∏n=1

n1−s(n+ 1)s

s+ n

=

∞∏n=1

(1 + 1

n

)s(1 + s

n

) . (1)

The first line is due to Euler, and the second and third are reformulations of it.

2.2 The Theta Function

For our case, define the theta function for real t > 0 as

ϑ(t) =

∞∑−∞

e−πn2t.

This satisfies the functional equation

ϑ(t) = t−12ϑ

(1

t

)which can be shown by application of the Poisson summation formula to ϑ(t).

The growth of ϑ(t) is bounded like

|ϑ(t)− 1| ≤ Ce−πt.

This can be seen from the fact that

∞∑1

e−πn2t ≤

∞∑1

e−πnt ≤ Ce−πt

for t ≥ 1. The behavior of ϑ(t) near t = 0 is given by

ϑ(t) ≤ Ct− 12

which can be seen from the functional equation.

2.3 Analytic Continuation and Functional Equation

Now we are in a position to relate ζ, γ, and ϑ as follows. The proof is based onStein and Shakarchi [S1]. Let Re(s) > 1. If n ≥ 1, then∫ ∞

0

e−πn2uu(s/2)−1du = π−s/2Γ(s/2)n−s,

3

which can be seen immediately from the change of variable u = t/(πn2), makingthe integral ∫ ∞

0

e−tt(s/2)−1dt · (πn2)−s/2

equal to π−s/2Γ(s/2)n−s. Now, because

ϑ(u)− 1

2=

∞∑n=1

e−πn2u,

and because of the previously shown bounds on the growth and decay of ϑ, wemay interchange the sum and integral. Then

1

2

∫ ∞0

u(s/2)−1[ϑ(u)− 1]du =

∞∑n=1

∫ ∞0

u(s/2)−1e−πn2udu

= π−s/2Γ(s/2)

∞∑n=1

n−s

= π−s/2Γ(s/2)ζ(s)

Now define

ψ(u) =ϑ(u)− 1

2.

The functional equation ϑ(u) = u−1/2ϑ(1/u) implies

ψ(u) = u−1/2ψ(1/u) +1

2u1/2− 1

2.

From the previously derived equation, we have, for Re(s) > 1,

π−s/2Γ(s/2)ζ(s) =

∫ ∞0

u(s/2)−1ψ(u)du

=

∫ 1

0

u(s/2)−1ψ(u)du+

∫ ∞1

u(s/2)−1ψ(u)du

=

∫ 1

0

u(s/2)−1[u−1/2ψ(1/u) +

1

2u1/2− 1

2

]du

+

∫ ∞1

u(s/2)−1ψ(u)du

=1

s− 1+

1

s+

∫ ∞1

[u−(s/2)−1/2 + u(s/2)−1]ψ(u)du.

Note that this defines a meromorphic function with simple poles at 0 and 1.This is because the exponential decay of ψ in the integral means the integraldefines an entire function. Also, observe that the value is unchanged if s isreplaced by 1− s. Hence

π−s/2Γ(s/2)ζ(s) = π−(1−s)/2Γ((1− s)/2)ζ(1− s),

4

which allows us to define values for zeta everywhere except the pole at s = 1.We shall follow Riemann’s notation and multiply π−s/2Γ(s/2)ζ(s) by the

factor s(s− 1)/2 and define this as∗

ξ(s) = Γ(s/2 + 1)(s− 1)π−s/2ζ(s).

The advantage of using this definition of ξ is that the multiplying by s ands − 1 effectively cancel the simple poles of π−s/2Γ(s/2)ζ(s), and hence ξ(s) isan entire function and satisfies

ξ(s) = ξ(1− s).

We may rearrange to find

ζ(s) =πsξ(s)

(s− 1)Γ(s/2 + 1),

which shows that ζ has a simple pole at 1 and zeros where Γ(s/2 + 1) has poles,namely at s/2 = −n, n ∈ N. Hence zeta has simple zeros at −2, −4, −6, etc.These are defined the trivial zeros. Note that all other zeros of ζ must also bezeros of ξ. These nontrivial zeros are denoted by ρ.

Furthermore, the zeta function can be defined in the halfplane Re(s) > 1 by

ζ(s) =∏p

1

1− 1ps

.

Now (1−1/ps)−1−1 converges absolutely, so if zeta has a zero in this halfplane,then one of the terms (1 − 1/ps)−1 must equal zero. This is impossible, so ζhas no zeroes in the halfplane Re(s) > 1. And from the functional equationξ(s) = ξ(1 − s), it follows that there are no zeros in the halfplane Re(s) < 0,except at the trivial zeros. Thus all of the nontrivial zeros must lie in therectangle 0 ≤ Re(s) ≤ 1. This bound can be improved to remove the linesRe(s) = 0 and Re(s) = 1 and thus have the statement that all nontrivial zerosof zeta lie in the region 0 < Re(s) < 1, which is denoted as the critical strip.

2.4 Product Formula for ξ(s)

Riemann assumed it was possible to factor ξ(s) in terms of its roots in somethingof the form

ξ(s) = f(s)∏ρ

(1− s

ρ

),

where f(s) is a function that does not vanish. Given this was possible, he showedthat f(s) must be a constant, and then showed that the constant must bef(s) = ξ(0), which follows upon setting s = 0.

∗The ξ function is usually defined as ξ(s) = π−s/2Γ(s/2)ζ(s), which has been shown tohave simple poles at 0 and 1.

5

The factoring step is indeed valid as shown by Hadamard in 1893, some 34years after the publication of Riemann’s paper. We will not repeat the proof ofthe Hadamard factorization here, as it is a fairly intricate result (a proof can befound starting on p.147 of [S1]). The factorization theorem states for this casethat f(s) = ea+bs because ξ has order of growth 1 (this can be easily checkedfrom the equation defining ξ). Then since ξ(s − 1/2) is an even function (thisfollows from ξ(s) = ξ(1 − s)), Re log ξ(s − 1/2) is an even function but mustgrow slower than s1+ε. A linear term cannot be even, so it must be constant.Hence we have the equation

ξ(s) = ξ(0)∏ρ

(1− s

ρ

).

But we also have by definition that

ξ(s) = Γ(s/2 + 1)(s− 1)π−s/2ζ(s),

so we may combine these, take the log, and rearrange to obtain

log ζ(s) = log ξ(0)+∑ρ

log

(1− s

ρ

)− log Γ

(s2

+ 1)

+s

2log π− log(s−1). (2)

3 Building the Formula

3.1 π(x) and J(x)

The end goal is to obtain a formula for π(x), which counts the number of primesless than x. For our purposes, we shall use the formula

π(x) =1

2

∑p<x

1 +∑p≤x

1

.This function starts at 0 when x = 0 and jumps by 1 at each prime. At eachjump, the function assumes the halfway value. Since π(x) almost everywhereassumes integer values, it is difficult to imagine why a formula based on analytictechniques should exist.

Riemann next defined the function J(x). Like π(x), this function starts at 0when x = 0 and jumps by 1 for every prime, but it also jumpts by 1/2 for everyprime square, 1/3 for every prime cube, etc. It may be defined as

J(x) =1

2

∑pn<x

1

n+∑pn≤x

1

n

where it assumes halfway values at the jumps. The reason this function isinteresting is that it may be related to the zeta function as follows.

6

Consider the product formula of ζ(s) for Re(s) > 1

ζ(s) =∏p

1

1− 1ps

.

Taking the log of both sides and using the Taylor series for the log yields

log ζ(s) =∑p

− log

(1− 1

ps

)=∑p

(1

ps+

1

2p2s+

1

3p3s+ · · ·

)

=∑p

∑n

p−ns

n.

Observe that

p−ns = s

∫ ∞pn

x−s−1dx

which follows from elementary calculus. We may substitute this into the log ζ(s)formula to obtain

log ζ(s) = s∑p

∑n

1

n

∫ ∞pn

x−s−1dx

Because this is absolutely convergent for Re(s) > 1, it follows that we mayinterchange the order of summation and integration, resulting in†

log ζ(s) = s

∫ ∞0

∑pn<x

1

nx−s−1dx

= s

∫ ∞0

J(x)x−s−1dx. (3)

This is the key relation between ζ(s) and J(x). Later in this section we shalluse this formula again.

Now we need a relation between J(x) and π(x). This is given by

J(x) = π(x) +1

2π(x

12 ) +

1

3π(x

13 ) + · · · (4)

where the number of primes less than x is counted with weight 1, the numberof prime squares less than x is counted with weight 1/2, etc. Note that the sumis actually a finite sum, as π(x) = 0 for x < 2 (there are no primes less than 2).This shall be helpful, though not necessary, for inverting the relation.

†Note that since jumps occur on a set of measure zero, it does not matter in the sumwhether we use pn < x or pn ≤ x.

7

The method of inversion will be the Mobius inversion. Let µ(n) denote theMobius function, defined for n ∈ N as

µ(n) =

1, if n = 1,

(−1)k, if n is the product of k distinct primes,

0, otherwise.

Then Mobius inversion on equation (4) gives

π(x) =

∞∑n=1

µ(n)

nJ(x

1n )

which is also a finite sum, for when x < 2, we have J(x) = 0 (there are no

primes or prime powers less than 2). So, all the terms where x1n < 2 are 0, that

is, which means there are only blog x/ log 2c non-zero terms.At this point, note that since J(x) counts primes and weighted prime powers

below x, J(x) grows no faster than x (in fact, the prime number theorem impliesJ(x) ∼ x/ log x). Then the function J(x)x−s−1 grows slower than x−s. Combinethis with the fact that J(x) = 0 for x < 2, to see that J(x)x−s−1 is integrableacross the line when Re(s) > 1. So, we may use the inverse Laplace transformon the equation

log ζ(s)

s=

∫ ∞0

J(x)x−s−1dx

which is a reassembling of equation (3) to find

J(x) =1

2πi

∫ a+i∞

a−i∞log ζ(s)

xs

sds (5)

with a > 1.

3.2 The Product Formula and the Result

The next step begins a long line of hard work. We now attempt to substituteequation (2), reprinted below,

log ζ(s) = log ξ(0) +∑ρ

log

(1− s

ρ

)− log Γ

(s2

+ 1)

+s

2log π − log(s− 1)

into (5). If this works, then we can integrate term-wise and obtain a formulafor J(x). Unfortunately, the direct substitution does not work because it leadsto divergent integrals. We can, however, first integrate (5) by parts to obtain

J(x) = − 1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[log ζ(s)

s

]xsds (6)

8

and then carry out the processes of substitution and term-wise integegrationto obtain the desired formula. The integration by parts of (5) depends on thebehavior of the term

1

2πi· 1

log x· log ζ(s)

sxs

when s→ a±∞. To prove the validity of (6), it suffices to show that

limT→∞

log ζ(a± iT )

a± iTxa±iT = 0.

This follows from the inequality

| log ζ(a± iT )| =

∣∣∣∣∣∑n

∑p

(1/n)p−n(a±iT )

∣∣∣∣∣ ≤∑n

∑p

(1/n)p−na = log ζ(s) <∞

so that the numerator is bounded, the denominator goes to infinity, and theright-hand term is also bounded. Hence the term goes to zero and the integrationby parts is valid. The next section, in which we integrate term-wise, is the hardpart.

4 The Terms of J(x)

After substitution, formula (6) at the end of the last section gives us an integralwith 5 terms. The evaluations of some of these integrals are certainly not trivial.Much of the work in this section is due to Edwards [E1].

For sake of nearby reference for the reader, the integral is

J(x) = − 1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[log ζ(s)

s

]xsds

and the terms are

log ζ(s) = log ξ(0) +∑ρ

log

(1− s

ρ

)− log Γ

(s2

+ 1)

+s

2log π − log(s− 1),

derived in the previous sections.

4.1 The Main Term

We shall start with the − log(s− 1) term. This becomes

1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[log(s− 1)

s

]xsds.

To compute this integral, we first define a few auxiliary functions, the the firstof which is

F (β) =1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

{log[(s/β)− 1]

s

}xsds

9

where our term in question is the special case F (1). To extend F , we takea > Reβ and define log[(s/β)− 1] as log(s− β) - log β, to follow the principalbranch of log. Moreover, the integral is absolutely convergent because∣∣∣∣ dds log[(s/β)− 1]

s

∣∣∣∣ ≤ | log[(s/β)− 1]||s|2

+1

|s(s− β)|

is integrable, while xs oscillates on the line of integration. Now we use thederivative

d

dβ

log[(s/β)− 1]

s=

1

(β − s)βto obtain

F ′(β) =1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[1

(β − s)β

]xsds

= − 1

2πi

∫ a+i∞

a−i∞

xs

(β − s)βds

=1

2πiβ

∫ a+i∞

a−i∞

xs

s− βds

where the first step comes from differentiation under the integral sign, the secondfrom integration by parts, and the third from trivial rearrangement.

This can be computed. Consider the function

1

s− β=

∫ ∞1

x−sxβ−1dx [Re(s− β) > 0].

Substitute x = eλ, dx = eλdλ and write s = a+ iµ to obtain

1

a+ iµ− β=

∫ ∞0

e−iλµeλ(β−a)dλ [a > Re(β)],

which gives, from Fourier inversion,∫ ∞−∞

1

a+ iµ− βeiµxdµ =

{2πex(β−a), if x > 0,

0, if x < 0.

It follows that1

2πi

∫ a+i∞

a−i∞

1

s− βysds =

{yβ , if y > 1,

0, if y < 1.(7)

Since we already have x > 1, F ′(β) = xβ/β.The next step is to evaluate a contour integral. Let C+ be the contour from

0 to x that consists of the real line segment from 0 to 1 − ε, the semicircle inthe upper halfplane Im t ≥ 0 from 1− ε to 1 + ε, and then the real line segmentfrom 1 + ε to x. Define

G(β) =

∫C+

tβ−1

log tdt

10

and note that

G′(β) =

∫C+

tβ−1dt =tβ

β

∣∣∣∣x0

= F ′(β).

Since G(β) is defined and analytic for Re(β) > 0, G(β) and F (β) must differby a constant. The hope is that we can compute this constant and hence findF (β) as G(β) plus a constant.

We shall evaluate the constant by setting β = σ+ iτ , holding σ fixed, lettingτ → ∞, and evaluate F (β) and G(β). First, we evaluate the limit of G(β).Making the change of variable t = eu puts G(β) in the form∫ iδ+log x

iδ−∞

eβu

udu+

∫ log x

iδ+log x

eβu

udu.

Note that the path of integration has been altered slightly based on Cauchy’sintegral theorem. The further changes of variable u = iδ+ v in the first integraland u = log x+ iw in the second put G(β) in the form

eiδσe−στ∫ log x

−∞

eσv

iδ + veiτvdv − ixβ

∫ δ

0

e−τweσiw

log x+ iwdw,

whose values both approach 0 as τ → ∞. In the first integral, e−δτ → 0 isenough to make the value 0, and in the second, e−τw → 0 except at w = 0.Therefore, the limit of G(β) as τ →∞ is 0.

Evaluating the limit of F (β) is a bit trickier. Define another auxiliary func-tion

H(β) =1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

{log[1− (s/β)]

s

}xsds

where a > Reβ and log[1−(s/β)] is defined for complex β as log(s−β)−log(−β).The goal is to compare this to F (β) and thereby to G(β). In the upper halfplaneImβ > 0, the difference

H(β)− F (β) =1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[log β − log(−β)

s

]xsds

=1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[iπ

s

]xsds

= − 1

2πi

∫ a+i∞

a−i∞

iπ

sxsds

= −iπ

where the last result is derived from equation (7). Therefore, F (β) = H(β)+ iπin the upper halfplane, reducing the problem to finding the limit of H(β) asτ →∞. From the derivative

d

ds

log[1− (s/β)]

s= − log[1− (s/β)]

s2+

1

s(s− β)

= − log[1− (s/β)]

s2+

1

β(s− β)− 1

βs,

11

we may put this in the integral defining H(β). The first term is then

− 1

2πi

∫ a+i∞

a−i∞

log[1− (s/β)]

s2xsds.

Since 1 − (s/β) → 1 and hence log[1 − (s/β)] → 0, the numerator is stronglybounded. The denominator is s2, which grows like |s|2 for large τ , and xs

oscillates along the line of integration. The 1/s2 growth rate means the we mayuse the Lebesgue bounded convergence theorem so that the limit of the integralis the integral of the limit, which is 0 due to log[1 − (s/β)] in the numerator.Hence, this integral is 0. The second and third terms combine to give

1

2πi

∫ a+i∞

a−i∞

[1

β(s− β)− 1

βs

]xsds =

xβ

β− 1

β

from equation (7). The numerators are bounded and |β| → ∞, hence theseterms go to 0, and the function H(β) goes to 0. This implies F (β) → iπ, andthus F (β) = G(β) + iπ in the halfplane Reβ > 0. Finally, this allows us towrite the main J(x) term as

F (1) =

∫ 1−ε

0

dt

log t+

∫ 1+ε

1−ε

dt

log t+

∫ x

1+ε

dt

log t+ iπ

Taking the limit as ε→ 0, we see that the second term approaches along a poleof residue 1, but the contour is taken with the negative orientation, resulting

in∫ 1+ε

1−εdt

log t = −iπ, from the residue theorem. This implies that the iπ termscancel and we are left with

F (1) = limε→0

∫ 1−ε

0

dt

log t+

∫ x

1+ε

dt

log t= Li(x).

4.2 The Oscillatory Term

Next, we shall look at the term∑ρ

log

(1− s

ρ

)which involves the nontrivial roots of the zeta function. In the integral form forJ(x), this becomes

− 1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

∑ρ log(

1− sρ

)s

xsds. (8)

At this point it is not clear what to do, since we do not know whether theintegral and sum can be interchanged. Riemann did not know how to provethis, but he assumed it could be done. We will see in a later section that if we

12

assume the interchange is valid, the final result is the correct one, despite thepossible invalidity of the method.

Assuming we can interchange the integral and sum, this expression becomes

−∑ρ

H(ρ)

with the same H(ρ) as defined in the previous section. We showed that H(ρ) =G(ρ) in the first quadrant (Re ρ > 0, Im ρ > 0), and if we take the integraldefining G(ρ) to go through the lower halfplane, the same holds ρ in the fourthquadrant (Re ρ > 0, Im ρ ≤ 0). That is, let C− be the contour that goes in aline segment from 0 to 1 − ε, in a semicircle in the lower halfplane (Im ρ < 0)from 1 − ε to 1 + ε, and then in a line segment from 1 + ε to x. Then afterpairing the terms ρ and 1− ρ, we find that the total sum is equal to

−∑

Im ρ>0

(∫C+

tρ−1

log tdt+

∫C−

t−ρ

log tdt

).

If β is real and positive, then the change of variable u = tβ , log t = (log u)/β,dt/t = du/uβ gives ∫

C+

tβ−1

log tdt =

∫ xβ

0

du

log u= Li(xρ)− iπ,

where the path from 0 to xβ passes in the upper halfplane near u = 1. Now theintegral converges in the upper halfplane Reβ > 0 and thus gives an analyticcontinuation of Li(xβ) in this halfplane. On the other hand, the integral∫

C−

tβ−1

log tdt = Li(xρ) + iπ,

through a similar argument. Thus the formula for equation (8) is

−∑

Im ρ>0

[Li(xρ) + Li(x1−ρ)

].

We must be careful as this sum converges only conditionally. We take the sumin order of increasing | Im(ρ)|.

4.3 The Constant Term

The next term islog ξ(0),

which becomes, in the integral,

− 1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[log ξ(0)

s

]xsds.

13

Integrating by parts and using equation (7), we have that the above is equal to

1

2πi

∫ a+i∞

a−i∞

log ξ(0)

sxsds = log ξ(0),

which is given by ξ(0) = Γ(1)π0(0− 1)ζ(0) = −ζ(0) = 12 , so that

log ξ(0) = − log 2.

4.4 The Integral Term

The last useful term islog Γ

(s2

+ 1)

and the corresponding integral is

1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[log Γ

(s2 + 1

)s

]xsds (9)

Using formula (1), a property of the gamma function, we may rewrite

log Γ(s

2+ 1)

=

∞∑1

[− log

(1 +

s

2n

)+s

2log

(1 +

1

n

)].

Putting this formula in (9) and and assuming that we can interchange the sumand integral, we have (9) in the form

− 1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

{− log[1 + (s/2n)]

s

}xsds

where only the first sum is intact (the second sum vanishes because division bys results in a constant, which has derivative 0). But this is equal to

−∞∑1

H(−2n)

where H is defined as in section 4.1 in the evaluation of the main term. In thatsection we only evaluated H for Re(β) > 0. To analyze the behavior of H inRe(β) < 0, define

E(β) = −∫ ∞x

tβ−1

log tdt

and note that

E′(β) = −∫ ∞x

tβ−1dt =xβ

β= F ′(β) = H ′(β),

14

so that E(β) and H(β) differ by a constant. Now both E and H approach zeroas β → ∞ and so the constant is zero, giving E(β) = H(β). Thus our termbecomes

−∞∑1

H(−2n) =

∫ ∞x

t−2n−1

log tdt

=

∫ ∞x

1

t log t

∞∑1

(t−2n

)dt

=

∫ ∞x

dt

t(t2 − 1) log t

assuming that termwise integration is valid.To show that it is, we consider

d

ds

log Γ(s/2 + 1)

s= −

∞∑1

d

ds

− log[1 + (s/2n)]

s.

For large n, take the Taylor series expansion log(1 + x) = x− 12x

2 + 13x

3 − · · ·to find that

−∞∑1

d

ds

− log[1 + (s/2n)]

s=

1

2

1

4n2− 2

3

s

8n3+

3

4

s2

16n4− · · ·

which converges uniformly as the highest order term of n is n−2. This justifiestermwise differentiation. The termwise integration is likewise justified, as theterms decay like 1/n2 and the sum is hence uniformly convergent.

4.5 The Vanishing Term

The final term we look at iss

2log π,

which, as it turns out, completely vanishes in the formula for J(x), because

− 1

2πi· 1

log x

∫ a+i∞

a−i∞

d

ds

[ s2 log π

s

]xsds = 0.

The term is divided by s and becomes constant, resulting in a derivative of 0,and thus the entire term is 0.

4.6 Result

In the final analysis, we have

J(x) = Li(x) +∑ρ

Li(xρ)− log 2 +

∫ ∞x

dt

t(t2 − 1) log t

15

with x > 1, and with the sum in the second term only conditionally convergent(one must sum in order of increasing | Im(ρ)|). Combining this formula with

π(x) =

∞∑n=1

µ(n)

nJ(x

1n )

gives an analytic formula for π(x). Remembering that this formula involvesa finite sum, we can see easily that if the formula for J(x) is valid, then theformula for π(x) must also be valid.

We have not yet shown the validity of termwise integration for the secondterm ∑

ρ

Li(xρ).

A proof dealing with this sum directly was not discovered until 1908, nearlyhalf a century after Riemann’s paper, by Landau [L1]. There were also methodsof indirect proof which involved formulas for functions similar to J(x), one ofwhich we shall examine in the next section.

5 The Von Mangoldt Formula

5.1 Deriving the Formula

Consider a counting function that counts primes and prime powers weighted bythe log of the prime, that is,

ψ(x) =∑pn<x

log p

where the function assumes the halfway value at each jump.This function has the corresponding equation (proved by von Mangoldt in

1894, see [E1])

ψ(x) = x−∑ρ

xρ

ρ− log(2π) +

∑n

x−2n

2n

for x > 1. While we shall not fully prove it here, we can show that it is a veryreasonable result. One can differentiate the formula for J(x) to obtain

dJ =

(1

log x−∑ρ

xρ−1

log x− 1

x(x2 − 1) log x

)dx.

Now, since J jumps by 1/n at prime powers, dJ = 1/n at x = pn. Similarly,dψ = log p = (1/n) log(pn) = (1/n) log(x) at x = pn. They are 0 everywhereelse. Hence these equations give

dψ = (log x)dJ

=

(1−

∑ρ

xρ−1 −∑n

x−2n−1

)dx

16

where the last term can be derived with geometric series. This leads to theplausible guess that

ψ(x) = x−∑ρ

xρ

ρ+∑n

x−2n

2n+ C.

The hard part in showing that von Mangoldt’s formula holds is showing thatthe oscillatory term , i.e.

∑ρ x

ρ/ρ, converges.To derive such a formula for ψ(x) in terms of ζ, von Mangoldt used the same

method as Riemann, i.e., he first found a formula for ζ(s) in an integral form ofψ(x), and then took the Laplace transform. In his case, he found

−ζ′(s)

ζ(s)= s

∫ ∞0

ψ(x)x−s−1dx,

which comes from log-differentiating the product formula for zeta, and then heapplies the transform to obtain

ψ(x) =1

2πi

∫ a+i∞

a−i∞

[−ζ′(s)

ζ(s)

]xsds

s(10)

for a > 1.For the next step, we shall find a formula for −ζ ′(s)/ζ(s) and take the

integral termwise. The reader will probably recognize this process as nearlyidentical so far to the process Riemann used to find J(x).

Using the equation

ξ(0)∏ρ

(1− s

ρ

)= Γ(s/2 + 1)(s− 1)π−s/2ζ(s)

developed at the end of section 2.4 and log-differentiating, we find that

−ζ′(s)

ζ(s)=

1

s− 1−∑ρ

1

s− ρ+∑n

[− 1

s+ 2n+

1

2log

(1 +

1

n

)]− 1

2log π.

Plugging s = 0 gives

−ζ′(0)

ζ(0)= −1 +

∑ρ

1

ρ+∑n

[− 1

2n+

1

2log

(1 +

1

n

)]− 1

2log π,

which, when substracted from the previous equation, gives

−ζ′(s)

ζ(s)=

s

s− 1−∑ρ

s

ρ(s− ρ)+∑n

s

2n(s+ 2n)− ζ ′(0)

ζ(0). (11)

17

5.2 The∑

ρxρ

ρTerm

When we plug equation (11) into the integral in equation (10), the terms actuallyconverge so we do not need to take the extra step of integrating by parts as wedid for J(x). This simplifies the calculation immensely. We shall skip overthe calculation of the first, third, and fourth terms, as we already know fromcalculation of J(x) what they should be (except for the value of the constant)and why they converge. We shall concern ourselves with the second term, arisingfrom the nontrivial zeros, namely

−∑ρ

s

ρ(s− ρ)

with the integral expression

1

2πi

∫ a+i∞

a−i∞

[∑ρ

s

ρ(s− ρ)

]xsds

s. (12)

The goal will be to show that this term converges and is equal to∑ρ

xρ

ρ. (13)

If we pair the roots ρ and 1 − ρ (such roots exist because ξ(s) = ξ(1 − s)), wefind that the sum actually converges uniformly. This can be seen from∣∣∣∣ 1

s− ρ+

1

s− (1− ρ)

∣∣∣∣ =

∣∣∣∣ 1

(s− 12 )− (ρ− 1

2 )+

1

(s− 12 ) + (ρ− 1

2 )

∣∣∣∣=

∣∣∣∣ 2(s− 12 )

(s− 12 )2 − (ρ− 1

2 )2

∣∣∣∣≤ C

∣∣∣∣ 1

(ρ− 12 )2

∣∣∣∣for large ρ, and the fact that∑

ρ

1∣∣ρ− 12

∣∣1+ε <∞,which is essentially due to ξ(s) having order of growth 1. The uniform con-vergence implies that this sum can be integrated termwise over finite intervals.Thus the term (12) is equal to

limh→∞

∑ρ

1

2πi

∫ a+ih

a−ih

xsds

ρ(s− ρ)= limh→∞

∑ρ

xρ

ρ

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ

and defines the correct term in the formula for ψ(x). It is not hard to find, forx > 1,

limh→∞

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ= 1,

18

which follows immediately from the formula

limh→∞

1

2πi

∫ a+ih

a−ih

ysds

s=

0, if 0 < y < 1,12 , if y = 1,

1, if y > 1.

This would imply that the term (12) converges to

limh→∞

∑ρ

xρ

ρ

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ=∑ρ

xρ

ρ

if we are allowed to interchange the limit and sum. If this is possible, then wewill have shown that (13) converges.

To do this, we shall follow von Mangoldt’s proof, which takes the limit“diagonally” using the function

∑| Im(ρ)|≤h

xρ

ρ

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ. (14)

Before doing the proof, we need two bounds on the integral

1

2πi

∫ a+ih

a−ih

ysds

s.

The first bound is ∣∣∣∣∣ 1

2πi

∫ a+ih

a−ih

ysds

s− 1

∣∣∣∣∣ ≤ xa

πh log x(15)

with x > 1 and a > 0, and the second is∣∣∣∣∣ 1

2πi

∫ a+id

a−ic

ysds

s

∣∣∣∣∣ ≤ K xa

(a+ c) log x(16)

where x > 1, a > 0, and d > c ≥ 0. The proofs for both can be found in [E1],and we shall not provide them in this paper.

We also need a statement about the density of roots ρ. Namely, there existsH such that for T ≥ H, the number of roots in the region T ≤ Im(ρ) ≤ T + 1is less than 2 log T . It is clear due to ξ(s) having order of growth 1 that thisdensity must be less than T ε, but to obtain the bound 2 log T requires a bit moredetail, and it in fact uses Stirling’s approximation for the gamma function. Weshall not give the proof here, but it can also be found in [E1].

Now, on with the proof that (12) converges to (13). Consider for a given hthe differences∑

ρ

xρ

ρ

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ−

∑| Im(ρ)|≤h

xρ

ρ

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ(17)

19

and ∑| Im(ρ)|≤h

xρ

ρ

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ−

∑| Im(ρ)|≤h

xρ

ρ. (18)

The goal will be to show that both of these are 0, which will prove that (12) isequal to (13), and since the former converges, so does the latter.

We shall consider first an estimate of (17). Write ρ = β+ iγ. From (16), wesee that the modulus of (17) is at most

∑|γ|>h

∣∣∣∣xρρ∣∣∣∣∣∣∣∣∣ 1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ

∣∣∣∣∣≤ 2

∑γ>h

xβ

β

∣∣∣∣∣ 1

2πi

∫ a−β+i(γ+h)

a−β+i(γ−h)

xtdt

t

∣∣∣∣∣≤ 2

∑γ>h

xβ

β·K xa−β

(a− β + γ − h) log x

≤ 2Kxa

log x

∑γ>h

1

γ(γ − h+ c)

where c = a− 1 > 0 so that c ≤ a− β for all roots ρ. Grouping the with γ > hin intervals h < γ ≤ h + 1, h + 1 < γ ≤ h + 2,..., then for large h, the intervalh+ j < γ < h+ j+ 1 contains at most 2 log(h+ j) roots, and thus the modulusof (17) is at most a constant times

∞∑j=0

log(h+ j)

(h+ j)(j + c).

This sum obviously converges because of the j2 in the denominator. However,we need to show that as h → ∞, the sum converges to 0. Choosing h largeenough so that log(h+ j) < (h+ j)1/2 for all j ≥ 0, and thus the sum is at most

∞∑j=0

1

(h+ j)12 (j + c)

which can be made arbitrarily small by choosing large h. Hence (17) goes to 0.Now consider (18). The modulus of (18) is at most

2∑

0<γ≤h

xβ

β

∣∣∣∣∣ 1

2πi

∫ a−β−iγ+ih

a−β−iγ−ih

xtdt

t

∣∣∣∣∣ .Note the difference in bounds of integration. The integral bounds (15) and (16)

20

imply that this is at most

2∑

0<γ≤h

xβ

β

∣∣∣∣∣ 1

2πi

∫ a−β+i(h+γ)

a−β−i(h+γ)

xtdt

t

∣∣∣∣∣+ 2∑

0<γ≤h

xβ

β

∣∣∣∣∣ 1

2πi

∫ a−β+i(h+γ)

a−β+i(h−γ)

xtdt

t

∣∣∣∣∣≤ 2

∑0<γ≤h

xβ

β

xa−β

π(h+ γ) log x+ 2

∑0<γ≤h

xβ

βK

xa−β

(a− β + h− γ) log x

≤ 2xa

π log x

∑0<γ≤h

1

γ(h+ γ)+

2Kxa

log x

∑0<γ≤h

1

γ(c+ h− γ),

where c = a− 1 > 0 and c ≤ a− β as before. Now we just need to show tht thetwo sums ∑

0<γ≤h

1

γ(h+ γ)+∑

0<γ≤h

1

γ(c+ h− γ)

both go to 0. For the first sum, let H be an integer large enough such that theinterval H + j ≤ γ ≤ H + j + 1 contains at most 2 log(H + j) roots. Then∑

0<γ≤h

1

γ(h+ γ)≤

∑0<γ≤H

1

γ(h+ γ)+

∑0≤j≤h−H

2 log(H + j)

(H + j)(h+H + j),

where the first sum has a finite number of terms, and thus goes to 0 as h→∞.The second sum is at most

2∑

0≤j≤h−H

(log h)

[1

h

(1

H + j− 1

h+H + j

)]≤ 2

log h

h

∑0≤j≤h−H

1

H + j

≤ 2log h

h

∫ h

H−1

dt

t

≤ 2(log h)2

h,

which goes to 0 as h→∞. A similar calculation shows that the sum∑0<γ≤h

1

γ(c+ h− γ)

goes to 0 as h→∞.With this, we have shown that (17) and (18) go to 0, and hence

limh→∞

∑ρ

xρ

ρ

1

2πi

∫ a+ih

a−ih

xs−ρds

s− ρ−

∑| Im(ρ)|≤h

xρ

ρ= 0,

and therefore, we have shown the convergence of∑ρ

xρ

ρ.

21

6 The Prime Number Theorem and ConcludingRemarks

After the argument in the previous section, von Mangoldt then uses a Stieltjesintegral to transform the formula for ψ(x) into the formula Riemann obtainedfor J(x) (the integral is based on dψ = (log x)dJ). Note that there is nocircular reasoning here, as von Mangoldt proved the formula for ψ(x) withoutusing J(x) at all; the plausibility argument at the beginning of section 5.1 usingdψ = (log x)dJ is purely a result. In the Stieltjes integral that von Mangoldtcomputed, there were two terms corresponding to the convergent

∑ρ x

ρ/ρ: afirst term that contained the sum over ρ but did not contain the variable overwhich he was integrating, hence the validity of termwise integration, and asecond term that contained ρ2 on the denominator, so that the sum convergeduniformly. These formulas can be found on p.63 of [E1].

With these facts, this would constitute an indirect proof that the secondterm in J(x), i.e. the term

∑ρ Li(x

ρ), converges. Then the formula

J(x) = Li(x) +∑ρ

Li(xρ)− log 2 +

∫ ∞x

dt

t(t2 − 1) log t,

where x > 1 and the second term is summed in order of increasing | Im(ρ)|, isvalid.

We now turn our attention for the remainder of the paper to the primenumber theorem

π(x) ∼ x

log x,

which can almost be seen in Riemann’s formula, as π(x) ∼ J(x), and x/ log x ∼Li(x). Obviously the third and fourth term do not grow, but to show the primenumber theorem, one must show that

limx→∞

1

(x/ log x)

∑ρ

Li(xρ) = 0.

Perhaps it is much easier to see this with von Mangoldt’s formula after notingthat ψ(x) ∼ π(x) log x. Then the prime number theorem amounts to showingthat ψ(x) ∼ x. From von Mangoldt’s formula

ψ(x) = x−∑ρ

xρ

ρ− log(2π) +

∑n

x−2n

2n,

we see that the prime number theorem is equivalent to

limx→∞

−∑ρxρ

ρ − log(2π) +∑nx−2n

2n

x= 0.

Since the last 2 terms do not grow with x, it suffices to show that

limx→∞

∑ρ

xρ−1

ρ= 0,

22

which would follow from xρ−1 → 0 for all ρ. This requires the proof that thereare no zeros on the line Re(s) = 1, which is precisely what Hadamard and de laVallee Poussin showed in their proofs of the prime number theorem.

References

[D1] Derbyshire, J., Prime Obsession, Joseph Henry Press, Washington, DC,2003.

[E1] Edwards, H. M., Riemann’s Zeta Function, Academic Press, New York,NY, 1974.

[L1] Landau, E., Nouvelle demonstration pour la formule de Riemann... Ann.Sci. Ecole Norm. Sup., 25, 399-442 (1908).

[S1] Stein, E. M. and Shakarchi, R., Complex Analysis, Princeton UniversityPress, Pinceton, NJ, 2003.

[S2] Stopple, J.,A Primer on Analytic Number Theory, Cambridge UniversityPress, Cambridge, UK, 2003.

23

Riemannâ€™s Explicit Formula

Documents