arXiv:1501.05438v2 [math.NT] 27 Jan 2015

Post on 27-Apr-2022

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

The ternary Goldbach problem

Harald Andres Helfgott

arX

iv1

501

0543

8v2

[m

ath

NT

] 2

7 Ja

n 20

15

ii

Contents

Preface vii

Acknowledgements ix

1 Introduction 111 History and new developments 212 The circle method Fourier analysis on Z 613 The major arcs M 9

131 What do we really know about L-functions and their zeros 9132 Estimates of f(α) for α in the major arcs 10

14 The minor arcs m 14141 Qualitative goals and main ideas 14142 Combinatorial identities 16143 Type I sums 18144 Type II or bilinear sums 21

15 Integrals over the major and minor arcs 2416 Some remarks on computations 28

2 Notation and preliminaries 3121 General notation 3122 Dirichlet characters and L functions 3223 Fourier transforms and exponential sums 3224 Mellin transforms 3425 Bounds on sums of micro and Λ 3526 Interval arithmetic and the bisection method 38

I Minor arcs 41

3 Introduction 4331 Results 4432 Comparison to earlier work 4533 Basic setup 45

331 Vaughanrsquos identity 45

iii

iv CONTENTS

332 An alternative route 47

4 Type I sums 5141 Trigonometric sums 5142 Type I estimates 56

421 Type I variations 63

5 Type II sums 7751 The sum S1 cancellation 80

511 Reduction to a sum with micro 80512 Explicit bounds for a sum with micro 84513 Estimating the triple sum 89

52 The sum S2 the large sieve primes and tails 93

6 Minor-arc totals 10161 The smoothing function 10162 Contributions of different types 102

621 Type I terms SI1 102622 Type I terms SI2 103623 Type II terms 107

63 Adjusting parameters Calculations 117631 First choice of parameters q le y 119632 Second choice of parameters 125

64 Conclusion 133

II Major arcs 135

7 Major arcs overview and results 13771 Results 13872 Main ideas 140

8 The Mellin transform of the twisted Gaussian 14381 How to choose a smoothing function 14582 The twisted Gaussian overview and setup 146

821 Relation to the existing literature 146822 General approach 147

83 The saddle point 149831 The coordinates of the saddle point 149832 The direction of steepest descent 150

84 The integral over the contour 152841 A simple contour 152842 Another simple contour 157

85 Conclusions 159

CONTENTS v

9 Explicit formulas 16391 A general explicit formula 16492 Sums and decay for the Gaussian 17593 The case of ηlowast(t) 17894 The case of η+(t) 18495 A sum for η+(t)2 18896 A verification of zeros and its consequences 193

III The integral over the circle 199

10 The integral over the major arcs 201101 Decomposition of Sη by characters 202102 The integral over the major arcs the main term 204103 The `2 norm over the major arcs 207104 The integral over the major arcs conclusion 212

11 Optimizing and adapting smoothing functions 217111 The symmetric smoothing function η 218

1111 The product η(t)η(ρminus t) 218112 The smoothing function ηlowast adapting minor-arc bounds 219

12 The `2 norm and the large sieve 227121 Variations on the large sieve for primes 227122 Bounding the quotient in the large sieve for primes 232

13 The integral over the minor arcs 245131 Putting together `2 bounds over arcs and `infin bounds 245132 The minor-arc total 248

14 Conclusion 259141 The `2 norm over the major arcs explicit version 259142 The total major-arc contribution 261143 The minor-arc total explicit version 267144 Conclusion proof of main theorem 275

IV Appendices 277

A Norms of smoothing functions 279A1 The decay of a Mellin transform 280A2 The difference η+ minus η in `2 norm 283A3 Norms involving η+ 285A4 Norms involving ηprime+ 286A5 The `infin-norm of η+ 288

vi CONTENTS

B Norms of Fourier transforms 291B1 The Fourier transform of ηprimeprime2 291B2 Bounds involving a logarithmic factor 293

C Sums involving Λ and φ 297C1 Sums over primes 297C2 Sums involving φ 299

D Checking small n by checking zeros of ζ(s) 305

Preface

ἐγγὺς δrsquo ἦν τέλεος ὃ δὲ τὀ τρίτον ἧκε χ[αμᾶζε

σὺν τῶι δrsquo ἐξέφυγεν θάνατον καὶ κῆ[ρα μέλαιναν

Hesiod () Ehoiai fr 7621ndash2 Merkelbach and West

The ternary Goldbach conjecture (or three-prime conjecture) states that every oddnumber n greater than 5 can be written as the sum of three primes The purpose of thisbook is to give the first full proof of this conjecture

The proof builds on the great advances made in the early 20th century by Hardy andLittlewood (1922) and Vinogradov (1937) Progress since then has been more gradualIn some ways it was necessary to clear the board and start work using only the mainexisting ideas towards the problem together with techniques developed elsewhere

Part of the aim has been to keep the exposition as accessible as possible withan emphasis on qualitative improvements and new technical ideas that should be ofuse elsewhere The main strategy was to give an analytic approach that is efficientrelatively clean and as it must be for this problem explicit the focus does not lie inoptimizing explicit constants or in performing calculations necessary as these tasksare

Organization In the introduction after a summary of the history of the problemwe will go over a detailed outline of the proof The rest of the book is divided in threeparts structured so that they can be read independently the first two parts do not referto each other and the third part uses only the main results (clearly marked) of the firsttwo parts

As is the case in most proofs involving the circle method the problem is reduced toshowing that a certain integral over the ldquocirclerdquo RZ is non-zero The circle is dividedinto major arcs and minor arcs In Part I ndash in some ways the technical heart of the proofndash we will see how to give upper bounds on the integrand when α is in the minor arcsPart II will provide rather precise estimates for the integrand when the variable α is inthe major arcs Lastly Part III shows how to use these inputs as well as possible toestimate the integral

Each part and each chapter starts with a general discussion of the strategy andthe main ideas involved Some of the more technical bounds and computations arerelegated to the appendices

vii

viii PREFACE

Dependencies between the chapters

1 2

3 7 10

4 8 11

5 9 12

6 13

14

Introduction Notation andpreliminaries

Minor arcsintroduction

Type I sums

Type II sums

Minor-arctotals

Major arcsoverview

Mellin transform oftwisted Gaussian

Explicit formulas

The integral overthe major arcs

Smoothing func-tions and their use

The `2 norm andthe large sieve

The integral overthe minor arcs

Conclusion

Acknowledgements

The author is very thankful to D Platt who working in close coordination with himprovided GRH verifications in the necessary ranges and also helped him with the usageof interval arithmetic He is also deeply grateful to O Ramare who in reply to hisrequests prepared and sent for publication several auxiliary results and who otherwiseprovided much-needed feedback

The author is also much indebted to A Booker B Green R Heath-Brown HKadiri D Platt T Tao and M Watkins for many discussions on Goldbachrsquos prob-lem and related issues Several historical questions became clearer due to the helpof J Brandes K Gong R Heath-Brown Z Silagadze R Vaughan and T WooleyAdditional references were graciously provided by R Bryant S Huntsman and IRezvyakova Thanks are also due to B Bukh A Granville and P Sarnak for theirvaluable advice

The introduction is largely based on the authorrsquos article for the Proceedings of the2014 ICM [Hel14b] That article in turn is based in part on the informal note [Hel13b]which was published in Spanish translation ([Hel13a] translated by M A Morales andthe author and revised with the help of J Cilleruelo and M Helfgott) and in a Frenchversion ([Hel14a] translated by M Bilu and revised by the author) The proof firstappeared as a series of preprints [Helb] [Hela] [Helc]

Travel and other expenses were funded in part by the Adams Prize and the PhilipLeverhulme Prize The authorrsquos work on the problem started at the Universite deMontreal (CRM) in 2006 he is grateful to both the Universite de Montreal and theEcole Normale Superieure for providing pleasant working environments During thelast stages of the work travel was partly covered by ANR Project Caesar No ANR-12-BS01-0011

The present work would most likely not have been possible without free and pub-licly available software SAGE PARI Maxima gnuplot VNODE-LP PROFIL BIASand of course LATEX Emacs the gcc compiler and GNULinux in general Some ex-ploratory work was done in SAGE and Mathematica Rigorous calculations used eitherD Plattrsquos interval-arithmetic package (based in part on Crlibm) or the PROFILBIASinterval arithmetic package underlying VNODE-LP

The calculations contained in this paper used a nearly trivial amount of resourcesthey were all carried out on the authorrsquos desktop computers at home and work How-ever D Plattrsquos computations [Plab] used a significant amount of resources kindly do-nated to D Platt and the author by several institutions This crucial help was providedby MesoPSL (affiliated with the Observatoire de Paris and Paris Sciences et Lettres)

ix

x ACKNOWLEDGEMENTS

Universite de Paris VIVII (UPMC - DSI - Pole Calcul) University of Warwick (thanksto Bill Hart) University of Bristol France Grilles (French National Grid InfrastructureDIRAC national instance) Universite de Lyon 1 and Universite de Bordeaux 1 BothD Platt and the author would like to thank the donating organizations their technicalstaff and all those who helped to make these resources available to them

Chapter 1

Introduction

The question we will discuss or one similar to it seems to have been first posed byDescartes in a manuscript published only centuries after his death [Des08 p 298]Descartes states ldquoSed amp omnis numerus par fit ex uno vel duobus vel tribus primisrdquo(ldquoBut also every even number is made out of one two or three prime numbersrdquo1) Thisstatement comes in the middle of a discussion of sums of polygonal numbers such asthe squares

Statements on sums of primes and sums of values of polynomials (polygonal num-bers powers nk etc) have since shown themselves to be much more than mere cu-riosities ndash and not just because they are often very difficult to prove Whereas the studyof sums of powers can rely on their algebraic structure the study of sums of primesleads to the realization that from several perspectives the set of primes behaves muchlike the set of integers or like a random set of integers (It also leads to the realizationthat this is very hard to prove)

If instead of the primes we had a random set of odd integers S whose density ndashan intuitive concept that can be made precise ndash equaled that of the primes then wewould expect to be able to write every odd number as a sum of three elements of Sand every even number as the sum of two elements of S We would have to check byhand whether this is true for small odd and even numbers but it is relatively easy toshow that after a long enough check it would be very unlikely that there would be anyexceptions left among the infinitely many cases left to check

The question then is in what sense we need the primes to be like a random set ofintegers in other words we need to know what we can prove about the regularities ofthe distribution of the primes This is one of the main questions of analytic numbertheory progress on it has been very slow and difficult

Fourier analysis expresses information on the distribution of a sequence in termsof frequencies In the case of the primes what may be called the main frequencies ndashthose in the major arcs ndash correspond to the same kind of large-scale distribution thatis encoded by L-functions the family of functions to which the Riemann zeta function

1Thanks are due to J Brandes and R Vaughan for a discussion on a possible ambiguity in the Latinwording Descartesrsquo statement is mentioned (with a translation much like the one given here) in DicksonrsquosHistory [Dic66 Ch XVIII]

1

2 CHAPTER 1 INTRODUCTION

belongs On some of the crucial questions on L-functions the limits of our knowledgehave barely budged in the last century There is something relatively new now namelyrigorous numerical data of non-negligible scope still such data is by definition finiteand as a consequence its range of applicability is very narrow Thus the real questionin the major-arc regime is how to use well the limited information we do have on thelarge-scale distribution of the primes As we will see this requires delicate work onexplicit asymptotic analysis and smoothing functions

Outside the main frequencies ndash that is in what are called the minor arcs ndash estimatesbased on L-functions no longer apply and what is remarkable is that one can sayanything meaningful on the distribution of the primes Vinogradov was the first to giveunconditional non-trivial bounds showing that there are no great irregularities in theminor arcs this is what makes them ldquominorrdquo Here the task is to give sharper boundsthan Vinogradov It is in this regime that we can genuinely say that we learn a littlemore about the distribution of the primes based on what is essentially an elementaryand highly optimized analytic-combinatorial analysis of exponential sums ie Fouriercoefficients given by series (supported on the primes in our case)

The circle method reduces an additive problem ndash that is a problems on sums suchas sums of primes powers etc ndash to the estimation of an integral on the space offrequencies (the ldquocirclerdquo RZ) In the case of the primes as we have just discussed wehave precise estimates on the integrand on part of the circle (the major arcs) and upperbounds on the rest of the circle (the minor arcs) Putting them together efficiently togive an estimate on the integral is a delicate matter we leave it for the last part as itis really what is particular to our problem as opposed to being of immediate generalrelevance to the study of the primes As we shall see estimating the integral well doesinvolve using ndash and improving ndash general estimates on the variance of irregularities inthe distribution of the primes as given by the large sieve

In fact one of the main general lessons of the proof is that there is a very closerelationship between the circle method and the large sieve we will use the large sievenot just as a tool ndash which we shall incidentally sharpen in certain contexts ndash but as asource for ideas on how to apply the circle method more effectively

This has been an attempt at a first look from above Let us now undertake a moreleisurely and detailed overview of the problem and its solution

11 History and new developments

The history of the conjecture starts properly with Euler and his close friend ChristianGoldbach both of whom lived and worked in Russia at the time of their correspon-dence ndash about a century after Descartesrsquo isolated statement Goldbach a man of manyinterests is usually classed as a serious amateur he seems to have awakened Eulerrsquospassion for number theory which would lead to the beginning of the modern era ofthe subject [Wei84 Ch 3 sectIV] In a letter dated June 7 1742 Goldbach made aconjectural statement on prime numbers and Euler rapidly reduced it to the followingconjecture which he said Goldbach had already posed to him every positive integercan be written as the sum of at most three prime numbers

11 HISTORY AND NEW DEVELOPMENTS 3

We would now say ldquoevery integer greater than 1rdquo since we no long consider 1 tobe a prime number Moreover the conjecture is nowadays split into two

bull the weak or ternary Goldbach conjecture states that every odd integer greaterthan 5 can be written as the sum of three primes

bull the strong or binary Goldbach conjecture states that every even integer greaterthan 2 can be written as the sum of two primes

As their names indicate the strong conjecture implies the weak one (easily subtract 3from your odd number n then express nminus 3 as the sum of two primes)

The strong conjecture remains out of reach A short while ago ndash the first completeversion appeared on May 13 2013 ndash the author proved the weak Goldbach conjecture

Theorem 111 Every odd integer greater than 5 can be written as the sum of threeprimes

In 1937 I M Vinogradov proved [Vin37] that the conjecture is true for all oddnumbers n larger than some constant C (Hardy and Littlewood had proved the samestatement under the assumption of the Generalized Riemann Hypothesis which weshall have the chance to discuss later)

It is clear that a computation can verify the conjecture only for n le c c a constantcomputations have to be finite What can make a result coming from analytic numbertheory be valid only for n ge C

An analytic proof generally speaking gives us more than just existence In thiskind of problem it gives us more than the possibility of doing something (here writingan integer n as the sum of three primes) It gives us a rigorous estimate for the numberof ways in which this something is possible that is it shows us that this number ofways equals

main term + error term (11)

where the main term is a precise quantity f(n) and the error term is something whoseabsolute value is at most another precise quantity g(n) If f(n) gt g(n) then (11) isnon-zero ie we will have shown the existence of a way to write our number as thesum of three primes

(Since what we truly care about is existence we are free to weigh different waysof writing n as the sum of three primes however we wish ndash that is we can decide thatsome primes ldquocountrdquo twice or thrice as much as others and that some do not count atall)

Typically after much work we succeed in obtaining (11) with f(n) and g(n) suchthat f(n) gt g(n) asymptotically that is for n large enough To give a highly simplifiedexample if say f(n) = n2 and g(n) = 100n32 then f(n) gt g(n) for n gt C whereC = 104 and so the number of ways (11) is positive for n gt C

We want a moderate value of C that is a C small enough that all cases n le C canbe checked computationally To ensure this we must make the error term bound g(n)as small as possible This is our main task A secondary (and sometimes neglected)possibility is to rig the weights so as to make the main term f(n) larger in comparisonto g(n) this can generally be done only up to a certain point but is nonetheless veryhelpful

4 CHAPTER 1 INTRODUCTION

As we said the first unconditional proof that odd numbers n ge C can be writtenas the sum of three primes is due to Vinogradov Analytic bounds fall into severalcategories or stages quite often successive versions of the same theorem will gothrough successive stages

1 An ineffective result shows that a statement is true for some constant C but givesno way to determine what the constant C might be Vinogradovrsquos first proof ofhis theorem (in [Vin37]) is like this it shows that there exists a constant C suchthat every odd number n gt C is the sum of three primes yet give us no hope offinding out what the constant C might be2 Many proofs of Vinogradovrsquos resultin textbooks are also of this type

2 An effective but not explicit result shows that a statement is true for someunspecified constant C in a way that makes it clear that a constant C couldin principle be determined following and reworking the proof with great careVinogradovrsquos later proof ([Vin47] translated in [Vin54]) is of this nature AsChudakov [Chu47 sectIV2] pointed out the improvement on [Vin37] given byMardzhanishvili [Mar41] already had the effect of making the result effective3

3 An explicit result gives a value of C According to [Chu47 p 201] the firstexplicit version of Vinogradovrsquos result was given by Borozdkin in his unpub-lished doctoral dissertation written under the direction of Vinogradov (1939)C = exp(exp(exp(4196))) Such a result is by definition also effectiveBorodzkin later [Bor56] gave the value C = ee

16038

though he does not seem tohave published the proof The best ndash that is smallest ndash value of C known beforethe present work was that of Liu and Wang [LW02] C = 2 middot 101346

4 What we may call an efficient proof gives a reasonable value for C ndash in our casea value small enough that checking all cases up to C is feasible

How far were we from an efficient proof That is what sort of computation couldever be feasible The situation was paradoxical the conjecture was known above anexplicit C but C = 2 middot101346 is so large that it could not be said that the problem couldbe attacked by any foreseeable computational means within our physical universe (Atruly brute-force verification up to C takes at least C steps a cleverer verification takeswell over

radicC steps The number of picoseconds since the beginning of the universe is

less than 1030 whereas the number of protons in the observable universe is currentlyestimated at sim 1080 [Shu92] this limits the number of steps that can be taken inany currently imaginable computer even if it were to do parallel processing on anastronomical scale) Thus the only way forward was a series of drastic improvementsin the mathematical rather than computational side

I gave a proof with C = 1029 in May 2013 Since D Platt and I had verifiedthe conjecture for all odd numbers up to n le 88 middot 1030 by computer [HP13] thisestablished the conjecture for all odd numbers n

2Here as is often the case in ineffective results in analytic number theory the underlying issue is that ofSiegel zeros which are believed not to exist but have not been shown not to the strongest bounds on (ieagainst) such zeros are ineffective and so are all of the many results using such estimates

3The proof in [Mar41] combined the bounds in [Vin37] with a more careful accounting of the effect ofthe single possible Siegel zero within range

11 HISTORY AND NEW DEVELOPMENTS 5

(In December 2013 I reduced C to 1027 The verification of the ternary Gold-bach conjecture up to n le 1027 can be done on a home computer over a weekendas of the time of writing (2014) It must be said that this uses the verification of thebinary Goldbach conjecture for n le 4 middot 1018 [OeSHP14] which itself required com-putational resources far outside the home-computing range Checking the conjectureup to n le 1027 was not even the main computational task that needed to be accom-plished to establish the Main Theorem ndash that task was the finite verification of zeros ofL-functions in [Plab] a general-purpose computation that should be useful elsewhere)

What was the strategy of the proof The basic framework is the one pioneered byHardy and Littlewood for a variety of problems ndash namely the circle method which aswe shall see is an application of Fourier analysis over Z (There are other later routesto Vinogradovrsquos result see [HB85] [FI98] and especially the recent work [Sha14]which avoids using anything about zeros of L-functions inside the critical strip) Vino-gradovrsquos proof like much of the later work on the subject was based on a detailedanalysis of exponential sums ie Fourier transforms over Z So is the proof that wewill sketch

At the same time the distance between 2 middot 101346 and 1027 is such that we cannothope to get to 1027 (or any other reasonable constant) by fine-tuning previous workRather we must work from scratch using the basic outline in Vinogradovrsquos originalproof and other initially unrelated developments in analysis and number theory (no-tably the large sieve) Merely improving constants will not do rather we must doqualitatively better than previous work (by non-constant factors) if we are to have anychance to succeed It is on these qualitative improvements that we will focus

It is only fair to review some of the progress made between Vinogradovrsquos time andours Here we will focus on results later we will discuss some of the progress madein the techniques of proof See [Dic66 Ch XVIII] for the early history of the problem(before Hardy and Littlewood) see R Vaughanrsquos ICM lecture notes on the ternaryGoldbach problem [Vau80] for some further details on the history up to 1978

In 1933 Schnirelmann proved [Sch33] that every integer n gt 1 can be written asthe sum of at most K primes for some unspecified constant K (This pioneering workis now considered to be part of the early history of additive combinatorics) In 1969Klimov gave an explicit value for K (namely K = 6 middot 109) he later improved theconstant to K = 115 (with G Z Piltay and T A Sheptickaja) and K = 55 Laterthere were results by Vaughan [Vau77a] (K = 27) Deshouillers [Des77] (K = 26)and Riesel-Vaughan [RV83] (K = 19)

Ramare showed in 1995 that every even number n gt 1 can be written as the sum ofat most 6 primes [Ram95] In 2012 Tao proved [Tao14] that every odd number n gt 1is the sum of at most 5 primes

There have been other avenues of attack towards the strong conjecture Using ideasclose to those of Vinogradovrsquos Chudakov [Chu37] [Chu38] Estermann [Est37] andvan der Corput [van37] proved (independently from each other) that almost every evennumber (meaning all elements of a subset of density 1 in the even numbers) can bewritten as the sum of two primes In 1973 J-R Chen showed [Che73] that every even

6 CHAPTER 1 INTRODUCTION

number n larger than a constant C can be written as the sum of a prime number andthe product of at most two primes (n = p1 + p2 or n = p1 + p2p3) IncidentallyJ-R Chen himself together with T-Z Wang was responsible for the best bounds onC (for ternary Goldbach) before Lui and Wang C = exp(exp(11503)) lt 4 middot 1043000

[CW89] and C = exp(exp(9715)) lt 6 middot 107193 [CW96]Matters are different if one assumes the Generalized Riemann Hypothesis (GRH)

A careful analysis [Eff99] of Hardy and Littlewoodrsquos work [HL22] gives that everyodd number n ge 124 middot 1050 is the sum of three primes if GRH is true4 Accordingto [Eff99] the same statement with n ge 1032 was proven in the unpublished doctoraldissertation of B Lucke a student of E Landaursquos in 1926 Zinoviev [Zin97] improvedthis to n ge 1020 A computer check ([DEtRZ97] see also [Sao98]) showed that theconjecture is true for n lt 1020 thus completing the proof of the ternary Goldbachconjecture under the assumption of GRH What was open until now was of course theproblem of giving an unconditional proof

12 The circle method Fourier analysis on Z

It is common for a first course on Fourier analysis to focus on functions over the re-als satisfying f(x) = f(x + 1) or what is the same functions f RZ rarr CSuch a function (unless it is fairly pathological) has a Fourier series converging to itthis is just the same as saying that f has a Fourier transform f Z rarr C definedby f(n) =

intRZ f(α)e(minusαn)dα and satisfying f(α) =

sumnisinZ f(n)e(αn) (Fourier

inversion theorem) where e(t) = e2πitIn number theory we are especially interested in functions f Zrarr C Then things

are exactly the other way around provided that f decays reasonably fast as n rarr plusmninfin(or becomes 0 for n large enough) f has a Fourier transform f RZ rarr C definedby f(α) =

sumn f(n)e(minusαn) and satisfying f(n) =

intRZ f(α)e(αn)dα (Highbrow

talk we already knew that Z is the Fourier dual of RZ and so of course RZ isthe Fourier dual of Z) ldquoExponential sumsrdquo (or ldquotrigonometrical sumsrdquo as in the titleof [Vin54]) are sums of the form

sumn f(α)e(minusαn) of course the ldquocirclerdquo in ldquocircle

methodrdquo is just a name for RZ (To see an actual circle in the complex plane look atthe image of RZ under the map α 7rarr e(α))

The study of the Fourier transform f is relevant to additive problems in numbertheory ie questions on the number of ways of writing n as a sum of k integers ofa particular form Why One answer could be that f gives us information about theldquorandomnessrdquo of f if f were the characteristic function of a random set then f(α)would be very small outside a sharp peak at α = 0

We can also give a more concrete and immediate answer Recall that in generalthe Fourier transform of a convolution equals the product of the transforms over Z

4In fact Hardy Littlewood and Effinger use an assumption somewhat weaker than GRH they assumethat Dirichlet L-functions have no zeroes satisfying lt(s) ge θ where θ lt 34 is arbitrary (We will reviewDirichlet L-functions in a minute)

12 THE CIRCLE METHOD FOURIER ANALYSIS ON Z 7

this means that for the additive convolution

(f lowast g)(n) =sum

m1m2isinZm1+m2=n

f(m1)g(m2)

the Fourier transform satisfies the simple rule

f lowast g(α) = f(α) middot g(α)

We can see right away from this that (f lowast g)(n) can be non-zero only if n can bewritten as n = m1 + m2 for some m1 m2 such that f(m1) and g(m2) are non-zeroSimilarly (f lowastglowasth)(n) can be non-zero only if n can be written as n = m1 +m2 +m3

for some m1 m2 m3 such that f(m1) f2(m2) and f3(m3) are all non-zero Thissuggests that to study the ternary Goldbach problem we define f1 f2 f3 Zrarr C sothat they take non-zero values only at the primes

Hardy and Littlewood defined f1(n) = f2(n) = f3(n) = 0 for n non-prime (andalso for n le 0) and f1(n) = f2(n) = f3(n) = (log n)eminusnx for n prime (where x isa parameter to be fixed later) Here the factor eminusnx is there to provide ldquofast decayrdquoso that everything converges as we will see later Hardy and Littlewoodrsquos choice ofeminusnx (rather than some other function of fast decay) comes across in hindsight asbeing very clever though not quite best-possible (Their ldquochoicerdquo was to some extentnot a choice but an artifact of their version of the circle method which was framedin terms of power series not in terms of exponential sums with arbitrary smoothingfunctions) The term log n is there for technical reasons ndash in essence it makes senseto put it there because a random integer around n has a chance of about 1(log n) ofbeing prime

We can see that (f1 lowast f2 lowast f3)(n) 6= 0 if and only if n can be written as the sumof three primes Our task is then to show that (f1 lowast f2 lowast f3)(n) (ie (f lowast f lowast f)(n))is non-zero for every n larger than a constant C sim 1027 Since the transform of aconvolution equals a product of transforms

(f1lowastf2lowastf3)(n) =

intRZ

f1 lowast f2 lowast f3(α)e(αn)dα =

intRZ

(f1f2f3)(α)e(αn)dα (12)

Our task is thus to show that the integralintRZ(f1f2f3)(α)e(αn)dα is non-zero

As it happens f(α) is particularly large when α is close to a rational with smalldenominator Moreover for such α it turns out we can actually give rather preciseestimates for f(α) Define M (called the set of major arcs) to be a union of narrowarcs around the rationals with small denominator

M =⋃qler

⋃a mod q

(aq)=1

(a

qminus 1

qQa

q+

1

qQ

)

where Q is a constant times xr and r will be set later (This is a slight simplificationthe major-arc set we will actually use in the course of the proof will be a little different

8 CHAPTER 1 INTRODUCTION

due to a distinction between odd and even q) We can writeintRZ

(f1f2f3)(α)e(αn)dα =

intM

(f1f2f3)(α)e(αn)dα+

intm

(f1f2f3)(α)e(αn)dα

(13)where m is the complement (RZ) M (called minor arcs)

Now we simply do not know how to give precise estimates for f(α) when α is inm However as Vinogradov realized one can give reasonable upper bounds on |f(α)|for α isin m This suggests the following strategy show thatint

m

|f1(α)||f2(α)||f3(α)|dα ltintM

f1(α)f2(α)f3(α)e(αn)dα (14)

By (12) and (13) this will imply immediately that (f1 lowast f2 lowast f3)(n) gt 0 and so wewill be done

The name of circle method is given to the study of additive problems by means ofFourier analysis over Z and in particular to the use of a subdivision of the circle RZinto major and minor arcs to estimate the integral of a Fourier transform There wasa ldquocirclerdquo already in Hardy and Ramanujanrsquos work [HR00] but the subdivision intomajor and minor arcs is due to Hardy and Littlewood who also applied their methodto a wide variety of additive problems (Hence ldquothe Hardy-Littlewood methodrdquo as analternative name for the circle method) For instance before working on the ternaryGoldbach conjecture they studied the question of whether every n gt C can be writtenas the sum of kth powers (Waringrsquos problem) In fact they used a subdivision intomajor and minor arcs to study Waringrsquos problem and not for the ternary Goldbachproblem they had no minor-arc bounds for ternary Goldbach and their use of GRHhad the effect of making every α isin RZ yield to a major-arc treatment

Vinogradov worked with finite exponential sums ie fi compactly supportedFrom todayrsquos perspective it is clear that there are applications (such as ours) in whichit can be more important for fi to be smooth than compactly supported still Vino-gradovrsquos simplifications were an incentive to further developments In the case of theternary Goldbachrsquos problem his key contribution consisted in the fact that he couldgive bounds on f(α) for α in the minor arcs without using GRH

An important note in the case of the binary Goldbach conjecture the method failsat (14) and not before if our understanding of the actual value of fi(α) is at all correctit is simply not true in general thatint

m

|f1(α)||f2(α)|dα ltintM

f1(α)f2(α)e(αn)dα

Let us see why this is not surprising Set f1 = f2 = f3 = f for simplicity so thatwe have the integral of the square (f(α))2 for the binary problem and the integral ofthe cube (f(α))3 for the ternary problem Squaring like cubing amplifies the peaksof f(α) which are at the rationals of small denominator and their immediate neighbor-hoods (the major arcs) however cubing amplifies the peaks much more than squaringThis is why even though the arcs making up M are very narrow

intM

(f(α))3e(αn)dα

13 THE MAJOR ARCS M 9

is larger thanintm|f(α)|3dα that explains the name major arcs ndash they are not large but

they give the major part of the contribution In contrast squaring amplifies the peaksless and this is why the absolute value of

intMf(α)2e(αn)dα is in general smaller thanint

m|f(α)|2dα As nobody knows how to prove a precise estimate (and in particular

lower bounds) on f(α) for α isin m the binary Goldbach conjecture is still very muchout of reach

To prove the ternary Goldbach conjecture it is enough to estimate both sides of(14) for carefully chosen f1 f2 f3 and compare them This is our task from now on

13 The major arcs M

131 What do we really know about L-functions and their zerosBefore we start let us give a very brief review of basic analytic number theory (in thesense of say [Dav67]) A Dirichlet character χ Z rarr C of modulus q is a characterof (ZqZ)lowast lifted to Z (In other words χ(n) = χ(n+ q) for all n χ(ab) = χ(a)χ(b)for all a b and χ(n) = 0 for (n q) 6= 1) A Dirichlet L-series is defined by

L(s χ) =

infinsumn=1

χ(n)nminuss

for lt(s) gt 1 and by analytic continuation for lt(s) le 1 (The Riemann zeta functionζ(s) is the L-function for the trivial character ie the character χ such that χ(n) = 1for all n) Taking logarithms and then derivatives we see that

minus Lprime(s χ)

L(s χ)=

infinsumn=1

χ(n)Λ(n)nminuss (15)

for lt(s) gt 1 where Λ is the von Mangoldt function (Λ(n) = log p if n is some primepower pα α ge 1 and Λ(n) = 0 otherwise)

Dirichlet introduced his characters and L-series so as to study primes in arithmeticprogressions In general and after some work (15) allows us to restate many sumsover the primes (such as our Fourier transforms f(α)) as sums over the zeros ofL(s χ)A non-trivial zero of L(s χ) is a zero of L(s χ) such that 0 lt lt(s) lt 1 (The otherzeros are called trivial because we know where they are namely at negative integersand in some cases also on the line lt(s) = 0 In order to eliminate all zeros onlt(s) = 0 outside s = 0 it suffices to assume that χ is primitive a primitive charactermodulo q is one that is not induced by (ie not the restriction of) any character modulod|q d lt q)

The Generalized Riemann Hypothesis for Dirichlet L-functions is the statementthat for every Dirichlet character χ every non-trivial zero of L(s χ) satisfies lt(s) =12 Of course the Generalized Riemann Hypothesis (GRH) ndash and the Riemann Hy-pothesis which is the special case of χ trivial ndash remains unproven Thus if we want toprove unconditional statements we need to make do with partial results towards GRHTwo kinds of such results have been proven

10 CHAPTER 1 INTRODUCTION

bull Zero-free regions Ever since the late nineteenth century (Hadamard de laVallee-Poussin) we have known that there are hourglass-shaped regions (moreprecisely of the shape c

log t le σ le 1minus clog t where c is a constant and where we

write s = σ + it) outside which non-trivial zeros cannot lie Explicit values forc are known [McC84b] [Kad05] [Kad] There is also the Vinogradov-Korobovregion [Kor58] [Vin58] which is broader asymptotically but narrower in mostof the practical range (see [For02] however)

bull Finite verifications of GRH It is possible to (ask a computer to) prove smallfinite fragments of GRH in the sense of verifying that all non-trivial zeros ofa given finite set of L-functions with imaginary part less than some constant Hlie on the critical line lt(s) = 12 Such verifications go back to Riemannwho checked the first few zeros of ζ(s) Large-scale rigorous computer-basedverifications are now a possibility

Most work in the literature follows the first alternative though [Tao14] did use afinite verification of RH (ie GRH for the trivial character) Unfortunately zero-freeregions seem too narrow to be useful for the ternary Goldbach problem Thus we areleft with the second alternative

In coordination with the present work Platt [Plab] verified that all zeros s of L-functions for characters χ with modulus q le 300000 satisfying =(s) le Hq lie on theline lt(s) = 12 where

bull Hq = 108q for q odd and

bull Hq = max(108q 200 + 75 middot 107q) for q even

This was a medium-large computation taking a few hundreds of thousands of core-hours on a parallel computer It used interval arithmetic for the sake of rigor we willlater discuss what this means

The choice to use a finite verification of GRH rather than zero-free regions hadconsequences on the manner in which the major and minor arcs had to be chosen Aswe shall see such a verification can be used to give very precise bounds on the majorarcs but also forces us to define them so that they are narrow and their number isconstant To be precise the major arcs were defined around rationals aq with q le rr = 300000 moreover as will become clear the fact that Hq is finite will force theirwidth to be bounded by c0rqx where c0 is a constant (say c0 = 8)

132 Estimates of f(α) for α in the major arcs

Recall that we want to estimate sums of the type f(α) =sumf(n)e(minusαn) where

f(n) is something like (log n)η(nx) for n equal to a prime and 0 otherwise hereη Rrarr C is some function of fast decay such as Hardy and Littlewoodrsquos choice

η(t) =

eminust for t ge 0

0 for t lt 0

13 THE MAJOR ARCS M 11

Let us modify this just a little ndash we will actually estimate

Sη(α x) =sum

Λ(n)e(αn)η(nx) (16)

where Λ is the von Mangoldt function (as in (15)) The use of α rather thanminusα is justa bow to tradition as is the use of the letter S (for ldquosumrdquo) however the use of Λ(n)rather than just plain log p does actually simplify matters

The function η here is sometimes called a smoothing function or simply a smooth-ing It will indeed be helpful for it to be smooth on (0infin) but in principle it neednot even be continuous (Vinogradovrsquos work implicitly uses in effect the ldquobrutal trun-cationrdquo 1[01](t) defined to be 1 when t isin [0 1] and 0 otherwise that would be fine forthe minor arcs but as it will become clear it is a bad idea as far as the major arcs areconcerned)

Assume α is on a major arc meaning that we can write α = aq+δx for some aq(q small) and some δ (with |δ| small) We can write Sη(α x) as a linear combination

Sη(α x) =sumχ

cχSηχ

x x

)+ tiny error term (17)

where

Sηχ

x x

)=sum

Λ(n)χ(n)e(δnx)η(nx) (18)

In (17) χ runs over primitive Dirichlet characters of moduli d|q and cχ is small(|cχ| le

radicdφ(q))

Why are we expressing the sums Sη(α x) in terms of the sums Sηχ(δx x) whichlook more complicated The argument has become δx whereas before it was αHere δ is relatively small ndash smaller than the constant c0r in our setup In other wordse(δnx) will go around the circle a bounded number of times as n goes from 1 up to aconstant times x (by which time η(nx) has become small because η is of fast decay)This makes the sums much easier to estimate

To estimate the sums Sηχ we will use L-functions together with one of the mostcommon tools of analytic number theory the Mellin transform This transform is es-sentially a Laplace transform with a change of variables and a Laplace transform inturn is a Fourier transform taken on a vertical line in the complex plane For f of fastenough decay the Mellin transform F = Mf of f is given by

F (s) =

int infin0

f(t)tsdt

t

we can express f in terms of F by the Mellin inversion formula

f(t) =1

2πi

int σ+iinfin

σminusiinfinF (s)tminussds

for any σ within an interval We can thus express e(δt)η(t) in terms of its Mellintransform Fδ and then use (15) to express Sηχ in terms of Fδ and Lprime(s χ)L(s χ)

12 CHAPTER 1 INTRODUCTION

shifting the integral in the Mellin inversion formula to the left we obtain what is knownin analytic number theory as an explicit formula

Sηχ(δx x) = [η(minusδ)x]minussumρ

Fδ(ρ)xρ + tiny error term

Here the term between brackets appears only for χ trivial In the sum ρ goes over allnon-trivial zeros ofL(s χ) and Fδ is the Mellin transform of e(δt)η(t) (The tiny errorterm comes from a sum over the trivial zeros of L(s χ)) We will obtain the estimatewe desire if we manage to show that the sum over ρ is small

The point is this if we verify GRH for L(s χ) up to imaginary part H ie ifwe check that all zeroes ρ of L(s χ) with |=(ρ)| le H satisfy lt(ρ) = 12 we have|xρ| =

radicx In other words xρ is very small (compared to x) However for any

ρ whose imaginary part has absolute value greater than H we know next to nothingabout its real part other than 0 le lt(ρ) le 1 (Zero-free regions are notoriously weakfor =(ρ) large we will not use them) Hence our only chance is to make sure thatFδ(ρ) is very small when |=(ρ)| ge H

This has to be true for both δ very small (including the case δ = 0) and for δ not sosmall (|δ| up to c0rq which can be large because r is a large constant) How can wechoose η so that Fδ(ρ) is very small in both cases for τ = =(ρ) large

The method of stationary phase is useful as an exploratory tool here In brief itsuggests (and can sometimes prove) that the main contribution to the integral

Fδ(t) =

int infin0

e(δt)η(t)tsdt

t(19)

can be found where the phase of the integrand has derivative 0 This happens whent = minusτ2πδ (for sgn(τ) 6= sgn(δ)) the contribution is then a moderate factor timesη(minusτ2πδ) In other words if sgn(τ) 6= sgn(δ) and δ is not too small (|δ| ge 8 say)Fδ(σ + iτ) behaves like η(minusτ2πδ) if δ is small (|δ| lt 8) then Fδ behaves like F0which is the Mellin transform Mη of η Here is our goal then the decay of η(t) as|t| rarr infin should be as fast as possible and the decay of the transform Mη(σ + iτ)should also be as fast as possible

This is a classical dilemma often called the uncertainty principle because it is themathematical fact underlying the physical principle of the same name you cannot havea function η that decreases extremely rapidly and whose Fourier transform (or in thiscase its Mellin transform) also decays extremely rapidly

What does ldquoextremely rapidlyrdquo mean here It means (as Hardy himself proved)ldquofaster than any exponential eminusCtrdquo Thus Hardy and Littlewoodrsquos choice η(t) = eminust

seems essentially optimal at first sightHowever it is not optimal We can choose η so that Mη decreases exponentially

(with a constant C somewhat worse than for η(t) = eminust) but η decreases faster thanexponentially This is a particularly appealing possibility because it is t|δ| and not somuch t that risks being fairly small (To be explicit say we check GRH for charactersof modulus q up to Hq sim 50 middot c0rq ge 50|δ| Then we only know that |τ2πδ| amp8 So for η(t) = eminust η(minusτ2πδ) may be as large as eminus8 which is not negligibleIndeed since this term will be multiplied later by other terms eminus8 is simply not small

13 THE MAJOR ARCS M 13

enough On the other hand we can assume that Hq ge 200 (say) and so Mη(s) simeminus(π2)|τ | is completely negligible and will remain negligible even if we replace π2by a somewhat smaller constant)

We shall take η(t) = eminust22 (that is the Gaussian) This is not the only possible

choice but it is in some sense natural It is easy to show that the Mellin transform Fδfor η(t) = eminust

22 is a multiple of what is called a parabolic cylinder function U(a z)with imaginary values for z There are plenty of estimates on parabolic cylinder func-tions in the literature ndash but mostly for a and z real in part because that is one of thecases occuring most often in applications There are some asymptotic expansions andestimates for U(a z) a z general due to Olver [Olv58] [Olv59] [Olv61] [Olv65]but unfortunately they come without fully explicit error terms for a and z within ourrange of interest (The same holds for [TV03])

In the end I derived bounds for Fδ using the saddle-point method (The methodof stationary phase which we used to choose η seems to lead to error terms that aretoo large) The saddle-point method consists in brief in changing the contour of anintegral to be bounded (in this case (19)) so as to minimize the maximum of theintegrand (To use a metaphor in [dB81] find the lowest mountain pass)

Here we strive to get clean bounds rather than the best possible constants Considerthe case k = 0 of Corollary 802 with k = 0 it states the following For s = σ + iτwith σ isin [0 1] and |τ | ge max(100 4π2|δ|) we obtain that the Mellin transform Fδ ofη(t)e(δt) with η(t) = eminust

22 satisfies

|Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

3001eminus01065( 2|τ|

|`| )2

if 4|τ |`2 lt 323286eminus01598|τ | if 4|τ |`2 ge 32

(110)

Similar bounds hold for σ in other ranges thus giving us estimates on the Mellintransform Fδ for η(t) = tkeminust

22 and σ in the critical range [0 1] (We could do a littlebetter if we knew the value of σ but in our applications we do not once we leavethe range in which GRH has been checked We will give a bound (Theorem 801) thatdoes take σ into account and also reflects and takes advantage of the fact that thereis a transitional region around |τ | sim (32)(πδ)2 in practice however we will useCor 802)

A momentrsquos thought shows that we can also use (110) to deal with the Mellintransform of η(t)e(δt) for any function of the form η(t) = eminust

22g(t) (or more gener-ally η(t) = tkeminust

22g(t)) where g(t) is any band-limited function By a band-limitedfunction we could mean a function whose Fourier transform is compactly supportedwhile that is a plausible choice it turns out to be better to work with functions that areband-limited with respect to the Mellin transform ndash in the sense of being of the form

g(t) =

int R

minusRh(r)tminusirdr

where h Rrarr C is supported on a compact interval [minusRR] withR not too large (sayR = 200) What happens is that the Mellin transform of the product eminust

22g(t)e(δt)

is a convolution of the Mellin transform Fδ(s) of eminust22e(δt) (estimated in (110)) and

14 CHAPTER 1 INTRODUCTION

that of g(t) (supported in [minusRR]) the effect of the convolution is just to delay decayof Fδ(s) by at most a shift by y 7rarr y minusR

We wish to estimate Sηχ(δx) for several functions η This motivates us to derivean explicit formula (sect) general enough to work with all the weights η(t) we will workwith while being also completely explicit and free of any integrals that may be tediousto evaluate

Once that is done and once we consider the input provided by Plattrsquos finite verifi-cation of GRH up to Hq we obtain simple bounds for different weights

For η(t) = eminust22 x ge 108 χ a primitive character of modulus q le r = 300000

and any δ isin R with |δ| le 4rq we obtain

Sηχ

x x

)= Iq=1 middot η(minusδ)x+ E middot x (111)

where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

|E| le 4306 middot 10minus22 +1radicx

(650400radicq

+ 112

) (112)

Here η stands for the Fourier transform from R to R normalized as follows η(t) =intinfinminusinfin e(minusxt)η(x)dx Thus η(minusδ) is just

radic2πeminus2π2δ2 (self-duality of the Gaussian)

This is one of the main results of Part II see sect71 Similar bounds are also proventhere for η(t) = t2eminust

22 as well as for a weight of type η(t) = teminust22g(t) where

g(t) is a band-limited function and also for a weight η defined by a multiplicativeconvolution The conditions on q (namely q le r = 300000) and δ are what weexpected from the outset

Thus concludes our treatment of the major arcs This is arguably the easiest part ofthe proof it was actually what I left for the end as I was fairly confident it would workout Minor-arc estimates are more delicate let us now examine them

14 The minor arcs m

141 Qualitative goals and main ideas

What kind of bounds do we need What is there in the literatureWe wish to obtain upper bounds on |Sη(α x)| for some weight η and any α isin RZ

not very close to a rational with small denominator Every α is close to some rationalaq what we are looking for is a bound on |Sη(α x)| that decreases rapidly when qincreases

Moreover we want our bound to decrease rapidly when δ increases where α =aq + δx In fact the main terms in our bound will be decreasing functions ofmax(1 |δ|8) middot q (Let us write δ0 = max(2 |δ|4) from now on) This will allowour bound to be good enough outside narrow major arcs which will get narrower andnarrower as q increases ndash that is precisely the kind of major arcs we were presupposingin our major-arc bounds

14 THE MINOR ARCS M 15

It would be possible to work with narrow major arcs that become narrower as qincreases simply by allowing q to be very large (close to x) and assigning each angleto the fraction closest to it This is in fact the common procedure However thismakes matters more difficult in that we would have to minimize at the same time thefactors in front of terms xq x

radicq etc and those in front of terms q

radicqx and so

on (These terms are being compared to the trivial bound x) Instead we choose tostrive for a direct dependence on δ throughout this will allow us to cap q at a muchlower level thus making terms such as q and

radicqx negligible (This choice has been

taken elsewhere in applications of the circle method but strangely seems absent fromprevious work on the ternary Goldbach conjecture)

How good must our bounds be Since the major-arc bounds are valid only forq le r = 300000 and |δ| le 4rq we cannot afford even a single factor of log x (orany other function tending to infin as x rarr infin) in front of terms such as x

radicq|δ0| a

factor like that would make the term larger than the trivial bound x if q|δ0| is equal toa constant (r say) and x is very large Apparently there was no such ldquolog-free boundrdquowith explicit constants in the literature even though such bounds were considered tobe in principle feasible and even though previous work ([Che85] [Dab96] [DR01][Tao14]) had gradually decreased the number of factors of log x (In limited ranges forq there were log-free bounds without explicit constants see [Dab96] [Ram10] Theestimate in [Vin54 Thm 2a 2b] was almost log-free but not quite There were alsobounds [Kar93] [But11] that used L-functions and thus were not really useful in atruly minor-arc regime)

It also seemed clear that a main bound proportional to (log q)2xradicq (as in [Tao14])

was too large At the same time it was not really necessary to reach a bound of thebest possible form that could be found through Vinogradovrsquos basic approach namely

|Sη(α x)| le Cxradicq

φ(q) (113)

Such a bound had been proven by Ramare [Ram10] for q in a limited range and Cnon-explicit later in [Ramc] ndash which postdates the first version of [Helb] ndash Ramarebroadened the range to q le x148 and gave an explicit value forC namelyC = 13000Such a bound is a notable achievement but unfortunately it is not useful for ourpurposes Rather we will aim at a bound whose main term is bounded by a constantaround 1 times x(log δ0q)

radicδ0φ(q) this is slightly worse asymptotically than (113)

but it is much better in the delicate range of δ0q sim 300000 and in fact for a muchwider range as well

We see that we have several tasks One of them is the removal of logarithms wecannot afford a single factor of log x and in practice we can afford at most one factorof log q Removing logarithms will be possible in part because of the use of previouslyexisting efficient techniques (the large sieve for sequences with prime support) but alsobecause we will be able to find cancellation at several places in sums coming from acombinatorial identity (namely Vaughanrsquos identity) The task of finding cancellationis particularly delicate because we cannot afford large constants or for that matter

16 CHAPTER 1 INTRODUCTION

statements valid only for large x (Bounding a sum such assumn micro(n) efficiently where

micro is the Mobius function

micro(n) =

(minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

is harder than estimating a sum such assumn Λ(n) equally efficiently even though we

are used to thinking of the two problems as equivalent)We have said that our bounds will improve as |δ| increases This dependence on

δ will be secured in different ways at different places Sometimes δ will appear asan argument as in η(minusδ) for η piecewise continuous with ηprime isin L1 we know that|η(t)| rarr 0 as |t| rarr infin Sometimes we will obtain a dependence on δ by using severaldifferent rational approximations to the same α isin R Lastly we will obtain a gooddependence on δ in bilinear sums by supplying a scattered input to a large sieve

If there is a main moral to the argument it lies in the close relation between thecircle method and the large sieve The circle method rests on the estimation of anintegral involving a Fourier transform f RZ rarr C as we will later see this leadsnaturally to estimating the `2-norm of f on subsets (namely unions of arcs) of the circleRZ The large sieve can be seen as an approximate discrete version of Plancherelrsquosidentity which states that |f |2 = |f |2

Both in this section and in sect15 we shall use the large sieve in part so as to usethe fact that some of the functions we work with have prime support ie are non-zeroonly on prime numbers There are ways to use prime support to improve the outputof the large sieve In sect15 these techniques will be refined and then translated to thecontext of the circle method where f has (essentially) prime support and |f |2 must beintegrated over unions of arcs (This allows us to remove a logarithm) The main pointis that the large sieve is not being used as a black box rather we can adapt ideas from(say) the large-sieve context and apply them to the circle method

Lastly there are the benefits of a continuous η Hardy and Littlewood alreadyused a continuous η this was abandoned by Vinogradov presumably for the sake ofsimplicity The idea that smooth weights η can be superior to sharp truncations isnow commonplace As we shall see using a continuous η is helpful in the minor-arcsregime but not as crucial there as for the major arcs We will not use a smooth η wewill prove our estimates for any continuous η that is piecewise C1 and then towardsthe end we will choose to use the same weight η = η2 as in [Tao14] in part because ithas compact support and in part for the sake of comparison The moral here is not quitethe common dictum ldquoalways smoothrdquo but rather that different kinds of smoothing canbe appropriate for different tasks in the end we will show how to coordinate differentsmoothing functions η

There are other ideas involved for instance some of Vinogradovrsquos lemmas areimproved Let us now go into some of the details

142 Combinatorial identitiesGenerally since Vinogradov a treatment of the minor arcs starts with a combinatorialidentity expressing Λ(n) (or the characteristic function of the primes) as a sum of two

14 THE MINOR ARCS M 17

or more convolutions (In this section by a convolution flowastg we will mean the Dirichletconvolution (f lowast g)(n) =

sumd|n f(d)g(nd) ie the multiplicative convolution on the

semigroup of positive integers)In some sense the archetypical identity is

Λ = micro lowast log

but it will not usually do the contribution of micro(d) log(nd) with d close to n is toodifficult to estimate precisely There are alternatives for example there is the identity

Λ(n) log n = micro lowast log2minusΛ lowast Λ (114)

which underlies an estimate of Selbergrsquos that in turn is the basis for the Erdos-Selbergproof of the prime number theorem see eg [MV07 sect82] More generally onecan decompose Λ(n)(log n)k as micro lowast logk+1 minus a linear combination of convolu-tions this kind of decomposition ndash really just a direct consequence of the develop-ment of (ζ prime(s)ζ(s))(k) ndash will be familiar to some from the exposition of Bombierirsquoswork [Bom76] in [FI10 sect3] (for instance) Another useful identity was that used byDaboussi [Dab96] witness its application in [DR01] which gives explicit estimates onexponential sums over primes

The proof of Vinogradovrsquos three-prime result was simplified substantially [Vau77b]by the introduction of Vaughanrsquos identity

Λ(n) = microleU lowast logminusΛleV lowast microleU lowast 1 + 1 lowast microgtU lowast ΛgtV + ΛleV (115)

where we are using the notation

fleW =

f(n) if n leW 0 if n gt W

fgtW =

0 if n leW f(n) if n gt W

Of the resulting sums (sumn(microleU lowast log)(n)e(αn)η(nx) etc) the first three are said

to be of type I type I (again) and type II the last sumsumnleV Λ(n) is negligible

One of the advantages of Vaughanrsquos identity is its flexibility we can set U and Vto whatever values we wish Its main disadvantage is that it is not ldquolog-freerdquo in that itseems to impose the loss of two factors of log x if we sum each side of (115) from 1to x we obtain

sumnlex Λ(n) sim x on the left side whereas if we bound the sum on the

right side without the use of cancellation we obtain a bound of x(log x)2 Of coursewe will obtain some cancellation from the phase e(αn) still even if this gives us afactor of say 1

radicq we will get a bound of x(log x)2

radicq which is worse than the

trivial bound x for q bounded and x large Since we want a bound that is useful for allq larger than the constant r and all x larger than a constant this will not do

As was pointed out in [Tao14] it is possible to get a factor of (log q)2 instead of afactor of (log x)2 in the type II sums by setting U and V appropriately Unfortunatelya factor of (log q)2 is still too large in practice and there is also the issue of factors oflog x in type I sums

Vinogradov had already managed to get an essentially log-free result (by a ratherdifficult procedure) in [Vin54 Ch IX] The result in [Dab96] is log-free Unfortu-nately the explicit result in [DR01] ndash the study of which encouraged me at the begin-ning of the project ndash is not For a while I worked with the case k = 2 of the expansion

18 CHAPTER 1 INTRODUCTION

of (ζ prime(s)ζ(s))(k) which gives

Λ middot log2 = micro lowast log3minus3 middot (Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (116)

This identity is essentially log-free while a trivial bound on the sum of the right sidefor n from 1 to N does seem to have two extra factors of log they are present only inthe term micro lowast log3 which is not the hardest one to estimate Ramare obtained a log-freebound in [Ram10] using an identity introduced by Diamond and Steinig in the courseof their own work on elementary proofs of the prime number theorem [DS70] thatidentity gives a decomposition for Λ middot logk that can also be derived from the expansionof (ζ prime(s)ζ(s))(k) by a clever grouping of terms

In the end I decided to use Vaughanrsquos identity motivated in part by [Tao14] andin part by the lack of free parameters in (116) as can be seen in (115) Vaughanrsquosidentity has two parameters U V that we can set to whatever values we think best Theform of the identity allowed me to reuse much of my work up to that point but it alsoposed a challenge since Vaughanrsquos identity is by no means log-free one has obtaincancellation in Vaughanrsquos identity at every possible step beyond the cancellation givenby the phase e(αn) (The presence of a phase in fact makes the task of getting can-cellation from the identity more complicated) The removal of logarithms will be oneof our main tasks in what follows It is clear that the presence of the Mobius functionmicro should give in principle some cancellation we will show how to use it to obtain asmuch cancellation as we need ndash with good constants and not just asymptotically

143 Type I sumsThere are two type I sums namelysum

mleU

micro(m)sumn

(log n)e(αmn)η(mnx

)(117)

and sumvleV

Λ(v)sumuleU

micro(u)sumn

e(αvun)η(vunx

) (118)

In either case α = aq + δx where q is larger than a constant r and |δx| le 1qQ0

for some Q0 gt max(qradicx) For the purposes of this exposition we will set it as our

task to estimate the slightly simpler sumsummleD

micro(m)sumn

e(αmn)η(mnx

) (119)

where D can be U or UV or something else less than xWhy can we consider this simpler sum without omitting anything essential It is

clear that (117) is of the same kind as (119) The inner double sum in (118) is just(119) with αv instead of α this enables us to estimate (118) by means of (119) for qsmall ie the more delicate case If q is not small then the approximation αv sim avqmay not be accurate enough In that case we collapse the two outer sums in (118) intoa sum

sumn(ΛleV lowast microleU )(n) and treat all of (118) much as we will treat (119) since

14 THE MINOR ARCS M 19

q is not small we can afford to bound (ΛleV lowast microleU )(n) trivially (by log n) in the lesssensitive terms

Let us first outline Vinogradovrsquos procedure for bounding type I sums Just by sum-ming a geometric series we get∣∣∣∣∣∣

sumnleN

e(αn)

∣∣∣∣∣∣ le min

(N

c

α

) (120)

where c is a constant and α is the distance from α to the nearest integer Vinogradovsplits the outer sum in (119) into sums of length q When m runs on an interval oflength q the angle amq runs through all fractions of the form bq due to the errorδx αm could be close to 0 for two values of n but otherwise αm takes valuesbounded below by 1q (twice) 2q (twice) 3q (twice) etc Thus∣∣∣∣∣∣

sumyltmley+q

micro(m)sumnleN

e(αmn)

∣∣∣∣∣∣ lesum

yltmley+q

∣∣∣∣∣∣sumnleN

e(αmn)

∣∣∣∣∣∣ le 2N

m+ 2cq log eq

(121)for any y ge 0

There are several ways to improve this One is simply to estimate the inner summore precisely this was already done in [DR01] One can also define a smoothingfunction η as in (119) it is easy to get∣∣∣∣∣∣

sumnleN

e(αn)η(nx

)∣∣∣∣∣∣ le min

(x|η|1 +

|ηprime|12|ηprime|1

2| sin(πα)||ηprimeprime|infin

4x(sinπα)2

)

Except for the third term this is as in [Tao14] We could also choose carefully whichbound to use for each m surprisingly this gives an improvement ndash in fact an impor-tant one for m large However even with these improvements we still have a termproportional to Nm as in (121) and this contributes about (x log x)q to the sum(119) thus giving us an estimate that is not log-free

What we have to do naturally is to take out the terms with q|m for m small (If mis large then those may not be the terms for which mα is close to 0 we will later seewhat to do) For y + q le Q2 |αminus aq| le 1qQ we get thatsum

yltmley+q

q-m

min

(A

B

| sinπαn|

C

| sinπαn|2

)(122)

is at most

min

(20

3π2Cq2 2A+

4q

π

radicAC

2Bq

πmax

(2 log

Ce3q

)) (123)

This is satisfactory We are left with all the terms m le M = min(DQ2) with q|mndash and also with all the terms Q2 lt m le D For m le M divisible by q we can

20 CHAPTER 1 INTRODUCTION

estimate (as opposed to just bound from above) the inner sum in (119) by the Poissonsummation formula and then sum over m but without taking absolute values writingm = aq we get a main term

xmicro(q)

qmiddot η(minusδ) middot

sumaleMq

(aq)=1

micro(a)

a (124)

where (a q) stands for the greatest common divisor of a and qIt is clear that we have to get cancellation over micro here There is an elegant elemen-

tary argument [GR96] showing that the absolute value of the sum in (124) is at most1 We need to gain one more log however Ramare [Ramb] helpfully furnished thefollowing bound ∣∣∣∣∣∣∣∣

sumalex

(aq)=1

micro(a)

a

∣∣∣∣∣∣∣∣ le4

5

q

φ(q)

1

log xq(125)

for q le x (Cf [EM95] [EM96]) This is neither trivial nor elementary5 We are so tospeak allowed to use non-elementary means (that is methods based on L-functions)because the only L-function we need to use here is the Riemann zeta function

What shall we do for m gt Q2 We can always give a bound

sumyltmley+q

min

(A

C

| sinπαn|2

)le 3A+

4q

π

radicAC (126)

for y arbitrary since AC will be of constant size (4qπ)radicAC is pleasant enough but

the contribution of 3A sim 3|η|1xy is nasty (it adds a multiple of (x log x)q to thetotal) and seems unavoidable the values of m for which αm is close to 0 no longercorrespond to the congruence class m equiv 0 mod q and thus cannot be taken out

The solution is to switch approximations (The idea of using different approxima-tions to the same α is neither new nor recent in the general context of the circle methodsee [Vau97 sect28 Ex 2] What may be new is its use to clear a hurdle in type I sums)What does this mean If α were exactly or almost exactly aq then there would beno other very good approximations in a reasonable range However note that we candefine Q = bx|δq|c for α = aq + δx and still have |αminus aq| le 1qQ If δ is verysmall Q will be larger than 2D and there will be no terms with Q2 lt m le D toworry about

5The current state of knowledge may seem surprising after all we expect nearly square-root cancella-tion ndash for instance |

sumnlex micro(n)n| le

radic2x holds for all real 0 lt x le 1012 see also the stronger

bound [Dre93]) The classical zero-free region of the Riemann zeta function ought to give a factor ofexp(minus

radic(log x)c) which looks much better than 1 log x What happens is that (a) such a factor is

not actually much better than 1 log x for x sim 1030 say (b) estimating sums involving the Mobius func-tion by means of an explicit formula is harder than estimating sums involving Λ(n) the residues of 1ζ(s)at the non-trivial zeros of s come into play As a result getting non-trivial explicit results on sums of micro(n)is harder than one would naively expect from the quality of classical effective (but non-explicit) results See[Rama] for a survey of explicit bounds

14 THE MINOR ARCS M 21

What happens if δ is not very small We know that for any Qprime there is an approx-imation aprimeqprime to α with |αminus aprimeqprime| le 1qprimeQprime and qprime le Qprime However for Qprime gt Q weknow that aprimeqprime cannot equal aq by the definition of Q the approximation aq is notgood enough ie |α minus aq| le 1qQprime does not hold Since aq 6= aprimeqprime we see that|aq minus aprimeqprime| ge 1qqprime and this implies that qprime ge (ε(1 + ε))Q

Thus for m gt Q2 the solution is to apply (126) with aprimeqprime instead of aq Thecontribution of A fades into insignificance for the first sum over a range y lt m ley + qprime y ge Q2 it contributes at most x(Q2) and all the other contributions of Asum up to at most a constant times (x log x)qprime

Proceeding in this way we obtain a total bound for (119) whose main terms areproportional to

1

φ(q)

x

log xq

min

(1

1

δ2

)

2

π

radic|ηprimeprime|infin middotD and q log max

(D

q q

) (127)

with good explicit constants The first term ndash usually the largest one ndash is precisely whatwe needed it is proportional to (1φ(q))x log x for q small and decreases rapidly as|δ| increases

144 Type II or bilinear sums

We must now bound

S =summ

(1 lowast microgtU )(m)sumngtV

Λ(n)e(αmn)η(mnx)

At this point it is convenient to assume that η is the Mellin convolution of two functionsThe multiplicative or Mellin convolution on R+ is defined by

(η0 lowastM η1)(t) =

int infin0

η0(r)η1

(t

r

)dr

r

Tao [Tao14] takes η = η2 = η1 lowastM η1 where η1 is a brutal truncation viz thefunction taking the value 2 on [12 1] and 0 elsewhere We take the same η2 in partfor comparison purposes and in part because this will allow us to use off-the-shelfestimates on the large sieve (Brutal truncations are rarely optimal in principle but asthey are very common results for them have been carefully optimized in the literature)Clearly

S =

int XU

V

summ

sumdgtUd|m

micro(d)

η1

(m

xW

)middotsumngeV

Λ(n)e(αmn)η1

( nW

) dWW

(128)

22 CHAPTER 1 INTRODUCTION

By Cauchy-Schwarz the integrand is at mostradicS1(UW )S2(VW ) where

S1(UW ) =sum

x2W ltmle x

W

∣∣∣∣∣∣∣∣sumdgtUd|m

micro(d)

∣∣∣∣∣∣∣∣2

S2(VW ) =sum

x2W lemle

xW

∣∣∣∣∣∣∣sum

max(VW2 )lenleW

Λ(n)e(αmn)

∣∣∣∣∣∣∣2

(129)

We must bound S1(UW ) by a constant times xW We are able to do this ndash witha good constant (A careless bound would have given a multiple of (xU) log3(xU)which is much too large) First we reduce S1(W ) to an expression involving an inte-gral of sum

r1lex

sumr2lex

(r1r2)=1

micro(r1)micro(r2)

σ(r1)σ(r2) (130)

We can bound (130) by the use of bounds onsumnlet micro(n)n combined with the es-

timation of infinite products by means of approximations to ζ(s) for s rarr 1+ Aftersome additional manipulations we obtain a bound for S1(UW ) whose main term isat most (3π2)(xW ) for each W and closer to 022482xW on average over W

(This is as good a point as any to say that throughout we can use a trick in [Tao14]that allows us to work with odd values of integer variables throughout instead of lettingm or n range over all integers Here for instance if m and n are restricted to be oddwe obtain a bound of (2π2)(xW ) for individual W and 015107xW on averageoverW This is so even though we are losing some cancellation in micro by the restriction)

Let us now bound S2(VW ) This is traditionally done by Linnikrsquos dispersionmethod However it should be clear that the thing to do nowadays is to use a largesieve and more specifically a large sieve for primes that kind of large sieve is nothingother than a tool for estimating expressions such as S2(VW ) (Incidentally eventhough we are trying to save every factor of log we can we choose not to use smallsieves at all either here or elsewhere) In order to take advantage of prime support weuse Montgomeryrsquos inequality ([Mon68] [Hux72] see the expositions in [Mon71 pp27ndash29] and [IK04 sect74]) combined with Montgomery and Vaughanrsquos large sieve withweights [MV73 (16)] following the general procedure in [MV73 (16)] We obtain abound of the form

logW

log W2q

(x

4φ(q)+qW

φ(q)

)W

2(131)

on S2(VW ) where of course we can also choose not to gain a factor of logW2q ifq is close to or greater than W

It remains to see how to gain a factor of |δ| in the major arcs and more specificallyin S2(VW ) To explain this let us step back and take a look at what the large sieve is

14 THE MINOR ARCS M 23

Given a civilized function f Zrarr C Plancherelrsquos identity tells us thatintRZ

∣∣∣f (α)∣∣∣2 dα =

sumn

|f(n)|2

The large sieve can be seen as an approximate or statistical version of this for aldquosamplerdquo of points α1 α2 αk satisfying |αi minus αj | ge β for i 6= j it tells us thatsum

1lejlek

∣∣∣f (αi)∣∣∣2 le (X + βminus1)

sumn

|f(n)|2 (132)

assuming that f is supported on an interval of length X Now consider α1 = α α2 = 2α α3 = 3α If α = aq then the angles

α1 αq are well-separated ie they satisfy |αi minus αj | ge 1q and so we can apply(132) with β = 1q However αq+1 = α1 Thus if we have an outer sum oflength L gt q ndash in (129) we have an outer sum of length L = x2W ndash we needto split it into dLqe blocks of length q and so the total bound given by (132) isdLqe(X + q)

sumn |f(n)|2 Indeed this is what gives us (131) which is fine but we

want to do better for |δ| larger than a constantSuppose then that α = aq + δx where |δ| gt 8 say Then the angles α1

and αq+1 are not identical |α1 minus αq+1| le q|δ|x We also see that αq+1 is at adistance at least q|δ|x from α2 α3 αq provided that q|δ|x lt 1q We can goon with αq+2 αq+3 and stop only once there is overlap ie only once we reachαm such that m|δ|x ge 1q We then give all the angles α1 αm ndash which areseparated by at least q|δ|x from each other ndash to the large sieve at the same time Wedo this dLme le dL(x|δ|q)e times and obtain a total bound of dL(x|δ|q)e(X +x|δ|q)

sumn |f(n)|2 which for L = x2W X = W2 gives us about(

x

4Q

W

2+x

4

)logW

provided thatL ge x|δ|q and as usual |αminusaq| le 1qQ This is very small comparedto the trivial bound xW8

What happens if L lt x|δq| Then there is never any overlap we consider allangles αi and give them all together to the large sieve The total bound is (W 24 +xW2|δ|q) logW If L = x2W is smaller than say x3|δq| then we see clearlythat there are non-intersecting swarms of angles αi around the rationals aq We canthus save a factor of log (or rather (φ(q)q) log(W|δq|)) by applying Montgomeryrsquosinequality which operates by strewing displacements of given angles (or here swarmsaround angles) around the circle to the extent possible while keeping everything well-separated In this way we obtain a bound of the form

logW

log W|δ|q

(x

|δ|φ(q)+

q

φ(q)

W

2

)W

2

Compare this to (131) we have gained a factor of |δ|4 and so we use this estimatewhen |δ| gt 4 (We will actually use the criterion |δ| gt 8 but since we will be working

24 CHAPTER 1 INTRODUCTION

with approximations of the form 2α = aq + δx the value of δ in our actual workis twice of what it is in this introduction This is a consequence of working with sumsover the odd integers as in [Tao14])

We have succeeded in eliminating all factors of log we came across The onlyfactor of log that remains is log xUV coming from the integral

int xUV

dWW Thuswe want UV to be close to x but we cannot let it be too close since we also have aterm proportional to D = UV in (127) and we need to keep it substantially smallerthan x We set U and V so that UV is x

radicqmax(4 |δ|) or thereabouts

In the end after some work we obtain our main minor-arcs bound (Theorem 311)It states the following Let x ge x0 x0 = 216 middot 1020 Tecall that Sη(α x) =sumn Λ(n)e(αn)η(nx) and η2 = η1lowastM η1 = 4 middot1[121]lowast1[121] Let 2α = aq+δx

q le Q gcd(a q) = 1 |δx| le 1qQ where Q = (34)x23 If q le x136 then

|Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

middot x+25xradicδ0q

+2x

δ0qmiddot Lxδ0qq + 336x56

(133)where

δ0 = max(2 |δ|4) Rxt = 027125 log

(1 +

log 4t

2 log 9x13

2004t

)+ 041415

Lxtq =q

φ(q)

(13

4log t+ 782

)+ 1366 log t+ 3755

(134)The factor Rxt is small in practice for typical ldquodifficultrdquo values of x and δ0x it is

less than 1 The crucial things to notice in (133) are that there is no factor of log x andthat in the main term there is only one factor of log δ0q The fact that δ0 helps us asit grows is precisely what enables us to take major arcs that get narrower and narroweras q grows

15 Integrals over the major and minor arcsSo far we have sketched (sect13) how to estimate Sη(α x) for α in the major arcs andη based on the Gaussian eminust

22 and also (sect14) how to bound |Sη(α x)| for α in theminor arcs and η = η2 where η2 = 4 middot 1[121] lowastM 1[121] We now must show how touse such information to estimate integrals such as the ones in (14)

We will use two smoothing functions η+ ηlowast in the notation of (13) we set f1 =f2 = Λ(n)η+(nx) f3 = Λ(n)ηlowast(nx) and so we must give a lower bound forint

M

(Sη+(α x))2Sηlowast(α x)e(minusαn)dα (135)

and an upper bound for intm

∣∣Sη+(α x)∣∣2 Sηlowast(α x)e(minusαn)dα (136)

15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 25

so that we can verify (14)The traditional approach to (136) is to boundintm

(Sη+(α x))2Sηlowast(α x)e(minusαn)dα leintm

∣∣Sη+(α x)∣∣2 dα middotmax

αisinmηlowast(α)

lesumn

Λ(n)2η2+

(nx

)middotmaxαisinm

Sηlowast(α x)(137)

Since the sum over n is of the order of x log x this is not log-free and so cannot begood enough we will later see how to do better Still this gets the main shape rightour bound on (136) will be proportional to |η+|22|ηlowast|1 Moreover we see that ηlowast hasto be such that we know how to bound |Sηlowast(α x)| for α isin m while our choice of η+

is more or less free at least as far as the minor arcs are concernedWhat about the major arcs In order to do anything on them we will have to be

able to estimate both η+(α) and ηlowast(α) for α isin M If that is the case then as weshall see we will be able to obtain that the main term of (135) is an infinite product(independent of the smoothing functions) times x2 timesint infin

minusinfin(η+(minusα))2ηlowast(minusα)e(minusαnx)dα

=

int infin0

int infin0

η+(t1)η+(t2)ηlowast

(nxminus (t1 + t2)

)dt1dt2

(138)

In other words we want to maximize (or nearly maximize) the expression on the rightof (138) divided by |η+|22|ηlowast|1

One way to do this is to let ηlowast be concentrated on a small interval [0 ε) Then theright side of (138) is approximately

|ηlowast|1 middotint infin

0

η+(t)η+

(nxminus t)dt (139)

To maximize (139) we should make sure that η+(t) sim η+(nxminus t) We set x sim n2and see that we should define η+ so that it is supported on [0 2] and symmetric aroundt = 1 or nearly so this will maximize the ratio of (139) to |η+|22|ηlowast|1

We should do this while making sure that we will know how to estimate Sη+(α x)for α isin M We know how to estimate Sη(α x) very precisely for functions of theform η(t) = g(t)eminust

22 η(t) = g(t)teminust22 etc where g(t) is band-limited We will

work with a function η+ of that form chosen so as to be very close (in `2 norm) to afunction η that is in fact supported on [0 2] and symmetric around t = 1

We choose

η(t) =

t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 if t 6isin [0 2]

This function is obviously symmetric (η(t) = η(2 minus t)) and vanishes to high orderat t = 0 besides being supported on [0 2]

We set η+(t) = hR(t)teminust22 where hR(t) is an approximation to the function

h(t) =

t2(2minus t)3etminus

12 if t isin [0 2]

0 if t 6isin [0 2]

26 CHAPTER 1 INTRODUCTION

We just let hR(t) be the inverse Mellin transform of the truncation ofMh to an interval[minusiR iR] (Explicitly

hR(t) =

int infin0

h(tyminus1)FR(y)dy

y

where FR(t) = sin(R log y)(π log y) that is FR is the Dirichlet kernel with a changeof variables)

Since the Mellin transform of teminust22 is regular at s = 0 the Mellin transform

Mη+ will be holomorphic in a neighborhood of s 0 le lt(s) le 1 even thoughthe truncation of Mh to [minusiR iR] is brutal Set R = 200 say By the fast decay ofMh(it) and the fact that the Mellin transform M is an isometry |(hR(t)minush(t))t|2 isvery small and hence so is |η+ minus η|2 as we desired

But what about the requirement that we be able to estimate Sηlowast(α x) for bothα isin m and α isinM

Generally speaking if we know how to estimate Sη1(α x) for some α isin RZ andwe also know how to estimate Sη2(α x) for all other α isin RZ where η1 and η2 aretwo smoothing functions then we know how to estimate Sη3(α x) for all α isin RZwhere η3 = η1 lowastM η2 or more generally ηlowast(t) = (η1 lowastM η2)(κt) κ gt 0 a constantThis is an easy exercise on exchanging the order of integration and summation

Sηlowast(α x) =sumn

Λ(n)e(αn)(η1 lowastM η2)(κn

x

)=

int infin0

sumn

Λ(n)e(αn)η1(κr)η2

( nrx

) drr

=

int infin0

η1(κr)Sη2(rx)dr

r

(140)and similarly with η1 and η2 switched Of course this trick is valid for all exponentialsums any function f(n) would do in place of Λ(n) The only caveat is that η1 (andη2) should be small very near 0 since for r small we may not be able to estimateSη2(rx) (or Sη1(rx)) with any precision This is not a problem one of our functionswill be t2eminust

22 which vanishes to second order at 0 and the other one will be η2 =4 middot 1[121] lowastM 1[121] which has support bounded away from 0 We will set κ large(say κ = 49) so that the support of ηlowast is indeed concentrated on a small interval [0 ε)as we wanted

Now that we have chosen our smoothing weights η+ and ηlowast we have to estimate themajor-arc integral (135) and the minor-arc integral (136) What follows can actuallybe done for general η+ and ηlowast we could have left our particular choice of η+ and ηlowastfor the end

Estimating the major-arc integral (135) may sound like an easy task since we haverather precise estimates for Sη(α x) (η = η+ ηlowast) when α is on the major arcs wecould just replace Sη(α x) in (135) by the approximation given by (17) and (111) Itis however more efficient to express (135) as the sum of the contribution of the trivialcharacter (a sum of integrals of (η(minusδ)x)3 where η(minusδ)x comes from (111)) plus a

15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 27

term of the form

(maximum ofradicq middot E(q) for q le r) middot

intM

∣∣Sη+(α x)∣∣2 dα

where E(q) = E is as in (112) plus two other terms of essentially the same form Asusual the major arcs M are the arcs around rationals aq with q le r We will soondiscuss how to bound the integral of

∣∣Sη+(α x)∣∣2 over arcs around rationals aq with

q le s s arbitrary Here however it is best to estimate the integral over M using theestimate on Sη+(α x) from (17) and (111) we obtain a great deal of cancellationwith the effect that for χ non-trivial the error term in (112) appears only when it getssquared and thus becomes negligible

The contribution of the trivial character has an easy approximation thanks to thefast decay of η We obtain that the major-arc integral (135) equals a main termC0Cηηlowastx

2 where

C0 =prodp|n

(1minus 1

(pminus 1)2

)middotprodp-n

(1 +

1

(pminus 1)3

)

Cηηlowast =

int infin0

int infin0

η(t1)η(t2)ηlowast

(nxminus (t1 + t2)

)dt1dt2

plus several small error terms We have already chosen η ηlowast and x so as to (nearly)maximize Cηηlowast

It is time to bound the minor-arc integral (136) As we said in sect15 we must dobetter than the usual bound (137) Since our minor-arc bound (32) on |Sη(α x)|α sim aq decreases as q increases it makes sense to use partial summation togetherwith bounds onint

ms

|Sη+(α x)|2 =

intMs

|Sη+(α x)|2dαminusintM

|Sη+(α x)|2dα

where ms denotes the arcs around aq r lt q le s and Ms denotes the arcs around allaq q le s We already know how to estimate the integral on M How do we boundthe integral on Ms

In order to do better than the trivial boundintMsleintRZ we will need to use the

fact that the series (16) defining Sη+(α x) is essentially supported on prime numbersBounding the integral on Ms is closely related to the problem of bounding

sumqles

suma mod q

(aq)=1

∣∣∣∣∣∣sumnlex

ane(aq)

∣∣∣∣∣∣2

(141)

efficiently for s considerably smaller thanradicx and an supported on the primes

radicx lt

p le x This is a classical problem in the study of the large sieve The usual bound on(141) (by for instance Montgomeryrsquos inequality) has a gain of a factor of

2eγ(log s)(log xs2)

28 CHAPTER 1 INTRODUCTION

relative to the bound of (x + s2)sumn |an|2 that one would get from the large sieve

without using prime support Heath-Brown proceeded similarly to boundintMs

|Sη+(α x)|2dα 2eγ log s

log xs2

intRZ|Sη+(α x)|2dα (142)

This already gives us the gain of C(log s) log x that we absolutely need butthe constant C is suboptimal the factor in the right side of (142) should really be(log s) log x ie C should be 1 We cannot reasonably hope to obtain a factor betterthan 2(log s) log x in the minor arcs due to what is known as the parity problem insieve theory As it turns out Ramare [Ram09] had given general bounds on the largesieve that were clearly conducive to better bounds on (141) though they involved aratio that was not easy to bound in general

I used several careful estimations (including [Ram95 Lem 34]) to reduce theproblem of bounding this ratio to a finite number of cases which I then checked bya rigorous computation This approach gave a bound on (141) with a factor of sizeclose to 2(log s) log x (This solves the large-sieve problem for s le x03 it wouldstill be worthwhile to give a computation-free proof for all s le x12minusε ε gt 0) It wasthen easy to give an analogous bound for the integral over Ms namelyint

Ms

|Sη+(α x)|2dα 2 log s

log x

intRZ|Sη+(α x)|2dα

where can easily be made precise by replacing log s by log s + 136 and log x bylog x + c where c is a small constant Without this improvement the main theoremwould still have been proved but the required computation time would have been mul-tiplied by a factor of considerably more than e3γ = 56499

What remained then was just to compare the estimates on (135) and (136) andcheck that (136) is smaller for n ge 1027 This final step was just bookkeeping Aswe already discussed a check for n lt 1027 is easy Thus ends the proof of the maintheorem

16 Some remarks on computationsThere were two main computational tasks verifying the ternary conjecture for all n leC and checking the Generalized Riemann Hypothesis for modulus q le r up to acertain height

The first task was not very demanding Platt and I verified in [HP13] that everyodd integer 5 lt n le 88 middot 1030 can be written as the sum of three primes (In theend only a check for 5 lt n le 1027 was needed) We proceeded as follows In amajor computational effort Oliveira e Silva Herzog and Pardi [OeSHP14]) had alreadychecked that the binary Goldbach conjecture is true up to 4 middot 1018 ndash that is every evennumber up to 4 middot 1018 is the sum of two primes Given that all we had to do wasto construct a ldquoprime ladderrdquo that is a list of primes from 3 up to 88 middot 1030 suchthat the difference between any two consecutive primes in the list is at least 4 and atmost 4 middot 1018 (This is a known strategy see [Sao98]) Then for any odd integer

16 SOME REMARKS ON COMPUTATIONS 29

5 lt n le 88 middot 1030 there is a prime p in the list such that 4 le n minus p le 4 middot 1018 + 2(Choose the largest p lt n in the ladder or if n minus that prime is 2 choose the primeimmediately under that) By [OeSHP14] (and the fact that 4 middot 1018 + 2 equals p + qwhere p = 2000000000000001301 and q = 1999999999999998701 are both prime)we can write nminus p = p1 + p2 for some primes p1 p2 and so n = p+ p1 + p2

Building a prime ladder involves only integer arithmetic that is computer manip-ulation of integers rather than of real numbers Integers are something that computerscan handle rapidly and reliably We look for primes for our ladder only among a spe-cial set of integers whose primality can be tested deterministically quite quickly (Prothnumbers k middot 2m + 1 k lt 2m) Thus we can build a prime ladder by a rigorousdeterministic algorithm that can be (and was) parallelized trivially

The second computation is more demanding It consists in verifying that for everyL-function L(s χ) with χ of conductor q le r = 300000 (for q even) or q le r2(for q odd) all zeroes of L(s χ) such that |=(s)| le Hq = 108q (for q odd) and|=(s)| le Hq = max(108q 200 + 75 middot 107q (for q even) lie on the critical lineAs a matter of fact Platt went up to conductor q le 200000 (or twice that for q even)[Plab] he had already gone up to conductor 100000 in his PhD thesis [Pla11] Theverification took in total about 400000 core-hours (ie the total number of processorcores used times the number of hours they ran equals 400000 nowadays a top-of-the-line processor typically has eight cores) In the end since I used only q le 150000 (ortwice that for q even) the number of hours actually needed was closer to 160000 sinceI could have made do with q le 120000 (at the cost of increasing C to 1029 or 1030) itis likely in retrospect that only about 80000 core-hours were needed

Checking zeros of L-functions computationally goes back to Riemann (who didit by hand for the special case of the Riemann zeta function) It is also one of thethings that were tried on digital computers in their early days (by Turing [Tur53] forinstance see the exposition in [Boo06b]) One of the main issues to be careful aboutarises whenever one manipulates real numbers via a computer generally speaking acomputer cannot store an irrational number moreover while a computer can handlerationals it is really most comfortable handling just those rationals whose denomina-tors are powers of two Thus one cannot really say ldquocomputer give me the sine ofthat numberrdquo and expect a precise result What one should do if one really wants toprove something (as is the case here) is to say ldquocomputer I am giving you an intervalI = [a2k b2k] give me an interval I prime = [c2` d2`] preferably very short suchthat sin(I) sub I primerdquo This is called interval arithmetic it is arguably the easiest way to dofloating-point computations rigorously

Processors do not do this natively and if interval arithmetic is implemented purelyon software computations can be slowed down by a factor of about 100 Fortunatelythere are ways of running interval-arithmetic computations partly on hardware partlyon software

Incidentally there are some basic functions (such as sin) that should always be doneon software not just if one wants to use interval arithmetic but even if one just wantsreasonably precise results the implementation of transcendental functions in some ofthe most popular processors does not always round correctly and errors can accumulatequickly Fortunately this problem is already well-known and there is software thattakes care of this (Platt and I used the crlibm library [DLDDD+10])

30 CHAPTER 1 INTRODUCTION

Lastly there were several relatively minor computations strewn here and there inthe proof There is some numerical integration done rigorously once or twice thiswas done using a standard package based on interval arithmetic [Ned06] but most ofthe time I wrote my own routines in C (using Plattrsquos interval arithmetic package) forthe sake of speed Another kind of computation (employed much more in [Hela] thanin the somewhat more polished version of the proof given here) was a rigorous versionof a ldquoproof by graphrdquo (ldquothe maximum of a function f is clearly less than 4 because Ican see it on the screenrdquo) There is a standard way to do this (see eg [Tuc11 sect52])essentially the bisection method combines naturally with interval arithmetic as weshall describe in sect26 Yet another computation (and not a very small one) was thatinvolved in verifying a large-sieve inequality in an intermediate range (as we discussedin sect15)

It may be interesting to note that one of the inequalities used to estimate (130) wasproven with the help of automatic quantifier elimination [HB11] Proving this inequal-ity was a very minor task both computationally and mathematically in all likelihoodit is feasible to give a human-generated proof Still it is nice to know from first-hand experience that computers can nowadays (pretend to) do something other thanjust perform numerical computations ndash and that this is already applicable in currentmathematical practice

Chapter 2

Notation and preliminaries

21 General notationGiven positive integers m n we say m|ninfin if every prime dividing m also divides nWe say a positive integer n is square-full if for every prime p dividing n the squarep2 also divides n (In particular 1 is square-full) We say n is square-free if p2 - nfor every prime p For p prime n a non-zero integer we define vp(n) to be the largestnon-negative integer α such that pα|n

When we writesumn we mean

suminfinn=1 unless the contrary is stated As always

Λ(n) denotes the von Mangoldt function

Λ(n) =

log p if n = pα for some prime p and some integer α ge 10 otherwise

and micro denotes the Mobius function

micro(n) =

(minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

We let τ(n) be the number of divisors of an integer n ω(n) the number of primedivisors of n and σ(n) the sum of the divisors of n

We write (a b) for the greatest common divisor of a and b If there is any riskof confusion with the pair (a b) we write gcd(a b) Denote by (a binfin) the divisorprodp|b p

vp(a) of a (Thus a(a binfin) is coprime to b and is in fact the maximal divisorof a with this property)

As is customary we write e(x) for e2πix We denote the Lr norm of a function fby |f |r We write Olowast(R) to mean a quantity at most R in absolute value Given a setS we write 1S for its characteristic function

1S(x) =

1 if x isin S0 otherwise

Write log+ x for max(log x 0)

31

32 CHAPTER 2 NOTATION AND PRELIMINARIES

22 Dirichlet characters and L functions

Let us go over some basic terms A Dirichlet character χ Z rarr C of modulus q is acharacter χ of (ZqZ)lowast lifted to Z with the convention that χ(n) = 0 when (n q) 6= 1(In other words χ is completely multiplicative and periodic modulo q and vanisheson integers not coprime to q) Again by convention there is a Dirichlet character ofmodulus q = 1 namely the trivial character χT Z rarr C defined by χT (n) = 1 forevery n isin Z

If χ is a character modulo q and χprime is a character modulo qprime|q such that χ(n) =χprime(n) for all n coprime to q we say that χprime induces χ A character is primitive if it isnot induced by any character of smaller modulus Given a character χ we write χlowast forthe (uniquely defined) primitive character inducing χ If a character χmod q is inducedby the trivial character χT we say that χ is principal and write χ0 for χ (provided themodulus q is clear from the context) In other words χ0(n) = 1 when (n q) = 1 andχ0(n) = 0 when (n q) = 0

A Dirichlet L-function L(s χ) (χ a Dirichlet character) is defined as the analyticcontinuation of

sumn χ(n)nminuss to the entire complex plane there is a pole at s = 1 if χ

is principalA non-trivial zero of L(s χ) is any s isin C such that L(s χ) = 0 and 0 lt lt(s) lt 1

(In particular a zero at s = 0 is called ldquotrivialrdquo even though its contribution can bea little tricky to work out The same would go for the other zeros with lt(s) = 0occuring for χ non-primitive though we will avoid this issue by working mainly withχ primitive) The zeros that occur at (some) negative integers are called trivial zeros

The critical line is the line lt(s) = 12 in the complex plane Thus the generalizedRiemann hypothesis for Dirichlet L-functions reads for every Dirichlet character χall non-trivial zeros of L(s χ) lie on the critical line Verifiable finite versions ofthe generalized Riemann hypothesis generally read for every Dirichlet character χ ofmodulus q le Q all non-trivial zeros of L(s χ) with |=(s)| le f(q) lie on the criticalline (where f Zrarr R+ is some given function)

23 Fourier transforms and exponential sums

The Fourier transform on R is normalized here as follows

f(t) =

int infinminusinfin

e(minusxt)f(x)dx

The trivial bound is |f |infin le |f |1 If f is compactly supported (or of fast enoughdecay as t 7rarr plusmninfin) and piecewise continuous f(t) = f prime(t)(2πit) by integration byparts Iterating we obtain that if f is of fast decay and differentiable k times outsidefinitely many points then

f(t) = Olowast

(|f (k)|infin(2πt)k

)= Olowast

(|f (k)|1(2πt)k

) (21)

23 FOURIER TRANSFORMS AND EXPONENTIAL SUMS 33

Thus for instance if f is compactly supported continuous and piecewise C1 then fdecays at least quadratically

It could happen that |f (k)|1 = infin in which case (21) is trivial (but not false) Inpractice we require f (k) isin L1 In a typical situation f is differentiable k times exceptat x1 x2 xk where it is differentiable only (k minus 2) times the contribution of xi(say) to |f (k)|1 is then | limxrarrx+

if (kminus1)(x)minus limxrarrxminusi

f (kminus1)(x)|The following bound is standard (see eg [Tao14 Lemma 31]) for α isin RZ and

f Rrarr C compactly supported and piecewise continuous∣∣∣∣∣sumnisinZ

f(n)e(αn)

∣∣∣∣∣ le min

(|f |1 +

1

2|f prime|1

12 |fprime|1

| sin(πα)|

) (22)

(The first bound follows fromsumnisinZ |f(n)| le |f |1 + (12)|f prime|1 which in turn is

a quick consequence of the fundamental theorem of calculus the second bound isproven by summation by parts) The alternative bound (14)|f primeprime|1| sin(πα)|2 givenin [Tao14 Lemma 31] (for f continuous and piecewise C1) can usually be improvedby the following estimate

Lemma 231 Let f Rrarr C be compactly supported continuous and piecewise C1Then ∣∣∣∣∣sum

nisinZf(n)e(αn)

∣∣∣∣∣ le 14 |f primeprime|infin

(sinπα)2(23)

for every α isin R

As usual the assumption of compact support could easily be relaxed to an assump-tion of fast decay

Proof By the Poisson summation formulainfinsum

n=minusinfinf(n)e(αn) =

infinsumn=minusinfin

f(nminus α)

Since f(t) = f prime(t)(2πit)

infinsumn=minusinfin

f(nminus α) =

infinsumn=minusinfin

f prime(nminus α)

2πi(nminus α)=

infinsumn=minusinfin

f primeprime(nminus α)

(2πi(nminus α))2

By Eulerrsquos formula π cot sπ = 1s+suminfinn=1(1(n+ s)minus 1(nminus s))

infinsumn=minusinfin

1

(n+ s)2= minus(π cot sπ)prime =

π2

(sin sπ)2 (24)

Hence∣∣∣∣∣infinsum

n=minusinfinf(nminus α)

∣∣∣∣∣ le |f primeprime|infininfinsum

n=minusinfin

1

(2π(nminus α))2= |f primeprime|infin middot

1

(2π)2middot π2

(sinαπ)2

34 CHAPTER 2 NOTATION AND PRELIMINARIES

The trivial bound |f primeprime|infin le |f primeprime|1 applied to (23) recovers the bound in [Tao14Lemma 31] In order to do better we will give a tighter bound for |f primeprime|infin in AppendixB when f is equal to one of our main smoothing functions (f = η2)

Integrals of multiples of f primeprime (in particular |f primeprime|1 and f primeprime) can still be made senseof when f primeprime is undefined at a finite number of points provided f is understood as adistribution (and f prime has finite total variation) This is the case in particular for f = η2

When we need to estimatesumn f(n) precisely we will use the Poisson summation

formula sumn

f(n) =sumn

f(n)

We will not have to worry about convergence here since we will apply the Poissonsummation formula only to compactly supported functions f whose Fourier transformsdecay at least quadratically

24 Mellin transformsThe Mellin transform of a function φ (0infin)rarr C is

Mφ(s) =

int infin0

φ(x)xsminus1dx (25)

If φ(x)xσminus1 is in `1 with respect to dt (ieintinfin

0|φ(x)|xσminus1dx ltinfin) then the Mellin

transform is defined on the line σ+ iR Moreover if φ(x)xσminus1 is in `1 for σ = σ1 andfor σ = σ2 where σ2 gt σ1 then it is easy to see that it is also in `1 for all σ isin (σ1 σ2)and that moreover the Mellin transform is holomorphic on s σ1 lt lt(s) lt σ2 Wethen say that s σ1 lt lt(s) lt σ2 is a strip of holomorphy for the Mellin transform

The Mellin transform becomes a Fourier transform (of η(eminus2πv)eminus2πvσ) by meansof the change of variables x = eminus2πv We thus obtain for example that the Mellintransform is an isometry in the sense thatint infin

0

|f(x)|2x2σ dx

x=

1

int infinminusinfin|Mf(σ + it)|2dt (26)

Recall that in the case of the Fourier transform for |f |2 = |f |2 to hold it is enoughthat f be in `1 cap `2 This gives us that for (26) to hold it is enough that f(x)xσminus1 bein `1 and f(x)xσminus12 be in `2 (again with respect to dt in both cases)

We write f lowastM g for the multiplicative or Mellin convolution of f and g

(f lowastM g)(x) =

int infin0

f(w)g( xw

) dww (27)

In generalM(f lowastM g) = Mf middotMg (28)

25 BOUNDS ON SUMS OF micro AND Λ 35

and

M(f middot g)(s) =1

2πi

int σ+iinfin

σminusiinfinMf(z)Mg(sminus z)dz [GR94 sect1732] (29)

provided that z and sminus z are within the strips on which Mf and Mg (respectively) arewell-defined

We also have several useful transformation rules just as for the Fourier transformFor example

M(f prime(t))(s) = minus(sminus 1) middotMf(sminus 1)

M(tf prime(t))(s) = minuss middotMf(s)

M((log t)f(t))(s) = (Mf)prime(s)

(210)

(as in eg [BBO10 Table 111])Let

η2 = (2 middot 1[121]) lowastM (2 middot 1[121])

Since (see eg [BBO10 Table 113] or [GR94 sect1643])

(MI[ab])(s) =bs minus as

s

we see that

Mη2(s) =

(1minus 2minuss

s

)2

Mη4(s) =

(1minus 2minuss

s

)4

(211)

Let fz = eminuszt where lt(z) gt 0 Then

(Mf)(s) =

int infin0

eminuszttsminus1dt =1

zs

int infin0

eminustdt

=1

zs

int zinfin

0

eminusuusminus1du =1

zs

int infin0

eminusttsminus1dt =Γ(s)

zs

where the next-to-last step holds by contour integration and the last step holds by thedefinition of the Gamma function Γ(s)

25 Bounds on sums of micro and Λ

We will need some simple explicit bounds on sums involving the von Mangoldt func-tion Λ and the Moebius function micro In non-explicit work such sums are usuallybounded using the prime number theorem or rather using the properties of the zetafunction ζ(s) underlying the prime number theorem Here however we need robustfully explicit bounds valid over just about any range

For the most part we will just be quoting the literature supplemented with somecomputations when needed The proofs in the literature are sometimes based on prop-erties of ζ(s) and sometimes on more elementary facts

36 CHAPTER 2 NOTATION AND PRELIMINARIES

First let us see some bounds involving Λ The following bound can be easilyderived from [RS62 (323)] supplemented by a quick calculation of the contributionof powers of primes p lt 32 sum

nlex

Λ(n)

nle log x (212)

We can derive a bound in the other direction from [RS62 (321)] (for x gt 1000adding the contribution of all prime powers le 1000) and a numerical verification forx le 1000 sum

nlex

Λ(n)

nge log xminus log

3radic2 (213)

We also use the following older bounds

1 By the second table in [RR96 p 423] supplemented by a computation for2 middot 106 le V le 4 middot 106 sum

nley

Λ(n) le 10004y (214)

for y ge 2 middot 106

2 sumnley

Λ(n) lt 103883y (215)

for every y gt 0 [RS62 Thm 12]

For all y gt 663 sumnley

Λ(n)n lt 103884y2

2 (216)

where we use (215) and partial summation for y gt 200000 and a computation for663 lt y le 200000 Using instead the second table in [RR96 p 423] together withcomputations for small y lt 107 and partial summation we get that

sumnley

Λ(n)n lt 10008y2

2(217)

for y gt 16 middot 106Similarly sum

nley

Λ(n)radicn

lt 2 middot 10004radicy (218)

for all y ge 1It is also true that sum

y2ltpley

(log p)2 le 1

2y(log y) (219)

25 BOUNDS ON SUMS OF micro AND Λ 37

for y ge 117 this holds for y ge 2 middot 758699 by [RS75 Cor 2] (applied to x = yx = y2 and x = 2y3) and for 117 le y lt 2 middot 758699 by direct computation

Now let us see some estimates on sums involving micro The situation here is lesssatisfactory than for sums involving Λ The main reason is that the complex-analyticapproach to estimating

sumnleN micro(n) would involve 1ζ(s) rather than ζ prime(s)ζ(s) and

thus strong explicit bounds on the residues of 1ζ(s) would be needed Thus explicitestimates on sums involving micro are harder to obtain than estimates on sums involving ΛThis is so even though analytic number theorists are generally used (from the habit ofnon-explicit work) to see the estimation of one kind of sum or the other as essentiallythe same task

Fortunately in the case of sums of the typesumnlex micro(n)n for x arbitrary (a type of

sum that will be rather important for us) all we need is a saving of (log n) or (log n)2

on the trivial bound This is provided by the following

1 (Granville-Ramare [GR96] Lemma 102)∣∣∣∣∣∣sum

nlexgcd(nq)=1

micro(n)

n

∣∣∣∣∣∣ le 1 (220)

for all x q ge 1

2 (Ramare [Ram13] cf El Marraki [EM95] [EM96])∣∣∣∣∣∣sumnlex

micro(n)

n

∣∣∣∣∣∣ le 003

log x(221)

for x ge 11815

3 (Ramare [Ramb]) sumnlexgcd(nq)=1

micro(n)

n= Olowast

(1

log xqmiddot 4

5

q

φ(q)

)(222)

for all x and all q le xsumnlexgcd(nq)=1

micro(n)

nlog

x

n= Olowast

(100303

q

φ(q)

)(223)

for all x and all q

Improvements on these bounds would lead to improvements on type I estimates butnot in what are the worst terms overall at this point

A computation carried out by the author has proven the following inequality for allreal x le 1012 ∣∣∣∣∣∣

sumnlex

micro(n)

n

∣∣∣∣∣∣ leradic

2

x(224)

38 CHAPTER 2 NOTATION AND PRELIMINARIES

The computation was conducted rigorously by means of interval arithmetic For thesake of verification we record that

542625 middot 10minus8 lesum

nle1012

micro(n)

nle 542898 middot 10minus8

Computations also show that the stronger bound∣∣∣∣∣∣sumnlex

micro(n)

n

∣∣∣∣∣∣ le 1

2radicx

holds for all 3 le x le 7727068587 but not for x = 7727068588minus εEarlier numerical work carried out by Olivier Ramare [Ram14] had shown that

(224) holds for all x le 1010

26 Interval arithmetic and the bisection methodInterval arithmetic has at its basic data type intervals of the form I = [a2` b2`]where a b ` isin Z and a le b Say we have a real number x and we want to know sin(x)In general we cannot represent x in a computer in part because it may have no finitedescription The best we can do is to construct an interval of the form I = [a2` b2`]in which x is contained

What we ask of a routine in an interval-arithmetic package is to construct an intervalI prime = [aprime2`

prime bprime2`

prime] in which sin(I) is contained (In practice this is done partly in

software by means of polynomial approximations to sin with precise error terms andpartly in hardware by means of an efficient usage of rounding conventions) This givesus in effect a value for sin(x) (namely (aprime+ bprime)2`

prime+1) and a bound on the error term(namely (bprime minus aprime)2`prime+1)

There are several implementations of interval arithmetic available We will almostalways use D Plattrsquos implementation [Pla11] of double-precision interval arithmeticbased on Lambovrsquos [Lam08] ideas (At one point we will use the PROFILBIAS inter-val arithmetic package [Knu99] since it underlies the VNODE-LP [Ned06] packagewhich we use to bound an integral)

The bisection method is a particularly simple method for finding maxima and min-ima of functions as well as roots It combines rather nicely with interval arithmeticwhich makes the method rigorous We follow an implementation based on [Tuc11sect52] Let us go over the basic ideas

Let us use the bisection method to find the minima (say) of a function f on acompact interval I0 (If the interval is non-compact we generally apply the bisectionmethod to a compact sub-interval and use other tools eg power-series expansionsin the complement) The method proceeds by splitting an interval into two repeatedlydiscarding the halfs where the minimum cannot be found More precisely if we im-plement it by interval arithmetic it proceeds as follows First in an optional initialstep we subdivide (if necessary) the interval I0 into smaller intervals Ik to which thealgorithm will actually be applied For each k interval arithmetic gives us a lower

26 INTERVAL ARITHMETIC AND THE BISECTION METHOD 39

bound rminusk and an upper bound r+k on f(x) x isin Ik here rminusk and r+

k are both ofthe form a2` a ` isin Z Let m0 be the minimum of r+

k over all k We can discardall the intervals Ik for which rminusk gt m0 Then we apply the main procedure startingwith i = 1 split each surviving interval into two equal halves recompute the lower andupper bound on each half definemi as before to be the minimum of all upper boundsand discard again the intervals on which the lower bound is larger than mi increase iby 1 We repeat the main procedure as often as needed In the end we obtain that theminimum is no smaller than the minimum of the lower bounds (call them (r(i))minusk ) onall surviving intervals I(i)

k Of course we also obtain that the minimum (or minima ifthere is more than one) must lie in one of the surviving intervals

It is easy to see how the same method can be applied (with a trivial modification)to find maxima or (with very slight changes) to find the roots of a real-valued functionon a compact interval

40 CHAPTER 2 NOTATION AND PRELIMINARIES

Part I

Minor arcs

41

Chapter 3

Introduction

The circle method expresses the number of solutions to a given problem in terms ofexponential sums Let η R+ rarr C be a smooth function Λ the von Mangoldt function(defined as in (15)) and e(t) = e2πit The estimation of exponential sums of the type

Sη(α x) =sumn

Λ(n)e(αn)η(nx) (31)

where α isin RZ already lies at the basis of Hardy and Littlewoodrsquos approach to theternary Goldbach problem by means of the circle method [HL22] The division of thecircle RZ into ldquomajor arcsrdquo and ldquominor arcsrdquo goes back to Hardy and Littlewoodrsquosdevelopment of the circle method for other problems As they themselves noted as-suming GRH means that for the ternary Goldbach problem all of the circle can bein effect subdivided into major arcs ndash that is under GRH (31) can be estimated withmajor-arc techniques for α arbitrary They needed to make such an assumption pre-cisely because they did not yet know how to estimate Sη(α x) on the minor arcs

Minor-arc techniques for Goldbachrsquos problem were first developed by Vinogradov[Vin37] These techniques make it possible to work without GRH The main obstacleto a full proof of the ternary Goldbach conjecture since then has been that in spite ofgradual improvements minor-arc bounds have simply not been strong enough

As in all work to date our aim will be to give useful upper bounds on (31) forα in the minor bounds rather than the precise estimates that are typical of the major-arc case We will have to give upper bounds that are qualitatively stronger than thoseknown before (In Part III we will also show how to use them more efficiently)

Our main challenge will be to give a good upper bound whenever q is larger than aconstant r Here ldquosufficiently goodrdquo means ldquosmaller than the trivial bound divided bya large constant and getting even smaller quickly as q growsrdquo Our bound must also begood for α = aq + δx where q lt r but δ is large (Such an α may be said to lie onthe tail (δ large) of a major arc (q small))

Of course all expressions must be explicit and all constants in the leading terms ofthe bound must be small Still the main requirement is a qualitative one For instancewe know in advance that a single factor of log x would be the end of us That is we

43

44 CHAPTER 3 INTRODUCTION

know that if there is a single term of the form say (x log x)q and the trivial boundis about x we are lost (x log x)q is greater than x for x large and q constant

The quality of the results here is due to several new ideas of general applicabilityIn particular sect51 introduces a way to obtain cancellation from Vaughanrsquos identityVaughanrsquos identity is a two-log gambit in that it introduces two convolutions (each ofthem at a cost of log) and offers a great deal of flexibility in compensation One of theideas presented here is that at least one of two logs can be successfully recovered afterhaving been given away in the first stage of the proof This reduces the cost of the useof this basic identity in this and presumably many other problems

There are several other improvements that make a qualitative difference see thediscussions at the beginning of sect4 and sect5 Considering smoothed sums ndash now a com-mon idea ndash also helps (Smooth sums here go back to Hardy-Littlewood [HL22] ndash bothin the general context of the circle method and in the context of Goldbachrsquos ternaryproblem In recent work on the problem they reappear in [Tao14])

31 ResultsThe main bound we are about to see is essentially proportional to ((log q)

radicφ(q)) middot x

The term δ0 serves to improve the bound when we are on the tail of an arc

Theorem 311 Let x ge x0 x0 = 216 middot 1020 Let Sη(α x) be as in (31) with ηdefined in (34) Let 2α = aq + δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23 If q le x136 then

|Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

middot x+25xradicδ0q

+2x

δ0qmiddot Lxδ0qq + 336x56

(32)where

δ0 = max(2 |δ|4) Rxt = 027125 log

(1 +

log 4t

2 log 9x13

2004t

)+ 041415

Lxtq =q

φ(q)

(13

4log t+ 782

)+ 1366 log t+ 3755

(33)If q gt x136 then

|Sη(α x)| le 0276x56(log x)32 + 1234x23 log x

The factor Rxt is small in practice for instance for x = 1025 and δ0q = 5 middot 105

(typical ldquodifficultrdquo values) Rxδ0q equals 059648 The classical choice1 for η in (31) is η(t) = 1 for t le 1 η(t) = 0 for t gt 1 which

of course is not smooth or even continuous We use

η(t) = η2(t) = 4 max(log 2minus | log 2t| 0) (34)

1Or more precisely the choice made by Vinogradov and followed by most of the literature since himHardy and Littlewood [HL22] worked with η(t) = eminust

32 COMPARISON TO EARLIER WORK 45

as in Tao [Tao14] in part for purposes of comparison (This is the multiplicative con-volution of the characteristic function of an interval with itself) Nearly all work shouldbe applicable to any other sufficiently smooth function η of fast decay It is importantthat η decay at least quadratically

We are not forced to use the same smoothing function as in Part II and we do notAs was explained in the introduction the simple technique (140) allows us to workwith one smoothing function on the major arcs and with another one on the minor arcs

32 Comparison to earlier workTable 31 compares the bounds for the ratio |Sη(aq x)|x given by this paper and by[Tao14][Thm 13] for x = 1027 and different values of q We are comparing worstcases φ(q) as small as possible (q divisible by 2 middot 3 middot 5 middot middot middot ) in the result here and qdivisible by 4 (implying 4α sim a(q4)) in Taorsquos result The main term in the result inthis paper improves slowly with increasing x the results in [Tao14] worsen slowly withincreasing x The qualitative gain with respect to the main term in [Tao14 (110)] is inthe order of log(q)

radicφ(q)q Notice also that the bounds in [Tao14] are not log-free in

[Tao14 (110)] there is a term proportional to x(log x)2q This becomes larger thanthe trivial bound x for x very large

The results in [DR01] are unfortunately worse than the trivial bound in the rangecovered by Table 31 Ramarersquos results ([Ram10 Thm 3] [Ramc Thm 6]) are notapplicable within the range since neither of the conditions log q le (150)(log x)13q le x148 is satisfied Ramarersquos bound in [Ramc Thm 6] is∣∣∣∣∣∣

sumxltnle2x

Λ(n)e(anq)

∣∣∣∣∣∣ le 13000

radicq

φ(q)x (35)

for 20 le q le x148 We should underline that while both the constant 13000 and thecondition q le x148 keep (35) from being immediately useful in the present context(35) is asymptotically better than the results here as q rarr infin (Indeed qualitativelyspeaking the form of (35) is the best one can expect from results derived by the familyof methods stemming from Vinogradovrsquos work) There is also unpublished work byRamare (ca 1993) with different constants for q (log x log log x)4

33 Basic setupIn the minor-arc regime the first step in estimating an exponential sum on the primesgenerally consists in the application of an identity expressing the von Mangoldt func-tion Λ(n) in terms of a sum of convolutions of other functions

331 Vaughanrsquos identityWe recall Vaughanrsquos identity [Vau77b]

Λ = microleU lowast log +microleU lowast ΛleV lowast 1 + microgtU lowast ΛgtV lowast 1 + ΛleV (36)

46 CHAPTER 3 INTRODUCTION

q0|Sη(aqx)|

x HH |Sη(aqx)|x Tao

105 004661 03447515 middot 105 003883 02883625 middot 105 003098 0231945 middot 105 002297 01741675 middot 105 001934 014775106 001756 013159107 000690 005251

Table 31 Worst-case upper bounds on xminus1|Sη(a2q x)| for q ge q0 |δ| le 8 x =1027 The trivial bound is 1

where 1 is the constant function 1 and where we write

flez(n) =

f(n) if n le z0 if n gt z

fgtz(n) =

0 if n le zf(n) if n gt z

Here f lowast g denotes the Dirichlet convolution (f lowast g)(n) =sumd|n f(d)g(nd) We can

set the values of U and V however we wishVaughanrsquos identity is essentially a consequence of the Mobius inversion formula

(1 lowast micro)(n) =

1 if n = 10 otherwise

(37)

Indeed by (37)

ΛgtV (n) =sumdm|n

micro(d)ΛgtV (m)

=sumdm|n

microleU (d)ΛgtV (m) +sumdm|n

microgtU (d)ΛgtV (m)

Applying to this the trivial equality ΛgtV = Λ minus ΛleV as well as the simple fact that1 lowast Λ = log we obtain that

ΛgtV (n) =sumd|n

microleU (d) log(nd)minussumdm|n

microleU (d)ΛleV (m) +sumdm|n

microgtU (d)ΛgtV (m)

By ΛV = ΛgtV + ΛgeV we conclude that Vaughanrsquos identity (36) holdsApplying Vaughanrsquos identity we easily get that for any function η R rarr R any

completely multiplicative function f Z+ rarr C and any x gt 0 U V ge 0sumn

Λ(n)f(n)e(αn)η(nx) = SI1 minus SI2 + SII + S0infin (38)

33 BASIC SETUP 47

where

SI1 =summleU

micro(m)f(m)sumn

(log n)e(αmn)f(n)η(mnx)

SI2 =sumdleV

Λ(d)f(d)summleU

micro(m)f(m)sumn

e(αdmn)f(n)η(dmnx)

SII =summgtU

f(m)

sumdgtUd|m

micro(d)

sumngtV

Λ(n)e(αmn)f(n)η(mnx)

S0infin =sumnleV

Λ(n)e(αn)f(n)η(nx)

(39)

We will use the function

f(n) =

1 if gcd(n v) = 10 otherwise

(310)

where v is a small positive square-free integer (Our final choice will be v = 2) Then

Sη(x α) = SI1 minus SI2 + SII + S0infin + S0w (311)

where Sη(x α) is as in (31) and

S0v =sumn|v

Λ(n)e(αn)η(nx)

The sums SI1 SI2 are called ldquoof type Irdquo the sum SII is called ldquoof type IIrdquo (orbilinear) (The not-all-too colorful nomenclature goes back to Vinogradov) The sumS0infin is in general negligible for our later choice of V and η it will be in fact 0 Thesum S0v will be negligible as well

As we already discussed in the introduction Vaughanrsquos identity is highly flexible(in that we can choose U and V at will) but somewhat inefficient in practice (in that atrivial estimate for the right side of (311) is actually larger than a trivial estimate forthe left side of (311)) Some of our work will consist in regaining part of what is givenup when we apply Vaughanrsquos identity

332 An alternative route

There is an alternative route ndash namely to use a less sacrificial though also more in-flexible identity While this was not in the end the route that was followed let usnevertheless discuss it in some detail in part so that we can understand to what extentit was in retrospect viable and in part so as to see how much of the work we willundertake is really more or less independent of the particular identity we choose

48 CHAPTER 3 INTRODUCTION

Since ζ prime(s)ζ(s) =sumn Λ(n)nminuss and(

ζ prime(s)

ζ(s)

)(2)

=

(ζ primeprime(s)

ζ(s)minus (ζ prime(s))

2

ζ(s)2

)prime

=ζ(3)(s)

ζ(s)minus 3ζ primeprime(s)ζ prime(s)

ζ(s)2+ 2

(ζ prime(s)

ζ(s)

)3

=ζ(3)(s)

ζ(s)minus 3

(ζ prime(s)

ζ(s)

)primemiddot ζprime(s)

ζ(s)minus(ζ prime(s)

ζ(s)

)3

(312)

we can see comparing coefficients that

Λ middot log2 = micro lowast log3minus3(Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (313)

as was stated by Bombieri in [Bom76]Here the term microlowast log3 is of the same kind as the term microleU lowast log we have to estimate

if we use Vaughanrsquos identity though the fact that there is no truncation at U means thatone of the error terms will get larger ndash it will be proportional to x in fact if we sumfrom 1 to x The trivial upper bound on the sum of Λ middot log2 from 1 to x is x(log x)2thus an error term of size x is barely acceptable

In general when we have a double or triple sum we are not very good at gettingbetter than trivial bounds in ranges in which all but one of the variables are very smallThis is the source of the large error term that appears in the sum involving micro lowast log3

because we are no longer truncating as for microleU lowast log It will also be the source of otherlarge error terms including one that would be too large ndash namely the one coming fromthe term (Λ middot log) lowast Λ when the variable of Λ middot log is large and that of Λ is small (Thetrivial bound on that range is x log x)

We avoid this problem by substituting the identity Λ middot log = micro lowast log2minusΛ lowastΛ inside(313)

Λ middot log2 = micro lowast log3minus3(micro lowast log2) lowast Λ + 2Λ lowast Λ lowast Λ (314)

(We could also have got this directly from the next-to-last line in (312)) When thevariable of Λ in (micro lowast log2) lowast Λ is small the variable of micro lowast log2 is large and we canestimate the resulting term using the same techniques as for micro lowast log3

It is easy to see that we can in fact mix (313) and (314)

Λ middot log2 = micro lowast log3minus3((Λ middot log) lowast ΛgtV + (micro lowast log2) lowast ΛleV

)+ (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ)

(315)

for V arbitrary Note here that there is some cancellation in the last term writing

F3V (n) = (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ) (n) (316)

we can check easily that for n = p1p2p3 square-free with V 3 lt n we have

F3V (n) =

minus6 log p1 log p2 log p3 if all pi gt V 0 if p1 lt p2 le V lt p36 log p1 log p2 log p3 if p1 le V lt p2 lt p312 log p1 log p2 log p3 if all pi le V

33 BASIC SETUP 49

In contrast for n square-free minusΛ lowast Λ lowast Λ(n) is minus6 if n is of the form p1p2p3 and 0otherwise

We may find it useful to take aside two large terms that may need to be boundedtrivially namely micro lowast log3

leu and (Λ middot log)leu lowastΛgtV where u will be a small parameter(We can let for instance u = 3) We conclude that

Λ middot log2 = FI1u(n)minus 3FI2Vu(n)minus 3FIIVu(n) + F3V (n) + F0Vu(n) (317)

whereFI1u = micro lowast log3

gtu

FI2Vu = (micro lowast log2) lowast ΛleV

FIIVu(n) = (Λ middot log)gtu lowast ΛgtV

F0Vu(n) = micro lowast log3leuminus3(Λ middot log)leu lowast ΛgtV

and F3V is as in (316)In the bulk of the present work ndash in particular in all steps that are part of the proof

of Theorem 311 or the Main Theorem ndash we will use Vaughanrsquos identity rather than(317) This choice was made while the proof was still underway it was due mainlyto back-of-the-envelope estimates that showed that the error terms could be too largeif (314) was used Of course this might have been the case with Vaughanrsquos identityas well but the fact that the parameters U V there have a large effect on the outcomemeant that one could hope to improve on insufficient estimates in part by adjusting Uand V without losing all previous work (This is what was meant by the ldquoflexibilityrdquoof Vaughanrsquos identity)

The question remains can one prove ternary Goldbach using (317) rather thanVaughanrsquos identity This seems likely If so which proof would be more complicatedThis is not clear

There are large parts of the work that are the essentially the same in both cases

bull estimates for sums involving microleU lowast logk (ldquotype Irdquo)

bull estimates for sums involving Λgtu lowast ΛgtV and the like (ldquotype IIrdquo)

Trilinear sums ie sums involving ΛlowastΛlowastΛ can be estimated much like bilinear sumsie sums involving Λ lowast Λ

There are also challenges that appear only for Vaughanrsquos identity and others thatappear only for (317) An example of a challenge that is successfully faced in the mainproof but does not appear if (317) is used consists in bounding sums of type

sumUltmlexW

sumdgtUd|m

micro(d)

2

(In sect51 we will be able to bound sums of this type by a constant times xW ) Like-wise large tail terms that have to be estimated trivially seem unavoidable in (317)(The choice of a parameter u gt 1 as above is meant to alleviate the problem)

50 CHAPTER 3 INTRODUCTION

In the end losing a factor of about log xUV seems inevitable when one usesVaughanrsquos identity but not when one uses (317) Another reason why a full treatmentbased on (317) would also be worthwhile is that it is a somewhat less familiar andarguably under-used identity and deserves more exploration With these commentswe close the discussion of (317) we will henceforth use Vaughanrsquos identity

Chapter 4

Type I sums

Here we must bound sums of the basic typesummleD

micro(m)sumn

e(αmn)η(mnx

)and variations thereof There are three main improvements in comparison to standardtreatments

1 The terms with m divisible by q get taken out and treated separately by analyticmeans This all but eliminates what would otherwise be the main term

2 The other terms get handled by improved estimates on trigonometric sums Forlarge m the improvements have a substantial total effect ndash more than a constantfactor is gained

3 The ldquoerrorrdquo term δx = α minus aq is used to our advantage This happens boththrough the Poisson summation formula and through the use of two alternativeapproximations to the same number α

The fact that a continuous weight η is used (ldquosmoothingrdquo) is a difference with respectto the classical literature ([Vin37] and what followed) but not with respect to morerecent work (including [Tao14]) using smooth or continuous weights is an idea thathas become commonplace in analytic number theory even though it is not consistentlyapplied The improvements due to smoothing in type I are both relatively minor andessentially independent of the improvements due to (1) and (3) The use of a contin-uous weight combines nicely with (2) but the ideas given here would give qualitativeimprovements in the treatment of trigonometric sums even in the absence of smoothing

41 Trigonometric sumsThe following lemmas on trigonometric sums improve on the best Vinogradov-typelemmas in the literature (By this we mean results of the type of Lemma 8a and

51

52 CHAPTER 4 TYPE I SUMS

Lemma 8b in [Vin04 Ch I] See in particular the work of Daboussi and Rivat [DR01Lemma 1]) The main idea is to switch between different types of approximation withinthe sum rather than just choosing between bounding all terms either trivially (by A)or non-trivially (by C| sin(παn)|2) There will also1 be improvements in our appli-cations stemming from the fact that Lemmas 411 and Lemma 412 take quadratic(| sin(παn)|2) rather than linear (| sin(παn)|) inputs (These improved inputs comefrom the use of smoothing elsewhere)

Lemma 411 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Then for anyAC ge 0

sumyltnley+q

min

(A

C

| sin(παn)|2

)le min

(2A+

6q2

π2C 3A+

4q

π

radicAC

) (41)

Proof We start by letting m0 = byc + b(q + 1)2c j = n minusm0 so that j ranges inthe interval (minusq2 q2] We write

αn =aj + c

q+ δ1(j) + δ2 mod 1

where |δ1(j)| and |δ2| are both le 12q we can assume δ2 ge 0 The variable r =aj + c mod q occupies each residue class mod p exactly once

One option is to bound the terms corresponding to r = 0minus1 by A each and allthe other terms by C| sin(παn)|2 (This can be seen as the simple case it will takeus about a page just because we should estimate all sums and all terms here with greatcare ndash as in [DR01] only more so)

The terms corresponding to r = minusk and r = k minus 1 (2 le k le q2) contribute atmost

1

sin2 πq (k minus 1

2 minus qδ2)+

1

sin2 πq (k minus 3

2 + qδ2)le 1

sin2 πq

(k minus 1

2

) +1

sin2 πq

(k minus 3

2

) since x 7rarr 1

(sin x)2 is convex-up on (0infin) Hence the terms with r 6= 0 1 contribute atmost

1(sin π

2q

)2 + 2sum

2lerle q2

1(sin π

q (r minus 12))2 le

1(sin π

2q

)2 + 2

int q2

1

1(sin π

q x)2

where we use again the convexity of x 7rarr 1(sinx)2 (We can assume q gt 2 asotherwise we have no terms other than r = 0 1) Nowint q2

1

1(sin π

q x)2 dx =

q

π

int π2

πq

1

(sinu)2du =

q

πcot

π

q

1This is a change with respect to the first version of the preprint [Helb] The version of Lemma 411there has however the advantage of being immediately comparable to results in the literature

41 TRIGONOMETRIC SUMS 53

Hence sumyltnley+q

min

(A

C

(sinπαn)2

)le 2A+

C(sin π

2q

)2 + C middot 2q

πcot

π

q

Now by [AS64 (4368)] and [AS64 (4370)] for t isin (minusπ π)

t

sin t= 1 +

sumkge0

a2k+1t2k+2 = 1 +

t2

6+

t cot t = 1minussumkge0

b2k+1t2k+2 = 1minus t2

3minus t4

45minus

(42)

where a2k+1 ge 0 b2k+1 ge 0 Thus for t isin [0 t0] t0 lt π(t

sin t

)2

= 1 +t2

3+ c0(t)t4 le 1 +

t2

3+ c0(t0)t4 (43)

where

c0(t) =1

t4

((t

sin t

)2

minus(

1 +t2

3

))

which is an increasing function because a2k+1 ge 0 For t0 = π4 c0(t0) le 0074807Hence

t2

sin2 t+ t cot 2t le

(1 +

t2

3+ c0

(π4

)t4)

+

(1

2minus 2t2

3minus 8t4

45

)=

3

2minus t2

3+

(c0

(π4

)minus 8

45

)t4 le 3

2minus t2

3le 3

2

for t isin [0 π4]Therefore the left side of (41) is at most

2A+ C middot(

2q

π

)2

middot 3

2= 2A+

6

π2Cq2

The following is an alternative approach it yields the other estimate in (41) Webound the terms corresponding to r = 0 r = minus1 r = 1 by A each We let r = plusmnrprimefor rprime ranging from 2 to q2 We obtain that the sum is at most

3A+sum

2lerprimeleq2

min

A C(sin π

q

(rprime minus 1

2 minus qδ2))2

+

sum2lerprimeleq2

min

A C(sin π

q

(rprime minus 1

2 + qδ2))2

(44)

54 CHAPTER 4 TYPE I SUMS

We bound a term min(AC sin((πq)(rprime minus 12 plusmn qδ2))2) by A if and only ifC sin((πq)(rprimeminus 1plusmn qδ2))2 ge A (In other words we are choosing which of the twobounds A C| sin(παn)|2 on a case-by-case basis ie for each n instead of makinga single choice for all n in one go This is hardly anything deep but it does result ina marked improvement with respect to the literature and would give an improvementeven if we were given a bound B| sin(παn)| instead of a bound C| sin(παn)|2 asinput) The number of such terms is

le max(0 b(qπ) arcsin(radicCA)∓ qδ2c)

and thus at most (2qπ) arcsin(radicCA) in total (Recall that qδ2 le 12) Each

other term gets bounded by the integral of C sin2(παq) from rprime minus 1 plusmn qδ2 (ge(qπ) arcsin(

radicCA)) to rprime plusmn qδ2 by convexity Thus (44) is at most

3A+2q

πA arcsin

radicC

A+ 2

int q2

qπ arcsin

radicCA

C

sin2 πtq

dt

le 3A+2q

πA arcsin

radicC

A+

2q

πC

radicA

Cminus 1

We can easily show (taking derivatives) that arcsinx + x(1 minus x2) le 2x for 0 lex le 1 Setting x = CA we see that this implies that

3A+2q

πA arcsin

radicC

A+

2q

πC

radicA

Cminus 1 le 3A+

4q

π

radicAC

(If CA gt 1 then 3A + (4qπ)radicAC is greater than Aq which is an obvious upper

bound for the left side of (41))

Now we will see that if we take out terms with n divisible by q and n is not toolarge then we can give a bound that does not involve a constant term A at all (We arereferring to the bound (203π2)Cq2 below of course 2A + (4qπ)

radicAC does have

a constant term 2A ndash it is just smaller than the constant term 3A in the correspondingbound in (41))

Lemma 412 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any AC ge 0sum

y1ltnley2q-n

min

(A

C

| sin(παn)|2

)le min

(20

3π2Cq2 2A+

4q

π

radicAC

) (45)

Proof Clearly αn equals anq + (nQ)βq since y2 le Q2 this means that |αnminusanq| le 12q for n le y2 moreover again for n le y2 the sign of αnminus anq remainsconstant Hence the left side of (45) is at most

q2sumr=1

min

(A

C

(sin πq (r minus 12))2

)+

q2sumr=1

min

(A

C

(sin πq r)

2

)

41 TRIGONOMETRIC SUMS 55

Proceeding as in the proof of Lemma 411 we obtain a bound of at most

C

(1

(sin π2q )2

+1

(sin πq )2

+q

πcot

π

q+q

πcot

2q

)

for q ge 2 (If q = 1 then the left-side of (45) is trivially zero) Now by (42)

t2

(sin t)2+t

2cot 2t le

(1 +

t2

3+ c0

(π4

)t4)

+1

4

(1minus 4t2

3minus 16t4

45

)le 5

4+

(c0

(π4

)minus 4

45

)t4 le 5

4

for t isin [0 π4] and

t2

(sin t)2+ t cot

3t

2le(

1 +t2

3+ c0

(π2

)t4)

+2

3

(1minus 3t2

4minus 81t4

24 middot 45

)le 5

3+

(minus1

6+

(c0

(π2

)minus 27

360

)(π2

)2)t2 le 5

3

for t isin [0 π2] Hence(1

(sin π2q )2

+1

(sin πq )2

+q

πcot

π

q+q

πcot

2q

)le(

2q

π

)2

middot 54

+( qπ

)2

middot 53le 20

3π2q2

Alternatively we can follow the second approach in the proof of Lemma 411 andobtain an upper bound of 2A+ (4qπ)

radicAC

The following bound will be useful when the constant A in an application ofLemma 412 would be too large (This tends to happen for n small)

Lemma 413 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any BC ge 0

sumy1ltnley2

q-n

min

(B

| sin(παn)|

C

| sin(παn)|2

)le 2B

q

πmax

(2 log

Ce3q

) (46)

The upper bound le (2Bqπ) log(2e2qπ) is also valid

Proof As in the proof of Lemma 412 we can bound the left side of (46) by

2

q2sumr=1

min

(B

sin πq

(r minus 1

2

) C

sin2 πq

(r minus 1

2

))

56 CHAPTER 4 TYPE I SUMS

Assume B sin(πq) le C le B By the convexity of 1 sin(t) and 1 sin(t)2 fort isin (0 π2]

q2sumr=1

min

(B

sin πq

(r minus 1

2

) C

sin2 πq

(r minus 1

2

))

le B

sin π2q

+

int qπ arcsin C

B

1

B

sin πq tdt+

int q2

qπ arcsin C

B

1

sin2 πq tdt

le B

sin π2q

+q

π

(B

(log tan

(1

2arcsin

C

B

)minus log tan

π

2q

)+ C cot arcsin

C

B

)le B

sin π2q

+q

π

(B

(log cot

π

2qminus log

C

B minusradicB2 minus C2

)+radicB2 minus C2

)

Now for all t isin (0 π2)

2

sin t+

1

tlog cot t lt

1

tlog

(e2

t

)

we can verify this by comparing series Thus

B

sin π2q

+q

πB log cot

π

2qle B q

πlog

2e2q

π

for q ge 2 (If q = 1 the sum on the left of (46) is empty and so the bound we aretrying to prove is trivial) We also have

t log(tminusradict2 minus 1) +

radict2 minus 1 lt minust log 2t+ t (47)

for t ge 1 (as this is equivalent to log(2t2(1minusradic

1minus tminus2)) lt 1minusradic

1minus tminus2 which wecheck easily after changing variables to δ = 1minus

radic1minus tminus2) Hence

B

sin π2q

+q

π

(B

(log cot

π

2qminus log

C

B minusradicB2 minus C2

)+radicB2 minus C2

)le B q

πlog

2e2q

π+q

π

(B minusB log

2B

C

)le B q

πlog

Ce3q

for q ge 2Given any C we can apply the above with C = B instead as for any t gt 0

min(Bt Ct2) le Bt le min(BtBt2) (We refrain from applying (47) so as toavoid worsening a constant) If C lt B sinπq (or even if C lt (πq)B) we relax theinput to C = B sinπq and go through the above

42 Type I estimatesLet us give our first main type I estimate2 One of the main innovations is the mannerin which the ldquomain termrdquo (m divisible by q) is separated we are able to keep error

2The current version of Lemma 421 is an improvement over that included in the first version of thepreprint [Helb]

42 TYPE I ESTIMATES 57

terms small thanks to the particular way in which we switch between two differentapproximations

(These are not necessarily successive approximations in the sense of continuedfractions we do not want to assume that the approximation aq we are given arisesfrom a continued fraction and at any rate we need more control on the denominator qprime

of the new approximation aprimeqprime than continued fractions would furnish)The following lemma is a theme so to speak to which several variations will be

given Later in practice we will always use one of the variations rather than theoriginal lemma itself This is so just because even though (48) is the basic type ofsum we treat in type I the sums that we will have to estimate in practice will alwayspresent some minor additional complication Proving the lemma we are about to givein full will give us a chance to see all the main ideas at work leaving complications forlater

Lemma 421 Let α = aq+ δx (a q) = 1 |δx| le 1qQ0 q le Q0 Q0 ge 16 Letη be continuous piecewise C2 and compactly supported with |η|1 = 1 and ηprimeprime isin L1Let c0 ge |ηprimeprime|infin

Let 1 le D le x Then if |δ| le 12c2 where c2 = (3π5radicc0)(1 +

radic133) the

absolute value of summleD

micro(m)sumn

e(αmn)η(mnx

)(48)

is at most

x

qmin

(1

c0(2πδ)2

) ∣∣∣∣∣∣∣∣∣∣summleMq

(mq)=1

micro(m)

m

∣∣∣∣∣∣∣∣∣∣+Olowast

(c0

(1

4minus 1

π2

)(D2

2xq+D

2x

))(49)

plus

2radicc0c1π

D + 3c1x

qlog+ D

c2xq+

radicc0c1π

q log+ D

q2

+|ηprime|1π

q middotmax

(2 log

c0e3q2

4π|ηprime|1x

)+

(2radic

3c0c1π

+3c1c2

+55c0c212π2

)q

(410)

where c1 = 1 + |ηprime|1(2xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

In general if |δ| ge 12c2 the absolute value of (48) is at most (49) plus

2radicc0c1π

(D + (1 + ε) min

(lfloorx

|δ|q

rfloor+ 1 2D

)($ε +

1

2log+ 2D

x|δ|q

))

+ 3c1

(2 +

(1 + ε)

εlog+ 2D

x|δ|q

)x

Q0+

35c0c26π2

q

(411)

for ε isin (0 1] arbitrary where $ε =radic

3 + 2ε+ ((1 +radic

133)4minus 1)(2(1 + ε))

58 CHAPTER 4 TYPE I SUMS

In (49) min(1 c0(2πδ)2) always equals 1 when |δ| le 12c2 (since (35)(1 +radic

133) gt 1)

Proof Let Q = bx|δq|c Then α = aq + Olowast(1qQ) and q le Q (If δ = 0 welet Q = infin and ignore the rest of the paragraph since then we will never need Qprime orthe alternative approximation aprimeqprime) Let Qprime = d(1 + ε)Qe ge Q + 1 Then α is notaq + Olowast(1qQprime) and so there must be a different approximation aprimeqprime (aprime qprime) = 1qprime le Qprime such that α = aprimeqprime + Olowast(1qprimeQprime) (since such an approximation alwaysexists) Obviously |aq minus aprimeqprime| ge 1qqprime yet at the same time |aq minus aprimeqprime| le1qQ+ 1qprimeQprime le 1qQ+ 1((1 + ε)qprimeQ) Hence qprimeQ+ q((1 + ε)Q) ge 1 and soqprime ge Qminusq(1+ε) ge (ε(1+ε))Q (Note also that (ε(1+ε))Q ge (2|δq|x)middotbxδqc gt1 and so qprime ge 2)

Lemma 412 will enable us to treat separately the contribution from terms withm divisible by q and m not divisible by q provided that m le Q2 Let M =min(Q2 D) We start by considering all terms with m le M divisible by q Thene(αmn) equals e((δmx)n) By Poisson summation

sumn

e(αmn)η(mnx) =sumn

f(n)

where f(u) = e((δmx)u)η((mx)u) Now

f(n) =

inte(minusun)f(u)du =

x

m

inte((δ minus xn

m

)u)η(u)du =

x

mη( xmnminus δ

)

By assumption m le M le Q2 le x2|δq| and so |xm| ge 2|δq| ge 2δ Thus by(21) (with k = 2)

sumn

f(n) =x

m

η(minusδ) +sumn 6=0

η(nxmminus δ)

=x

m

η(minusδ) +Olowast

sumn6=0

1(2π(nxm minus δ

))2 middot ∣∣∣ηprimeprime∣∣∣

infin

=

x

mη(minusδ) +

m

x

c0(2π)2

Olowast

max|r|le 1

2

sumn 6=0

1

(nminus r)2

(412)

Since x 7rarr 1x2 is convex on R+

max|r|le 1

2

sumn 6=0

1

(nminus r)2=sumn 6=0

1(nminus 1

2

)2 = π2 minus 4

42 TYPE I ESTIMATES 59

Therefore the sum of all terms with m leM and q|m issummleMq|m

x

mη(minusδ) +

summleMq|m

m

x

c0(2π)2

(π2 minus 4)

=xmicro(q)

qmiddot η(minusδ) middot

summleMq

(mq)=1

micro(m)

m

+Olowast(micro(q)2c0

(1

4minus 1

π2

)(D2

2xq+D

2x

))

We will bound |η(minusδ)| by (21)As we have just seen estimating the contribution of the terms with m divisible by

q and not too large (m le M ) involves isolating a main term estimating it carefully(with cancellation) and then bounding the remaining error terms

We will now bound the contribution of all other m ndash that is m not divisible by qand m larger than M Cancellation will now be used only within the inner sum thatis we will bound each inner sum

Tm(α) =sumn

e(αmn)η(mnx

)

and then we will carefully consider how to bound sums of |Tm(α)| over m efficientlyBy (22) and Lemma 231

|Tm(α)| le min

(x

m+

1

2|ηprime|1

12 |ηprime|1

| sin(πmα)|m

x

c04

1

(sinπmα)2

) (413)

For any y2 gt y1 gt 0 with y2 minus y1 le q and y2 le Q2 (413) gives us thatsumy1ltmley2

q-m

|Tm(α)| lesum

y1ltmley2q-m

min

(A

C

(sinπmα)2

)(414)

for A = (xy1)(1 + |ηprime|1(2(xy1))) and C = (c04)(y2x) We must now estimatethe sum sum

mleMq-m

|Tm(α)|+sum

Q2 ltmleD

|Tm(α)| (415)

To bound the terms with m le M we can use Lemma 412 The question is thenwhich one is smaller the first or the second bound given by Lemma 412 A briefcalculation gives that the second bound is smaller (and hence preferable) exactly whenradicCA gt (3π10q)(1 +

radic133) Since

radicCA sim (

radicc02)mx this means that

it is sensible to prefer the second bound in Lemma 412 when m gt c2xq wherec2 = (3π5

radicc0)(1 +

radic133)

It thus makes sense to ask does Q2 le c2xq (so that m le M implies m lec2xq) This question divides our work into two basic cases

60 CHAPTER 4 TYPE I SUMS

Case (a) δ large |δ| ge 12c2 where c2 = (3π5radicc0)(1 +

radic133) Then

Q2 le c2xq this will induce us to bound the first sum in (415) by the first bound inLemma 412

Recall that M = min(Q2 D) and so M le c2xq By (414) and Lemma 412

sum1lemleMq-m

|Tm(α)| leinfinsumj=0

sumjqltmlemin((j+1)qM)

q-m

min

(x

jq + 1+|ηprime|1

2

c04

(j+1)qx

(sinπmα)2

)

le 20

3π2

c0q3

4x

sum0lejleMq

(j + 1) le 20

3π2

c0q3

4xmiddot(

1

2

M2

q2+

3

2

c2x

q2+ 1

)

le 5c0c26π2

M +5c0q

3π2

(3

2c2 +

q2

x

)le 5c0c2

6π2M +

35c0c26π2

q

(416)where to bound the smaller terms we are using the inequality Q2 le c2xq andwhere we are also using the observation that since |δx| le 1qQ0 the assumption|δ| ge 12c2 implies that q le 2c2xQ0 moreover since q le Q0 this gives us thatq2 le 2c2x In the main term we are bounding qM2x from above by M middot qQ2x leM2δ le c2M

If D le (Q + 1)2 then M ge bDc and so (416) is all we need the second sumin (415) is empty Assume from now on that D gt (Q+ 1)2 The first sum in (415)is then bounded by (416) (with M = Q2) To bound the second sum in (415) wewill use the approximation aprimeqprime instead of aq The motivation is the following ifwe used the approximation aq even for m gt Q2 the contribution of the terms withq|m would be too large When we use aprimeqprime the contribution of the terms with qprime|m(or m equiv plusmn1 mod qprime) is very small only a fraction 1qprime (tiny since qprime is large) of allterms are like that and their individual contribution is always small precisely becausem gt Q2

By (414) (without the restriction q - m on either side) and Lemma 411

sumQ2ltmleD

|Tm(α)| leinfinsumj=0

sumjqprime+Q

2 ltmlemin((j+1)qprime+Q2D)

|Tm(α)|

le

lfloorDminus(Q+1)2

qprime

rfloorsumj=0

(3c1

x

jqprime + Q+12

+4qprime

π

radicc1c0

4

x

jqprime + (Q+ 1)2

(j + 1)qprime +Q2

x

)

le

lfloorDminus(Q+1)2

qprime

rfloorsumj=0

(3c1

x

jqprime + Q+12

+4qprime

π

radicc1c0

4

(1 +

qprime

jqprime + (Q+ 1)2

))

where we recall that c1 = 1 + |ηprime|1(2xD) Since qprime ge (ε(1 + ε))QlfloorDminus(Q+1)2

qprime

rfloorsumj=0

x

jqprime + Q+12

le x

Q2+x

qprime

int D

Q+12

1

tdt le 2x

Q+

(1 + ε)x

εQlog+ D

Q+12

(417)

42 TYPE I ESTIMATES 61

Recall now that qprime le (1 + ε)Q+ 1 le (1 + ε)(Q+ 1) Therefore

qprimebDminus(Q+1)2

qprime csumj=0

radic1 +

qprime

jqprime + (Q+ 1)2le qprime

radic1 +

(1 + ε)Q+ 1

(Q+ 1)2+

int D

Q+12

radic1 +

qprime

tdt

le qprimeradic

3 + 2ε+

(D minus Q+ 1

2

)+qprime

2log+ D

Q+12

(418)We conclude that

sumQ2ltmleD |Tm(α)| is at most

2radicc0c1π

(D +

((1 + ε)

radic3 + 2εminus 1

2

)(Q+ 1) +

(1 + ε)Q+ 1

2log+ D

Q+12

)

+ 3c1

(2 +

(1 + ε)

εlog+ D

Q+12

)x

Q

(419)We sum this to (416) (with M = Q2) and obtain that (415) is at most

2radicc0c1π

(D + (1 + ε)(Q+ 1)

($ε +

1

2log+ D

Q+12

))

+ 3c1

(2 +

(1 + ε)

εlog

DQ+1

2

)x

Q+

35c0c26π2

q

(420)

where we are bounding

5c0c26π2

=5c06π2

5radicc0

(1 +

radic13

3

)=

radicc0

(1 +

radic13

3

)le

2radicc0c1π

middot 14

(1 +

radic13

3

)(421)

and defining

$ε =radic

3 + 2ε+

(1

4

(1 +

radic13

3

)minus 1

)1

2(1 + ε) (422)

(Note that $ε ltradic

3 for ε lt 01741) A quick check against (416) shows that (420)is valid also when D le Q2 even when Q + 1 is replaced by min(Q + 1 2D) Webound Q from above by x|δ|q and log+D((Q + 1)2) by log+ 2D(x|δ|q + 1)and obtain the result

Case (b) |δ| small |δ| le 12c2 or D le Q02 Then min(c2xqD) le Q2 Westart by bounding the first q2 terms in (415) by (413) and Lemma 413sum

mleq2

|Tm(α)| lesum

mleq2

min

( 12 |ηprime|1

| sin(πmα)|

c0q8x

| sin(πmα)|2

)

le |ηprime|1π

qmax

(2 log

c0e3q2

4π|ηprime|1x

)

(423)

62 CHAPTER 4 TYPE I SUMS

If q2 lt 2c2x we estimate the terms with q2 lt m le c2xq by Lemma 412which is applicable because min(c2xqD) lt Q2

sumq2ltmleDprime

q-m

|Tm(α)| leinfinsumj=1

sum(jminus 1

2 )qltmle(j+ 12 )q

mlemin( c2xq D)q-m

min

(x(

j minus 12

)q

+|ηprime1|2c04

(j+12)qx

(sinπmα)2

)

le 20

3π2

c0q3

4x

sum1lejleDprimeq + 1

2

(j +

1

2

)le 20

3π2

c0q3

4x

(c2x

2q2

Dprime

q+

3

2

(c2x

q2

)+

5

8

)

le 5c06π2

(c2D

prime + 3c2q +5

4

q3

x

)le 5c0c2

6π2

(Dprime +

11

2q

)

(424)where we write Dprime = min(c2xqD) If c2xq ge D we stop here Assume thatc2xq lt D Let R = max(c2xq q2) The terms we have already estimated areprecisely those with m le R We bound the terms R lt m le D by the second boundin Lemma 411sum

RltmleD

|Tm(α)| leinfinsumj=0

summgtjq+R

mlemin((j+1)q+RD)

min

(c1x

jq +Rc04

(j+1)q+Rx

(sinπmα)2

)

leb 1q (DminusR)csumj=0

3c1x

jq +R+

4q

π

radicc1c0

4

(1 +

q

jq +R

) (425)

(Note there is no need to use two successive approximations aq aprimeqprime as in case (a)We are also including all terms with m divisible by q as we may since |Tm(α)| isnon-negative) Now much as before

b 1q (DminusR)csumj=0

x

jq +Rle x

R+x

q

int D

R

1

tdt le min

(q

c2

2x

q

)+x

qlog+ D

c2xq (426)

andb 1q (DminusR)csumj=0

radic1 +

q

jq +Rleradic

1 +q

R+

1

q

int D

R

radic1 +

q

tdt

leradic

3 +D minusRq

+1

2log+ D

q2

(427)

We sum with (423) and (424) and we obtain that (415) is at most

2radicc0c1π

(radic3q +D +

q

2log+ D

q2

)+

(3c1 log+ D

c2xq

)x

q

+ 3c1 min

(q

c2

2x

q

)+

55c0c212π2

q +|ηprime|1π

q middotmax

(2 log

c0e3q2

4π|ηprime|1x

)

(428)

42 TYPE I ESTIMATES 63

where we are using the fact that 5c0c26π2 lt 2

radicc0c1π to make sure that the term

(5c0c26π2)Dprime from (424) is more than compensated by the termminus2

radicc0c1Rπ com-

ing from minusRq in (427) (by the definition of Dprime and R we have R ge D) We canalso use 5c0c26π

2 lt 2radicc0c1π to bound the term (5c0c26π

2)Dprime from (424) by theterm 2

radicc0c1Dπ in (428) in case c2xq ge D (Again by definition Dprime le D) Thus

(428) is valid both when c2xq lt D and when c2xq ge D

421 Type I variationsWe will need a version of Lemma 421 with m and n restricted to the odd numbers(We will barely be using the restriction of m whereas the restriction on n is both (a)slightly harder to deal with (b) something that can be turned to our advantage)

Lemma 422 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge 16 Let η be continuous piecewise C2 and compactly supported with|η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin

Let 1 le D le x Then if |δ| le 12c2 where c2 = 6π5radicc0 the absolute value ofsum

mleDm odd

micro(m)sumn odd

e(αmn)η(mnx

)(429)

is at most

x

2qmin

(1

c0(πδ)2

) ∣∣∣∣∣∣∣∣∣∣summleMq

(m2q)=1

micro(m)

m

∣∣∣∣∣∣∣∣∣∣+Olowast

(c0q

x

(1

8minus 1

2π2

)(D

q+ 1

)2)

(430)

plus

2radicc0c1π

D +3c12

x

qlog+ D

c2xq+

radicc0c1π

q log+ D

q2

+2|ηprime|1π

q middotmax

(1 log

c0e3q2

4π|ηprime|1x

)+

(2radic

3c0c1π

+3c12c2

+55c0c2

6π2

)q

(431)

where c1 = 1 + |ηprime|1(xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

In general if |δ| ge 12c2 the absolute value of (48) is at most (430) plus

2radicc0c1π

(D + (1 + ε) min

(lfloorx

|δ|q

rfloor+ 1 2D

)(radic3 + 2ε+

1

2log+ 2D

x|δ|q

))

+3

2c1

(2 +

(1 + ε)

εlog+ 2D

x|δ|q

)x

Q0+

35c0c23π2

q

(432)for ε isin (0 1] arbitrary

64 CHAPTER 4 TYPE I SUMS

If q is even the sum (430) can be replaced by 0

Proof The proof is almost exactly that of Lemma 421 we go over the differencesThe parameters Q Qprime aprime qprime and M are defined just as before (with 2α wherever wehad α)

Let us first consider m le M odd and divisible by q (Of course this case arisesonly if q is odd) For n = 2r + 1

e(αmn) = e(αm(2r + 1)) = e(2αrm)e(αm)

= e

xrm

)e

((a

2q+

δ

2x+κ

2

)m

)= e

(δ(2r + 1)

2xm

)e

(a+ κq

2

m

q

)= κprimee

(δ(2r + 1)

2xm

)

where κ isin 0 1 and κprime = e((a + κq)2) isin minus1 1 are independent of m and nHence by Poisson summationsum

n odd

e(αmn)η(mnx) = κprimesumn odd

e((δm2x)n)η(mnx)

=κprime

2

(sumn

f(n)minussumn

f(n+ 12)

)

(433)

where f(u) = e((δm2x)u)η((mx)u) Now

f(t) =x

(x

mtminus δ

2

)

Just as before |xm| ge 2|δq| ge 2δ Thus

1

2

∣∣∣∣∣sumn

f(n)minussumn

f(n+ 12)

∣∣∣∣∣ le x

m

1

2

∣∣∣∣η(minusδ2)∣∣∣∣+

1

2

sumn 6=0

∣∣∣∣η( xm n

2minus δ

2

)∣∣∣∣

=x

m

1

2

∣∣∣∣η(minusδ2)∣∣∣∣+

1

2middotOlowast

sumn 6=0

1(π(nxm minus δ

))2 middot ∣∣∣ηprimeprime∣∣∣

infin

=

x

2m

∣∣∣∣η(minusδ2)∣∣∣∣+

m

x

c02π2

(π2 minus 4)x

(434)The contribution of the second term in the last line of (434) issum

mleMm oddq|m

m

x

c02π2

(π2 minus 4) =q

x

c02π2

(π2 minus 4) middotsum

mleMq

m odd

m

=qc0x

(1

8minus 1

2π2

)(M

q+ 1

)2

42 TYPE I ESTIMATES 65

Hence the absolute value of the sum of all terms with m le M and q|m is given by(430)

We define Tm(α) by

Tm(α) =sumn odd

e(αmn)η(mnx

) (435)

Changing variables by n = 2r + 1 we see that

|Tm(α)| =

∣∣∣∣∣sumr

e(2α middotmr)η(m(2r + 1)x)

∣∣∣∣∣ Hence instead of (413) we get that

|Tm(α)| le min

(x

2m+

1

2|ηprime|1

12 |ηprime|1

| sin(2πmα)|m

x

c02

1

(sin 2πmα)2

) (436)

We obtain (414) but with Tm instead of Tm A = (x2y1)(1 + |ηprime|1(xy1)) andC = (c02)(y2x) and so c1 = 1 + |ηprime|1(xD)

The rest of the proof of Lemma 421 carries almost over word-by-word (For thesake of simplicity we do not really try to take advantage of the odd support of mhere) Since C has doubled it would seem to make sense to reset the value of c2 to bec2 = (3π5

radic2c0)(1 +

radic133) this would cause complications related to the fact that

5c0c23π2 would become larger than 2

radicc0π and so we set c2 to the slightly smaller

value c2 = 6π5radicc0 instead This implies

5c0c23π2

=2radicc0π

(437)

The bound from (416) gets multiplied by 2 (but the value of c2 has changed) thesecond line in (419) gets halved (421) gets replaced by (437) the second term inthe maximum in the second line of (423) gets doubled the bound from (424) getsdoubled and the bound from (426) gets halved

We will also need a version of Lemma 421 (or rather Lemma 422 we will decideto work with the restriction that n and m be odd) with a factor of (log n) within theinner sum This is the sum SI1 in (39)

Lemma 423 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(16 2

radicx) Let η be continuous piecewise C2 and compactly

supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Assume that for any ρ ge ρ0ρ0 a constant the function η(ρ)(t) = log(ρt)η(t) satisfies

|η(ρ)|1 le log(ρ)|η|1 |ηprime(ρ)|1 le log(ρ)|ηprime|1 |ηprimeprime(ρ)|infin le c0 log(ρ) (438)

Letradic

3 le D le min(xρ0 xe) Then if |δ| le 12c2 where c2 = 6π5radicc0 the

absolute value of summleDm odd

micro(m)sumn

n odd

(log n)e(αmn)η(mnx

)(439)

66 CHAPTER 4 TYPE I SUMS

is at most

x

qmin

(1c0δ

2

(2π)2

) ∣∣∣∣∣∣∣∣∣∣summleMq

(mq)=1

micro(m)

mlog

x

mq

∣∣∣∣∣∣∣∣∣∣+x

q|log middotη(minusδ)|

∣∣∣∣∣∣∣∣∣∣summleMq

(mq)=1

micro(m)

m

∣∣∣∣∣∣∣∣∣∣+Olowast

(c0

(1

2minus 2

π2

)(D2

4qxlog

e12x

D+

1

e

)) (440)

plus

2radicc0c1π

D logex

D+

3c12

x

qlog+ D

c2xqlog

q

c2

+

(2|ηprime|1π

max

(1 log

c0e3q2

4π|ηprime|1x

)log x+

2radicc0c1π

(radic3 +

1

2log+ D

q2

)log

q

c2

)q

+3c12

radic2x

c2log

2x

c2+

20c0c322

3π2

radic2x log

2radicex

c2(441)

for c1 = 1 + |ηprime|1(xD) The same bound holds if |δ| ge 12c2 but D le Q02In general if |δ| ge 12c2 the absolute value of (439) is at most

2radicc0c1π

D logex

D+

2radicc0c1π

(1 + ε)

(x

|δ|q+ 1

)(radic3 + 2ε middot log+ 2

radice|δ|q +

1

2log+ 2D

x|δ|q

log+ 2|δ|q

)

+

(3c14

(2radic5

+1 + ε

2εlog x

)+

40

3

radic2c0c

322

)radicx log x

(442)for ε isin (0 1]

Proof DefineQQprimeM aprime and qprime as in the proof of Lemma 421 The same method ofproof works as for Lemma 421 we go over the differences When applying Poissonsummation or (22) use η(xm)(t) = (log xtm)η(t) instead of η(t) Then use thebounds in (438) with ρ = xm in particular

|ηprimeprime(xm)|infin le c0 logx

m

For f(u) = e((δm2x)u)(log u)η((mx)u)

f(t) =x

mη(xm)

(x

mtminus δ

2

)

42 TYPE I ESTIMATES 67

and so

1

2

sumn

∣∣∣f(n2)∣∣∣ le x

m

1

2

∣∣∣∣η(xm)

(minusδ

2

)∣∣∣∣+1

2

sumn 6=0

∣∣∣∣η( xm n

2minus δ

2

)∣∣∣∣

=1

2

x

m

(log middotη

(minusδ

2

)+ log

( xm

(minusδ

2

))+m

x

(log

x

m

) c02π2

(π2 minus 4)

The part of the main term involving log(xm) becomes

xη(minusδ)2

summleMm oddq|m

micro(m)

mlog( xm

)=xmicro(q)

qη(minusδ) middot

summleMq

(m2q)=1

micro(m)

mlog

(x

mq

)

for q odd (We can see that this like the rest of the main term vanishes for m even)In the term in front of π2 minus 4 we find the sum

summleMm oddq|m

m

xlog( xm

)le M

xlog

x

M+q

2

int Mq

0

t logxq

tdt

=M

xlog

x

M+M2

4qxlog

e12x

M

where we use the fact that t 7rarr t log(xt) is increasing for t le xe By the same fact(and by M le D) (M2q) log(e12xM) le (D2q) log(e12xD) It is also easy tosee that (Mx) log(xM) le 1e (since M le D le x)

The basic estimate for the rest of the proof (replacing (413)) is

Tm(α) =sumn odd

e(αmn)(log n)η(mnx

)=sumn odd

e(αmn)η(xm)

(mnx

)

= Olowast

min

x

2m|η(xm)|1 +

|ηprime(xm)|12

12 |ηprime(xm)|1

| sin(2πmα)|m

x

12 |ηprimeprime(xm)|infin

(sin 2πmα)2

= Olowast

(log

x

mmiddotmin

(x

2m+|ηprime|1

2

12 |ηprime|1

| sin(2πmα)|m

x

c02

1

(sin 2πmα)2

))

We wish to bound summleMq-mm odd

|Tm(α)|+sum

Q2 ltmleD

|Tm(α)| (443)

Just as in the proofs of Lemmas 421 and 422 we give two bounds one valid for|δ| large (|δ| ge 12c2) and the other for δ small (|δ| le 12c2) Again as in the proofof Lemma 422 we ignore the condition that m is odd in (415)

68 CHAPTER 4 TYPE I SUMS

Consider the case of |δ| large first Instead of (416) we havesum1lemleMq-m

|Tm(α)| le 40

3π2

c0q3

2x

sum0lejleMq

(j + 1) logx

jq + 1 (444)

Since sum0lejleMq

(j + 1) logx

jq + 1

le log x+M

qlog

x

M+

sum1lejleMq

logx

jq+

sum1lejleMq minus1

j logx

jq

le log x+M

qlog

x

M+

int Mq

0

logx

tqdt+

int Mq

1

t logx

tqdt

le log x+

(2M

q+M2

2q2

)log

e12x

M

this means thatsum1lemleMq-m

|Tm(α)| le 40

3π2

c0q3

4x

(log x+

(2M

q+M2

2q2

)log

e12x

M

)

le 5c0c23π2

M log

radicex

M+

40

3

radic2c0c

322

radicx log x

(445)

where we are using the bounds M le Q2 le c2xq and q2 le 2c2x (just as in (416))Instead of (417) we havelfloor

Dminus(Q+1)2

qprime

rfloorsumj=0

(log

x

jqprime + Q+12

)x

jqprime + Q+12

le x

Q2log

2x

Q+x

qprime

int D

Q+12

logx

t

dt

t

le 2x

Qlog

2x

Q+x

qprimelog

2x

Qlog+ 2D

Q

recall that the coefficient in front of this sum will be halved by the condition that n isodd Instead of (418) we obtain

qprimebDminus(Q+1)2

qprime csumj=0

radic1 +

qprime

jqprime + (Q+ 1)2

(log

x

jqprime + Q+12

)

le qprimeradic

3 + 2ε middot log2x

Q+ 1+

int D

Q+12

(1 +

qprime

2t

)(log

x

t

)dt

le qprimeradic

3 + 2ε middot log2x

Q+ 1+D log

ex

D

minus Q+ 1

2log

2ex

Q+ 1+qprime

2log

2x

Q+ 1log

2D

Q+ 1

42 TYPE I ESTIMATES 69

(The boundint ba

log(xt)dtt le log(xa) log(ba) will be more practical than the exactexpression for the integral) Hence

sumQ2ltmleD |Tm(α)| is at most

2radicc0c1π

D logex

D

+2radicc0c1π

((1 + ε)

radic3 + 2ε+

(1 + ε)

2log

2D

Q+ 1

)(Q+ 1) log

2x

Q+ 1

minus2radicc0c1π

middot Q+ 1

2log

2ex

Q+ 1+

3c12

(2radic5

+1 + ε

εlog+ D

Q2

)radicx log

radicx

Summing this to (445) (with M = Q2) and using (421) and (422) as before weobtain that (443) is at most

2radicc0c1π

D logex

D

+2radicc0c1π

(1 + ε)(Q+ 1)

(radic3 + 2ε log+ 2

radicex

Q+ 1+

1

2log+ 2D

Q+ 1log+ 2x

Q+ 1

)+

3c12

(2radic5

+1 + ε

εlog+ D

Q2

)radicx log

radicx+

40

3

radic2c0c

322

radicx log x

Now we go over the case of |δ| small (or D le Q02) Instead of (423) we havesummleq2

|Tm(α)| le 2|ηprime|1π

qmax

(1 log

c0e3q2

4π|ηprime|1x

)log x (446)

Suppose q2 lt 2c2x (Otherwise the sum we are about to estimate is empty) Insteadof (424) we havesumq2ltmleDprime

q-m

|Tm(α)| le 40

3π2

c0q3

6x

sum1lejleDprimeq + 1

2

(j +

1

2

)log

x(j minus 1

2

)q

le 10c0q3

3π2x

(log

2x

q+

1

q

int Dprime

0

logx

tdt+

1

q

int Dprime

0

t logx

tdt+

Dprime

qlog

x

Dprime

)

=10c0q

3

3π2x

(log

2x

q+

(2Dprime

q+

(Dprime)2

2q2

)log

radicex

Dprime

)le 5c0c2

3π2

(4radic

2c2x log2x

q+ 4radic

2c2x log

radicex

Dprime+Dprime log

radicex

Dprime

)le 5c0c2

3π2

(Dprime log

radicex

Dprime+ 4radic

2c2x log2radicex

c2

)(447)

where Dprime = min(c2xqD) (We are using the bounds q3x le (2c2)32 Dprimeq2x lec2q lt c

322

radic2x and Dprimeqx le c2) Instead of (425) we have

sumRltmleD

|Tm(α)| lebDminusRq csumj=0

(3c12 x

jq +R+

4q

π

radicc1c0

4

(1 +

q

jq +R

))log

x

jq +R

70 CHAPTER 4 TYPE I SUMS

where R = max(c2xq q2) We can simply reuse (426) multiplying it by log xRthe only difference is that now we take care to bound min(qc2 2xq) by the geometricmean

radic(qc2)(2xq) =

radic2xc2 We replace (427) by

b 1q (DminusR)csumj=0

radic1 +

q

jq +Rlog

x

jq +Rleradic

1 +q

Rlog

x

R+

1

q

int D

R

radic1 +

q

tlog

x

tdt

leradic

3 logq

c2+

(D

qlog

ex

Dminus R

qlog

ex

R

)+

1

2log

q

c2log+ D

R

(448)We sum with (446) and (447) and obtain (441) as an upper bound for (443) (Just asin the proof of Lemma 421 the term (5c0c2(3π

2))Dprime log(radicexDprime) is smaller than

the term (2radicc1c0π)R log exR in (448) and thus gets absorbed by it when D gt R

If D le R then again as in Lemma 421 the sumsumRltmleD |Tm(α)| is empty and

we bound (5c0c2(3π2))Dprime log(

radicexDprime) by the term (2

radicc1c0π)D log exD which

would not appear otherwise)

Now comes the time to focus on our second type I sum namelysumvleVv odd

Λ(v)sumuleUu odd

micro(u)sumn

n odd

e(αvun)η(vunx)

which corresponds to the term SI2 in (39) The innermost two sums on their ownare a sum of type I we have already seen Accordingly for q small we will be able tobound them using Lemma 422 If q is large then that approach does not quite worksince then the approximation avq to vα is not always good enough (As we shall latersee we need q le Qv for the approximation to be sufficiently close for our purposes)

Fortunately when q is large we can also afford to lose a factor of log since thegains from q will be large Here is the estimate we will use for q large

Lemma 424 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(2e 2

radicx) Let η be continuous piecewise C2 and compactly

supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Let c2 = 6π5radicc0 Assume

that x ge e2c22Let U V ge 1 satisfy UV +(1918)Q0 le x56 Then if |δ| le 12c2 the absolute

value of ∣∣∣∣∣∣∣∣sumvleVv odd

Λ(v)sumuleUu odd

micro(u)sumn

n odd

e(αvun)η(vunx)

∣∣∣∣∣∣∣∣ (449)

is at most

x

2qmin

(1

c0(πδ)2

)log V q

+Olowast(

1

4minus 1

π2

)middot c0(D2 log V

2qx+

3c42

UV 2

x+

(U + 1)2V

2xlog q

) (450)

42 TYPE I ESTIMATES 71

plus

2radicc0c1π

(D log

Dradice

+ q

(radic3 log

c2x

q+

logD

2log+ D

q2

))+

3c12

x

qlogD log+ D

c2xq+

2|ηprime|1π

qmax

(1 log

c0e3q2

4π|ηprime|1x

)log

q

2

+3c1

2radic

2c2

radicx log

c2x

2+

25c04π2

(2c2)32radicx log x

(451)

whereD = UV and c1 = 1+ |ηprime|1(2xD) and c4 = 103884 The same bound holdsif |δ| ge 12c2 but D le Q02

In general if |δ| ge 12c2 the absolute value of (449) is at most (450) plus

2radicc0c1π

D logD

e

+2radicc0c1π

(1 + ε)

(x

|δ|q+ 1

)((radic

3 + 2εminus 1) log

x|δ|q + 1radic

2+

1

2logD log+ e2D

x|δ|q

)

+

(3c12

(1

2+

3(1 + ε)

16εlog x

)+

20c03π2

(2c2)32

)radicx log x

(452)for ε isin (0 1]

Proof We proceed essentially as in Lemma 421 and Lemma 422 Let Q qprime and Qprime

be as in the proof of Lemma 422 that is with 2α where Lemma 421 uses αLet M = min(UVQ2) We first consider the terms with uv le M u and v odd

uv divisible by q If q is even there are no such terms Assume q is odd Then by(433) and (434) the absolute value of the contribution of these terms is at most

sumaleMa oddq|a

sumv|a

aUlevleV

Λ(v)micro(av)

(xη(minusδ2)

2a+O

(a

x

|ηprimeprime|infin2π2

middot (π2 minus 4)

)) (453)

Now

sumaleMa oddq|a

sumv|a

aUlevleV

Λ(v)micro(av)

a

=sumvleVv odd

(vq)=1

Λ(v)

v

sumulemin(UMV )

u oddq|u

micro(u)

u+sumpαleVp oddp|q

Λ(pα)

sumulemin(UMV )

u oddq

(qpα)|u

micro(u)

u

72 CHAPTER 4 TYPE I SUMS

which equals

micro(q)

q

sumvleVv odd

(vq)=1

Λ(v)

v

sumulemin(UqMV q)

(u2q)=1

micro(u)

u

+micro(

q(qpα)

)q

sumpαleVp oddp|q

Λ(pα)

pα(q pα)

sumulemin( U

q(qpα)MV

q(qpα) )u odd

(u q(qpα) )=1

micro(u)

u

=1

qmiddotOlowast

sumvleV

(v2q)=1

Λ(v)

v+sumpαleVp oddp|q

log p

pα(q pα)

where we are using (220) to bound the sums on u by 1 We notice that

sumpαleVp oddp|q

log p

pα(q pα)lesump oddp|q

(log p)

vp(q) +sum

αgtvp(q)

pαleV

1

pαminusvp(q)

le log q +

sump oddp|q

(log p)sumβgt0

pβle V

pvp(q)

log p

pβle log q +

sumvleVv odd

(vq)=1

Λ(v)

v

and so

sumaleMa oddq|a

sumv|a

aUlevleV

Λ(v)micro(av)

a=

1

qmiddotOlowast

log q +sumvleV

(v2)=1

Λ(v)

v

=

1

qmiddotOlowast(log q + log V )

by (212) The absolute value of the sum of the terms with η(minusδ2) in (453) is thus atmost

x

q

η(minusδ2)

2(log q + log V ) le x

2qmin

(1

c0(πδ)2

)log V q

where we are bounding η(minusδ2) by (21) (with k = 2)

42 TYPE I ESTIMATES 73

The other terms in (453) contribute at most

(π2 minus 4)|ηprimeprime|infin2π2

1

x

sumuleU

sumvleV

uv odduvleM q|uvu sq-free

Λ(v)uv (454)

For any RsumuleRu oddq|u le R24q + 3R4 Using the estimates (212) (215)

and (216) we obtain that the double sum in (454) is at mostsumvleV

(v2q)=1

Λ(v)vsum

ulemin(UMv)

u oddq|u

u+sumpαleVp oddp|q

(log p)pαsumuleUu oddq

(qpα)|u

u

lesumvleV

(v2q)=1

Λ(v)v middot(

(Mv)2

4q+

3M

4v

)+sumpαleVp oddp|q

(log p)pα middot (U + 1)2

4

le M2 log V

4q+

3c44MV +

(U + 1)2

4V log q

(455)

where c4 = 103884From this point onwards we use the easy bound∣∣∣∣∣∣∣∣∣

sumv|a

aUlevleV

Λ(v)micro(av)

∣∣∣∣∣∣∣∣∣ le log a

What we must bound now issummleUVm odd

q - m orm gt M

(logm)sumn odd

e(αmn)η(mnx) (456)

The inner sum is the same as the sum Tm(α) in (435) we will be using the bound(436) Much as before we will be able to ignore the condition that m is odd

Let D = UV What remains to do is similar to what we did in the proof of Lemma421 (or Lemma 422)

Case (a) δ large |δ| ge 12c2 Instead of (416) we have

sum1lemleMq-m

(logm)|Tm(α)| le 40

3π2

c0q3

4x

sum0lejleMq

(j + 1) log(j + 1)q

74 CHAPTER 4 TYPE I SUMS

and since M le min(c2xqD) q leradic

2c2x (just as in the proof of Lemma 421) andsum0lejleMq

(j + 1) log(j + 1)q

le M

qlogM +

(M

q+ 1

)log(M + 1) +

1

q2

int M

0

t log t dt

le(

2M

q+ 1

)log x+

M2

2q2log

Mradice

we conclude thatsum1lemleMq-m

|Tm(α)| le 5c0c23π2

M logMradice

+20c03π2

(2c2)32radicx log x

(457)

Instead of (417) we have

bDminus(Q+1)2

qprime csumj=0

x

jqprime + Q+12

log

(jqprime +

Q+ 1

2

)le x

Q+12

logQ+ 1

2+x

qprime

int D

Q+12

log t

tdt

le 2x

Qlog

Q

2+

(1 + ε)x

2εQ

((logD)2 minus

(log

Q

2

)2)

Instead of (418) we estimate

qprime

lfloorDminusQ+1

2qprime

rfloorsumj=0

(log

(Q+ 1

2+ jqprime

))radic1 +

qprime

jqprime + Q+12

le qprime(

logD + (radic

3 + 2εminus 1) logQ+ 1

2

)+

int D

Q+12

log t dt+

int D

Q+12

qprime log t

2tdt

le qprime(

logD +(radic

3 + 2εminus 1)

logQ+ 1

2

)+

(D log

D

eminus Q+ 1

2log

Q+ 1

2e

)+qprime

2logD log+ D

Q+12

We conclude that when D ge Q2 the sumsumQ2ltmleD(logm)|Tm(α)| is at most

2radicc0c1π

(D log

D

e+ (Q+ 1)

((1 + ε)(

radic3 + 2εminus 1) log

Q+ 1

2minus 1

2log

Q+ 1

2e

))+

radicc0c1π

(Q+ 1)(1 + ε) logD log+ e2DQ+1

2

+3c12

(2x

Qlog

Q

2+

(1 + ε)x

2εQ

((logD)2 minus

(log

Q

2

)2))

42 TYPE I ESTIMATES 75

We must now add this to (457) Since

(1 + ε)(radic

3 + 2εminus 1) logradic

2minus 1

2log 2e+

1 +radic

133

2log 2radice gt 0

and Q ge 2radicx we conclude that (456) is at most

2radicc0c1π

D logD

e

+2radicc0c1π

(1 + ε)(Q+ 1)

((radic

3 + 2εminus 1) logQ+ 1radic

2+

1

2logD log+ e2D

Q+12

)

+

(3c12

(1

2+

3(1 + ε)

16εlog x

)+

20c03π2

(2c2)32

)radicx log x

(458)Case (b) δ small |δ| le 12c2 or D le Q02 The analogue of (423) is a bound of

le 2|ηprime|1π

qmax

(1 log

c0e3q2

4π|ηprime|1x

)log

q

2

for the terms with m le q2 If q2 lt 2c2x then much as in (424) we havesumq2ltmleDprime

q-m

|Tm(α)|(logm) le 10

π2

c0q3

3x

sum1lejleDprimeq + 1

2

(j +

1

2

)log(j + 12)q

le 10

π2

c0q

3x

int Dprime+ 32 q

q

x log x dx

(459)

Sinceint Dprime+ 32 q

q

x log x dx =1

2

(Dprime +

3

2q

)2

logDprime + 3

2qradiceminus 1

2q2 log

qradice

=

(1

2Dprime2 +

3

2Dprimeq

)(log

Dprimeradice

+3

2

q

Dprime

)+

9

8q2 log

Dprime + 32qradiceminus 1

2q2 log

qradice

=1

2Dprime2 log

Dprimeradice

+3

2Dprimeq logDprime +

9

8q2

(2

9+

3

2+ log

(Dprime +

19

18q

))

where Dprime = min(c2xqD) and since the assumption (UV + (1918)Q0) le x56implies that (29 + 32 + log(Dprime + (1918)q)) le x we conclude thatsum

q2ltmleDprime

q-m

|Tm(α)|(logm)

le 5c0c23π2

Dprime logDprimeradice

+10c03π2

(3

4(2c2)32

radicx log x+

9

8(2c2)32

radicx log x

)le 5c0c2

3π2Dprime log

Dprimeradice

+25c04π2

(2c2)32radicx log x

(460)

76 CHAPTER 4 TYPE I SUMS

Let R = max(c2xq q2) We bound the terms R lt m le D as in (425) with afactor of log(jq +R) inside the sum The analogues of (426) and (427) are

b 1q (DminusR)csumj=0

x

jq +Rlog(jq +R) le x

RlogR+

x

q

int D

R

log t

tdt

leradic

2x

c2log

radicc2x

2+x

qlogD log+ D

R

(461)

where we use the assumption that x ge e2c2 and

b 1q (DminusR)csumj=0

log(jq +R)

radic1 +

q

jq +Rleradic

3 logR

+1

q

(D log

D

eminusR log

R

e

)+

1

2logD log

D

R

(462)

(or 0 if D lt R) We sum with (460) and the terms with m le q2 and obtain forDprime = c2xq = R

2radicc0c1π

(D log

Dradice

+ q

(radic3 log

c2x

q+

logD

2log+ D

q2

))+

3c12

x

qlogD log+ D

c2xq+

2|ηprime|1π

qmax

(1 log

c0e3q2

4π|ηprime|1x

)log

q

2

+3c1

2radic

2c2

radicx log

c2x

2+

25c04π2

(2c2)32radicx log x

which it is easy to check is also valid even if Dprime = D (in which case (461) and (462)do not appear) or R = q2 (in which case (460) does not appear)

Chapter 5

Type II sums

We must now consider the sum

SII =summgtU

(mv)=1

sumdgtUd|m

micro(d)

sumngtV

(nv)=1

Λ(n)e(αmn)η(mnx) (51)

Here the main improvements over classical treatments of type II sums are as fol-lows

1 obtaining cancellation in the term sumdgtUd|m

micro(d)

leading to a gain of a factor of log

2 using a large sieve for primes getting rid of a further log

3 exploiting via a non-conventional application of the principle of the large sieve(Lemma 521) the fact that α is in the tail of an interval (when that is the case)

It should be clear that these techniques are of general applicability (It is also clear that(2) is not new though strangely enough it seems not to have been applied to Gold-bachrsquos problem Perhaps this oversight is due to the fact that proofs of Vinogradovrsquosresult given in textbooks often follow Linnikrsquos dispersion method rather than the largesieve Our treatment of the large sieve for primes will follow the lines set by Mont-gomery and Montgomery-Vaughan [MV73 (16)] The fact that the large sieve forprimes can be combined with the new technique (3) is of course a novelty)

While (1) is particularly useful for the treatment of a term that generally arises inapplications of Vaughanrsquos identity all of the points above address issues that can arisein more general situations in number theory

77

78 CHAPTER 5 TYPE II SUMS

It is technically helpful to express η as the (multiplicative) convolution of two func-tions of compact support ndash preferrably the same function

η(x) = η1 lowastM η1 =

int infin0

η1(t)η1(xt)dt

t (52)

For the smoothing function η(t) = η2(t) = 4 max(log 2 minus | log 2t| 0) equation (52)holds with η1 = 2 middot 1[121] where 1[121] is the characteristic function of the interval[12 1] We will work with η = η2 yet most of our work will be valid for any η of theform η = η1 lowast η1

By (52) the sum (51) equals

4

int infin0

summgtU

(mv)=1

sumdgtUd|m

micro(d)

sumngtV

(nv)=1

Λ(n)e(αmn)η1(t)η1

(mnx

t

)dt

t

= 4

int xU

V

summax( x

2W U)ltmle xW

(mv)=1

sumdgtUd|m

micro(d)

summax(VW2 )ltnleW

(nv)=1

Λ(n)e(αmn)dW

W

(53)by the substitution t = (mx)W (We can assume V le W le xU because otherwiseone of the sums in (54) is empty) As we can see the sums within the integral are nowunsmoothed This will not be truly harmful and to some extent it will be convenientin that ready-to-use large-sieve estimates in the literature have been optimized morecarefully for unsmoothed sums than for smooth sums The fact that the sums start atx2W and W2 rather than at 1 will also be slightly helpful

(This is presumably why the weight η2 was introduced in [Tao14] which also usesthe large sieve As we will later see the weight η2 ndash or anything like it ndash will simplynot do on the major arcs which are much more sensitive to the choice of weights Onthe minor arcs however η2 is convenient and this is why we use it here For type Isums ndash as should be clear from our work so far which was stated for general weightsndash any function whose second derivative exists almost everywhere and lies in `1 woulddo just as well The option of having no smoothing whatsoever ndash as in Vinogradovrsquoswork or as in most textbook accounts ndash would not be quite as good for type I sumsand would lead to a routine but inconvenient splitting of sums into short intervals inplace of (53))

We now do what is generally the first thing in type II treatments we use Cauchy-Schwarz A minor note however that may help avoid confusion the treatments fa-miliar to some readers (eg the dispersion method not followed here) start with thespecial case of Cauchy-Schwarz that is most common in number theory∣∣∣∣∣∣

sumnleN

an

∣∣∣∣∣∣2

le NsumnleN

|an|2

79

whereas here we apply the general rule

summ

ambm leradicsum

m

|am|2radicsum

m

|bm|2

to the integrand in (53) At any rate we will have reduced the estimation of a sumto the estimation of two simpler sums

summ |am|2

summ |bm|2 but each of these two

simpler sums will be of a kind that we will lead to a loss of a factor of log x (or(log x)3) if not estimated carefully Since we cannot afford to lose a single factor oflog x we will have to deploy and develop techniques to eliminate these factors of log xThe procedure followed will be quite different for the two sums a variety of techniqueswill be needed

We separate n prime and n non-prime in the integrand of (53) and as we weresaying we apply Cauchy-Schwarz We obtain that the expression within the integral in(53) is at most

radicS1(UW ) middot S2(U VW ) +

radicS1(UW ) middot S3(W ) where

S1(UW ) =sum

max( x2W U)ltmle x

W

(mv)=1

sumdgtUd|m

micro(d)

2

S2(U VW ) =sum

max( x2W U)ltmle x

W

(mv)=1

∣∣∣∣∣∣∣∣∣∣sum

max(VW2 )ltpleW(pv)=1

(log p)e(αmp)

∣∣∣∣∣∣∣∣∣∣

2

(54)

and

S3(W ) =sum

x2W ltmle x

W

(mv)=1

∣∣∣∣∣∣∣∣sumnleW

n non-prime

Λ(n)

∣∣∣∣∣∣∣∣2

=sum

x2W ltmle x

W

(mv)=1

(142620W 12

)2

le 10171x+ 20341W

(55)

(by [RS62 Thm 13]) We will assume V le w thus the condition (p v) = 1 will befulfilled automatically and can be removed

The contribution of S3(W ) will be negligible We must bound S1(UW ) andS2(U VW ) from above

80 CHAPTER 5 TYPE II SUMS

51 The sum S1 cancellationWe shall bound

S1(UW ) =sum

max(Ux2W )ltmlexW(mv)=1

sumdgtUd|m

micro(d)

2

(56)

There will be a surprising amount of cancellation the expression within the sumwill be bounded by a constant on average ndash a constant less than 1 and usually less than12 in fact In other words the inner sum in (56) is exactly 0 most of the time

Recall that we need explicit constants throughout and that this essentially con-strains us to elementary means (We will at one point use Dirichlet series and ζ(s) fors real and greater than 1)

511 Reduction to a sum with microIt is tempting to start by applying Mobius inversion to change d gt U to d le U in(56) but this just makes matters worse We could also try changing variables so thatmd (which is smaller than xUW ) becomes the variable instead of d but this leadsto complications for m non-square-free Instead we write

summax(Ux2W )ltmlexW

(mv)=1

sumdgtUd|m

micro(d)

2

=sum

x2W ltmle x

W

(mv)=1

sumd1d2|m

micro(d1 gt U)micro(d2 gt U)

=sum

r1ltxWU

sumr2ltxWU

(r1r2)=1

(r1r2v)=1

suml

(lr1r2)=1

r1lr2lgtU

(`v)=1

micro(r1l)micro(r2l)sum

x2W ltmle x

W

r1r2l|m(mv)=1

1

(57)where d1 = r1l d2 = r2l l = (d1 d2) (The inequality r1 lt xWU comes fromr1r2l|m m le xW r2l gt U r2 lt xWU is proven in the same way) Now (57)equals sum

slt xWU

(sv)=1

sumr1lt

xWUs

sumr2lt

xWUs

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)sum

max(

Umin(r1r2)

xW

2r1r2s

)ltlle xW

r1r2s

(lr1r2)=1(micro(l))2=1

(`v)=1

1 (58)

where we have set s = m(r1r2l) We begin by simplifying the innermost triple sumThis we do in the following Lemma it is not a trivial task and carrying it out efficientlyactually takes an idea

51 THE SUM S1 CANCELLATION 81

Lemma 511 Let z y gt 0 Thensumr1lty

sumr2lty

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)sum

min(

zymin(r1r2)

z2r1r2

)ltlle z

r1r2

(lr1r2)=1(micro(l))2=1

(`v)=1

1 (59)

equals

6z

π2

v

σ(v)

sumr1lty

sumr2lty

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)

σ(r1)σ(r2)

(1minusmax

(1

2r1

yr2

y

))

+Olowast

508 ζ

(3

2

)2

yradicz middotprodp|v

(1 +

1radicp

)(1minus 1

p32

)2

(510)

If v = 2 the error term in (510) can be replaced by

Olowast

(127ζ

(3

2

)2

yradicz middot(

1 +1radic2

)(1minus 1

232

)2) (511)

Proof By Mobius inversion (59) equalssumr1lty

sumr2lty

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)sum

lle zr1r2

lgtmin(

zymin(r1r2)

z2r1r2

)(`v)=1

sumd1|r1d2|r2d1d2|l

micro(d1)micro(d2)

sumd3|vd3|l

micro(d3)summ2|l

(mr1r2v)=1

micro(m)

(512)

We can change the order of summation of ri and di by defining si = ridi and we canalso use the obvious fact that the number of integers in an interval (a b] divisible by dis (bminus a)d+Olowast(1) Thus (512) equalssum

d1d2lty

(d1d2)=1

(d1d2v)=1

micro(d1)micro(d2)sum

s1ltyd1s2ltyd2

(d1s1d2s2)=1

(s1s2v)=1

micro(d1s1)micro(d2s2)

sumd3|v

micro(d3)sum

mleradic

z

d21s1d22s2d3

(md1s1d2s2v)=1

micro(m)

d1d2d3m2

z

s1d1s2d2

(1minusmax

(1

2s1d1

ys2d2

y

))

(513)

82 CHAPTER 5 TYPE II SUMS

plus

Olowast

sum

d1d2lty

(d1d2v)=1

sums1ltyd1s2ltyd2

(s1s2v)=1

sumd3|v

summle

radicz

d21s1d22s2d3

m sq-free

1

(514)

If we complete the innermost sum in (513) by removing the condition

m leradicz(d2

1sd22s2)

we obtain (reintroducing the variables ri = disi)

z middotsum

r1r2lty

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)

r1r2

(1minusmax

(1

2r1

yr2

y

))

sumd1|r1d2|r2

sumd3|v

summ

(mr1r2v)=1

micro(d1)micro(d2)micro(m)micro(d3)

d1d2d3m2

(515)

times z Now (515) equalssumr1r2lty

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)z

r1r2

(1minusmax

(1

2r1

yr2

y

)) prodp|r1r2

or v

(1minus 1

p

) prodp-r1r2p-v

(1minus 1

p2

)

=6z

π2

v

σ(v)

sumr1r2lty

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)

σ(r1)σ(r2)

(1minusmax

(1

2r1

yr2

y

))

ie the main term in (510) It remains to estimate the terms used to complete thesum their total is by definition given exactly by (513) with the inequality m leradicz(d2

1sd22s2d3) changed to m gt

radicz(d2

1sd22s2d3) This is a total of size at most

1

2

sumd1d2lty

(d1d2v)=1

sums1ltyd1s2ltyd2

(s1s2v)=1

sumd3|v

summgt

radicz

d21s1d22s2d3

m sq-free

1

d1d2d3m2

z

s1d1s2d2 (516)

Adding this to (514) we obtain as our total error termsumd1d2lty

(d1d2v)=1

sums1ltyd1s2ltyd2

(s1s2v)=1

sumd3|v

f

(radicz

d21s1d2

2s2d3

) (517)

51 THE SUM S1 CANCELLATION 83

where

f(x) =summlexm sq-free

1 +1

2

summgtxm sq-free

x2

m2

It is easy to see that f(x)x has a local maximum exactly when x is a square-free(positive) integer We can hence check that

f(x) le 1

2

(2 + 2

(ζ(2)

ζ(4)minus 125

))x = 126981 x

for all x ge 0 by checking all integers smaller than a constant using m m sq-free subm 4 - m and 15 middot (34) lt 126981 to bound f from below for x larger than aconstant Therefore (517) is at most

127sum

d1d2lty

(d1d2v)=1

sums1ltyd1s2ltyd2

(s1s2v)=1

sumd3|v

radicz

d21s1d2

2s2d3

= 127radiczprodp|v

(1 +

1radicp

)middot

sumdlty

(dv)=1

sumsltyd

(sv)=1

1

dradics

2

We can bound the double sum simply by

sumdlty

(dv)=1

sumsltyd

1radicsdle 2

sumdlty

radicyd

dle 2radicy middot ζ

(3

2

)prodp|v

(1minus 1

p32

)

Alternatively if v = 2 we bound

sumsltyd

(sv)=1

1radics

=sumsltyd

s odd

1radicsle 1 +

1

2

int yd

1

1radicsds =

radicyd

and thus

sumdlty

(dv)=1

sumsltyd

(sv)=1

1radicsdle

sumdlty

(d2)=1

radicyd

dle radicy

(1minus 1

232

(3

2

)

Applying Lemma 511 with y = Ss and z = xWs where S = xWU we

84 CHAPTER 5 TYPE II SUMS

obtain that (58) equals

6x

π2W

v

σ(v)

sumsltS

(sv)=1

1

s

sumr1ltSs

sumr2ltSs

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)

σ(r1)σ(r2)

(1minusmax

(1

2r1

Ssr2

Ss

))

+Olowast

504ζ

(3

2

)3

S

radicx

W

prodp|v

(1 +

1radicp

)(1minus 1

p32

)3

(518)with 504 replaced by 127 if v = 2 The main term in (518) can be written as

6x

π2W

v

σ(v)

sumsleS

(sv)=1

1

s

int 1

12

sumr1leuSs

sumr2leuSs

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)

σ(r1)σ(r2)du (519)

As we can see the use of an integral eliminates the unpleasant factor(1minusmax

(1

2r1

Ssr2

Ss

))

From now on we will focus on the cases v = 1 and v = 2 for simplicity (Highervalues of v do not seem to be really profitable in the last analysis)

512 Explicit bounds for a sum with microWe must estimate the expression within parentheses in (519) It is not too hard toshow that it tends to 0 the first part of the proof of Lemma 512 will reduce this to thefact that

sumn micro(n)n = 0 Obtaining good bounds is a more delicate matter For our

purposes we will need the expression to converge to 0 at least as fast as 1(log)2 witha good constant in front For this task the bound (221) on

sumnlex micro(n)n is enough

Lemma 512 Let

gv(x) =sumr1lex

sumr2lex

(r1r2)=1

(r1r2v)=1

micro(r1)micro(r2)

σ(r1)σ(r2)

where v = 1 or v = 2 Then

|g1(x)| le

1x if 33 le x le 1061x (111536 + 55768 log x) if 106 le x lt 101000044325(log x)2 + 01079radic

xif x ge 1010

|g2(x)| le

21x if 33 le x le 1061x (163434 + 817168 log x) if 106 le x lt 10100038128(log x)2 + 02046radic

x if x ge 1010

51 THE SUM S1 CANCELLATION 85

Tbe proof involves what may be called a version of Rankinrsquos trick using Dirichletseries and the behavior of ζ(s) near s = 1

Proof We prove the statements for x le 106 by a direct computation using intervalarithmetic (In fact in that range one gets 20895071x instead of 21x) Assumefrom now on that x gt 106

Clearly

g(x) =sumr1lex

sumr2lex

(r1r2v)=1

sumd|(r1r2)

micro(d)

micro(r1)micro(r2)

σ(r1)σ(r2)

=sumdlex

(dv)=1

micro(d)sumr1lex

sumr2lex

d|(r1r2)

(r1r2v)=1

micro(r1)micro(r2)

σ(r1)σ(r2)

=sumdlex

(dv)=1

micro(d)

(σ(d))2

sumu1lexd

(u1dv)=1

sumu2lexd

(u2dv)=1

micro(u1)micro(u2)

σ(u1)σ(u2)

=sumdlex

(dv)=1

micro(d)

(σ(d))2

sumrlexd

(rdv)=1

micro(r)

σ(r)

2

(520)

Moreover sumrlexd

(rdv)=1

micro(r)

σ(r)=

sumrlexd

(rdv)=1

micro(r)

r

sumdprime|r

prodp|dprime

(p

p+ 1minus 1

)

=sum

dprimelexdmicro(dprime)2=1

(dprimedv)=1

prodp|dprime

minus1

p+ 1

sumrlexd

(rdv)=1

dprime|r

micro(r)

r

=sum

dprimelexdmicro(dprime)2=1

(dprimedv)=1

1

dprimeσ(dprime)

sumrlexddprime

(rddprimev)=1

micro(r)

r

and sumrlexddprime

(rddprimev)=1

micro(r)

r=

sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

1

dprimeprime

sumrlexddprimedprimeprime

micro(r)

r

86 CHAPTER 5 TYPE II SUMS

Hence

|g(x)| lesumdlex

(dv)=1

(micro(d))2

(σ(d))2

sum

dprimelexdmicro(dprime)2=1

(dprimedv)=1

1

dprimeσ(dprime)

sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

1

dprimeprimef(xddprimedprimeprime)

2

(521)

where f(t) =∣∣∣sumrlet micro(r)r

∣∣∣We intend to bound the function f(t) by a linear combination of terms of the form

tminusδ δ isin [0 12) Thus it makes sense now to estimate Fv(s1 s2 x) defined to be thequantity

sumd

(dv)=1

(micro(d))2

(σ(d))2

sumdprime1

(dprime1dv)=1

micro(dprime1)2

dprime1σ(dprime1)

sumdprimeprime1 |(ddprime1v)infin

1

dprimeprime1middot (ddprime1dprimeprime1)1minuss1

sum

dprime2(dprime2dv)=1

micro(dprime2)2

dprime2σ(dprime2)

sumdprimeprime2 |(ddprime2v)infin

1

dprimeprime2middot (ddprime2dprimeprime2)1minuss2

for s1 s2 isin [12 1] This is equal to

sumd

(dv)=1

micro(d)2

ds1+s2

prodp|d

1

(1 + pminus1)2

(1minus pminuss1)prodp|v

1(1minuspminuss1 )(1minuspminuss2 )

(1minus pminuss2)

middot

sumdprime

(dprimedv)=1

micro(dprime)2

(dprime)s1+1

prodpprime|dprime

1

(1 + pprimeminus1) (1minus pprimeminuss1)

middot

sumdprime

(dprimedv)=1

micro(dprime)2

(dprime)s2+1

prodpprime|dprime

1

(1 + pprimeminus1) (1minus pprimeminuss2)

which in turn can easily be seen to equalprodp-v

(1 +

pminuss1pminuss2

(1minus pminuss1 + pminus1)(1minus pminuss2 + pminus1)

)prodp|v

1

(1minus pminuss1)(1minus pminuss2)

middotprodp-v

(1 +

pminus1pminuss1

(1 + pminus1)(1minus pminuss1)

)middotprodp-v

(1 +

pminus1pminuss2

(1 + pminus1)(1minus pminuss2)

) (522)

51 THE SUM S1 CANCELLATION 87

Now for any 0 lt x le y le x12 lt 1

(1+xminusy)(1minusxy)(1minusxy2)minus(1+x)(1minusy)(1minusx3) = (xminusy)(y2minusx)(xyminusxminus1)x le 0

and so

1 +xy

(1 + x)(1minus y)=

(1 + xminus y)(1minus xy)(1minus xy2)

(1 + x)(1minus y)(1minus xy)(1minus xy2)le (1minus x3)

(1minus xy)(1minus xy2)

(523)For any x le y1 y2 lt 1 with y2

1 le x y22 le x

1 +y1y2

(1minus y1 + x)(1minus y2 + x)le (1minus x3)2(1minus x4)

(1minus y1y2)(1minus y1y22)(1minus y2

1y2) (524)

This can be checked as follows multiplying by the denominators and changing vari-ables to x s = y1 + y2 and r = y1y2 we obtain an inequality where the left sidequadratic on s with positive leading coefficient must be less than or equal to the rightside which is linear on s The left side minus the right side can be maximal for givenx r only when s is maximal or minimal This happens when y1 = y2 or when eitheryi =

radicx or yi = x for at least one of i = 1 2 In each of these cases we have re-

duced (524) to an inequality in two variables that can be proven automatically1 by aquantifier-elimination program the author has used QEPCAD [HB11] to do this

Hence Fv(s1 s2 x) is at most

prodp-v

(1minus pminus3)2(1minus pminus4)

(1minus pminuss1minuss2)(1minus pminus2s1minuss2)(1minus pminuss1minus2s2)middotprodp|v

1

(1minus pminuss1)(1minus pminuss2)

middotprodp-v

1minus pminus3

(1 + pminuss1minus1)(1 + pminus2s1minus1)

prodp-v

1minus pminus3

(1 + pminuss2minus1)(1 + pminus2s2minus1)

= Cvs1s2 middotζ(s1 + 1)ζ(s2 + 1)ζ(2s1 + 1)ζ(2s2 + 1)

ζ(3)4ζ(4)(ζ(s1 + s2)ζ(2s1 + s2)ζ(s1 + 2s2))minus1

(525)where Cvs1s2 equals 1 if v = 1 and

(1minus 2minuss1minus2s2)(1 + 2minuss1minus1)(1 + 2minus2s1minus1)(1 + 2minuss2minus1)(1 + 2minus2s2minus1)

(1minus 2minuss1+s2)minus1(1minus 2minus2s1minuss2)minus1(1minus 2minuss1)(1minus 2minuss2)(1minus 2minus3)4(1minus 2minus4)

if v = 2For 1 le t le x (221) and (224) imply

f(t) le

radic

2t if x le 1010radic2t + 003

log x

(xt

) log log 1010

log xminuslog 1010 if x gt 1010(526)

1In practice the case yi =radicx leads to a polynomial of high degree and quantifier elimination increases

sharply in complexity as the degree increases a stronger inequality of lower degree (with (1minus 3x3) insteadof (1minus x3)2(1minus x4)) was given to QEPCAD to prove in this case

88 CHAPTER 5 TYPE II SUMS

where we are using the fact that log x is convex-down Note that again by convexity

log log xminus log log 1010

log xminus log 1010lt (log t)prime|t=log 1010 =

1

log 1010= 00434294

Obviouslyradic

2t in (526) can be replaced by (2t)12minusε for any ε ge 0By (521) and (526)

|gv(x)| le(

2

x

)1minus2ε

Fv(12 + ε 12 + ε x)

for x le 1010 We set ε = 1 log x and obtain from (525) that

Fv(12 + ε 12 + ε x) le Cv 12 +ε 12 +ε

ζ(1 + 2ε)ζ(32)4ζ(2)2

ζ(3)4ζ(4)

le 55768 middot Cv 12 +ε 12 +ε middot(

1 +log x

2

)

(527)

where we use the easy bound ζ(s) lt 1 + 1(sminus 1) obtained bysumns lt 1 +

int infin1

tsdt

(For sharper bounds see [BR02]) Now

C2 12 +ε 12 +ε le(1minus 2minus32minusε)2(1 + 2minus32)2(1 + 2minus2)2(1minus 2minus1minus2ε)

(1minus 2minus12)2(1minus 2minus3)4(1minus 2minus4)

le 14652983

whereas C1 12 +ε 12 +ε = 1 (We are assuming x ge 106 and so ε le 1(log 106)) Hence

|gv(x)| le

1x (111536 + 55768 log x) if v = 11x (163434 + 817168 log x) if v = 2

for 106 le x lt 1010For general x we must use the second bound in (526) Define c = 1(log 1010)

We see that if x gt 1010

|gv(x)| le 0032

(log x)2F1(1minus c 1minus c) middot Cv1minusc1minusc

+ 2 middotradic

2radicx

003

log xF (1minus c 12) middot Cv1minusc12

+1

x(111536 + 55768 log x) middot Cv 12 +ε 12 +ε

For v = 1 this gives

|g1(x)| le 00044325

(log x)2+

21626radicx log x

+1

x(111536 + 55768 log x)

le 00044325

(log x)2+

01079radicx

51 THE SUM S1 CANCELLATION 89

for v = 2 we obtain

|g2(x)| le 0038128

(log x)2+

25607radicx log x

+1

x(163434 + 817168 log x)

le 0038128

(log x)2+

02046radicx

513 Estimating the triple sumWe will now be able to bound the triple sum in (519) vizsum

sleS(sv)=1

1

s

int 1

12

gv(uSs)du (528)

where gv is as in Lemma 512As we will soon see Lemma 512 that (528) is bounded by a constant (essentially

because the integralint 12

01t(log t)2 converges) We must give as good a constant as

we can since it will affect the largest term in the final resultClearly gv(R) = gv(bRc) The contribution of each gv(m) 1 le m le S to (528)

is exactly gv(m) timessumS

m+1ltsleSm

1

s

(sv)=1

int 1

msS

1du+sum

S2mltsle

Sm+1

1

s

(sv)=1

int (m+1)sS

msS

1du

+sum

S2(m+1)

ltsle S2m

1

s

(sv)=1

int (m+1)sS

12

du =sum

Sm+1ltsle

Sm

(sv)=1

(1

sminus m

S

)

+sum

S2mltsle

Sm+1

(sv)=1

1

S+

sumS

2(m+1)ltsle S

2m

(sv)=1

(m+ 1

Sminus 1

2s

)

(529)

Write f(t) = 1S for S2m lt t le S(m+1) f(t) = 0 for t gt Sm or t lt S2(m+1) f(t) = 1tminusmS for S(m+ 1) lt t le Sm and f(t) = (m+ 1)S minus 12t forS2(m + 1) lt t le S2m then (529) equals

sumn(nv)=1 f(n) By Euler-Maclaurin

(second order)sumn

f(n) =

int infinminusinfin

f(x)minus 1

2B2(x)f primeprime(x)dx =

int infinminusinfin

f(x) +Olowast(

1

12|f primeprime(x)|

)dx

=

int infinminusinfin

f(x)dx+1

6middotOlowast

(∣∣∣∣f prime( 3

2m

)∣∣∣∣+

∣∣∣∣f prime( s

m+ 1

)∣∣∣∣)=

1

2log

(1 +

1

m

)+

1

6middotOlowast

((2m

s

)2

+

(m+ 1

s

)2)

(530)

90 CHAPTER 5 TYPE II SUMS

Similarly

sumn odd

f(n) =

int infinminusinfin

f(2x+ 1)minus 1

2B2(x)d

2f(2x+ 1)

dx2dx

=1

2

int infinminusinfin

f(x)dxminus 2

int infinminusinfin

1

2B2

(xminus 1

2

)f primeprime(x)dx

=1

2

int infinminusinfin

f(x)dx+1

6

int infinminusinfin

Olowast (|f primeprime(x)|) dx

=1

4log

(1 +

1

m

)+

1

3middotOlowast

((2m

s

)2

+

(m+ 1

s

)2)

We use these expressions form le C0 where C0 ge 33 is a constant to be computedlater they will give us the main term For m gt C0 we use the bounds on |g(m)| thatLemma 512 gives us

(Starting now and for the rest of the paper we will focus on the cases v = 1v = 2 when giving explicit computational estimates All of our procedures wouldallow higher values of v as well but as will become clear much later the gains fromhigher values of v are offset by losses and complications elsewhere)

Let us estimate (528) Let

cv0 =

16 if v = 113 if v = 2

cv1 =

1 if v = 125 if v = 2

cv2 =

55768 if v = 1817168 if v = 2

cv3 =

111536 if v = 1163434 if v = 2

cv4 =

00044325 if v = 10038128 if v = 2

cv5 =

01079 if v = 102046 if v = 2

Then (528) equals

summleC0

gv(m) middot(φ(v)

2vlog

(1 +

1

m

)+Olowast

(cv0

5m2 + 2m+ 1

S2

))

+sum

S106lesltSC0

1

s

int 1

12

Olowast(cv1uSs

)du

+sum

S1010lesltS106

1

s

int 1

12

Olowast(cv2 log(uSs) + cv3

uSs

)du

+sum

sltS1010

1

s

int 1

12

Olowast

(cv4

(log uSs)2+

cv5radicuSs

)du

51 THE SUM S1 CANCELLATION 91

which issummleC0

gv(m) middot φ(v)

2vlog

(1 +

1

m

)+summleC0

|g(m)| middotOlowast(cv0

5m2 + 2m+ 1

S2

)

+Olowast

(cv1

log 2

C0+

log 2

106

(cv3 + cv2(1 + log 106)

)+

2minusradic

2

10102cv5

)

+Olowast

sumsltS1010

cv42

s(logS2s)2

for S ge (C0 + 1) Note that

sumsltS1010

1s(logS2s)2 =

int 21010

01

t(log t)2 dtNow

cv42

int 21010

0

1

t(log t)2dt =

cv42

log(10102)=

000009923 if v = 1

0000853636 if v = 2

and

log 2

106

(cv3 + cv2(1 + log 106)

)+

2minusradic

2

105cv5 =

00006506 if v = 1

0009525 if v = 2

For C0 = 10000

φ(v)

v

1

2

summleC0

gv(m) middot log

(1 +

1

m

)=

0362482 if v = 10360576 if v = 2

cv0summleC0

|gv(m)|(5m2 + 2m+ 1) le

62040665 if v = 1159113401 if v = 2

and

cv1 middot (log 2)C0 =

000006931 if v = 1000017328 if v = 2

Thus for S ge 100000sumsleS

(sv)=1

1

s

int 1

12

gv(uSs)du le

036393 if v = 1037273 if v = 2

(531)

For S lt 100000 we proceed as above but using the exact expression (529) insteadof (530) Note (529) is of the form fsm1(S) + fsm2(S)S where both fsm1(S)and fsm2(S) depend only on bSc (and on s andm) Summing overm le S we obtaina bound of the form sum

sleS(sv)=1

1

s

int 1

12

gv(uSs)du le Gv(S)

92 CHAPTER 5 TYPE II SUMS

withGv(S) = Kv1(|S|) +Kv2(|S|)S

where Kv1(n) and Kv2(n) can be computed explicitly for each integer n (For exam-ple Gv(S) = 1minus 1S for 1 le S lt 2 and Gv(S) = 0 for S lt 1)

It is easy to check numerically that this implies that (531) holds not just for S ge100000 but also for 40 le S lt 100000 (if v = 1) or 16 le S lt 100000 (if v =

2) Using the fact that Gv(S) is non-negative we can compareint T

1Gv(S)dSS with

log(T+1N) for each T isin [2 40]cap 1NZ (N a large integer) to show again numerically

that int T

1

Gv(S)dS

Sle

03698 log T if v = 1037273 log T if v = 2

(532)

(We use N = 100000 for v = 1 already N = 10 gives us the answer above forv = 2 Indeed computations suggest the better bound 0358 instead of 037273 weare committed to using 037273 because of (531))

Multiplying by 6vπ2σ(v) we conclude that

S1(UW ) =x

WmiddotH1

( x

WU

)+Olowast

(508ζ(32)3 x32

W 32U

)(533)

if v = 1

S1(UW ) =x

WmiddotH2

( x

WU

)+Olowast

(127ζ(32)3 x32

W 32U

)(534)

if v = 2 where

H1(S) =

6π2G1(S) if 1 le S lt 40022125 if S ge 40

H2(s) =

4π2G2(S) if 1 le S lt 16015107 if S ge 16

(535)Hence (by (532)) int T

1

Hv(S)dS

Sle

022482 log T if v = 1015107 log T if v = 2

(536)

moreover

H1(S) le 3

π2 H2(S) le 2

π2(537)

for all S

Note There is another way to obtain cancellation on micro applicable when (xW ) gtUq (as is unfortunately never the case in our main application) For this alternativeto be taken one must either apply Cauchy-Schwarz on n rather than m (resulting inexponential sums over m) or lump together all m near each other and in the same

52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 93

congruence class modulo q before applying Cauchy-Schwarz on m (one can indeed dothis if δ is small) We could then writesum

msimWmequivr mod q

sumd|mdgtU

micro(d) = minussummsimW

mequivr mod q

sumd|mdleU

micro(d) = minussumdleU

micro(d)(Wqd+O(1))

and obtain cancellation on d If Uq ge (xW ) however the error term dominates

52 The sum S2 the large sieve primes and tailsWe must now bound

S2(U primeW primeW ) =sum

U primeltmle xW

(mv)=1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αmp)

∣∣∣∣∣∣2

(538)

for U prime = max(U x2W ) W prime = max(VW2) (The condition (p v) = 1 will befulfilled automatically by the assumption V gt v)

From a modern perspective this is clearly a case for a large sieve It is also clear thatwe ought to try to apply a large sieve for sequences of prime support What is subtlerhere is how to do things well for very large q (ie xq small) This is in some sense adual problem to that of q small but it poses additional complications for example it isnot obvious how to take advantage of prime support for very large q

As in type I we avoid this entire issue by forbidding q large and then taking advan-tage of the error term δx in the approximation α = a

q + δx This is one of the main

innovations here Note this alternative method will allow us to take advantage of primesupport

A key situation to study is that of frequencies αi clustering around given rationalsaq while nevertheless keeping at a certain small distance from each other

Lemma 521 Let q ge 1 Let α1 α2 αk isin RZ be of the form αi = aiq + υi0 le ai lt q where the elements υi isin R all lie in an interval of length υ gt 0 and whereai = aj implies |υi minus υj | gt ν gt 0 Assume ν + υ le 1q Then for any WW prime ge 1W prime geW2

ksumi=1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αip)

∣∣∣∣∣∣2

le min

(1

2q

φ(q)

1

log ((q(ν + υ))minus1)

)middot(W minusW prime + νminus1

) sumW primeltpleW

(log p)2

(539)

Proof For any distinct i j the angles αi αj are separated by at least ν (if ai = aj) orat least 1qminus|υiminusυj | ge 1qminusυ ge ν (if ai 6= aj) Hence we can apply the large sieve(in the optimal N + δminus1 minus 1 form due to Selberg [Sel91] and Montgomery-Vaughan[MV74]) and obtain the bound in (539) with 1 instead of min(1 ) immediately

94 CHAPTER 5 TYPE II SUMS

We can also apply Montgomeryrsquos inequality ([Mon68] [Hux72] see the exposi-tions in [Mon71 pp 27ndash29] and [IK04 sect74]) This gives us that the left side of (539)is at most

sumrleR

(rq)=1

(micro(r))2

φ(r)

minus1 sum

rleR(rq)=1

sumaprime mod r(aprimer)=1

ksumi=1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e((αi + aprimer)p)

∣∣∣∣∣∣2

(540)

If we add all possible fractions of the form aprimer r le R (r q) = 1 to the fractionsaiq we obtain fractions that are separated by at least 1qR2 If ν + υ ge 1qR2 thenthe resulting angles αi + aprimer are still separated by at least ν Thus we can apply thelarge sieve to (540) setting R = 1

radic(ν + υ)q we see that we gain a factor of

sumrleR

(rq)=1

(micro(r))2

φ(r)ge φ(q)

q

sumrleR

(micro(r))2

φ(r)ge φ(q)

q

sumdleR

1

dge φ(q)

2qlog((q(ν + υ))minus1

)

(541)since

sumdleR 1d ge log(R) for all R ge 1 (integer or not)

Let us first give a bound on sums of the type of S2(U VW ) using prime sup-port but not the error terms (or Lemma 521) This is something that can be donevery well using tools available in the literature (Not all of these tools seem to beknown as widely as they should be) Bounds (542) and (544) are completely standardlarge-sieve bounds To obtain the gain of a factor of log in (543) we use a lemmaof Montgomeryrsquos for whose modern proof (containing an improvement by Huxley)we refer to the standard source [IK04 Lemma 715] The purpose of Montgomeryrsquoslemma is precisely to gain a factor of log in applications of the large sieve to sequencessupported on the primes To use the lemma efficiently we apply Montgomery andVaughanrsquos large sieve with weights [MV73 (16)] rather than more common forms ofthe large sieve (The idea ndash used in [MV73] to prove an improved version of the Brun-Titchmarsh inequality ndash is that Farey fractions (rationals with bounded denominator)are not equidistributed this fact can be exploited if a large sieve with weights is used)

Lemma 522 Let W ge 1 W prime geW2 Let α = aq +Olowast(1qQ) q le Q Then

sumA0ltmleA1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αmp)

∣∣∣∣∣∣2

lelceil

A1 minusA0

min(q dQ2e)

rceilmiddot (W minusW prime + 2q)

sumW primeltpleW

(log p)2

(542)

52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 95

If q lt W2 and Q ge 35W the following bound also holds

sumA0ltmleA1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αmp)

∣∣∣∣∣∣2

lelceilA1 minusA0

q

rceilmiddot q

φ(q)

W

log(W2q)middot

sumW primeltpleW

(log p)2

(543)

If A1 minusA0 le q and q le ρQ ρ isin [0 1] the following bound also holds

sumA0ltmleA1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αmp)

∣∣∣∣∣∣2

le (W minusW prime + q(1minus ρ))sum

W primeltpleW

(log p)2

(544)

Proof Let k = min(q dQ2e) ge dq2e We split (A0 A1] into d(A1minusA0)ke blocksof at most k consecutive integers m0 + 1m0 + 2 For m mprime in such a block αmand αmprime are separated by a distance of at least

|(aq)(mminusmprime)| minusOlowast(kqQ) = 1q minusOlowast(12q) ge 12q

By the large sieve

qsuma=1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(α(m0 + a)p)

∣∣∣∣∣∣2

le ((W minusW prime)+2q)sum

W primeltpleW

(log p)2 (545)

We obtain (542) by summing over all d(A1 minusA0)ke blocksIf A1 minus A0 le |q| and q le ρQ ρ isin [0 1] we obtain (544) simply by applying

the large sieve without splitting the interval A0 lt m le A1Let us now prove (543) We will use Montgomeryrsquos inequality followed by Mont-

gomery and Vaughanrsquos large sieve with weights An angle aq + aprime1r1 is separatedfrom other angles aprimeq + aprime2r2 (r1 r2 le R (ai ri) = 1) by at least 1qr1R ratherthan just 1qR2 We will choose R so that qR2 lt Q this implies 1Q lt 1qR2 le1qr1R

By a lemma of Montgomeryrsquos [IK04 Lemma 715] applied (for each 1 le a le q)to S(α) =

sumn ane(αn) with an = log(n)e(α(m0 + a)n) if n is prime and an = 0

otherwise

1

φ(r)

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(α(m0 + a)p)

∣∣∣∣∣∣2

lesum

aprime mod r(aprimer)=1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e

((α (m0 + a) +

aprime

r

)p

)∣∣∣∣∣∣2

(546)

96 CHAPTER 5 TYPE II SUMS

for each square-free r leW prime We multiply both sides of (546) by(W

2+

3

2

(1

qrRminus 1

Q

)minus1)minus1

and sum over all a = 0 1 q minus 1 and all square-free r le R coprime to q we willlater make sure that R leW prime We obtain that

sumrleR

(rq)=1

(W

2+

3

2

(1

qrRminus 1

Q

)minus1)minus1

micro(r)2

φ(r)

middotqsuma=1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(α(m0 + a)p)

∣∣∣∣∣∣2

(547)

is at mostsumrleR

(rq)=1

r sq-free

(W

2+

3

2

(1

qrRminus 1

Q

)minus1)minus1

qsuma=1

sumaprime mod r(aprimer)=1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e

((α (m0 + a) +

aprime

r

)p

)∣∣∣∣∣∣2

(548)

We now apply the large sieve with weights [MV73 (16)] recalling that each angleα(m0 +a)+aprimer is separated from the others by at least 1qrRminus1Q we obtain that(548) is at most

sumW primeltpleW (log p)2 It remains to estimate the sum in the first line of

(547) (We are following here a procedure analogous to that used in [MV73] to provethe Brun-Titchmarsh theorem)

Assume first that q leW135 Set

R =

(σW

q

)12

(549)

where σ = 12e2middot025068 = 030285 It is clear that qR2 lt Q q lt W prime and R ge 2Moreover for r le R

1

Qle 1

35Wle σ

35

1

σW=

σ

35

1

qR2le σ35

qrR

Hence

W

2+

3

2

(1

qrRminus 1

Q

)minus1

le W

2+

3

2

qrR

1minus σ35=W

2+

3r

2(1minus σ

35

)Rmiddot 2σW

2

=W

2

(1 +

1minus σ35rW

R

)ltW

2

(1 +

rW

R

)

52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 97

and so

sumrleR

(rq)=1

(W

2+

3

2

(1

qrRminus 1

Q

)minus1)minus1

micro(r)2

φ(r)

ge 2

W

sumrleR

(rq)=1

(1 + rRminus1)minus1micro(r)2

φ(r)ge 2

W

φ(q)

q

sumrleR

(1 + rRminus1)minus1micro(r)2

φ(r)

For R ge 2 sumrleR

(1 + rRminus1)minus1micro(r)2

φ(r)gt logR+ 025068

this is true for R ge 100 by [MV73 Lemma 8] and easily verifiable numerically for2 le R lt 100 (It suffices to verify this for R integer with r lt R instead of r le R asthat is the worst case)

Now

logR =1

2

(log

W

2q+ log 2σ

)=

1

2log

W

2qminus 025068

Hence sumrleR

(1 + rRminus1)minus1micro(r)2

φ(r)gt

1

2log

W

2q

and the statement followsNow consider the case q gt W135 If q is even then in this range inequality

(542) is always better than (543) and so we are done Assume then that W135 ltq le W2 and q is odd We set R = 2 clearly qR2 lt W le Q and q lt W2 le W primeand so this choice of R is valid It remains to check that

1

W2 + 3

2

(12q minus

1Q

)minus1 +1

W2 + 3

2

(14q minus

1Q

)minus1 ge1

Wlog

W

2q

This follows because

112 + 3

2

(t2 minus

135

)minus1 +1

12 + 3

2

(t4 minus

135

)minus1 ge logt

2

for all 2 le t le 135

We need a version of Lemma 522 with m restricted to the odd numbers since weplan to set the parameter v equal to 2

98 CHAPTER 5 TYPE II SUMS

Lemma 523 Let W ge 1 W prime geW2 Let 2α = aq +Olowast(1qQ) q le Q Then

sumA0ltmleA1

m odd

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αmp)

∣∣∣∣∣∣2

lelceilA1 minusA0

min(2qQ)

rceilmiddot (W minusW prime + 2q)

sumW primeltpleW

(log p)2

(550)

If q lt W2 and Q ge 35W the following bound also holds

sumA0ltmleA1

m odd

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αmp)

∣∣∣∣∣∣2

lelceilA1 minusA0

2q

rceilmiddot q

φ(q)

W

log(W2q)middot

sumW primeltpleW

(log p)2

(551)

If A1 minusA0 le 2q and q le ρQ ρ isin [0 1] the following bound also holds

sumA0ltmleA1

∣∣∣∣∣∣sum

W primeltpleW

(log p)e(αmp)

∣∣∣∣∣∣2

le (W minusW prime + q(1minus ρ))sum

W primeltpleW

(log p)2

(552)

Proof We follow the proof of Lemma 522 noting the differences Let

k = min(q dQ2e) ge dq2e

just as before We split (A0 A1] into d(A1 minusA0)ke blocks of at most 2k consecutiveintegers any such block contains at most k odd numbers For odd m mprime in such ablock αm and αmprime are separated by a distance of

|α(mminusmprime)| =∣∣∣∣2α

mminusmprime

2

∣∣∣∣ = |(aq)k| minusOlowast(kqQ) ge 12q

We obtain (550) and (552) just as we obtained (542) and (544) before To obtain(551) proceed again as before noting that the angles we are working with can belabelled as α(m0 + 2a) 0 le a lt q

The idea now (for large δ) is that if δ is not negligible then as m increases andαm loops around the circle RZ αm roughly repeats itself every q steps ndash but with aslight displacement This displacement gives rise to a configuration to which Lemma521 is applicable The effect is that we can apply the large sieve once instead of manytimes thus leading to a gain of a large factor (essentially the number of times the largesieve would have been used) This is how we obtain the factor of |δ| in the denominatorof the main term x|δ|q in (556) and (557)

52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 99

Proposition 524 Let x ge W ge 1 W prime ge W2 U prime ge x2W Let Q ge 35W Let2α = aq + δx (a q) = 1 |δx| le 1qQ q le Q Let S2(U primeW primeW ) be as in(538) with v = 2

For q le ρQ where ρ isin [0 1]

S2(U primeW primeW ) le(

max(1 2ρ)

(x

8q+

x

2W

)+W

2+ 2q

)middot

sumW primeltpleW

(log p)2

(553)If q lt W2

S2(U primeW primeW ) le(

x

4φ(q)

1

log(W2q)+

q

φ(q)

W

log(W2q)

)middot

sumW primeltpleW

(log p)2

(554)If W gt x4q the following bound also holds

S2(U primeW primeW ) le(W

2+

q

1minus x4Wq

) sumW primeltpleW

(log p)2 (555)

If δ 6= 0 and x4W + q le x|δ|q

S2(U primeW primeW ) le min

12qφ(q)

log(

x|δq|(q + x

4W

)minus1)

middot(

x

|δq|+W

2

) sumW primeltpleW

(log p)2

(556)

Lastly if δ 6= 0 and q le ρQ where ρ isin [0 1)

S2(U primeW primeW ) le(

x

|δq|+W

2+

x

8(1minus ρ)Q+

x

4(1minus ρ)W

) sumW primeltpleW

(log p)2

(557)

The trivial bound would be in the order of

S2(U primeW primeW ) = (x2 log x)sum

W primeltpleW

(log p)2

In practice (555) gets applied when W ge xq

Proof Let us first prove statements (554) and (553) which do not involve δ Assumefirst q leW2 Then by (551) with A0 = U prime A1 = xW

S2(U primeW primeW ) le(xW minus U prime

2q+ 1

)q

φ(q)

W

log(W2q)

sumW primeltpleW

(log p)2

Clearly (xW minus U prime)W le (x2W ) middotW = x2 Thus (554) holds

100 CHAPTER 5 TYPE II SUMS

Assume now that q le ρQ Apply (550) with A0 = U prime A1 = xW Then

S2(U primeW primeW ) le(

xW minus U prime

q middotmin(2 ρminus1)+ 1

)(W minusW prime + 2q)

sumW primeltpleW

(log p)2

Now (xW minus U prime

q middotmin(2 ρminus1)+ 1

)middot (W minusW prime + 2q)

le( xWminus U prime

) W minusW prime

qmin(2 ρminus1)+ max(1 2ρ)

( xWminus U prime

)+W2 + 2q

le x4

qmin(2 ρminus1)+ max(1 2ρ)

x

2W+W2 + 2q

This implies (553)If W gt x4q apply (544) with = x4Wq ρ = 1 This yields (555)Assume now that δ 6= 0 and x4W + q le x|δq| Let Qprime = x|δq| For any m1

m2 with x2W lt m1m2 le xW we have |m1 minusm2| le x2W le 2(Qprime minus q) andso ∣∣∣∣m1 minusm2

2middot δx+ qδx

∣∣∣∣ le Qprime|δ|x =1

q (558)

The conditions of Lemma 521 are thus fulfilled with υ = (x4W ) middot |δ|x and ν =|δq|x We obtain that S2(U primeW primeW ) is at most

min

(1

2q

φ(q)

1

log ((q(ν + υ))minus1)

)(W minusW prime + νminus1

) sumW primeltpleW

(log p)2

Here W minusW prime + νminus1 = W minusW prime + x|qδ| leW2 + x|qδ| and

(q(ν + υ))minus1 =

(q|δ|x

)minus1 (q +

x

4W

)minus1

Lastly assume δ 6= 0 and q le ρQ We let Qprime = x|δq| ge Q again and we splitthe range U prime lt m le xW into intervals of length 2(Qprime minus q) so that (558) still holdswithin each interval We apply Lemma 521 with υ = (Qprimeminus q) middot |δ|x and ν = |δq|xWe obtain that S2(U primeW primeW ) is at most(

1 +xW minus U2(Qprime minus q)

)(W minusW prime + νminus1

) sumW primeltpleW

(log p)2

Here W minusW prime + νminus1 leW2 + xq|δ| as before Moreover(W

2+

x

q|δ|

)(1 +

xW minus U2(Qprime minus q)

)le(W

2+Qprime

)(1 +

x2W

2(1minus ρ)Qprime

)le W

2+Qprime +

x

8(1minus ρ)Qprime+

x

4W (1minus ρ)

le x

|δq|+W

2+

x

8(1minus ρ)Q+

x

4(1minus ρ)W

Hence (557) holds

Chapter 6

Minor-arc totals

It is now time to make all of our estimates fully explicit choose our parameters putour type I and type II estimates together and give our final minor-arc estimates

Let x gt 0 be given Starting in section 631 we will assume that x ge x0 =216 middot1020 We will choose our main parameters U and V gradually as the need ariseswe assume from the start that 2 middot 106 le V lt x4 and UV le x

We are also given an angle α isin RZ We choose an approximation 2α = aq +δx (a q) = 1 q le Q |δx| le 1qQ The parameter Q will be chosen later weassume from the start that Q ge max(16 2

radicx) and Q ge max(2U xU)

(Actually U and V will be chosen in different ways depending on the size of qActually evenQ will depend on the size of q this may seem circular but what actuallyhappens is the following we will first set a value for Q depending only on x and ifthe corresponding value of q le Q is larger than a certain parameter y depending on xthen we reset U V and Q and obtain a new value of q)

Let SI1 SI2 SII S0 be as in (39) with the smoothing function η = η2 as in(34) (We bounded the type I sums SI1 SI2 for a general smoothing function η it isonly here that we are specifying η)

The term S0 is 0 because V lt x4 and η2 is supported on [minus14 1] We set v = 2

61 The smoothing functionFor the smoothing function η2 in (34)

|η2|1 = 1 |ηprime2|1 = 8 log 2 |ηprimeprime2 |1 = 48 (61)

as per [Tao14 (59)ndash(513)] Similarly for η2ρ(t) = log(ρt)η2(t) where ρ ge 4

|η2ρ|1 lt log(ρ)|η2|1 = log(ρ)

|ηprime2ρ|1 = 2η2ρ(12) = 2 log(ρ2)η2(12) lt (8 log 2) log ρ

|ηprimeprime2ρ|1 = 4 log(ρ4) + |2 log ρminus 4 log(ρ4)|+ |4 log 2minus 4 log ρ|+ | log ρminus 4 log 2|+ | log ρ| lt 48 log ρ

(62)

101

102 CHAPTER 6 MINOR-ARC TOTALS

In the first inequality we are using the fact that log(ρt) is always positive (and less thanlog(ρ)) when t is in the support of η2

Write log+ x for max(log x 0)

62 Contributions of different types

621 Type I terms SI1The term SI1 can be handled directly by Lemma 423 with ρ0 = 4 and D = U (Condition (438) is valid thanks to (62)) Since U le Q2 the contribution of SI1gets bounded by (440) and (441) the absolute value of SI1 is at most

x

qmin

(1c0δ

2

(2π)2

) ∣∣∣∣∣∣∣∣∣∣summleUq

(mq)=1

micro(m)

mlog

x

mq

∣∣∣∣∣∣∣∣∣∣+x

q|log middotη(minusδ)|

∣∣∣∣∣∣∣∣∣∣summleUq

(mq)=1

micro(m)

m

∣∣∣∣∣∣∣∣∣∣+

2radicc0c1π

(U log

ex

U+radic

3q logq

c2+q

2log

q

c2log+ 2U

q

)+

3c1x

2qlog

q

c2log+ U

c2xq

+3c12

radic2x

c2log

2x

c2+

(c02minus 2c0π2

)(U2

4qxlog

e12x

U+

1

e

)+

2|ηprime|1π

qmax

(1 log

c0e3q2

4π|ηprime|1x

)log x+

20c0c322

3π2

radic2x log

2radicex

c2

(63)where c0 = 31521 (by Lemma B23) c1 = 10000028 gt 1 + (8 log 2)V ge 1 +(8 log 2)(xU) and c2 = 6π5

radicc0 = 067147 By (21) (with k = 2) (B17) and

Lemma B24

|log middotη(minusδ)| le min

(2minus log 4

24 log 2

π2δ2

)

By (220) (222) and (223) the first line of (63) is at most

x

qmin

(1cprime0δ2

)(min

(4

5

qφ(q)

log+ Uq2

1

)log

x

U+ 100303

q

φ(q)

)

+x

qmin

(2minus log 4

cprimeprime0δ2

)min

(4

5

qφ(q)

log+ Uq2

1

)

where cprime0 = 0798437 gt c0(2π)2 cprimeprime0 = 1685532 Clearly cprimeprime0c0 gt 1 gt 2minus log 4Taking derivatives we see that t 7rarr (t2) log(tc2) log+ 2Ut takes its maxi-

mum (for t isin [1 2U ]) when log(tc2) log+ 2Ut = log tc2 minus log+ 2Ut sincetrarr log tc2 minus log+ 2Ut is increasing on [1 2U ] we conclude that

q

2log

q

c2log+ 2U

qle U log

2U

c2

62 CONTRIBUTIONS OF DIFFERENT TYPES 103

Similarly t 7rarr t log(xt) log+(Ut) takes its maximum at a point t isin [0 U for whichlog(xt) log+(Ut) = log(xt) + log+(Ut) and so

x

qlog

q

c2log+ U

c2xq

le U

c2(log x+ logU)

We conclude that

|SI1| lex

qmin

(1cprime0δ2

)(min

(4qφ(q)

5 log+ Uq2

1

)(log

x

U+ c3I

)+ c4I

q

φ(q)

)

+

(c7I log

q

c2+ c8I log xmax

(1 log

c11Iq2

x

))q + c10I

U2

4qxlog

e12x

U

+

(c5I log

2U

c2+ c6I log xU

)U + c9I

radicx log

2radicex

c2+c10I

e

(64)where c2 and cprime0 are as above c3I = 211104 gt cprimeprime0c

prime0 c4I = 100303 c5I =

357422 gt 2radicc0c1π c6I = 223389 gt 3c12c2 c7I = 619072 gt 2

radic3c0c1π

c8I = 353017 gt 2(8 log 2)π

c9I = 191568 gt3radic

2c12radicc2

+20radic

2c0c322

3π2

c10I = 937301 gt c0(12minus 2π2) and c11I = 90857 gt c0e3(4π middot 8 log 2)

622 Type I terms SI2The case q le QV If q le QV then for v le V

2vα =va

q+Olowast

(v

Qq

)=va

q+Olowast

(1

q2

)

and so vaq is a valid approximation to 2vα (Here we are using v to label an integervariable bounded above by v le V we no longer need v to label the quantity in (310)since that has been set equal to the constant 2) Moreover for Qv = Qv we see that2vα = (vaq) +Olowast(1qQv) If α = aq + δx then vα = vaq + δ(xv) Now

SI2 =sumvleVv odd

Λ(v)summleUm odd

micro(m)sumn

n odd

e((vα) middotmn)η(mn(xv)) (65)

We can thus estimate SI2 by applying Lemma 422 to each inner double sum in (65)We obtain that if |δ| le 12c2 where c2 = 6π5

radicc0 and c0 = 31521 then |SI2| is

at most

sumvleV

Λ(v)

xv2qvmin

(1

c0(πδ)2

) ∣∣∣∣∣∣∣∣∣sum

mleMvq

(m2q)=1

micro(m)

m

∣∣∣∣∣∣∣∣∣+c10Iq

4xv

(U

qv+ 1

)2

(66)

104 CHAPTER 6 MINOR-ARC TOTALS

plus

sumvleV

Λ(v)

(2radicc0c+

πU +

3c+2

x

vqvlog+ U

c2xvqv

+

radicc0c+

πqv log+ U

qv2

)

+sumvleV

Λ(v)

(c8I max

(log

c11Iq2v

xv 1

)qv +

(2radic

3c0c+π

+3c+2c2

+55c0c2

6π2

)qv

)

(67)where qv = q(q v) Mv isin [min(Q2v U) U ] and c+ = 1 + (8 log 2)(xUV ) if|δ| ge 12c2 then |SI2| is at most (66) plus

sumvleV

Λ(v)

radicc0c1π2

U +3c12

2 +(1 + ε)

εlog+ 2U

xv|δ|qv

x

Q+

35c0c23π2

qv

+sumvleV

Λ(v)

radicc0c1π2

(1 + ε) min

(lfloorxv

|δ|qv

rfloor+ 1 2U

)radic3 + 2ε+

log+ 2U

b xv|δ|qv c+1

2

(68)

Write SV =sumvleV Λ(v)(vqv) By (212)

SV lesumvleV

Λ(v)

vq+

sumvleV

(vq)gt1

Λ(v)

v

((q v)

qminus 1

q

)

le log V

q+

1

q

sump|q

(log p)

vp(q) +sumαge1

pα+vp(q)leV

1

pαminussumαge1

pαleV

1

le log V

q+

1

q

sump|q

(log p)vp(q) =log V q

q

(69)

This helps us to estimate (66) We could also use this to estimate the second term inthe first line of (67) but for that purpose it will actually be wiser to use the simplerbound sum

vleV

Λ(v)x

vqvlog+ U

c2xvqv

lesumvleV

Λ(v)Uc2ele 10004

ec2UV (610)

(by (214) and the fact that t log+At takes its maximum at t = Ae)We bound the sum over m in (66) by (220) and (222)∣∣∣∣∣∣∣∣∣

summleMvq

(m2q)=1

micro(m)

m

∣∣∣∣∣∣∣∣∣ le min

(4

5

qφ(q)

log+ Mv

2q2 1

)

62 CONTRIBUTIONS OF DIFFERENT TYPES 105

To bound the terms involving (Uqv + 1)2 we usesumvleV

Λ(v)v le 05004V 2 (by (217))

sumvleV

Λ(v)v(v q)j lesumvleV

Λ(v)v + VsumvleV

(vq)6=1

Λ(v)(v q)j

sumvleV

(vq) 6=1

Λ(v)(v q) lesump|q

(log p)sum

1leαlelogp V

pvp(q) lesump|q

(log p)log V

log ppvp(q)

le (log V )sump|q

pvp(q) le q log V

and sumvleV

(vq)6=1

Λ(v)(v q)2 lesump|q

(log p)sum

1leαlelogp V

pvp(q)+α

lesump|q

(log p) middot 2pvp(q) middot plogp V le 2qV log q

Using (214) and (69) as well we conclude that (66) is at most

x

2qmin

(1

c0(πδ)2

)min

(4

5

qφ(q)

log+ min(Q2VU)2q2

1

)log V q

+c10I

4x

(05004V 2q

(U

q+ 1

)2

+ 2UV q log V + 2U2V log V

)

AssumeQ le 2UVe Using (214) (610) (218) and the inequality vq le V q le Q(which implies q2 le Ue) we see that (67) is at most

10004

((2radicc0c+

π+

3c+2ec2

)UV +

radicc0c+

πQ log

U

q2

)+

(c5I2 max

(log

c11Iq2V

x 2

)+ c6I2

)Q

where c5I2 = 353312 gt 10004 middot c8I and

c6I2 = 10004

(2radic

3c0c+π

+3c+2c2

+55c0c2

6π2

) (611)

The expressions in (68) get estimated similarly The first line of (68) is at most

10004

(2radicc0c+

πUV +

3c+2

(2 +

1 + ε

εlog+ 2UV |δ|q

x

)xV

Q+

35c0c23π2

qV

)

106 CHAPTER 6 MINOR-ARC TOTALS

by (214) Since q le QV we can obviously bound qV by Q As for the second lineof (68) ndash

sumvleV

Λ(v) min

(lfloorxv

|δ|qv

rfloor+ 1 2U

)middot 1

2log+ 2Ulfloor

xv|δ|qv

rfloor+ 1

lesumvleV

Λ(v) maxtgt0

t log+ U

tlesumvleV

Λ(v)U

e=

10004

eUV

but

sumvleV

Λ(v) min

(lfloorxv

|δ|qv

rfloor+ 1 2U

)le

sumvle x

2U|δ|q

Λ(v) middot 2U

+sum

x2U|δ|qltvleV

(vq)=1

Λ(v)x|δ|vq

+sumvleV

Λ(v) +sumvleV

(vq)6=1

Λ(v)x|δ|v

(1

qvminus 1

q

)

le 103883x

|δ|q+

x

|δ|qmax

(log V minus log

x

2U |δ|q+ log

3radic2 0

)+ 10004V +

x

|δ|1

q

sump|q

(log p)vp(q)

le x

|δ|q

(103883 + log q + log+ 6UV |δ|qradic

2x

)+ 10004V

by (212) (213) (214) and (215) we are proceeding much as in (69)

Let us collect our bounds If |δ| le 12c2 then assuming Q le 2UVe we con-clude that |SI2| is at most

x

2φ(q)min

(1

c0(πδ)2

)min

(45

log+ Q4V q2

1

)log V q

+ c8I2x

q

(UV

x

)2 (1 +

q

U

)2

+c10I

2

(UV

xq log V +

U2V

xlog V

) (612)

plus

(c4I2 +c9I2)UV +(c10I2 logU

q+c5I2 max

(log

c11Iq2V

x 2

)+c12I2)middotQ (613)

62 CONTRIBUTIONS OF DIFFERENT TYPES 107

where

c4I2 = 357565(1 + ε0) gt 10004 middot 2radicc0c+πc5I2 = 353312 gt 10004 middot c8I

c8I2 = 117257 gtc10I

4middot 05004

c9I2 = 082214(1 + 2ε0) gt 3c+ middot 100042ec2

c10I2 = 178783radic

1 + 2ε0 gt 10004radicc0c+π

c12I2 = 293333 + 11902ε0

gt 10004

(3

2c2c+ +

2radic

3c0π

radicc+ +

55c0c26π2

)+ 178783(1 + ε0) log 2

= c6I2 + c10I2 log 2

and c10I = 937301 as before Here ε0 = (4 log 2)(xUV ) and c6I2 is as in (611)If |δ| ge 12c2 then |SI2| is at most (612) plus

(c4I2 + (1 + ε)c13I2)UV + cε

(c14I2

(log q + log+ 6UV |δ|qradic

2x

)+ c15I2

)x

|δ|q

+ c16I2

(2 +

1 + ε

εlog+ 2UV |δ|q

x

)x

QV+ c17I2Q+ cε middot c4I2V

(614)where

c13I2 = 131541(1 + ε0) gt2radicc0c+

πmiddot 10004

e

c14I2 = 357422radic

1 + 2ε0 gt2radicc0c+

π

c15I2 = 371301radic

1 + 2ε0 gt2radicc0c+

πmiddot 103883

c16I2 = 15006(1 + 2ε0) gt 10004 middot 3c+2

c17I2 = 250295 gt 10004 middot 35c0c23π2

and cε = (1 + ε)radic

3 + 2ε We recall that c2 = 6π5radicc0 = 067147 We will

choose ε isin (0 1) later we also leave the task of bounding ε0 for laterThe case q gt QV We use Lemma 424 in this case

623 Type II termsAs we showed in (51)ndash(55) SII (given in (51)) is at most

4

int xU

V

radicS1(UW ) middot S2(U VW )

dW

W+4

int xU

V

radicS1(UW ) middot S3(W )

dW

W (615)

where S1 S2 and S3 are as in (54) and (55) We bounded S1 in (533) and (534) S2

in Prop 524 and S3 in (55)

108 CHAPTER 6 MINOR-ARC TOTALS

Let us try to give some structure to the bookkeeping we must now inevitably doThe second integral in (615) will be negligible (because S3 is) let us focus on the firstintegral

Thanks to our work in sect51 the term S1(UW ) is bounded by a (small) constanttimes xW (This represents a gain of several factors of log with respect to the trivialbound) We bounded S2(U VW ) using the large sieve we expected and got a boundthat is better than trivial by a factor of size roughly radicq log x ndash the exact factor inthe bound depends on the value of W In particular it is only in the central part of therange for W that we will really be able to save a factor of radicq log x as opposed tojust radicq We will have to be slightly clever in order to get a good total bound in theend

We first recall our estimate for S1 In the whole range [V xU ] for W we knowfrom (533) (534) and (537) that S1(UW ) is at most

2

π2

x

W+ κ0ζ(32)3 x

W

radicxWU

U (616)

whereκ0 = 127

(We recall we are working with v = 2)We have better estimates for the constant in front in some parts of the range in

what is usually the main part (534) and (536) give us a constant of 015107 insteadof 2π2 Note that 127ζ(32)3 = 226417 We should choose U V so that thefirst term in (616) dominates For the while being assume only

U ge 5 middot 105 x

V U (617)

then (616) givesS1(UW ) le κ1

x

W (618)

whereκ1 =

2

π2+

226418radic1062

le 02347

This will suffice for our cruder estimatesThe second integral in (615) is now easy to bound By (55)

S3(W ) le 10171x+ 20341W le 10172x

since W le xU le x5 middot 105 Hence

4

int xU

V

radicS1(UW ) middot S3(W )

dW

Wle 4

int xU

V

radicκ1

x

Wmiddot 10172x

dW

W

le κ9xradicV

62 CONTRIBUTIONS OF DIFFERENT TYPES 109

whereκ9 = 8 middot

radic10172 middot κ1 le 39086

Let us now examine S2 which was bounded in Prop 524 We set the parametersW prime U prime as follows in accordance with (54)

W prime = max(VW2) U prime = max(U x2W )

Since W prime geW2 and W ge V gt 117 we can always boundsumW primeltpleW

(log p)2 le 1

2W (logW ) (619)

by (219)Bounding S2 for δ arbitrary We set

W0 = min(max(2θq V ) xU)

where θ ge e is a parameter that will be set laterFor V leW lt W0 we use the bound (553)

S2(U primeW primeW ) le(

max(1 2ρ)

(x

8q+

x

2W

)+W

2+ 2q

)middot 1

2W (logW )

le max

(1

2 ρ

)(W

8q+

1

2

)x logW +

W 2 logW

4+ qW logW

where ρ = qQIf W0 gt V the contribution of the terms with V leW lt W0 to (615) is (by 618)

bounded by

4

int W0

V

radicκ1

x

W

(ρ0

4

(W

4q+ 1

)x logW +

W 2 logW

4+ qW logW

)dW

W

le κ2

2

radicρ0x

int W0

V

radiclogW

W 32dW +

κ2

2

radicx

int W0

V

radiclogW

W 12dW

+ κ2

radicρ0x2

16q+ qx

int W0

V

radiclogW

WdW

le(κ2radicρ0

xradicV

+ κ2

radicxW0

)radiclogW0

+2κ2

3

radicρ0x2

16q+ qx

((logW0)32 minus (log V )32

)

(620)

where ρ0 = max(1 2ρ) and

κ2 = 4radicκ1 le 193768

(We are using the easy boundradica+ b+ c le

radica+radicb+radicc)

110 CHAPTER 6 MINOR-ARC TOTALS

We now examine the terms with W ge W0 If 2θq gt xU then W0 = xU thecontribution of the case is nil and the computations below can be ignored Thus wecan assume that 2θq le xU

We use (554)

S2(U primeW primeW ) le(

x

4φ(q)

1

log(W2q)+

q

φ(q)

W

log(W2q)

)middot 1

2W logW

Byradica+ b le

radica+radicb we can take out the qφ(q) middotW log(W2q) term and estimate

its contribution on its own it is at most

4

int xU

W0

radicκ1

x

Wmiddot q

φ(q)middot 1

2W 2

logW

logW2q

dW

W

=κ2radic

2

radicq

φ(q)

int xU

W0

radicx logW

W logW2qdW

le κ2radic2

radicqx

φ(q)

int xU

W0

1radicW

(1 +

radiclog 2q

logW2q

)dW

(621)

Nowint xU

W0

1radicW

radiclog 2q

logW2qdW =

radic2q log 2q

int x2Uq

max(θV2q)

1radict log t

dt

We bound this last integral somewhat crudely for T ge e

int T

e

1radict log t

dt le 23

radicT

log T (622)

(This is shown as follows since

1radicT log T

lt

(23

radicT

log T

)prime

if and only if T gt T0 where T0 = e(1minus223)minus1

= 213594 it is enough to check(numerically) that (622) holds for T = T0) Since θ ge e this gives us that

int xU

W0

1radicW

(1 +

radiclog 2q

logW2q

)dW

le 2

radicx

U+ 23

radic2q log 2q middot

radicx2Uq

log x2Uq

62 CONTRIBUTIONS OF DIFFERENT TYPES 111

and so (621) is at most

radic2κ2

radicq

φ(q)

(1 + 115

radiclog 2q

log x2Uq

)xradicU

We are left with what will usually be the main term viz

4

int xU

W0

radicS1(UW ) middot

(x

8φ(q)

logW

logW2q

)WdW

W (623)

which by (534) is at most xradicφ(q) times the integral of

1

W

radicradicradicradic(2H2

( x

WU

)+κ4

2

radicxWU

U

)logW

logW2q

for W going from W0 to xU where H2 is as in (535) and

κ4 = 4κ0ζ(32)3 le 905671

By the arithmeticgeometric mean inequality the integrand is at most 1W times

β + βminus1 middot 2H2(xWU)

2+βminus1

2

κ4

2

radicxWU

U+β

2

log 2q

logW2q(624)

for any β gt 0 We will choose β laterThe first summand in (624) gives what we can think of as the main or worst term

in the whole paper let us compute it first The integral isint xU

W0

β + βminus1 middot 2H2(xWU)

2

dW

W=

int xUW0

1

β + βminus1 middot 2H2(s)

2

ds

s

le(β

2+κ6

)log

x

UW0

(625)

by (536) whereκ6 = 060428

Thus the main term is simply(β

2+κ6

)xradicφ(q)

logx

UW0 (626)

The integral of the second summand is at most

βminus1 middot κ4

4

radicx

U

int xU

V

dW

W 32le βminus1 middot κ4

2

radicxUV

U

112 CHAPTER 6 MINOR-ARC TOTALS

By (617) this is at most

βminus1

radic2middot 10minus3 middot κ4 le βminus1κ72

where

κ7 =

radic2κ4

1000le 01281

Thus the contribution of the second summand is at most

βminus1κ7

2middot xradic

φ(q)

The integral of the third summand in (624) is

β

2

int xU

W0

log 2q

logW2q

dW

W (627)

If V lt 2θq le xU this is

β

2

int xU

2θq

log 2q

logW2q

dW

W=β

2log 2q middot

int x2Uq

θ

1

log t

dt

t

2log 2q middot

(log log

x

2Uqminus log log θ

)

If 2θq gt xU the integral is over an empty range and its contribution is hence 0If 2θq le V (627) is

β

2

int xU

V

log 2q

logW2q

dW

W=β log 2q

2

int x2Uq

V2q

1

log t

dt

t

=β log 2q

2middot (log log

x

2Uqminus log log V2q)

=β log 2q

2middot log

(1 +

log xUV

log V2q

)

(628)

(Let us stop for a moment and ask ourselves when this will be smaller than whatwe can see as the main term namely the term (β2) log xUW0 in (625) Clearlylog(1 + (log xUV )(log V2q)) le (log xUV )(log V2q) and that is smaller than(log xUV ) log 2q when V2q gt 2q Of course it does not actually matter if (628)is smaller than the term from (625) or not since we are looking for upper bounds herenot for asymptotics)

The total bound for (623) is thus

xradicφ(q)

middot(β middot(

1

2log

x

UW0+

Φ

2

)+ βminus1

(1

4κ6 log

x

UW0+κ7

2

)) (629)

62 CONTRIBUTIONS OF DIFFERENT TYPES 113

where

Φ =

log 2q(

log log x2Uq minus log log θ

)if V2θ lt q lt x(2θU)

log 2q log(

1 + log xUVlog V2q

)if q le V2θ

(630)

Choosing β optimally we obtain that (623) is at most

xradic2φ(q)

radic(log

x

UW0+ Φ

)(κ6 log

x

UW0+ 2κ7

) (631)

where Φ is as in (630)Bounding S2 for |δ| ge 8 Let us see how much a non-zero δ can help us It makes

sense to apply (556) only when |δ| ge 8 otherwise (554) is almost certainly betterNow by definition |δ|x le 1qQ and so |δ| ge 8 can happen only when q le x8Q

With this in mind let us apply (556) assuming |δ| gt 8 Note first that

x

|δq|

(q +

x

4W

)minus1

ge 1|δq|qx + 1

4W

ge 4|δq|1

2Q + 1W

ge 4W

|δ|qmiddot 1

1 + W2Q

ge 4W

|δ|qmiddot 1

1 + xU2Q

This is at least 2 min(2QW )|δq| Thus we are allowed to apply (556) when |δq| le2 min(2QW ) Since Q ge xU we know that min(2QW ) = W for all W le xU and so it is enough to assume that |δq| le 2W We will soon be making a strongerassumption

Recalling also (619) we see that (556) gives us

S2(U primeW primeW ) le min

12qφ(q)

log

(4W|δ|q middot

1

1+xU2Q

)( x

|δq|+W

2

)middot 1

2W (logW )

(632)Similarly to before we define W0 = max(V θ|δq|) where θ ge 3e28 will be set

later (Here θ ge 3e28 is an assumption we do not yet need but we will be using itsoon to simplify matters slightly) For W geW0 we certainly have |δq| le 2W Hencethe part of the first term of (615) coming from the range W0 leW lt xU is

4

int xU

W0

radicS1(UW ) middot S2(U VW )

dW

W

le 4

radicq

φ(q)

int xU

W0

radicradicradicradicradicS1(UW ) middot logW

log

(4W|δ|q middot

1

1+xU2Q

) (Wx

|δq|+W 2

2

)dW

W

(633)

114 CHAPTER 6 MINOR-ARC TOTALS

By (534) the contribution of the term Wx|δq| to (633) is at most

4xradic|δ|φ(q)

int xU

W0

radicradicradicradicradicradic(H2

( x

WU

)+κ4

4

radicxWU

U

)logW

log

(4W|δ|q middot

1

1+xU2Q

) dWW

Note that 1 + (xU)2Q le 32 Proceeding as in (623)ndash(631) we obtain that this isat most

2xradic|δ|φ(q)

radic(log

x

UW0+ Φ

)(κ6 log

x

UW0+ 2κ7

)

where

Φ =

log (1+ε1)|δq|4 log

(1 + log xUV

log 4V|δ|(1+ε1)q

)if |δq| le Vθ

log 3|δq|8

(log log 8x

3U |δq| minus log log 8θ3

)if Vθ lt |δq| le xθU

(634)

where ε1 = x2UQ This is what we think of as the main termBy (618) the contribution of the term W 22 to (633) is at most

4

radicq

φ(q)

int xU

W0

radicκ1

2xdWradicWmiddot maxW0leWle x

U

radiclogW

log 8W3|δq|

(635)

Since trarr (log t)(log tc) is decreasing for t gt c (635) is at most

4radic

2κ1

radicq

φ(q)

(xradicUminusradicxW0

)radiclogW0

log 8W0

3|δq| (636)

If W0 gt V we also have to consider the range V leW lt W0 By Prop 524 and(619) the part of (615) coming from this is

4

int θ|δq|

V

radicS1(UW ) middot (logW )

(Wx

2|δq|+W 2

4+

Wx

16(1minus ρ)Q+

x

8(1minus ρ)

)dW

W

The contribution of W 24 is at most

4

int W0

V

radicκ1

x

WlogW middot W

2

4

dW

Wle 4radicκ1 middot

radicxW0 middot

radiclogW

the sum of this and (636) is at most

4radicκ1

(radic2q

φ(q)

(xradicUminusradicxW0

)radiclogW0

log 8θ3

+radicxW0

radiclogW0

)

le κ2 middotradic

q

φ(q)

xradicU

radiclogW0

62 CONTRIBUTIONS OF DIFFERENT TYPES 115

where we use the facts that W0 = θ|δq| (by W0 gt V ) and θ ge 3e28 and where werecall that κ2 = 4

radicκ1

The terms Wx2|δ|q and Wx(16(1minus ρ)Q) contribute at most

4radicκ1

int θ|δq|

V

radicx

Wmiddot (logW )W

(x

2|δq|+

x

16(1minus ρ)Q

)dW

W

= κ2x

(1radic2|δ|q

+1

4radic

(1minus ρ)Q

)int θ|δq|

V

radiclogW

dW

W

=2κ2

3x

(1radic2|δ|q

+1

4radic

(1minus ρ)Q

)((log θ|δ|q)32 minus (log V )32

)

The term x8(1minus ρ) contributes

radic2κ1x

int θ|δq|

V

radiclogW

W (1minus ρ)

dW

Wleradic

2κ1xradic1minus ρ

int infinV

radiclogW

W 32dW

le κ2xradic2(1minus ρ)V

(radic

log V +radic

1 log V )

where we use the estimate

int infinV

radiclogW

W 32dW =

1radicV

int infin1

radiclog u+ log V

u32du

le 1radicV

int infin1

radiclog V

u32du+

1radicV

int infin1

1

2radic

log V

log u

u32du

= 2

radiclog VradicV

+1

2radicV log V

middot 4 le 2radicV

(radiclog V +

radic1 log V

)

It is time to collect all type II terms Let us start with the case of general δ We willset θ ge e later If q le V2θ then |SII | is at most

xradic2φ(q)

middot

radic(log

x

UV+ log 2q log

(1 +

log xUV

log V2q

))(κ6 log

x

UV+ 2κ7

)+radic

2κ2

radicq

φ(q)

(1 + 115

radiclog 2q

log x2Uq

)xradicU

+ κ9xradicV

(637)

116 CHAPTER 6 MINOR-ARC TOTALS

If V2θ lt q le x2θU then |SII | is at most

xradic2φ(q)

middot

radic(log

x

U middot 2θq+ log 2q log

log x2Uq

log θ

)(κ6 log

x

U middot 2θq+ 2κ7

)

+radic

2κ2

radicq

φ(q)

(1 + 115

radiclog 2q

log x2Uq

)xradicU

+ (κ2

radiclog 2θq + κ9)

xradicV

+κ2

6

((log 2θq)32 minus (log V )32

) xradicq

+ κ2

(radic2θ middot log 2θq +

2

3((log 2θq)32 minus (log V )32)

)radicqx

(638)where we use the fact that Q ge xU (implying that ρ0 = max(1 2qQ) equals 1 forq le x2U ) Finally if q gt x2θU

|SII | le (κ2

radic2 log xU + κ9)

xradicV

+ κ2

radiclog xU

xradicU

+2κ2

3((log xU)32 minus (log V )32)

(x

2radic

2q+radicqx

)

(639)

Now let us examine the alternative bounds for |δ| ge 8 Here we assume θ ge 3e28If |δq| le Vθ then |SII | is at most

2xradic|δ|φ(q)

radicradicradicradiclogx

UV+ log

|δq|(1 + ε1)

4log

(1 +

log xUV

log 4V|δ|(1+ε1)q

)

middotradicκ6 log

x

UV+ 2κ7

+ κ2

radic2q

φ(q)middot

radiclog V

log 2V|δq|middot xradic

U+ κ9

xradicV

(640)

where ε1 = x2UQ If Vθ lt |δ|q le xθU then |SII | is at most

2xradic|δ|φ(q)

radicradicradicradic(logx

U middot θ|δ|q+ log

3|δq|8

loglog 8x

3U |δq|

log 8θ3

)(κ6 log

x

U middot θ|δq|+ 2κ7

)

+2κ2

3

(xradic2|δq|

+x

4radicQminus q

)((log θ|δq|)32 minus (log V )32

)+

(κ2radic

2(1minus ρ)

(radiclog V +

radic1 log V

)+ κ9

)xradicV

+ κ2

radicq

φ(q)middotradic

log θ|δq| middot xradicU

(641)

63 ADJUSTING PARAMETERS CALCULATIONS 117

where ρ = qQ Note that |δ| le xQq implies ρ le xQ2 and so ρ will be very smalland Qminus q will be very close to Q

The case |δq| gt xθU will not arise in practice essentially because of |δ|q le xQ

63 Adjusting parameters Calculations

We must bound the exponential sumsumn Λ(n)e(αn)η(nx) By (38) it is enough to

sum the bounds we obtained in sect62 We will now see how it will be best to set U Vand other parameters

Usually the largest terms will be

C0UV (642)

where C0 equals

c4I2 + c9I2 = 439779 + 521993ε0 if |δ| le 12c2 sim 074463c4I2 + (1 + ε)c13I2 = (489106 + 131541ε)(1 + ε0) if |δ| gt 12c2

(643)(from (613) and (614) type I we will specify ε and ε0 = (4 log 2)(xUV ) later)and

xradicδ0φ(q)

radicradicradicradiclogx

UV+ (log δ0(1 + ε1)q) log

(1 +

log xUV

log Vδ0(1+ε1)q

)

middotradicκ6 log

x

UV+ 2κ7

(644)

(from (637) and (640) type II here δ0 = max(2 |δ|4) while ε1 = x2UQ for|δ| gt 8 and ε1 = 0 for |δ| lt 8

We set UV = κxradicqδ0 we must choose κ gt 0

Let us first optimize (or rather almost optimize) κ in the case |δ| le 4 so thatδ0 = 2 and ε1 = 0 For the purpose of choosing κ we replace

radicφ(q) by

radicqC1

where C1 = 23536 sim 510510φ(510510) and also replace V by q2c c a constantWe use the approximation

log

(1 +

log xUV

log V|2q|

)= log

(1 +

log(radic

2qκ)

log(q2c)

)= log

(3

2+

log 2radiccκ

log q2c

)sim log

3

2+

2 log 2radiccκ

3 log q2c

118 CHAPTER 6 MINOR-ARC TOTALS

What we must minimize then is

C0κradic2q

+C1radic2q

radicradicradicradic(log

radic2q

κ+ log 2q

(log

3

2+

2 log 2radicc

κ3 log q

2c

))(κ6 log

radic2q

κ+ 2κ7

)

le C0κradic2q

+C1

2radicq

radicκ6radicκprime1

radicκprime1 log q minus

(5

3+

2

3

log 4c

log q2c

)logκ + κprime2

middot

radicκprime1 log q minus 2κprime1 logκ +

4κprime1κ7

κ6+ κprime1 log 2

le C0radic2q

(κ + κprime4

(κprime1 log q minus

((5

6+ κprime1

)+

1

3

log 4c

log q2c

)logκ + κprime3

))

(645)where

κprime1 =1

2+ log

3

2 κprime2 = log

radic2 + log 2 log

3

2+

log 4c log 2q

3 log q2c

κprime3 =1

2

(κprime2 +

4κprime1κ7

κ6+ κprime1 log 2

)=

log 4c

6+

(log 4c)2

6 log q2c

+ κprime5

κprime4 =C1

C0

radicκ6

2κprime1sim

030915

1+118694ε0if |δ| le 4

027797(1+026894ε)(1+ε0) if |δ| gt 4

κprime5 =1

2(logradic

2 + log 2 log3

2+

4κprime1κ7

κ6+ κprime1 log 2) sim 101152

Taking derivatives we see that the minimum is attained when

κ =

(5

6+ κprime1 +

1

3

log 4c

log q2c

)κprime4 sim

(17388 +

log 4c

3 log q2c

)middot 030915

1 + 119ε0(646)

provided that |δ| le 4 (What we obtain for |δ| gt 4 is essentially the same only withδ0q = δq4 instead of 2q and 027797((1 + 027ε)(1 + ε0)) in place of 030915) Forq = 5 middot 105 c = 25 and |δ| le 4 (typical values in the most delicate range) we get thatκ should be about 05582(1 + 119ε0) Values of q c nearby give similar values forκ whether |δ| le 4 or for |δ| gt 4

(Incidentally at this point we could already give a back-of-the-envelope estimatefor the last line of (645) ie our main term It suggests that choosing w = 1 insteadof w = 2 would have given bounds worse by about 15 percent)

We make the choices

κ = 12 and so UV =x

2radicqδ0

for the sake of simplicity (Unsurprisingly (645) changes very slowly around its min-imum) Note by the way that this means that ε0 = (2 log 2)

radicqδ0

Now we must decide how to choose U V and Q given our choice of UV We willactually make two sets of choices

63 ADJUSTING PARAMETERS CALCULATIONS 119

First we will use the SI2 estimates for q le QV to treat all α of the form α =aq +Olowast(1qQ) q le y (Here y is a parameter satisfying y le QV )

Then the remaining α will get treated with the (coarser) SI2 estimate for q gtQV with Q reset to a lower value (call it Qprime) If α was not treated in the first go (sothat it must be dealt with the coarser estimate) then α = aprimeqprime + δprimex where eitherqprime gt y or δprimeqprime gt xQ (Otherwise α = aprimeqprime +Olowast(1qprimeQ) would be a valid estimatewith qprime le y) The value of Qprime is set to be smaller than Q both because this is helpful(it diminishes error terms that would be large for large q) and because this is harmless(since we are no longer assuming that q le QV )

631 First choice of parameters q le y

The largest items affected strongly by our choices at this point are

c16I2

(2 +

1 + ε

εlog+ 2UV |δ|q

x

)x

QV+ c17I2Q (from SI2 |δ| gt 12c2)(

c10I2 logU

q+ 2c5I2 + c12I2

)Q (from SI2 |δ| le 12c2)

(647)and

κ2

radic2q

φ(q)

(1 + 115

radiclog 2q

log x2Uq

)xradicU

+ κ9xradicV

(from SII any |delta|)

(648)with

κ2

radic2q

φ(q)middot

radiclog V

log 2V|δq|middot xradic

U(from SII )

as an alternative to (648) for |δ| ge 8 (In several of these expressions we are apply-ing some minor simplifications that our later choices will justify Of course even ifthese simplifications were not justified we would not be getting incorrect results onlypotentially suboptimal ones we are trying to decide how choose certain parameters)

In addition we have a relatively mild but important dependence on V in the mainterm (644) even when we hold UV constant (as we do in so far as we have alreadychosen UV ) We must also respect the condition q le QV the lower bound onU given by (617) and the assumptions made at the beginning of the chapter (egQ ge xU V ge 2 middot 106) Recall that UV = x2

radicqδ0

We setQ =

x

8y

since we will then have not just q le y but also q|δ| le xQ = 8y and so qδ0 le 2yWe want q le QV to be true whenever q le y this means that

q le Q

V=QU

UV=

QU

x2radicqδ0

=Uradicqδ0

4y

120 CHAPTER 6 MINOR-ARC TOTALS

must be true when q le y and so it is enough to set U = 4y2radicqδ0 The following

choices make sense we will work with the parameters

y =x13

6 Q =

x

8y=

3

4x23 xUV = 2

radicqδ0 le 2

radic2y

U =4y2

radicqδ0

=x23

9radicqδ0

V =x

(xUV ) middot U=

x

8y2=

9x13

2

(649)

where as before δ0 = max(2 |δ|4) So for instance we obtain ε1 le x2UQ =6radicqδ0x

13 le 2radic

3x16 Assuming

x ge 216 middot 1020 (650)

we obtain that U(xUV ) ge (x239radicqδ0)(2

radicqδ0) = x2318qδ0 ge x136 ge

106 and so (617) holds We also get that ε1 le 0002Since V = x8y2 = (92)x13 (650) also implies that V ge 2 middot 106 (in fact

V ge 27 middot 106) It is easy to check that

V lt x4 UV le x Q ge max(16 2radicx) Q ge max(2U xU) (651)

as stated at the beginning of the chapter Let θ = (32)3 = 278 Then

V

2θq=x8y2

2θqge x

16θy3=

x

54y3= 4 gt 1

V

θ|δq|ge x8y2

8θyge x

64θy3=

x

216y3= 1

(652)

The first type I bound is

|SI1| lex

qmin

(1cprime0δ2

)min

45

qφ(q)

log+ x23 9

q52 δ

120

1

(log 9x13

radicqδ0 + c3I

)+c4Iq

φ(q)

+

(c7I log

y

c2+ c8I log x

)y +

c10Ix13

3422q32δ120

(log 9x13radiceqδ0)

+

(c5I log

2x23

9c2radicqδ0

+ c6I logx53

9radicqδ0

)x23

9radicqδ0

+ c9Iradicx log

2radicex

c2+c10I

e

(653)where the constants are as in sect621 For any cR ge 1 the function

xrarr (log cx)(log xR)

attains its maximum on [Rprimeinfin] Rprime gt R at x = Rprime Hence for qδ0 fixed

min

45

log+ 4x23

9(δ0q)52

1

(log 9x13

radicqδ0 + c3I

)(654)

63 ADJUSTING PARAMETERS CALCULATIONS 121

attains its maximum for x isin [(9e45(δ0q)524)32infin) at

x =(

9e45(δ0q)524

)32

= (278)e65(qδ0)154 (655)

Now notice that for smaller values of x (654) increases as x increases since the termmin( 1) equals the constant 1 Hence (654) attains its maximum for x isin (0infin)at (655) and so

min

45

log+ 4x23

9(δ0q)52

1

(log 9x13

radicqδ0 + c3I

)+ c4I

le log27

2e25(δ0q)

74 + c3I + c4I le7

4log δ0q + 611676

Examining the other terms in (653) and using (650) we conclude that

|SI1| lex

qmin

(1cprime0δ2

)middot q

φ(q)

(7

4log δ0q + 611676

)+

x23

radicqδ0

(067845 log xminus 120818) + 037864x23

(656)

where we are using (650) (and of course the trivial bound δ0q ge 2) to simplify thesmaller error terms We recall that cprime0 = 0798437 gt c0(2π)2

Let us now consider SI2 The terms that appear both for |δ| small and |δ| large aregiven in (612) The second line in (612) equals

c8I2

(x

4q2δ0+

2UV 2

x+qV 2

x

)+c10I

2

(q

2radicqδ0

+x23

18qδ0

)log

9x13

2

le c8I2(

x

4q2δ0+

9x13

2radic

2+

27

8

)+c10I

2

(y16

232+

x23

18qδ0

)(1

3log x+ log

9

2

)le 029315

x

q2δ0+ (008679 log x+ 039161)

x23

qδ0+ 000153

radicx

where we are using (650) to simplify Now

min

(45

log+ Q4V q2

1

)log V q = min

(45

log+ y4q2

1

)log

9x13q

2(657)

can be bounded trivially by log(9x13q2) le (23) log x+log 34 We can also bound(657) as we bounded (654) before namely by fixing q and finding the maximum forx variable In this way we obtain that (657) is maximal for y = 4e45q2 since bydefinition x136 = y (657) then equals

log9(6 middot 4e45q2)q

2= 3 log q + log 108 +

4

5le 3 log q + 548214

122 CHAPTER 6 MINOR-ARC TOTALS

We conclude that (612) is at most

min

(1

4cprime0δ2

)middot(

3

2log q + 274107

)x

φ(q)

+ 029315x

q2δ0+ (00434 log x+ 01959)x23

(658)

If |δ| le 12c2 we must consider (613) This is at most

(c4I2 + c9I2)x

2radicqδ0

+ (c10I2 logx23

9q32radicδ0

+ 2c5I2 + c12I2) middot 3

4x23

le 21989xradicqδ0

+361818x

qδ0+ (177019 log x+ 292955)x23

where we recall that ε0 = (4 log 2)(xUV ) = (2 log 2)radicqδ0 which can be bounded

crudely byradic

2 log 2 (Thus c10I2 leradic

1 +radic

8 log 2middot178783 lt 354037 and c12I2 le293333 + 11902

radic2 log 2 le 410004)

If |δ| gt 12c2 we must consider (614) instead For ε = 007 that is at most

(c4I2 + (1 + ε)c13I2)x

2radicqδ0

(1 +

2 log 2radicqδ0

)+ (338845

(1 +

2 log 2radicqδ0

)log δq3 + 208823)

x

|δ|q

+

(688133

(1 +

4 log 2radicqδ0

)log |δ|q + 720828

)x23 + 604141x13

= 249157xradicqδ0

(1 +

2 log 2radicqδ0

)+ (338845 log δq3 + 326771)

x

|δ|q

+

(229378 log x+ 190791

log |δ|qradicqδ0

+ 130691

)x

23

le 249157xradicqδ0

+ (359676 log δ0 + 273032 log q + 912218)x

qδ0

+ (229378 log x+ 411228)x23

where besides the crude bound ε0 leradic

2 log 2 we use the inequalities

log |δ|qradicqδ0

le log 4qδ0radicqδ0

le log 8radic2

log qradicqδ0le 1radic

2

log qradicqle 1radic

2

log e2

e=

radic2

e

1

|δ|le 4c2

δ0

log |δ||δ|

le 2

e log 2middot log δ0

δ0

(Obviously 1|δ| le 4c2δ0 is based on the assumption |δ| gt 12c2 and on the inequal-ity 16c2 ge 1 The bound on (log |δ|)|δ| is based on the fact that (log t)t reaches itsmaximum at t = e and (log δ0)δ0 = (log 2)2 for |δ| le 8)

63 ADJUSTING PARAMETERS CALCULATIONS 123

We sum (658) and whichever one of our bounds for (613) and (614) is greater(namely the latter) We obtain that for any δ

|SI2| le 249157xradicqδ0

+ min

(1

4cprime0δ2

)middot(

3

2log q + 274107

)x

φ(q)

+ (359676 log δ0 + 273032 log q + 91515)x

qδ0+ (229812 log x+ 411424)x23

(659)where we bound one of the lower-order terms in (658) by xq2δ0 le xqδ0

For type II we have to consider two cases (a) |δ| lt 8 and (b) |δ| ge 8 Considerfirst |δ| lt 8 Then δ0 = 2 Recall that θ = 278 We have q le V2θ and |δq| le Vθthanks to (652) We apply (637) and obtain that for |δ| lt 8

|SII | lexradic

2φ(q)middot

radicradicradicradic1

2log 4qδ0 + log 2q log

(1 +

12 log 4qδ0

log V2q

)middotradic

030214 log 4qδ0 + 02562

+ 822088

radicq

φ(q)

1 + 115

radicradicradicradic log 2q

log 9x13radicδ0

2radicq

(qδ0)14x23 + 184251x56

le xradic2φ(q)

middotradicCx2q log 2q +

log 8q

2middotradic

030214 log 2q + 067506

+ 16406

radicq

φ(q)x34 + 184251x56

(660)where we bound

log 2q

log 9x13radicδ0

2radicq

lelog x13

3

log 9x16radic

2

2radic

16

lt limxrarrinfin

log x13

3

log 9x16radic

2

2radic

16

= 2

and where we define

Cxt = log

(1 +

log 4t

2 log 9x13

2004t

)

for 0 lt t lt 9x132 (We have 2004 here instead of 2 because we want a constantge 2(1 + ε1) in later occurences of Cxt for reasons that will soon become clear)

For purposes of later comparison we remark that 16404 le 157863x45minus34 forx ge 216 middot 1020

Consider now case (b) namely |δ| ge 8 Then δ0 = |δ|4 By (652) |δq| le Vθ

124 CHAPTER 6 MINOR-ARC TOTALS

Hence (640) gives us that

|SII | le2xradic|δ|φ(q)

middot

radicradicradicradic1

2log |δq|+ log

|δq|(1 + ε1)

4log

(1 +

log |δ|q2 log 18x13

|δ|(1+ε1)q

)middotradic

030214 log |δ|q + 02562

+ 822088

radicq

φ(q)

radicradicradicradic log 9x13

2

log 9x13

|δq|

middot (qδ0)14x23 + 184251x56

le xradicδ0φ(q)

radicCxδ0q log δ0(1 + ε1)q +

log 4δ0q

2

radic030214 log δ0q + 067506

+ 179926

radicq

φ(q)x45 + 184251x56

(661)since

822088

radicradicradicradic log 9x13

2

log 9x13

|δq|

middot (qδ0)14 le 822088

radiclog 9x13

2

log 274

middot (x133)14

le 179926x45minus23

for x ge 216 middot 1020 Clearly

log δ0(1 + ε1)q = log δ0q + log(1 + ε1) le log δ0q + ε1

By Lemma C22 qφ(q) le z(y) = z(x136) (since x ge 183) It is easy tocheck that x rarr

radicz(x136)x45minus56 is decreasing for x ge 216 middot 1020 (in fact for

183) Using (650) we conclude that 167718radicqφ(q)x45 le 089657x56 and by

the way 16406radicqφ(q)x34 le 078663x56 This allows us to simplify the last lines

of (660) and (661) We obtain that for δ arbitrary

|SII | lexradicδ0φ(q)

radicCxδ0q(log δ0q + ε1) +

log 4δ0q

2

radic030214 log δ0q + 067506

+ 273908x56(662)

It is time to sum up SI1 SI2 and SII The main terms come from the first lineof (662) and the first term of (659) Lesser-order terms can be dealt with roughlywe bound min(1 cprime0δ

2) and min(1 4cprime0δ2) from above by 2δ0 (using the fact that

cprime0 = 0798437 lt 16 which implies that 8δ gt 4cprime0δ2 for δ gt 8 of course for δ le 8

we have min(1 4cprime0δ2) le 1 = 22 = 2δ0)

63 ADJUSTING PARAMETERS CALCULATIONS 125

The terms inversely proportional to q φ(q) or q2 thus add up to at most

2x

δ0qmiddot q

φ(q)

(7

4log δ0q + 611676

)+

2x

δ0φ(q)

(3

2log q + 274107

)+ (359676 log δ0 + 273032 log q + 91515)

x

qδ0

le 2x

δ0φ(q)

(13

4log δ0q + 781811

)+

2x

δ0q(136516 log δ0q + 375415)

where for instance we bound (32) log q + 274107 by (32) log δ0q + 274107 minus(32) log 2

As for the other terms ndash we use the assumption x ge 216 middot 1020 to bound x23

and x23 log x by a small constant times x56 We bound x23radicqδ0 by x23

radic2 (in

(656)) We obtain

x23

radic2

(067845 log xminus 120818) + 037864x23

+ (229812 log x+ 411424)x23 + 273908x

56 le 335531x56

The sums S0infin and S0w in (311) are 0 (by (650) and the fact that η2(t) = 0 fort le 14) We conclude that for q le y = x136 x ge 216 middot 1020 and η = η2 as in(34)

|Sη(x α)| le |SI1|+ |SI2|+ |SII |

le xradicδ0φ(q)

radicCxδ0q(log δ0q + 0002) +

log 4δ0q

2

radic030214 log δ0q + 067506

+249157xradic

δ0q+

2x

δ0φ(q)

(13

4log δ0q + 781811

)+

2x

δ0q(136516 log δ0q + 375415)

+ 335531x56(663)

where

δ0 = max(2 |δ|4) Cxt = log

(1 +

log 4t

2 log 9x13

2004t

) (664)

SinceCxt is an increasing function as a function of t (for x fixed and t le 9x132004)and δ0q le 2y we see that Cxt le Cx2y It is clear that x 7rarr Cxt (fixed t) is adecreasing function of x For x = 216 middot 1020 Cx2y = 139942

632 Second choice of parameters

If with the original choice of parameters we obtained q gt y = x136 we now resetour parameters (Q U and V ) Recall that while the value of q may now change (due tothe change inQ) we will be able to assume that either q gt y or |δq| gt x(x8y) = 8y

126 CHAPTER 6 MINOR-ARC TOTALS

We want U(xUV ) ge 5 middot 105 (this is (617)) We also want UV small With thisin mind we let

V =x13

3 U = 500

radic6x13 Q =

x

U=

x23

500radic

6 (665)

Then (617) holds (as an equality) Since we are assuming (650) we have V ge 2 middot106It is easy to check that (650) also implies that U le

radicx2 and Q ge 2

radicx and so the

inequalities in (651) all holdWrite 2α = aq + δx for the new approximation we must have either q gt y or

|δ| gt 8yq since otherwise aq would already be a valid approximation under the firstchoice of parameters Thus either (a) q gt y or both (b1) |δ| gt 8 and (b2) |δ|q gt 8ySince now V = 2y we have q gt V2θ in case (a) and |δq| gt Vθ in case (b) for anyθ ge 1 We set θ = 4

(Thanks to this choice of θ we have |δq| le xQ le xθU as we commented at theend of sect623 this will help us avoid some case-work later)

By (64)

|SI1| lex

qmin

(1cprime0δ2

)(log x23 minus log 500

radic6 + c3I + c4I

q

φ(q)

)+

(c7I log

Q

c2+ c8I log x log c11I

Q2

x

)Q+ c10I

U2

4xlog

e12x23

500radic

6+c10I

e

+

(c5I log

1000radic

6x13

c2+ c6I log 500

radic6x43

)middot 500radic

6x13 + c9Iradicx log

2radicex

c2

le x

qmin

(1cprime0δ2

)(2

3log xminus 499944 + 100303

q

φ(q)

)+

289

1000x23(log x)2

where we are bounding

c7I logQ

c2+ c8I log x log c11I

Q2

x

=c8I(log x)2 minus(c8I(log 1500000minus log c11I)minus

2

3c7I

)log x+ c7I log

1

500radic

6c2

lec8I(log x)2 minus 38 log x

We are also using the assumption (650) repeatedly in order to show that the sum ofall lower-order terms is less than (38c8I log x)(500

radic6) Note that c8I(log x)2Q le

000289x23(log x)2We have qφ(q) le z(Q) (where z is as in (C19)) and since Q gt

radic6 middot 12 middot 109

for x ge 216 middot 1020

100303z(Q) le 100303

(eγ log logQ+

250637

log logradic

6 middot 12 middot 109

)le 02359 logQ+ 079 lt 01573 log x

63 ADJUSTING PARAMETERS CALCULATIONS 127

(It is possible to give a much better estimation but it is not worthwhile since this willbe a very minor term) We have either q gt y or q|δ| gt 8y if q|δ| gt 8y but q le y then|δ| ge 8 and so cprime0δ

2q lt 18|δ|q lt 164y lt 1y Hence

|SI1| lex

y

((2

3+ 01573

)log x

)+ 000289x23(log x)2

le 24719x23 log x+ 000289x23(log x)2

We bound |SI2| using Lemma 424 First we bound (450) this is at most

x

2qmin

(1

4cprime0δ2

)log

x13q

3

+ c0

(1

4minus 1

π2

) (UV )2 log x13

3

2x+

3c42

500radic

6

9+

(500radic

6x13 + 1)2x13 log x

23

6x

where c4 = 103884 We bound the second line of this using (650) As for the firstline we have either q ge y (and so the first line is at most (x2y)(log x13y3)) orq lt y and 4cprime0δ

2q lt 116y lt 1y (and so the same bound applies) Hence (450) isat most

3x23

(2

3log xminus log 18

)+ 002017x23 log x = 202017x23 log xminus3(log 18)x23

Now we bound (451) which comes up when |δ| le 12c2 where c2 = 6π5radicc0

c0 = 31521 (and so c2 = 06714769 ) Since 12c2 lt 8 it follows that q gt y (thealternative q le y q|δ| gt 8y is impossible since it implies |δ| gt 8) Then (451) is atmost

2radicc0c1π

(UV log

UVradice

+Q

(radic3 log

c2x

Q+

logUV

2log

UV

Q2

))+

3c12

x

ylogUV log

UV

c2xy+

16 log 2

πQ log

c0e3Q2

4π middot 8 log 2 middot xlog

Q

2

+3c1

2radic

2c2

radicx log

c2x

2+

25c04π2

(3c2)12radicx log x

(666)

where c1 = 1000189 gt 1 + (8 log 2)(2xUV )The first line of (666) is a linear combination of terms of the form x23 logCx

C gt 1 using (650) we obtain that it is at most 1144693x23 log x (The main contri-bution comes from the first term) Similarly we can bound the first term in the secondline by 330536x23 log x Since log(c0e

3Q2(4π middot 8 log 2 middot x)) logQ2 is at mostlog x13 log x23 the second term in the second line is at most 00006406x(log x)2The third line of (666) can be bounded easily by 00122x23 log x

Hence (666) is at most

117776x23 log x+ 00006406x23(log x)2

128 CHAPTER 6 MINOR-ARC TOTALS

If |δ| gt 12c2 then we know that |δq| gt min(y2c2 8y) = y2c2 Thus (452)(with ε = 001) is at most

2radicc0c1π

UV logUVradice

+202radicc0c1

π

(x

y2c2+ 1

)((radic

302minus 1) log

xy2c2

+ 1radic

2+

1

2logUV log

e2UVx

y2c2

)

+

(3c12

(1

2+

303

016log x

)+

20c03π2

(2c2)32

)radicx log x

Again by (650) and in much the same way as before this simplifies to

le (114466 + 15107 + 68523)x23 log x+ 29136x12(log x)2

le 122885x23(log x)

Hence in total and for any |δ|

|SI2| le 202017x23 log x+ 122885x23(log x) + 00006406x23(log x)2

le 12309x23(log x) + 00006406x23(log x)2

Now we must estimate SII As we said before either (a) q gt y or both (b1)|δ| gt 8 and (b2) |δ|q gt 8y Recall that θ = 4 In case (a) we have q gt x136 =V2 gt V2θ thus we can use (638) and obtain that if q le x8U |SII | is at most

xradicz(q)radic2q

radic(log

x

U middot 8q+ log 2q log

log x(2Uq)

log 4

)(κ6 log

x

U middot 8q+ 2κ7

)

+radic

2κ2

radicz( x

8U

)(1 + 115

radiclog x4U

log 4

)xradicU

+ (κ2

radiclog xU + κ9)

xradicV

+κ2

6

((log 8y)32 minus (log 2y)32

) xradicy

+ κ2

(radic8 log xU +

2

3((log xU)32 minus (log V )32)

)xradic8U

(667)where z is as in (C19) (We are already simplifying the third line the bound givenis justified by a derivative test) It is easy to check that q rarr (log 2q)(log log q)q isdecreasing for q ge y (indeed for q ge 9) and so the first line of (667) is maximal forq = y

63 ADJUSTING PARAMETERS CALCULATIONS 129

We can thus bound (667) by x56 timesradic3z(et36)

(t

3minus log 8c+

(t

3minus log 3

)log

t3 minus log 2c

log 4

)(κ6

3tminus 4214

)+

radic2κ2radic6c

radicz(e2t3

48c

)1 + 115

radic23 tminus log 24c

log 4

+

(κ2

radic2t

3minus log 6c+ κ9

)radic

3

+κ2radic

6

((t

3+ log

8

6

) 32

minus(t

3+ log

2

6

) 32

)

+κ2radic48c

(radic8

(2t

3minus log 6c

)+

2

3

((2t

3minus log 6c

) 32

minus(t

3minus log 3

) 32

))(668)

where t = log x and c = 500radic

6 Asymptotically the largest term in (667) comesfrom the last line (of order t32) even if the first line is larger in practice (while beingof order at most t log t) Let us bound (668) by a multiple of t32

First of all notice that

d

dt

z(et3

6

)log t

=

(eγ log

(t3 minus log 6

)+ 250637

log( t3minuslog 6)

)primelog t

minusz(et3

6

)t(log t)2

=eγ minus 250637

log2( t3minuslog 6)

(tminus 3 log 6) log tminuseγ + 250637

log2( t3minuslog 6)

t log tmiddot

log(t3 minus log 6

)log t

(669)

which for t ge 100 is

gteγ log 3minus 2middot250637 log t

log2( t3minuslog 6)

t(log t)2ge

195671minus 892482log t

t(log t)2gt 0

Similarly for t ge 2000

d

dt

z(e2t3

48c

)log t

gteγ log 3

2 minus250637 log t

log2( 2t3 minuslog 48c)

minus 250637

log( 2t3 minuslog 48c)

t(log t)2

ge072216minus 545234

log t

t(log t)2gt 0

Thus

z(et3

6

)le (log t) middot lim

srarrinfin

z(es3

6

)log s

= eγ log t for t ge 100

z(e2t3

48c

)le (log t) middot lim

srarrinfin

z(e2s3

48c

)log s

= eγ log t for t ge 2000

(670)

130 CHAPTER 6 MINOR-ARC TOTALS

Also note that since (x32)prime = (32)radicx((

t

3+ log

8

6

) 32

minus(t

3+ log

2

6

) 32

)le 3

2

radict

3+ log

8

6middot log 4 le 120083

radict

for t ge 2000 We also have(2t

3minus log 6c

) 32

minus(t

3minus log 3

) 32

lt

(2t

3minus log 9

) 32

minus(t

3minus log 3

) 32

= (232 minus 1)

(t

3minus log 3

) 32

lt (232 minus 1)t32

332le 035189t32

Of course

t

3minus log 8c+

(t

3minus log 3

)log

t3 minus log 2c

log 4lt

(t

3+t

3log

t

3

)ltt

3log t

We conclude that for t ge 2000 (668) is at mostradic3 middot eγ log t middot t

3log t middot κ6

3t+

radic2κ2radic6c

radiceγ log t

(1 + 079749

radict)

+

(κ2

radic2

3t12 + κ9

)radic

3 +κ2radic

6middot 12009

radict+

κ2radic48c

(radic16t

3+

2

3middot 035189t32

)le (010181 + 000012 + 000145 + 0000048 + 000462)t32 le 010848t32

On the remaining interval log(216 middot 1020) le t le log 2000 we use interval arith-metic (as in sect26 with 30 iterations) to bound the ratio of (668) to t32 We obtain thatit is at most

0275964t32

Hence for all x ge 216 middot 1020

|SII | le 0275964x56(log x)32 (671)

in the case y lt q le x8U If x8U lt q le Q we use (639) In this range x2

radic2q +

radicqx adopts its max-

imum at q = Q (because x2radic

2q for q = x8U is smaller thanradicqx for q = Q by

(665) and (650)) Hence (639) is at most x56 times(κ2

radic2

(2

3tminus log cprime

)+ κ9

)radic

3 + κ2

radic2

3tminus log cprime middot 1radic

cprime

+2κ2

3

((2

3tminus log cprime

) 32

minus(t

3minus log 3

) 32

)( radiccprime

2radic

2eminust6 +

1radiccprime

)

63 ADJUSTING PARAMETERS CALCULATIONS 131

where t = log x (as before) and cprime = 500radic

6 This is at most

(2κ2 +radic

3κ9)radict+

κ2radiccprime

radic2

3

radict+

2κ2

3

232 minus 1

332t32

( radiccprime

2radic

2eminust6 +

1radiccprime

)le 010327

for t ge log(216 middot 1020

) and so

|SII | le 010327x56(log x)32

for x8U lt q le Q using the assumption x ge 216 middot 1020Finally let us treat case (b) that is |δ| gt 8 and |δ|q gt 8y we can also assume

q le y as otherwise we are in case (a) which has already been treated Since |δx| le1qQ we know that

|δq| le x

Q= U = 500

radic6x13 le x23

2000radic

6=

x

4U=

x

θU

again under assumption (650) We apply (641) and obtain that |SII | is at most

2xradicz(y)radic8y

radic(log

x

U middot 4 middot 8y+ log 3y log

log x3Uy

log 323

)(κ6 log

x

U middot 4 middot 8y+ 2κ7

)+

2κ2

3

(xradic16y

((log 32y)32 minus (log 2y)

32 ) +

x4radicQminus y

((log 4U)32 minus (log 2y)

32 )

)+

(κ2radic

2(1minus yQ)

(radiclog V +

radic1 log V

)+ κ9

)xradicV

+ κ2

radicz(y) middot

radiclog 4U middot xradic

U

(672)where we are using the facts that (log 3t8)t is increasing for t ge 8y gt 8e3 and that

d

dt

(log t)32 minus (log V )32

radict

=3(log t)12 minus ((log t)32 minus (log V )32)

2t32

= minuslog t

e3 middotradic

log tminus (log V )32

2t32lt 0

for t ge θ middot 8y = 16V thanks to(log

16V

e3

)2

log 16V gt (log V )3 +

(log 16minus 2 log

e3

16

)(log V )2

+

((log

16

e3

)2

minus 2 loge3

16log 16

)log V gt (log V )3

132 CHAPTER 6 MINOR-ARC TOTALS

(valid for log V ge 1) Much as before we can rewrite (672) as x56 times

2radicz(et36)radic

86

radict

3minus log 32c+

(t

3minus log 2

)log

t3 minus log 3c

log 323

middot

radicκ6

(t

3minus log 32c

)+ 2κ7 +

2κ2

3

radic3

8

((t

3+ log

32

6

) 32

minus(t

3minus log 3

) 32

)

+2κ2

3

14radicet3

6c minus16

((t

3+ log 24c

)32

minus(t

3minus log 3

)32)

+κ2

radic3radic

2(1minus c

et3

)(radic

t3minus log 3 +1radic

t3minus log 3

)+ κ9

radic3

+ κ2

radicz(et36)

radict3 + log 24c

6c

(673)where t = log x and c = 500

radic6 For t ge 100 we use (670) to bound z(et36)

and we obtain that (673) is at most

2radiceγradic

86

radic1

3middot κ6

3middot (log t)t+

2κ2

3

radic3

8middot 1

2

(t

3+ log

32

6

)12

middot log 16

+2κ2

3

14radice1003

6c minus 16

middot 1

2

(t

3+ log 24c

)12

middot log 72c

+κ2

radic3radic

2(1minus c

e1003

)(radic

t3 +1radict3

)+ κ9

radic3 + κ2

radiceγ log t

radict3 + log 24c

6c

(674)where we have bounded expressions of the form a32minusb32 (a gt b) by (a122)middot(aminusb)The ratio of (674) to t32 is clearly a decreasing function of t For t = 200 this ratiois 023747 hence (674) (and thus (673)) is at most 023748t32 for t ge 200

On the range log(216 middot 1020) le t le 200 the bisection method (with 25 iterations)gives that the ratio of (673) to t32 is at most 023511

We conclude that when |δ| gt 8 and |δ|q gt 8y

|SII | le 023511x56(log x)32

Thus (671) gives the worst caseWe now take totals and obtain

Sη(x α) le |SI1|+ |SI2|+ |SII |le (24719 + 12309)x23 log x+ (000289 + 00006406)x23(log x)2

+ 0275964x56(log x)32

le 027598x56(log x)32 + 123338x23 log x(675)

64 CONCLUSION 133

where we use (650) yet again

64 ConclusionProof of Theorem 311 We have shown that |Sη(α x)| is at most (663) for q lex136 and at most (675) for q gt x136 It remains to simplify (663) slightlyBy the geometric meanarithmetic mean inequalityradic

Cxδ0q(log δ0q + 0002) +log 4δ0q

2

radic030214 log δ0q + 067506 (676)

is at most

1

2radicρ

(Cxδ0q(log δ0q + 0002) +

log 4δ0q

2

)+

radicρ

2(030214 log δ0q + 067506)

for any ρ gt 0 We recall that

Cxt = log

(1 +

log 4t

2 log 9x13

2004t

)

Let

ρ =Cx12q0(log 2q0 + 0002) + log 8q0

2

030214 log 2q0 + 067506= 3397962

where x1 = 1025 q0 = 2 middot 105 (In other words we are optimizing matters for x = x1δ0q = 2q0 the losses in nearby ranges will be very slight) We obtain that (676) is atmost

Cxδ0q2radicρ

(log δ0q + 0002) +

(1

4radicρ

+

radicρ middot 030214

2

)log δ0q

+1

2

(log 2radicρ

+

radicρ

2middot 067506

)le 027125Cxt(log δ0q + 0002) + 04141 log δ0q + 049911

(677)

Now for x ge x0 = 216 middot 1020

Cxtlog t

le Cx0t

log t=

1

log tlog

(1 +

log 4t

2 log 54middot106

2004t

)le 008659

for 8 le t le 106 (by the bisection method with 20 iterations) and

Cxtlog t

leC(6t)3t

log tle 1

log tlog

(1 +

log 4t

2 log 9middot62004

)le 008659

if 106 lt t le x136 Hence

027125 middot Cxδ0q middot 0002 le 0000047 log δ0q

134 CHAPTER 6 MINOR-ARC TOTALS

We conclude that for q le x136

|Sη(α x)| le Rxδ0q log δ0q + 049911radicφ(q)δ0

middot x+2492xradicqδ0

+2x

δ0φ(q)

(13

4log δ0q + 782

)+

2x

δ0q(1366 log δ0q + 3755) + 336x56

where

Rxt = 027125 log

(1 +

log 4t

2 log 9x13

2004t

)+ 041415

Part II

Major arcs

135

Chapter 7

Major arcs overview andresults

Our task as in Part I will be to estimate

Sη(α x) =sumn

Λ(n)e(αn)η(nx) (71)

where η R+ rarr C us a smooth function Λ is the von Mangoldt function and e(t) =e2πit Here we will treat the case of α lying on the major arcs

We will see how we can obtain good estimates by using smooth functions η basedon the Gaussian eminust

22 This will involve proving new fully explicit bounds for theMellin transform of the twisted Gaussian or what is the same bounds on paraboliccylindrical functions in certain ranges It will also require explicit formulae that aregeneral and strong enough even for moderate values of x

Let α = aq + δx For us saying that α lies on a major arc will be the same assaying that q and δ are bounded more precisely q will be bounded by a constant r and|δ| will be bounded by a constant times rq As is customary on the major arcs wewill express our exponential sum (31) as a linear combination of twisted sums

Sηχ(δx x) =

infinsumn=1

Λ(n)χ(n)e(δnx)η(nx) (72)

for χ Zrarr C a Dirichlet character mod q ie a multiplicative character on (ZqZ)lowast

lifted to Z (The advantage here is that the phase term is now e(δnx) rather thane(αn) and e(δnx) varies very slowly as n grows) Our task then is to estimateSηχ(δx x) for δ small

Estimates on Sηχ(δx x) rely on the properties of DirichletL-functionsL(s χ) =sumn χ(n)nminuss What is crucial is the location of the zeroes of L(s χ) in the critical strip

0 le lt(s) le 1 (a region in which L(s χ) can be defined by analytic continuation) Incontrast to most previous work we will not use zero-free regions which are too narrowfor our purposes Rather we use a verification of the Generalized Riemann Hypothesisup to bounded height for all conductors q le 300000 (due to D Platt [Plab])

137

138 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

A key feature of the present work is that it allows one to mimic a wide varietyof smoothing functions by means of estimates on the Mellin transform of a singlesmoothing function ndash here the Gaussian eminust

22

71 Results

Write ηhearts(t) = eminust22 Let us first give a bound for exponential sums on the primes

using ηhearts as the smooth weight Without loss of generality we may assume that ourcharacter χ mod q is primitive ie that it is not really a character to a smaller modulusqprime|q

Theorem 711 Let x be a real numberge 108 Let χ be a primitive Dirichlet charactermod q 1 le q le r where r = 300000

Then for any δ isin R with |δ| le 4rq

infinsumn=1

Λ(n)χ(n)e

xn

)eminus

(nx)2

2 = Iq=1 middot ηhearts(minusδ) middot x+ E middot x

where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

|E| le 4306 middot 10minus22 +1radicx

(650400radicq

+ 112

)

We normalize the Fourier transform f as follows f(t) =intinfinminusinfin e(minusxt)f(x)dx Of

course ηhearts(minusδ) is justradic

2πeminus2π2δ2 As it turns out smooth weights based on the Gaussian are often better in applica-

tions than the Gaussian ηhearts itself Let us give a bound based on η(t) = t2ηhearts(t)

Theorem 712 Let η(t) = t2eminust22 Let x be a real number ge 108 Let χ be a

primitive character mod q 1 le q le r where r = 300000Then for any δ isin R with |δ| le 4rq

infinsumn=1

Λ(n)χ(n)e

xn

)η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

|E| le 2485 middot 10minus19 +1radicx

(281200radicq

+ 56

)

The advantage of η(t) = t2ηhearts(t) over ηhearts is that it vanishes at the origin (to secondorder) as we shall see this makes it is easier to estimate exponential sums with thesmoothing η lowastM g where lowastM is a Mellin convolution and g is nearly arbitrary Here isa good example that is used crucially in Part III

71 RESULTS 139

Corollary 713 Let η(t) = t2eminust22 lowastM η2(t) where η2 = η1 lowastM η1 and η1 =

2 middot I[121] Let x be a real number ge 108 Let χ be a primitive character mod q1 le q le r where r = 300000

Then for any δ isin R with |δ| le 4rq

infinsumn=1

Λ(n)χ(n)e

xn

)η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

|E| le 2485 middot 10minus19 +1radicx

(381500radicq

+ 76

)

Let us now look at a different kind of modification of the Gaussian smoothing Saywe would like a weight of a specific shape for example what we will need to do inPart III we would like an approximation to the function

η t 7rarr

t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

(73)

At the same time what we have is an estimate for the Mellin transform of the Gaussianeminust

22 centered at t = 0The route taken here is to work with an approximation η+ to η We let

η+(t) = hH(t) middot teminust22 (74)

where hH is a band-limited approximation to

h(t) =

t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

(75)

By band-limited we mean that the restriction of the Mellin transform of hH to theimaginary axis is of compact support (We could alternatively let hH be a functionwhose Fourier transform is of compact support this would be technically easier insome ways but it would also lead to using GRH verifications less efficiently)

To be precise we define

FH(t) =sin(H log y)

π log y

hH(t) = (h lowastM FH)(y) =

int infin0

h(tyminus1)FH(y)dy

y

(76)

and H is a positive constant It is easy to check that MFH(iτ) = 1 for minusH ltτ lt H and MFH(iτ) = 0 for τ gt H or τ lt minusH (unsurprisingly since FH is aDirichlet kernel under a change of variables) Since in general the Mellin transform ofa multiplicative convolution f lowastM g equals Mf middotMg we see that the Mellin transform

140 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

of hH on the imaginary axis equals the truncation of the Mellin transform of h to[minusiH iH] Thus hH is a band-limited approximation to h as we desired

The distinction between the odd and the even case in the statement that followssimply reflects the two different points up to which computations where carried out in[Plab] these computations were in turn to some extent tailored to the needs of thepresent work (as was the shape of η+ itself)

Theorem 714 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

H = 200 Let x be a real numberge 1012 Let χ be a primitive character mod q where1 le q le 150000 if q is odd and 1 le q le 300000 if q is even

Then for any δ isin R with |δ| le 600000 middot gcd(q 2)q

infinsumn=1

Λ(n)χ(n)e

xn

)η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

|E| le 13482 middot 10minus14 +1617 middot 10minus10

q+

1radicx

(499900radicq

+ 52

)

If q = 1 we have the sharper bound

|E| le 4772 middot 10minus11 +251400radic

x

This is a paradigmatic example in that following the proof given in sect94 we canbound exponential sums with weights of the form hH(t)eminust

22 where hH is a band-limited approximation to just about any continuous function of our choosing

Lastly we will need an explicit estimate of the `2 norm corresponding to the sumin Thm 714 for the trivial character

Proposition 715 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

H = 200 Let x be a real number ge 1012Theninfinsumn=1

Λ(n)(log n)η2(nx) = x middotint infin

0

η2+(t) log xt dt+ E1 middot x log x

= 0640206x log xminus 0021095x+ E2 middot x log x

where|E1| le 5123 middot 10minus15 +

36691radicx

|E2| le 2 middot 10minus6 +36691radic

x

72 Main ideasAn explicit formula gives an expression

Sηχ(δx x) = Iq=1η(minusδ)xminussumρ

Fδ(ρ)xρ + small error (77)

72 MAIN IDEAS 141

where Iq=1 = 1 if q = 1 and Iq=1 = 0 otherwise Here ρ runs over the complexnumbers ρ with L(ρ χ) = 0 and 0 lt lt(ρ) lt 1 (ldquonon-trivial zerosrdquo) The function Fδis the Mellin transform of e(δt)η(t) (see sect24)

The questions are then where are the non-trivial zeros ρ of L(s χ) How fast doesFδ(ρ) decay as =(ρ)rarr plusmninfin

Write σ = lt(s) τ = =(s) The belief is of course that σ = 12 for every non-trivial zero (Generalized Riemann Hypothesis) but this is far from proven Most workto date has used zero-free regions of the form σ le 1minus1C log q|τ | C a constant Thisis a classical zero-free region going back qualitatively to de la Vallee-Poussin (1899)The best values of C known are due to McCurley [McC84a] and Kadiri [Kad05]

These regions seem too narrow to yield a proof of the three-primes theorem Whatwe will use instead is a finite verification of GRH ldquoup to Tqrdquo ie a computation show-ing that for every Dirichlet character of conductor q le r0 (r0 a constant as above)every non-trivial zero ρ = σ + iτ with |τ | le Tq satisfies lt(σ) = 12 Such verifica-tions go back to Riemann modern computer-based methods are descended in part froma paper by Turing [Tur53] (See the historical article [Boo06b]) In his thesis [Pla11]D Platt gave a rigorous verification for r0 = 105 Tq = 108q In coordination withthe present work he has extended this to

bull all odd q le 3 middot 105 with Tq = 108q

bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

This was a major computational effort involving in particular a fast implementationof interval arithmetic (used for the sake of rigor)

What remains to discuss then is how to choose η in such a way Fδ(ρ) decreasesfast enough as |τ | increases so that (77) gives a good estimate We cannot hope forFδ(ρ) to start decreasing consistently before |τ | is at least as large as a constant times|δ| Since δ varies within (minuscr0q cr0q) this explains why Tq is taken inverselyproportional to q in the above As we will work with r0 ge 150000 we also see that wehave little margin for maneuver we want Fδ(ρ) to be extremely small already for say|τ | ge 80|δ| We also have a Scylla-and-Charybdis situation courtesy of the uncertaintyprinciple roughly speaking Fδ(ρ) cannot decrease faster than exponentially on |τ ||δ|both for |δ| le 1 and for δ large

The most delicate case is that of δ large since then |τ ||δ| is small It turns outwe can manage to get decay that is much faster than exponential for δ large while noslower than exponential for δ small This we will achieve by working with smoothingfunctions based on the (one-sided) Gaussian ηhearts(t) = eminust

22The Mellin transform of the twisted Gaussian e(δt)eminust

22 is a parabolic cylinderfunction U(a z) with z purely imaginary Since fully explicit estimates for U(a z)z imaginary have not been worked in the literature we will have to derive them our-selves

Once we have fully explicit estimates for the Mellin transform of the twisted Gaus-sian we are able to use essentially any smoothing function based on the Gaussianηhearts(t) = eminust

22 As we already saw we can and will consider smoothing functionsobtained by convolving the twisted Gaussian with another function and also functionsobtained by multiplying the twisted Gaussian with another function All we need to

142 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

do is use an explicit formula of the right kind ndash that is a formula that does not as-sume too much about the smoothing function or the region of holomorphy of its Mellintransform but still gives very good error terms with simple expressions

All results here will be based on a single general explicit formula (Lem 911) validfor all our purposes The contribution of the zeros in the critical trip can be handled ina unified way (Lemmas 913 and 914) All that has to be done for each smoothingfunction is to bound a simple integral (in (924)) We then apply a finite verification ofGRH and are done

Chapter 8

The Mellin transform of thetwisted Gaussian

Our aim in this chapter is to give fully explicit yet relatively simple bounds for theMellin transform Fδ(ρ) of e(δt)ηhearts(t) where ηhearts(t) = eminust

22 and δ is arbitrary Therapid decay that results will establish that the Gaussian ηhearts is a very good choice for asmoothing particularly when the smoothing has to be twisted by an additive charactere(δt)

The Gaussian smoothing has been used before in number theory see notablyHeath-Brownrsquos well-known paper on the fourth power moment of the Riemann zetafunction [HB79] What is new here is that we will derive fully explicit bounds on theMellin transform of the twisted Gaussian This means that the Gaussian smoothing willbe a real option in explicit work on exponential sums in number theory and elsewherefrom now on1

Theorem 801 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of fδ

Let s = σ + iτ σ ge 0 τ 6= 0 Let ` = minus2πδ Then if sgn(δ) 6= sgn(τ) and δ 6= 0

|Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

c1σττ

σ2 for ρ arbitraryc2στ`

σ for ρ le 32(81)

1 There has also been work using the Gaussian after a logarithmic change of variables see in particular[Leh66] In that case the Mellin transform is simply a Gaussian (as in eg [MV07 Ex XII29]) Howeverfor δ non-zero the Mellin transform of a twist e(δt)eminus(log t)22 decays very slowly and thus would not beuseful for our purposes or in general for most applications in which GRH is not assumed

143

144 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

where ρ = 4τ`2

E(ρ) =1

2

(arccos

1

υ(ρ)minus 2(υ(ρ)minus 1)

ρ

)

c1στ =1

2

1 + 214

(2

1 + sin2 π8

)σ2+eminus(radic

2minus12

)τ(

tan π8

c2στ =1

2

1 + min

2σ+ 12

radicsec 2π

5(sin π

5

)σ+

eminusτ6

(1radic

3)σ

(82)

and

υ(ρ) =

radic1 +

radicρ2 + 1

2

If sgn(δ) = sgn(τ) or δ = 0

|Fδ(s)| le |x0|minusσ middot eminus12 `

2

|Γ(s)|eπ2 |τ | middot((

1 +π

232

)eminus

π4 |τ | +

1

2eminusπ|τ |

) (83)

where

|x0| ge

051729

radicτ for ρ arbitrary

084473 |τ ||`| for ρ le 32(84)

As we shall see the choice of smoothing function η(t) = eminust22 can be easily

motivated by the method of stationary phase but the problem is actually solved by thesaddle-point method One of the challenges here is to keep all expressions explicit andpractical

(In particular the more critical estimate (81) is optimal up to a constant dependingon σ the constants we give will be good rather than optimal)

The expressions in Thm 801 can be easily simplified further especially if one isready to introduce some mild constraints and make some sacrifices in the main term

Corollary 802 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of

fδ Let s = σ + iτ where σ isin [0 1] and |τ | ge 20 Then for 0 le k le 2

|Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

κk0(|τ ||`|

)keminus01065( 2|τ|

|`| )2

if 4|τ |`2 lt 32

κk1|τ |k2eminus01598|τ | if 4|τ |`2 ge 32

whereκ00 le 3001 κ10 le 4903 κ20 le 796

κ01 le 3286 κ11 le 4017 κ21 le 513

We are considering Fδ(s + k) and not just Fδ(s) because bounding Fδ(s + k)

enables us to work with smoothing functions equal to or based on tkeminust22 Clearly

we can easily derive bounds with k arbitrary from Thm 801 It is just that we will

81 HOW TO CHOOSE A SMOOTHING FUNCTION 145

use k = 0 1 2 in practice Corollary 802 is meant to be applied to cases where τis larger than a constant (10 say) times |`| and σ cannot be bounded away from 1 ifeither condition fails to hold it is better to apply Theorem 801 directly

Let us end by a remark that may be relevant to applications outside number theoryBy (89) Thm 801 gives us bounds on the parabolic cylinder function U(a z) for zpurely imaginary (Surprisingly there seem to have been no fully explicit bounds forthis case in the literature) The bounds are useful when |=(a)| is at least somewhatlarger than |=(z)| (ie when |τ | is large compared to `) While the Thm 801 is statedfor σ ge 0 (ie for lt(a) ge minus12) extending the result to larger half-planes for a isnot hard

81 How to choose a smoothing functionLet us motivate our choice of smoothing function η The method of stationary phase([Olv74 sect411] [Won01 sectII3])) suggests that the main contribution to the integral

Fδ(t) =

int infin0

e(δt)η(t)tsdt

t(85)

should come when the phase has derivative 0 The phase part of (85) is

e(δt)t=(s)i = e(2πδt+τ log t)i

(where we write s = σ + iτ ) clearly

(2πδt+ τ log t)prime = 2πδ +τ

t= 0

when t = minusτ2πδ This is meaningful when t ge 0 ie sgn(τ) 6= sgn(δ) Thecontribution of t = minusτ2πδ to (85) is then

η(t)e(δt)tsminus1 = η

(minusτ2πδ

)eminusiτ

(minusτ2πδ

)σ+iτminus1

(86)

multiplied by a ldquowidthrdquo approximately equal to a constant divided byradic|(2πiδt+ τ log t)primeprime| =

radic| minus τt2| = 2π|δ|radic

|τ |

The absolute value of (86) is

η(minus τ

2πδ

)middot∣∣∣∣ minusτ2πδ

∣∣∣∣σminus1

(87)

In other words if sgn(τ) 6= sgn(δ) and δ is not too small asking that Fδ(σ + iτ)decay rapidly as |τ | rarr infin amounts to asking that η(t) decay rapidly as t rarr 0 Thusif we ask for Fδ(σ + iτ) to decay rapidly as |τ | rarr infin for all moderate δ we arerequesting that

146 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

1 η(t) decay rapidly as trarrinfin

2 the Mellin transform F0(σ + iτ) decay rapidly as τ rarr plusmninfin

Requirement (2) is there because we also need to consider Fδ(σ+ it) for δ very smalland in particular for δ = 0

There is clearly an uncertainty-principle issue here one cannot do arbitrarily wellin both aspects at the same time Once we are conscious of this the choice η(t) = eminust

in Hardy-Littlewood actually looks fairly good obviously η(t) = eminust decays expo-nentially and its Mellin transform Γ(s + iτ) also decays exponentially as τ rarr plusmninfinMoreover for this choice of η the Mellin transform Fδ(s) can be written explicitlyFδ(s) = Γ(s)(1minus 2πiδ)s

It is not hard to work out an explicit formula2 for η(t) = eminust However it is nothard to see that for Fδ(s) as above Fδ(12 + it) decays like eminust2π|δ| just as weexpected from (87) This is a little too slow for our purposes we will often haveto work with relatively large δ and we would like to have to check the zeroes of Lfunctions only up to relatively low heights t ndash say up to 50|δ| Then eminust2π|δ| gteminus8 = 000033 which is not very small We will settle for a different choice of ηthe Gaussian

The decay of the Gaussian smoothing function η(t) = eminust22 is much faster than

exponential Its Mellin transform is Γ(s2) which decays exponentially as =(s) rarrplusmninfin Moreover the Mellin transform Fδ(s) (δ 6= 0) while not an elementary orvery commonly occurring function equals (after a change of variables) a relativelywell-studied special function namely a parabolic cylinder function U(a z) (or inWhittakerrsquos [Whi03] notation Dminusaminus12(z))

For δ not too small the main term will indeed work out to be proportional toeminus(τ2πδ)22 as the method of stationary phase indicated This is of course muchbetter than eminusτ2π|δ| The ldquocostrdquo is that the Mellin transform Γ(s2) for δ = 0 nowdecays like eminus(π4)|τ | rather than eminus(π2)|τ | This we can certainly afford

82 The twisted Gaussian overview and setup

821 Relation to the existing literatureWe wish to approximate the Mellin transform

Fδ(s) =

int infin0

eminust22e(δt)ts

dt

t (88)

where δ isin R The parabolic cylinder function U C2 rarr C is given by

U(a z) =eminusz

24

Γ(

12 + a

) int infin0

taminus12 eminus

12 t

2minusztdt

2There may be a minor gap in the literature in this respect The explicit formula given in [HL22 Lemma4] does not make all constants explicit The constants and trivial-zero terms were fully worked out forq = 1 by [Wig20] (cited in [MV07 Exercise 12118(c)] the sign of hypκq(z) there seems to be off) Aswas pointed out by Landau (see [Har66 p 628]) [HL22] seems to neglect the effect of the zeros ρ withlt(ρ) = 0 =(ρ) 6= 0 for χ non-primitive (The author thanks R C Vaughan for this information and thereferences)

82 THE TWISTED GAUSSIAN OVERVIEW AND SETUP 147

for lt(a) gt minus12 the function can be extended to all a z isin C either by analyticcontinuation or by other integral representations ([AS64 sect195] [Tem10 sect125(i)])Hence

Fδ(s) = e(πiδ)2Γ(s)U

(sminus 1

2minus2πiδ

) (89)

The second argument of U is purely imaginary it would be otherwise if a Gaussian ofnon-zero mean were chosen

Let us briefly discuss the state of knowledge up to date on Mellin transforms ofldquotwistedrdquo Gaussian smoothings that is eminust

22 multiplied by an additive charactere(δt) As we have just seen these Mellin transforms are precisely the parabolic cylin-der functions U(a z)

The function U(a z) has been well-studied for a and z real see eg [Tem10]Less attention has been paid to the more general case of a and z complex The mostnotable exception is by far the work of Olver [Olv58] [Olv59] [Olv61] [Olv65] hegave asymptotic series for U(a z) a z isin C These were asymptotic series in the senseof Poincare and thus not in general convergent they would solve our problem if andonly if they came with error term bounds Unfortunately it would seem that all fullyexplicit error terms in the literature are either for a and z real or for a and z outsideour range of interest (see both Olverrsquos work and [TV03]) The bounds in [Olv61]involve non-explicit constants Thus we will have to find expressions with expliciterror bounds ourselves Our case is that of a in the critical strip z purely imaginary

822 General approach

We will use the saddle-point method (see eg [dB81 sect5] [Olv74 sect47] [Won01sectII4]) to obtain bounds with an optimal leading-order term and small error terms (Weused the stationary-phase method solely as an exploratory tool)

What do we expect to obtain Both the asymptotic expressions in [Olv59] and thebounds in [Olv61] make clear that if the sign of τ = =(s) is different from that of δthere will a change in behavior when τ gets to be of size about (2πδ)2 This is unsur-prising given our discussion using stationary phase for |=(a)| smaller than a constanttimes |=(z)|2 the term proportional to eminus(π4)|τ | = eminus|=(a)|2 should be dominantwhereas for |=(a)| much larger than a constant times |=(z)|2 the term proportional to

eminus12 ( τ

2πδ )2

should be dominantThere is one important difference between the approach we will follow here and

that in [Hela] In [Hela] the integral (88) was estimated by a direct application ofthe saddle-point method Here following a suggestion of N Temme we will use theidentity

U(a z) =e

14 z

2

radic2πi

int c+iinfin

cminusiinfineminuszu+u2

2 uminusaminus12 du (810)

(see eg [OLBC10 (1256)] c gt 0 is arbitrary) Together (89) and (810) give usthat

Fδ(s) =eminus2π2δ2Γ(s)radic

2πi

int c+iinfin

cminusiinfine2πiδu+u2

2 uminussdu (811)

148 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

Estimating the integral in (811) turns out to be a somewhat cleaner task than estimating(88) The overall procedure however is in essence the same in both cases

We write

φ(u) = minusu2

2minus (2πiδ)u+ iτ log u (812)

for u real or complex so that the integral in (811) equals

I(s) =

int c+iinfin

cminusiinfineminusφ(u)uminusσdu (813)

We wish to find a saddle point A saddle point is a point u at which φprime(u) = 0This means that

minus uminus 2πiδ +iτ

u= 0 ie u2 minus i`uminus iτ = 0 (814)

where ` = minus2πδ The solutions to φprime(u) = 0 are thus

u0 =i`plusmnradicminus`2 + 4iτ

2 (815)

The value of φ(u) at u0 is

φ(u0) = minus i`u0 + iτ

2+ i`u0 + iτ log u0

=i`

2u0 + iτ log

u0radice

(816)

The second derivative at u0 is

φprimeprime(u0) = minus 1

u20

(u2

0 + iτ)

= minus 1

u20

(i`u0 + 2iτ) (817)

Assign the names u0+ u0minus to the roots in (815) according to the sign in frontof the square-root (where the square-root is defined so as to have its argument in theinterval (minusπ2 π2]) We will actually have to pay attention just to u0+ since unlikeu0minus it lies on the right half of the plane where our contour of integration also liesWe remark that

u0+ =i`+ |`|

radicminus1 + 4iτ

`2

2=`

2

(iplusmnradicminus1 +

`2i

)(818)

where the sign plusmn is + if ` gt 0 and minus if ` lt 0 If ` = 0 then u0+ = (1radic

2 +iradic

2)radicτ

We can assume without loss of generality that τ ge 0 We will find it convenient toassume τ gt 0 since we can deal with τ = 0 simply by letting τ rarr 0+

83 THE SADDLE POINT 149

83 The saddle point

831 The coordinates of the saddle point

We should start by determining u0+ explicitly both in rectangular and polar coordi-nates For one thing we will need to estimate the integrand in (813) for u = u0+ Theabsolute value of the integrand is then

∣∣eminusφ(u0+)uminusσ0+

∣∣ = |u0+|minusσeminusltφ(u0+) and by(816)

ltφ(u0+) = minus `2=(u0+)minus arg(u0+)τ (819)

If ` = 0 we already know that lt(u0+) = =(u0+) =radicτ2 |u0+| =

radicτ and

arg u0+ = π4 Assume from now on that ` 6= 0

We will use the expression for u0+ in (818) Solving a quadratic equation we seethat

radicminus1 +

`2i =

radicj(ρ)minus 1

2+ i

radicj(ρ) + 1

2 (820)

where j(ρ) = (1 + ρ2)12 and ρ = 4τ`2 Hence

lt(u0+) = plusmn `2

radicj(ρ)minus 1

2 =(u0+) =

`

2

(1plusmn

radicj(ρ) + 1

2

) (821)

Here and in what follows the signplusmn is + if ` gt 0 andminus if ` lt 0 (Notice thatlt(u0+)and =(u0+) are always positive except for τ = ` = 0 in which case lt(u0+) ==(u0+) = 0) By (821)

|u0+| =|`|2middot

∣∣∣∣∣radicminus1 + j(ρ)

2+

(1plusmn

radic1 + j(ρ)

2

)i

∣∣∣∣∣=|`|2

radicminus1 + j(ρ)

2+

1 + j(ρ)

2+ 1plusmn 2

radic1 + j(ρ)

2

=|`|2

radic1 + j(ρ)plusmn 2

radic1 + j(ρ)

2=|`|radic

2

radicυ(ρ)2 plusmn υ(ρ)

(822)

150 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

where υ(ρ) =radic

(1 + j(ρ))2 We now compute the argument of u0+

arg(u0+) = arg(`(iplusmnradicminus1 + iρ

))= arg

(radicminus1 + j(ρ)

2+ i

(plusmn1 +

radic1 + j(ρ)

2

))

= arcsin

plusmn1 +radic

1+j(ρ)2radic

1 + j(ρ)plusmn 2radic

1+j(ρ)2

= arcsin

radicplusmn1 +

radic1+j(ρ)

2radic2radic

1+j(ρ)2

= arcsin

radicradicradicradic1

2

(1plusmn

radic2

1 + j(ρ)

) =π

2minus 1

2arccos

(plusmn

radic2

1 + j(ρ)

)(823)

(by cos(π minus 2θ) = minus cos 2θ = 2 sin2 θ minus 1) Thus

arg(u0+) =

π2 minus

12 arccos 1

υ(ρ) = 12 arccos minus1

υ(ρ) if ` gt 012 arccos 1

υ(ρ) if ` lt 0(824)

In particular arg(u0+) lies in [0 π2] and is close to π2 only when ` gt 0 andρ rarr 0+ Here and elsewhere we follow the convention that arcsin and arctan haveimage in [minusπ2 π2] whereas arccos has image in [0 π]

832 The direction of steepest descent

As is customary in the saddle-point method it is now time to determine the directionof steepest descent at the saddle-point u0+ Even if we decide to use a contour thatgoes through the saddle-point in a direction that is not quite optimal it will be usefulto know what the direction w of steepest descent actually is A contour that passesthrough the saddle-point making an angle between minusπ4 + ε and π4 minus ε with wmay be acceptable in that the contribution of the saddle point is then suboptimal by atmost a bounded factor depending on ε an angle approaching minusπ4 or π4 leads to acontribution suboptimal by an unbounded factor

Let w isin C be the unit vector pointing in the direction of steepest descent Thenby definition w2φprimeprime(u0+) is real and positive where φ is as in (812) Thus arg(w) =minus arg(φprimeprime(u0+))2 modπ (The direction of steepest descent is defined only moduloπ) By (817)

arg(φprimeprime(u0+)) = minusπ + arg(i`u0+ + 2iτ)minus 2 arg(u0+) mod 2π

= minusπ2

+ arg(`u0+ + 2τ)minus 2 arg(u0+) mod 2π

83 THE SADDLE POINT 151

By (821)

lt(`u0+ + 2τ) =`2

2

(plusmnradicj(ρ)minus 1

2+

`2

)=`2

2

(ρplusmn

radicj(ρ)minus 1

2

)

=(`u0+ + 2τ) =`2

2

(1plusmn

radicj(ρ) + 1

2

)

Therefore arg(`u0+ + 2τ) = arctan$ where

$ =1plusmn

radicj(ρ)+1

2

ρplusmnradic

j(ρ)minus12

It is easy to check that sgn$ = sgn ` Hence

arctan$ = plusmnπ2minus arctan

ρplusmnradic

j(ρ)minus12

1plusmnradic

j(ρ)+12

At the same time

ρplusmnradic

jminus12

1plusmnradic

j+12

=

(ρplusmn

radicjminus1

2

)(1∓

radicj+1

2

)1minus j+1

2

=ρplusmn

radic2(j minus 1)∓ ρ

radic2(j + 1)

1minus j

=ρplusmn

radic2j+1

(radicj2 minus 1minus ρ middot (j + 1)

)1minus j

=ρplusmn 1

υ (ρminus ρ middot (j + 1))

1minus j

=ρ(1∓ jυ)

1minus j=

(minus1plusmn jυ)(j + 1)

ρ=

2υ(minusυ plusmn j)ρ

(825)Hence modulo 2π

arg(φprimeprime(u0+)) = minus arctan2υ(minusυ plusmn j)

ρminus 2 arg(u0+)minus

0 if ` ge 0

π if ` lt 0

Therefore the direction of steepest descent is

arg(w) = minusarg(φprimeprime(u0+))

2= arg(u0+) +

1

2arctan

2υ(minusυ plusmn j)ρ

+

0 if ` ge 0π2 if ` lt 0

(826)By (824) and arccos 1υ = arctan

radicυ2 minus 1 = arctan

radic(j minus 1)2 we conclude that

arg(w) =

π2 + 1

2

(minus arctan 2υ(j+υ)

ρ + arctanradic

jminus12

)if ` lt 0

π2 + 1

2

(arctan 2υ(jminusυ)

ρ minus arctanradic

jminus12

)if ` ge 0

(827)

152 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

Figure 81 arg(w) minus π2 as a function ofρ for ` lt 0

Figure 82 arg(w) minus π2 as a function ofρ for ` ge 0

There is nothing wrong in using plots here to get an idea of the behavior of arg(w)since at any rate the direction of steepest descent will play only an advisory role inour choices See Figures 81 and 82

84 The integral over the contourWe must now choose the contour of integration The optimal contour should be one onwhich the phase of the integrand in (813) is constant ie =(φ(u)) is constant Thisis so because throughout the contour we want to keep descending from the saddleas rapidly as possible and so we want to maximize the absolute value of the deriva-tive of the real part of the exponent minusφ(u) At any point u if we are to maximize|lt(dφ(u)dt)| we want our contour to be such that =(dφ(u)dt) = 0 (We can alsosee this as follows if =(φ(u)) is constant there is no cancellation in (813) for us tomiss)

Writing u = x+ iy we obtain from (812) that

=(φ(u)) = minusxy + `x+ τ logradicx2 + y2 (828)

We would thus be considering the curve =(φ(u)) = c where c is a constant Since weneed the contour to pass through the saddle point u0+ we set c = =(φ(u0+)) Theonly problem is that the curve =(φ(u)) = 0 given by (828) is rather uncomfortable towork with

Instead we shall use several rather simple contours each appropriate for differentvalues of ` and τ

841 A simple contourAssume first that ` gt 0 We could just let our contour L be the vertical line goingthrough u0+ Since the direction of steepest descent is never far from vertical (see

84 THE INTEGRAL OVER THE CONTOUR 153

(82)) this would be a good choice However the vertical line has the defect of goingtoo close to the origin when ρrarr 0

Instead we will let L consist of three segments (a) the straight vertical ray

(x0 y) y ge y0

where x0 = ltu0+ ge 0 y0 = =u0+ gt 0 (b) the straight segment going downwardsand to the right from u0+ to the x-axis forming an angle of π2 minus β (where β gt 0will be determined later) with the x-axis at a point (x1 0) (c) the straight vertical ray(x1 y) y le 0 Let us call these three segments L1 L2 L3 Shifting the contour in(813) we obtain

I =

intL

eminusφ(u)uminusσdu

and so |I| le I1 + I2 + I3 where

Ij =

intLj

∣∣∣eminusφ(u)uminusσ∣∣∣ |du| (829)

As we shall see we have chosen the segments Lj so that each of the three integrals Ijwill be easy to bound

Let us start with I1 Since σ ge 0

I1 le |u0+|minusσint infiny0

eminusltφ(x0+iy)dy

where by (812)

ltφ(x+ iy) =y2 minus x2

2minus `y minus τ arg(x+ iy) (830)

Let us expand the expression on the right of (830) for x = x0 and y around y0 ==u0+ gt 0 The constant term is

ltφ(u0+) = minus `2y0 minus τ arg(u0+) = minus`

2

4(1 + υ(ρ))minus τ

2arccos

minus1

υ(ρ)

= minus(

1 + υ(ρ)

ρ+

1

2arccos

minus1

υ(ρ)

(831)

where we are using (819) (821) and (824)The linear term vanishes because u0+ is a saddle-point (and thus a local extremum

on L) It remains to estimate the quadratic term Now in (830) the term arg(x+ iy)equals arctan(yx) whose quadratic term we should now examine ndash but instead weare about to see that we can bound it trivially In general for t0 t isin R and f isin C2

f(t) = f(t0) + f prime(t0) middot (tminus t0) +

int t

t0

int r

t0

f primeprime(s)dsdr (832)

Now arctanprimeprime(s) = minus2s(s2 + 1)2 and this is negative for s gt 0 and obeys

arctanprimeprime(minuss) = minus arctanprimeprime(s)

154 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

for all s Hence for t0 ge 0 and t ge minust0

arctan t le arctan t0 + (arctanprime t0) middot (tminus t0) (833)

Therefore in (830) we can consider only the quadratic term coming from (y2minusx2)2ndash namely (yminusy0)22 ndash and ignore the quadratic term coming from arg(x+ iy) Thus

ltφ(x0 + iy) ge (y minus y0)2

2+ ltφ(u0+) (834)

for y ge minusy0 and in particular for y ge y0 Henceint infiny0

eminusltφ(x0+iy)dy le eminusltφ(u0+)

int infiny0

eminus12 (yminusy0)2dy =

radicπ2 middot eminusltφ(u0+) (835)

Notice that once we choose to use the approximation (833) the vertical direction isactually optimal (In turn the fact that the direction of steepest descent is close tovertical shows us that we are not losing much by using the approximation (833))

As for |u0+|minusσ we will estimate it by the easy bound

|u0+| =`radic2

radicυ2 + υ ge `radic

2max

(radicρ

2radic

2

)= max(

radicτ `) (836)

where we use (822)Let us now bound I2 As we already said the linear term at u0+ vanishes Let

u be the point at which L2 meets the line normal to it through the origin We musttake care that the angle formed by the origin u0+ and u be no larger than the angleformed by the origin (x1 0) and u0 this will ensure that we are in the range in whichthe approximation (833) is valid (namely t ge minust0 where t0 = tanα0) The firstangle is π2 +βminus arg u0+ whereas the second angle is π2minusβ Hence it is enoughto set β le (arg u0+)2 Then we obtain from (812) and (833) that

ltφ(u) ge ltφ(u0+)minuslt (uminus u0+)2

2 (837)

If we let s = |uminus u0+| we see that

lt (uminus u0+)2

2=s2

2cos(

2 middot(π

2minus β

))= minuss

2

2cos 2β

Hence

I2 le |u|minusσintL2

eminusltφ(u)|du|

lt |u|minusσint infin

0

eminusltφ(u0+)minus s22 cos 2βds = |u|minusσeminusltφ(u0+)

radicπ

2 cos 2β

(838)

Since arg u0 = arg u0+ minus β we see that by (821)

|u| = lt ((x0 + iy0) (cosβ minus i sinβ))

=`

2

(radicj minus 1

2cosβ +

(1 +

radicj + 1

2

)sinβ

)

(839)

84 THE INTEGRAL OVER THE CONTOUR 155

The square of the expression within the outer parentheses is at least

j minus 1

2cos2 β +

(1 +

j + 1

2+radic

2(j + 1)

)sin2 β +

(radicj2 minus 1

4+

radicj minus 1

2

)sin 2β

ge j

2+

7

2sin2 β minus 1

2cos2 β +

j

2sin2 β

If β ge π8 then tanβ gt 1radic

7 and so since j gt ρ we obtain

|u| ge`

2

radicj

2(1 + sin2 β) gt

`radicρ

232

radic1 + sin2 β

We can also apply the trivial bound j ge 1 directly to (839) Thus

|u| ge max

(radicτ

2

radic1 + sin2 β ` sinβ

)

Let us choose β as follows We could always set β = π8 since arg u0+ ge π4 wethen have β le (arg u0+)2 as required However if ρ le 32 then υ(ρ) le 118381and so by (824) arg u0+ ge 128842 We can thus set either β = π6 = 0523598 or β = π5 = 0628318 say either of which is smaller than (arg u0+)2 Goingback to (838) we conclude that

I2 le eminusltφ(u0+) middotradicπ

214

∣∣∣∣radicτ

2

radic1 + sin2 π

8

∣∣∣∣minusσfor ρ arbitrary and

I2 le eminusltφ(u0+) middotmin

(radicπ2

cos 2π5middot∣∣∣` sin

π

5

∣∣∣minusσ radicπ ∣∣∣∣ `2∣∣∣∣minusσ)

when υ(ρ) le 32It remains to estimate I3 For u = x1

minuslt (uminus u0+)2

2= minuslty

20 (tanβ minus i)2

2=

1

2

(1minus tan2 β

)y2

0

ge(1minus tan2 β

)middot `

2

8

(1 +

j + 1

2

)ge `2

8

(1minus tan2 β

)middot ρ

2

ge 1

4

(1minus tan2 β

where we are using (821) Thus (837) tells us that

ltφ(x1) ge ltφ(u0+) +1minus tan2 β

At the same time by (830) and τ ` ge 0

ltφ(x1 + iy) ge ltφ(x1) +y2

2

156 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

for y le 0 Hence

I3 le |x1|minusσintL3

eminusltφ(u)|du| le |x1|minusσeminusltφ(x1)

int 0

minusinfineminusy

22dy

le |x1|minusσ middotradicπ

2eminus

1minustan2 β4 τeminusltφ(u0+)

Here note that x1 ge (tanβ)|u0+| and so by (836)

x1 ge tanβ middotmax(radicτ `)

We conclude that for ` gt 0

|I| le

1 + 214

(2

1 + sin2 π8

)σ2+eminus(radic

2minus12

)τ(

tan π8

)σ middot radicπ2

τσ2eminusltφ(u0+)

(since (1minus tan2 π8)4 = (radic

2minus 1)2) and when ρ le 32

|I| le

1 + min

2σ+ 12

radicsec 2π

5(sin π

5

)σ+

eminusτ6

(1radic

3)σ

middot radicπ2`σ

eminusltφ(u0+)

We know ltφ(u0+) from (831) Write

E(ρ) =1

2arccos

1

υ(ρ)minus υ(ρ)minus 1

ρ (840)

so that

minusltφ(u0+) =1 + υ(ρ)

ρ+

1

2arccos

minus1

υ(ρ)=π

2minus E(ρ) +

2

ρ

To finish we just need to apply (811) It makes sense to group together Γ(s)eπ2 τ

since it is bounded on the critical line (by the classical formula |Γ(12 + iτ)| =radicπ coshπτ as in [MV07 Exer C1(b)]) and in general of slow growth on bounded

strips Using (811) and noting that 2π2δ2 = `22 = (2ρ) middot τ we obtain

|Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

c1σττ

σ2 for ρ arbitraryc2στ`

σ for ρ le 32(841)

where

c1στ =1

2

1 + 214

(2

1 + sin2 π8

)σ2+eminus(radic

2minus12

)τ(

tan π8

c2στ =1

2

1 + min

2σ+ 12

radicsec 2π

5(sin π

5

)σ+

eminusτ6

(1radic

3)σ

(842)

84 THE INTEGRAL OVER THE CONTOUR 157

We have assumed throughout that ` ge 0 and τ ge 0 We can immediately obtain abound valid for ` le 0 τ le 0 by reflection on the x-axis we simply put absolutevalues around τ and ` in (841)

We see that we have obtained a bound in a neat closed form without too mucheffort Of course this effortlessness is usually in part illusory the contour we haveused here is actually the product of some trial and error in that some other contoursgive results that are comparable in quality but harder to simplify We will have tochoose a different contour when sgn(`) 6= sgn(τ)

842 Another simple contourWe now wish to give a bound for the case of sgn(`) 6= sgn(τ) ie sgn(δ) = sgn(τ)We expect a much smaller upper bound than for sgn(`) = sgn(τ) given what wealready know from the method of stationary phase This also means that we will notneed to be as careful in order to get a bound that is good enough for all practicalpurposes

Our contour L will consist of three segments (a) the straight vertical ray (x0 y) y ge 0 (b) the quarter-circle from (x0 0) to (0minusx0) (that is an arc where the argu-ment runs from 0 to minusπ2) and (c) the straight vertical ray (0 y) y le minusx0 Wecall these segments L1 L2 L3 and define the integrals I1 I2 and I3 just as in (829)

Much as before we have

I1 le xminusσ0

int infin0

eminusltφ(x0+iy)dy

Since (833) is valid for t ge 0 (834) holds and so

I1 le xminusσ0 eminusltφ(u0+)

int infinminusinfin

eminus12 (yminusy0)2dy = xminusσ0

radic2π middot eminusltφ(u0+)

By (812) and (830)

I2 le xminusσ0

intL2

eminusltφ(u)du = x1minusσ0

int π2

0

eminus(minus x

20 cos 2α

2 +`x0 sinα+τα

)dα (843)

Now for α ge 0 and ` le 0

(`x0 sinα+ τα)prime

= `x0 cosα+ τ ge `x0 + τ

Since j =radic

1 + ρ2 le 1 + ρ22 we haveradic

(j minus 1)2 le ρ2 and so by (821)|`x0| le `2ρ4 = τ and thus `x0 + τ ge 0 In other words the exponent in (843)equals (x2

0 cos 2α)2 minus an increasing function and so since ltφ(x0) = minusx202

I2 le xminusσ0 middot x0

int π2

0

ex20 cos 2α

2 dα = xminusσ0 middot π2x0 middot I0(x2

02)

where I0(t) = 1π

int π0et cos θdθ is the modified Bessel function of the first kind (and

order 0)

158 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

Since cos θ =radic

1minus sin2 θ lt 1minus (sin2 θ)2 le 1minus 2θ2π2 we have3

I0(t) le 1

π

int π

0

et(

1minus 2θ2

π2

)dθ lt et middot 1

π

int infin0

eminus2tπ2 θ

2

dθ = etπradic

2t

π

radicπ

2=

radicπ

232

etradict

for t ge 0Using the fact that ltφ(x0) = minusx2

02 we conclude that

I2 le xminusσ0 middot π2x0 middot

radicπ

232

ex202

x0radic

2=π32

4xminusσ0 eminusltφ(x0)

By (834) which is valid for all ` we know that ltφ(x0) ge ltφ(u0+)Let us now estimate the integral on L3 Again by (830) for y lt 0

ltφ(iy) =y2

2minus `y + τ

π

2

Hence ∣∣∣∣intL3

eminusφ(u)uminusσdu

∣∣∣∣ le xminusσ0

int minusx0

minusinfineminus(y2

2 minus`y+τ π2

)du

= xminusσ0 e12 `

2

eminusτπ2

int minusx0

minusinfineminus

12 (yminus`)2dy = xminusσ0 eminus

τπ2

radicπ

2

since yminus` le minus` for y le minusx0 andint minus`minusinfin eminust

22dt leradicπ2middoteminus`22 (by [AS64 7113])

Now that we have bounded the integrals over L1 L2 and L3 it remains to boundx0 from below starting from (821) We will bound it differently for ρ lt 32 and forρ ge 32 (The choice of 32 is fairly arbitrary)

Expanding (radic

1 + t minus 1)2 gt 0 we obtain that 2(1 + t) minus 2radic

1 + t ge t for allt ge minus1 and so(radic

1 + tminus 1

t

)prime=

1

t2

(t

2radic

1 + tminus (radic

1 + tminus 1)

)lt 0

ie (radic

1 + tminus 1)t decreases as t increases Hence for ρ le ρ0 where ρ0 ge 0

j(ρ) =radic

1 + ρ2 ge 1 +

radic1 + ρ2

0 minus 1

ρ20

ρ2 (844)

which equals 1 + (29)(radic

13minus 2)ρ2 for ρ0 = 32 Thus for ρ le 32

x0 ge|`|2

radic29 (radic

13minus 2)ρ2

2=

radicradic13minus 2

6|`|ρ

=2radicradic

13minus 2

3

τ

|`|ge 084473

|τ |`

(845)

3It is actually not hard to prove rigorously the better bound I0(t) le 0468823etradict For t ge 8 this can

be done directly by the change of variables cos θ = 1 minus 2s2 dθ = 2dsradic

1minus s2 followed by the usageof different upper bounds on the the integrand exp(minus2ts2

radic1minus s2) for 0 le s le 12 and 12 le s le 1

(Thanks are due G Kuperberg for this argument) For t lt 8 use the Taylor expansion of I0(t) aroundt = 0 [AS64 (9612)] truncate it after 16 terms and then bound the maximum of the truncated series bythe bisection method implemented via interval arithmetic (as described in sect26)

85 CONCLUSIONS 159

On the other hand(j(ρ)minus 1

ρ

)prime=

1

ρ2(jprime(ρ)ρminus (j(ρ)minus 1)) =

ρ2 minus (1 + ρ2) +radic

1 + ρ2

ρ2radic

1 + ρ2ge 0

and so for ρ ge 32 (j(ρ) minus 1)ρ is minimal at ρ = 32 where it takes the value(radic

13minus 2)3 Hence

x0 =|`|2

radicj(ρ)minus 1

2ge|`|radicρ

2

radicradic13minus 2radic

6=

radicradic13minus 2radic

6

radicτ ge 051729

radicτ (846)

We now sum I1 I2 and I3 and then use (811) we obtain that when ` lt 0 andτ ge 0

|Fδ(s)| leeminus2π2δ2 |Γ(s)|radic

∣∣∣∣intL

eminusφ(u)uminusσdu

∣∣∣∣le |x0|minusσ

((1 +

π

232

)eminusltφ(u0+) +

1

2eminus

τπ2

)eminus

12 `

2

|Γ(s)|(847)

By (819) (821) and (824)

minuslt(φ(u0+)) =`2

4(1minus υ(ρ)) +

τ

2arccos

1

υ(ρ)ltτ

2arccos

1

υ(ρ)le π

We conclude that when sgn(`) 6= sgn(τ) (ie sgn(δ) = sgn(τ))

|Fδ(s)| le |x0|minusσ middot eminus12 `

2

|Γ(s)|eπ2 |τ | middot((

1 +π

232

)eminus

π4 |τ | +

1

2eminusπ|τ |

)

where x0 can be bounded as in (845) and (846) Here as before we reducing the caseτ lt 0 to the case τ gt 0 by reflection This concludes the proof of Theorem 801

85 ConclusionsWe have obtained bounds on |Fδ(s)| for sgn(δ) 6= sgn(τ) (841) and for sgn(δ) =sgn(τ) (847) Our task is now to simplify them

First let us look at the exponent E(ρ) defined as in (82) Its plot can be seen inFigure 85 We claim that

E(ρ) ge

01598 if ρ ge 1501065ρ if ρ lt 15

(848)

This is so for ρ ge 15 because E(ρ) is increasing on ρ and E(15) = 015982 Thecase ρ lt 15 is a little more delicate We can easily see that arccos(1minus t22) ge t for0 le t ge 2 (since the derivative of the left side is 1

radic1minus t24 which is always ge 1)

We also have

1 +ρ2

2minus ρ4

8le j(ρ) le 1 +

ρ2

2

160 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

Figure 83 The function E(ρ)

for 0 le ρ leradic

8 and so

1 +ρ2

8minus 5ρ4

128le υ(ρ) le 1 +

ρ2

8

for 0 le ρ leradic

325 this in turn gives us that 1υ(ρ) le 1minus ρ28 + 7ρ4128 (againfor 0 le ρ le

radic325) and so 1υ(ρ) le 1 minus (1 minus 764)ρ28 for 0 le ρ le 12 We

conclude that

arccos1

υ(ρ)ge 1

2

radic57

64ρ

therefore

E(ρ) ge 1

4

radic57

64ρminus ρ

8gt 011093ρ gt 01065ρ

In the remaining range 12 le ρ le 32 we prove that E(ρ)ρ gt 0106551 usingthe bisection method (with 20 iterations) implemented by means of interval arithmeticThis concludes the proof of (848)

Assume from this point onwards that |τ | ge 20 Let us show that the contributionof (83) is negligible relative to that of (81) Indeed((

1 +π

232

)eminus

π4 |τ | +

1

2eminusπ|τ |

)le 78

106eminus01598τ

It is useful to note that eminus`22 = eminus2τρ and so for σ le k + 1 and ρ le 32

eminus2τρ

(084473|τ |`)σle eminus40ρ(

0844734 ρ

)σ`σle 1

(4

084473 middot 15

)σeminus80(3t)

le 1

`σmiddot 315683k+1 e

minus80(3t)

tk+1

(849)

85 CONCLUSIONS 161

where t = 2ρ3 le 1 Since eminuscttk+1 attains its maximum at t = c(k + 1)

eminus80(3t)

tk+1le eminus(k+1)

(3(k + 1)

80

)k+1

and so for ρ le 32

|x0|minusσeminus12 `

2

le 1

`σmiddot

004355 if 0 le σ le 1

000759 if 1 le σ le 2

000224 if 2 le σ le 3

whereas |x0|minusσeminus`22 le |x0|minusσ le (051729

radicτ)minusσ for ρ ge 32

We conclude that for |τ | ge 20 and σ le 3

|Fδ(s)| le |Γ(s)|eπ2 τ middot eminus01598τ middot

4

1071`σ if ρ le 32

6105

1τσ2

if ρ ge 32(850)

provided that sgn(δ) = sgn(τ) or δ = 0 This will indeed be negligible compared toour bound for the case sgn(δ) = minus sgn(τ)

Let us now deal with the factor |Γ(s)|eπ2 τ By Stirlingrsquos formula with remainderterm [GR94 (8344)]

log Γ(s) =1

2log(2π) +

(sminus 1

2

)log sminus s+

1

12s+R2(s)

where

|R2(s)| lt 130

12|s|3 cos3(

arg s2

) =

radic2

180|s|3

for lt(s) ge 0 The real part of (sminus 12) log sminus s is

(σ minus 12) log |s| minus τ arg(s)minus σ = (σ minus 12) log |s| minus π

2τ + τ

(arctan

σ

|τ |minus σ

|τ |

)for s = σ + iτ σ ge 0 Since arctan(r) le r for r ge 0 we conclude that

|Γ(s)|eπ2 τ leradic

2π|s|σminus 12 e

112|s|+

radic2

180|s|3 (851)

Lastly |s|σminus12 = |τ |σminus12|1 + iστ |σminus12 For |τ | ge 20

|1 + iστ |σminus12 le

1000625 if 0 le σ le 11007491 if 1 le σ le 21028204 if 2 le σ le 3

ande

112|τ|+

radic2

180|τ|3 le 1004177

162 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

Thus

|Γ(s)|eπ2 τ le |τ |σminus12 middot

251868 if 0 le σ le 1253596 if 1 le σ le 225881 if 2 le σ le 3

(852)

Let us now estimate the constants c1στ and c2στ in (82) By |τ | ge 20

eminus(radic

2minus12

)τ le 0015889 eminus

τ6 le 0035674 (853)

Since 8 sin(π8) = 3061467 gt 1 we obtain that

c1στ le

130454 if 0 le σ le 1158361 if 1 le σ le 2198186 if 2 le σ le 3

c2στ le

194511 if 0 le σ le 1315692 if 1 le σ le 2502186 if 2 le σ le 3

Lastly note that for k le σ le k + 1 we have

1

τσ2middot |τ |σminus12 = |τ |(σminus1)2 le τk2

whereas for ρ le 32 and 0 le γ le 1

|τ |γminus12

|`|γle |τ |

γ2minus

12

( τ`2

)γ2le 20

γ2minus

12

(32

4

)γ2le(

3

8

)12

and so1

`σmiddot |τ |σminus12 =

(|τ |`

)k |τ |σminus12

|`|σleradic

3

8middot(|τ |`

)k

Multiplying and remembering to add (850) we obtain that for k = 0 1 2 σ isin[0 1] and |τ | ge 20

|Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

κk0(|τ ||`|

)keminus01065( 2|τ|

|`| )2

if ρ lt 32

κk1|τ |keminus01598|τ | if ρ ge 32

whereκ00 le (4 middot 10minus7 + 194511) middot 251868 middot

radic38 le 3001

κ10 le (4 middot 10minus7 + 315692) middot 253596 middotradic

38 le 4903

κ20 le (4 middot 10minus7 + 502186) middot 25881 middotradic

38 le 796

and similarly

κ01 le (6 middot 10minus5 + 130454) middot 251868 le 3286

κ11 le (6 middot 10minus5 + 158361) middot 253596 le 4017

κ21 le (6 middot 10minus5 + 198186) middot 25881 le 513

This concludes the proof of Corollary 802

Chapter 9

Explicit formulas

An explicit formula is an expression restating a sum such as Sηχ(δx x) as a sum ofthe Mellin transformGδ(s) over the zeros of the L function L(s χ) More specificallyfor us Gδ(s) is the Mellin transform of η(t)e(δt) for some smoothing function η andsome δ isin R We want a formula whose error terms are good both for δ very close orequal to 0 and for δ farther away from 0 (Indeed our choice(s) of η will be made sothat Fδ(s) decays rapidly in both cases)

We will be able to base all of our work on a single general explicit formula namelyLemma 911 This explicit formula has simple error terms given purely in terms of afew norms of the given smoothing function η We also give a common framework forestimating the contribution of zeros on the critical strip (Lemmas 913 and 914)

The first example we work out is that of the Gaussian smoothing η(t) = eminust22

We actually do this in part for didactic purposes and in part because of its likely ap-plicability elsewhere for our applications we will always use smoothing functionsbased on teminust

22 and t2eminust22 generally in combination with something else Since

η(t) = eminust22 does not vanish at t = 0 its Mellin transform has a pole at s = 0

ndash something that requires some additional work (Lemma 912 see also the proof ofLemma 911)

Other than that for each function η(t) all that has to be done is to bound an integral(from Lemma 913) and bound a few norms Still both for ηlowast and for η+ we find afew interesting complications Since η+ is defined in terms of a truncation of a Mellintransform (or alternatively in terms of a multiplicative convolution with a Dirichletkernel as in (74) and (76)) bounding the norms of η+ and ηprime+ takes a little work Weleave this to Appendix A The effect of the convolution is then just to delay the decaya shift in that a rapidly decaying function f(τ) will get replaced by f(τ minus H) H aconstant

The smoothing function ηlowast is defined as a multiplicative convolution of t2eminust22

with something else Given that we have an explicit formula for t2eminust22 we obtain an

explicit formula for ηlowast by what amounts to just exchanging the order of a sum and anintegral (We already went over this in the introduction in (140))

163

164 CHAPTER 9 EXPLICIT FORMULAS

91 A general explicit formulaWe will prove an explicit formula valid whenever the smoothing η and its derivative ηprime

satisfy rather mild assumptions ndash they will be assumed to be L2-integrable and to havestrips of definition containing s 12 le lt(s) le 32 though any strip of the forms ε le lt(s) le 1 + ε would do just as well

(For explicit formulas with different sets of assumptions see eg [IK04 sect55] and[MV07 Ch 12])

The main idea in deriving any explicit formula is to start with an expression givinga sum as integral over a vertical line with an integrand involving a Mellin transform(here Gδ(s)) and an L-function (here L(s χ)) We then shift the line of integration tothe left If stronger assumptions were made (as in Exercise 5 in [IK04 sect55]) we couldshift the integral all the way tolt(s) = minusinfin the integral would then disappear replacedentirely by a sum over zeros (or even as in the same Exercise 5 by a particularly simpleintegral) Another possibility is to shift the line only to lt(s) = 12 + ε for some ε gt 0ndash but this gives a weaker result and at any rate the factor Lprime(s χ)L(s χ) can be largeand messy to estimate within the critical strip 0 lt lt(s) lt 1

Instead we will shift the line to lts = minus12 We can do this because the assump-tions on η and ηprime are enough to continue Gδ(s) analytically up to there (with a possiblepole at s = 0) The factor Lprime(s χ)L(s χ) is easy to estimate for lts lt 0 and s = 0(by the functional equation) and the part of the integral on lts = minus12 coming fromGδ(s) can be estimated easily using the fact that the Mellin transform is an isometry

Lemma 911 Let η R+0 rarr R be in C1 Let x isin R+ δ isin R Let χ be a primitive

character mod q q ge 1Write Gδ(s) for the Mellin transform of η(t)e(δt) Assume that η(t) and ηprime(t) are

in `2 (with respect to the measure dt) and that η(t)tσminus1 and ηprime(t)tσminus1 are in `1 (againwith respect to dt) for all σ in an open interval containing [12 32]

Theninfinsumn=1

Λ(n)χ(n)e

xn

)η(nx) = Iq=1 middot η(minusδ)xminus

sumρ

Gδ(ρ)xρ

minusR+Olowast ((log q + 601) middot (|ηprime|2 + 2π|δ||η|2))xminus12

(91)

where

Iq=1 =

1 if q = 10 if q 6= 1

R = η(0)

(log

q+ γ minus Lprime(1 χ)

L(1 χ)

)+Olowast(c0)

(92)

for q gt 1 R = η(0) log 2π for q = 1 and

c0 =2

3Olowast(∣∣∣∣ηprime(t)radict

∣∣∣∣1

+∣∣∣ηprime(t)radict∣∣∣

1+ 2π|δ|

(∣∣∣∣η(t)radict

∣∣∣∣1

+ |η(t)radict|1))

(93)

The norms |η|2 |ηprime|2 |ηprime(t)radict|1 etc are taken with respect to the usual measure dt

The sumsumρ is a sum over all non-trivial zeros ρ of L(s χ)

91 A GENERAL EXPLICIT FORMULA 165

Proof Since (a) η(t)tσminus1 is in `1 for σ in an open interval containing 32 and (b)η(t)e(δt) has bounded variation (since η ηprime isin `1 implying that the derivative ofη(t)e(δt) is also in `1) the Mellin inversion formula (as in eg [IK04 4106]) holds

η(nx)e(δnx) =1

2πi

int 32 +iinfin

32minusiinfin

Gδ(s)xsnminussds

Since Gδ(s) is bounded for lt(s) = 32 (by η(t)t32minus1 isin `1) andsumn Λ(n)nminus32 is

bounded as well we can change the order of summation and integration as follows

infinsumn=1

Λ(n)χ(n)e(δnx)η(nx) =

infinsumn=1

Λ(n)χ(n) middot 1

2πi

int 32 +iinfin

32minusiinfin

Gδ(s)xsnminussds

=1

2πi

int 32 +iinfin

32minusiinfin

infinsumn=1

Λ(n)χ(n)Gδ(s)xsnminussds

=1

2πi

int 32 +iinfin

32minusiinfin

minusLprime(s χ)

L(s χ)Gδ(s)x

sds

(94)

(This is the way the procedure always starts see for instance [HL22 Lemma 1] orto look at a recent standard reference [MV07 p 144] We are being very scrupulousabout integration because we are working with general η)

The first question we should ask ourselves is up to where can we extend Gδ(s)Since η(t)tσminus1 is in `1 for σ in an open interval I containing [12 32] the transformGδ(s) is defined for lt(s) in the same interval I However we also know that thetransformation rule M(tf prime(t))(s) = minuss middotMf(s) (see (210) by integration by parts)is valid when s is in the holomorphy strip for both M(tf prime(t)) and Mf In our case(f(t) = η(t)e(δt)) this happens when lt(s) isin (I minus 1) cap I (so that both sides of theequation in the rule are defined) Hence s middot Gδ(s) (which equals s middotMf(s)) can beanalytically continued to lt(s) in (I minus 1) cup I which is an open interval containing[minus12 32] This implies immediately that Gδ(s) can be analytically continued to thesame region with a possible pole at s = 0

When does Gδ(s) have a pole at s = 0 This happens when sGδ(s) is non-zero ats = 0 ie when M(tf prime(t))(0) 6= 0 for f(t) = η(t)e(δt) Now

M(tf prime(t))(0) =

int infin0

f prime(t)dt = limtrarrinfin

f(t)minus f(0)

We already know that f prime(t) = (ddt)(η(t)e(δt)) is in `1 Hence limtrarrinfin f(t) existsand must be 0 because f is in `1 Hence minusM(tf prime(t))(0) = f(0) = η(0)

Let us look at the next term in the Laurent expansion of Gδ(s) at s = 0 It is

limsrarr0

sGδ(s)minus η(0)

s= limsrarr0

minusM(tf prime(t))(s)minus f(0)

s= minus lim

srarr0

1

s

int infin0

f prime(t)(ts minus 1)dt

= minusint infin

0

f prime(t) limsrarr0

ts minus 1

sdt = minus

int infin0

f prime(t) log t dt

166 CHAPTER 9 EXPLICIT FORMULAS

Here we were able to exchange the limit and the integral because f prime(t)tσ is in `1for σ in a neighborhood of 0 in turn this is true because f prime(t) = ηprime(t) + 2πiδη(t)and ηprime(t)tσ and η(t)tσ are both in `1 for σ in a neighborhood of 0 In fact we willuse the easy bounds |η(t) log t| le (23)(|η(t)tminus12|1 + |η(t)t12|1) |ηprime(t) log t| le(23)(|ηprime(t)tminus12|1 + |ηprime(t)t12|1) resulting from the inequality

2

3

(tminus

12 + t

12

)le | log t| (95)

valid for all t gt 0We conclude that the Laurent expansion of Gδ(s) at s = 0 is

Gδ(s) =η(0)

s+ c0 + c1s+ (96)

where

c0 = Olowast(|f prime(t) log t|1)

=2

3Olowast(∣∣∣∣ηprime(t)radict

∣∣∣∣1

+∣∣∣ηprime(t)radict∣∣∣

1+ 2πδ

(∣∣∣∣η(t)radict

∣∣∣∣1

+ |η(t)radict|1))

We shift the line of integration in (94) to lt(s) = minus12 We obtain

1

2πi

int 2+iinfin

2minusiinfinminusLprime(s χ)

L(s χ)Gδ(s)x

sds = Iq=1Gδ(1)xminussumρ

Gδ(ρ)xρ minusR

minus 1

2πi

int minus12+iinfin

minus12minusiinfin

Lprime(s χ)

L(s χ)Gδ(s)x

sds

(97)

where

R = Ress=0Lprime(s χ)

L(s χ)Gδ(s)

Of course

Gδ(1) = M(η(t)e(δt))(1) =

int infin0

η(t)e(δt)dt = η(minusδ)

Let us work out the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 By the func-tional equation (as in eg [IK04 Thm 415])

Lprime(s χ)

L(s χ)= log

π

qminus 1

(s+ κ

2

)minus 1

(1minus s+ κ

2

)minus Lprime(1minus s χ)

L(1minus s χ) (98)

where ψ(s) = Γprime(s)Γ(s) and

κ =

0 if χ(minus1) = 1

1 if χ(minus1) = minus1

91 A GENERAL EXPLICIT FORMULA 167

By ψ(1 minus x) minus ψ(x) = π cotπx (immediate from Γ(s)Γ(1 minus s) = π sinπs) andψ(s) + ψ(s+ 12) = 2(ψ(2s)minus log 2) (Legendre [AS64 (638)])

minus 1

2

(s+ κ

2

)+ ψ

(1minus s+ κ

2

))= minusψ(1minuss)+log 2+

π

2cot

π(s+ κ)

2 (99)

Hence unless q = 1 the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 is

1minus κs

+

(log

qminus ψ(1)minus Lprime(1 χ)

L(1 χ)

)+a1

s+a2

s2+

Here ψ(1) = minusγ the Euler gamma constant [AS64 (632)]There is a special case for q = 1 due to the pole of ζ(s) at s = 1 We know that

ζ prime(0)ζ(0) = log 2π (see eg [MV07 p 331])From this and (96) we conclude that if η(0) = 0 then

R =

c0 if q gt 1 and χ(minus1) = 10 otherwise

where c0 = Olowast(|ηprime(t) log t|1 + 2π|δ||η(t) log t|1) If η(0) 6= 0 then

R = η(0)

(log

q+ γ minus Lprime(1 χ)

L(1 χ)

)+

c0 if χ(minus1) = 1

0 otherwise

for q gt 1 andR = η(0) log 2π

for q = 1It is time to estimate the integral on the right side of (97) For that we will need to

estimate Lprime(s χ)L(s χ) for lt(s) = minus12 using (98) and (99)If lt(z) = 32 then |t2 + z2| ge 94 for all real t Hence by [OLBC10 (5915)]

and [GR94 (34111)]

ψ(z) = log z minus 1

2zminus 2

int infin0

tdt

(t2 + z2)(e2πt minus 1)

= log z minus 1

2z+ 2 middotOlowast

(int infin0

tdt94 (e2πt minus 1)

)= log z minus 1

2z+

8

9Olowast(int infin

0

tdt

e2πt minus 1

)= log z minus 1

2z+

8

9middotOlowast

(1

(2π)2Γ(2)ζ(2)

)= log z minus 1

2z+Olowast

(1

27

)= log z +Olowast

(10

27

)

(910)

Thus in particular ψ(1 minus s) = log(32 minus iτ) + Olowast(1027) where we write s =12 + iτ Now ∣∣∣∣cot

π(s+ κ)

2

∣∣∣∣ =

∣∣∣∣e∓π4 iminusπ2 τ + eplusmnπ4 i+

π2 τ

e∓π4 iminus

π2 τ minus eplusmnπ4 i+π

2 τ

∣∣∣∣ = 1

168 CHAPTER 9 EXPLICIT FORMULAS

Since lt(s) = minus12 a comparison of Dirichlet series gives∣∣∣∣Lprime(1minus s χ)

L(1minus s χ)

∣∣∣∣ le |ζ prime(32)||ζ(32)|

le 150524 (911)

where ζ prime(32) and ζ(32) can be evaluated by Euler-Maclaurin Therefore (98) and(99) give us that for s = minus12 + iτ ∣∣∣∣Lprime(s χ)

L(s χ)

∣∣∣∣ le ∣∣∣logq

π

∣∣∣+ log

∣∣∣∣32 + iτ

∣∣∣∣+10

27+ log 2 +

π

2+ 150524

le∣∣∣log

q

π

∣∣∣+1

2log

(τ2 +

9

4

)+ 41396

(912)

Recall that we must bound the integral on the right side of (97) The absolute valueof the integral is at most xminus12 times

1

int minus 12 +iinfin

minus 12minusiinfin

∣∣∣∣Lprime(s χ)

L(s χ)Gδ(s)

∣∣∣∣ ds (913)

By Cauchy-Schwarz this is at mostradicradicradicradic 1

int minus 12 +iinfin

minus 12minusiinfin

∣∣∣∣Lprime(s χ)

L(s χ)middot 1

s

∣∣∣∣2 |ds| middotradicradicradicradic 1

int minus 12 +iinfin

minus 12minusiinfin

|Gδ(s)s|2 |ds|

By (912)radicradicradicradicint minus 12 +iinfin

minus 12minusiinfin

∣∣∣∣Lprime(s χ)

L(s χ)middot 1

s

∣∣∣∣2 |ds| leradicradicradicradicint minus 1

2 +iinfin

minus 12minusiinfin

∣∣∣∣ log q

s

∣∣∣∣2 |ds|+

radicradicradicradicint infinminusinfin

∣∣ 12 log

(τ2 + 9

4

)+ 41396 + log π

∣∣214 + τ2

leradic

2π log q +radic

226844

where we compute the last integral numerically1

Again we use the fact that by (210) sGδ(s) is the Mellin transform of

minus td(e(δt)η(t))

dt= minus2πiδte(δt)η(t)minus te(δt)ηprime(t) (914)

Hence by Plancherel (as in (26))radicradicradicradic 1

int minus 12 +iinfin

minus 12minusiinfin

|Gδ(s)s|2 |ds| =

radicint infin0

|minus2πiδte(δt)η(t)minus te(δt)ηprime(t)|2 tminus2dt

= 2π|δ|

radicint infin0

|η(t)|2dt+

radicint infin0

|ηprime(t)|2dt

(915)1By a rigorous integration from τ = minus100000 to τ = 100000 using VNODE-LP [Ned06] which runs

on the PROFILBIAS interval arithmetic package [Knu99]

91 A GENERAL EXPLICIT FORMULA 169

Thus (913) is at most(log q +

radic226844

)middot (|ηprime|2 + 2π|δ||η|2)

Lemma 911 leaves us with three tasks bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with small imaginary part bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with large imaginary part and bounding Lprime(1 χ)L(1 χ) Letus start with the last task while in a narrow sense it is optional ndash in that in theapplications we actually need (Thm 712 Cor 713 and Thm 714) we will haveη(0) = 0 thus making the term Lprime(1 χ)L(1 χ) disappear ndash it is also very easy andcan be dealt with quickly

Since we will be using a finite GRH check in all later applications we might aswell use it here

Lemma 912 Let χ be a primitive character mod q q gt 1 Assume that all non-trivialzeroes ρ = σ + it of L(s χ) with |t| le 58 satisfy lt(ρ) = 12 Then∣∣∣∣Lprime(1 χ)

L(1 χ)

∣∣∣∣ le 5

2logM(q) + c

where M(q) = maxn

∣∣∣summlen χ(m)∣∣∣ and

c = 5 log2radic

3

ζ(94)ζ(98)= 1507016

Proof By a lemma of Landaursquos (see eg [MV07 Lemma 63] where the constantsare easily made explicit) based on the Borel-Caratheodory Lemma (as in [MV07Lemma 62]) any function f analytic and zero-free on a disc Cs0R = s |sminus s0| leR of radius R gt 0 around s0 satisfies

f prime(s)

f(s)= Olowast

(2R logM|f(s0)|

(Rminus r)2

)(916)

for all s with |s minus s0| le r where 0 lt r lt R and M is the maximum of |f(z)| onCs0R Assuming L(s χ) has no non-trivial zeros off the critical line with |=(s)| le H where H gt 12 we set s0 = 12 +H r = H minus 12 and let Rrarr Hminus We obtain

Lprime(1 χ)

L(1 χ)= Olowast

(8H log

maxsisinCs0H |L(s χ)||L(s0 χ)|

) (917)

Now

|L(s0 χ)| geprodp

(1 + pminuss0)minus1 =prodp

(1minus pminus2s0)minus1

(1minus pminuss0)minus1=ζ(2s0)

ζ(s0)

Since s0 = 12 +H Cs0H is contained in s isin C lt(s) gt 12 for any value of H We choose (somewhat arbitrarily) H = 58

170 CHAPTER 9 EXPLICIT FORMULAS

By partial summation for s = σ + it with 12 le σ lt 1 and any N isin Z+

L(s χ) =sumnleN

χ(m)nminuss minus

summleN

χ(m)

(N + 1)minuss

+sum

ngeN+1

summlen

χ(m)

(nminuss minus (n+ 1)minuss+1)

= Olowast(N1minus12

1minus 12+N1minusσ +M(q)Nminusσ

)

(918)

where M(q) = maxn

∣∣∣summlen χ(m)∣∣∣ We set N = M(q)3 and obtain

|L(s χ)| le 2M(q)Nminus12 = 2radic

3radicM(q) (919)

We put this into (917) and are done

Let M(q) be as in the statement of Lem 912 Since the sum of χ(n) (χ mod qq gt 1) over any interval of length q is 0 it is easy to see that M(q) le q2 We alsohave the following explicit version of the Polya-Vinogradov inequality

M(q) le

2π2

radicq log q + 4

π2

radicq log log q + 3

2

radicq if χ(minus1) = 1

12π

radicq log q + 1

π

radicq log log q +

radicq if χ(minus1) = 1

(920)

Taken together with M(q) le q2 this implies that

M(q) le q45 (921)

for all q ge 1 and also thatM(q) le 2q35 (922)

for all q ge 1Notice lastly that ∣∣∣∣log

q+ γ

∣∣∣∣ le log q + logeγ middot 2π

32

for all q ge 3 (There are no primitive characters modulo 2 so we can omit q = 2)We conclude that for χ primitive and non-trivial∣∣∣∣log

q+ γ minus Lprime(1 χ)

L(1 χ)

∣∣∣∣ le logeγ middot 2π

32+ log q +

5

2log q

45 + 1507017

le 3 log q + 15289

Obviously 15289 is more than log 2π the bound for χ trivial Hence the absolutevalue of the quantity R in the statement of Lemma 911 is at most

|η(0)|(3 log q + 15289) + |c0| (923)

91 A GENERAL EXPLICIT FORMULA 171

for all primitive χIt now remains to bound the sum

sumρGδ(ρ)xρ in (91) Clearly∣∣∣∣∣sum

ρ

Gδ(ρ)xρ

∣∣∣∣∣ lesumρ

|Gδ(ρ)| middot xlt(ρ)

Recall that these are sums over the non-trivial zeros ρ of L(s χ)We first prove a general lemma on sums of values of functions on the non-trivial

zeros of L(s χ) This is little more than partial summation given a (classical) boundfor the number of zeroesN(T χ) of L(s χ) with |=(s)| le T The error term becomesparticularly simple if f is real-valued and decreasing the statement is then practicallyidentical to that of [Leh66 Lemma 1] (for χ principal) except for the fact that the errorterm is improved here

Lemma 913 Let f R+ rarr C be piecewise C1 Assume limtrarrinfin f(t)t log t = 0Let χ be a primitive character mod q q ge 1 let ρ denote the non-trivial zeros ρ ofL(s χ) Then for any y ge 1sum

ρ non-trivial=(ρ)gty

f(=(ρ)) =1

int infiny

f(T ) logqT

2πdT

+1

2Olowast(|f(y)|gχ(y) +

int infiny

|f prime(T )| middot gχ(T )dT

)

(924)

wheregχ(T ) = 05 log qT + 177 (925)

If f is real-valued and decreasing on [yinfin) the second line of (924) equals

Olowast(

1

4

int infiny

f(T )

TdT

)

Proof WriteN(T χ) for the number of non-trivial zeros ofL(s χ) satisfying |=(s)| leT Write N+(T χ) for the number of (necessarily non-trivial) zeros of L(s χ) with0 lt =(s) le T Then for any f R+ rarr C with f piecewise differentiable andlimtrarrinfin f(t)N(T χ) = 0sum

ρ=(ρ)gty

f(=(ρ)) =

int infiny

f(T ) dN+(T χ)

= minusint infiny

f prime(T )(N+(T χ)minusN+(y χ))dT

= minus1

2

int infiny

f prime(T )(N(T χ)minusN(y χ))dT

Now by [Ros41 Thms 17ndash19] and [McC84a Thm 21] (see also [Tru Thm 1])

N(T χ) =T

πlog

qT

2πe+Olowast (gχ(T )) (926)

172 CHAPTER 9 EXPLICIT FORMULAS

for T ge 1 where gχ(T ) is as in (925) (This is a classical formula the referencesserve to prove the explicit form (925) for the error term gχ(T ))

Thus for y ge 1sumρ=(ρ)gty

f(=(ρ)) = minus1

2

int infiny

f prime(T )

(T

πlog

qT

2πeminus y

πlog

qy

2πe

)dT

+1

2Olowast(|f(y)|gχ(y) +

int infiny

|f prime(T )| middot gχ(T )dT

)

(927)

Here

minus 1

2

int infiny

f prime(T )

(T

πlog

qT

2πeminus y

πlog

qy

2πe

)dT =

1

int infiny

f(T ) logqT

2πdT (928)

If f is real-valued and decreasing (and so by limtrarrinfin f(t) = 0 non-negative)

|f(y)|gχ(y) +

int infiny

|f prime(T )| middot gχ(T )dT = f(y)gχ(y)minusint infiny

f prime(T )gχ(T )dT

= 05

int infiny

f(T )

TdT

since gprimeχ(T ) le 05T for all T ge T0

Let us bound the part of the sumsumρGδ(ρ)xρ corresponding to ρ with bounded

|=(ρ)| The bound we will give is proportional toradicT0 log qT0 whereas a very naive

approach (based on the trivial bound |Gδ(σ + iτ)| le |G0(σ)|) would give a boundproportional to T0 log qT0

We could obtain a bound proportional toradicT0 log qT0 for η(t) = tkeminust

22 by usingTheorem 801 Instead we will give a bound of that same quality valid for η essentiallyarbitrary simply by using the fact that the Mellin transform is an isometry (preceded byan application of Cauchy-Schwarz)

Lemma 914 Let η R+0 rarr R be such that both η(t) and (log t)η(t) lie in L1 cap L2

and η(t)radict lies in L1 (with respect to dt) Let δ isin R Let Gδ(s) be the Mellin

transform of η(t)e(δt)Let χ be a primitive character mod q q ge 1 Let T0 ge 1 Assume that all non-

trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Thensumρ non-trivial|=(ρ)|leT0

|Gδ(ρ)|

is at most

(|η|2 + |η middot log |2)radicT0 log qT0 + (1721|η middot log |2 minus (log 2π

radice)|η|2)

radicT0

+∣∣∣η(t)

radict∣∣∣1middot (132 log q + 345)

(929)

91 A GENERAL EXPLICIT FORMULA 173

Proof For s = 12 + iτ we have the trivial bound

|Gδ(s)| leint infin

0

|η(t)|t12 dtt

=∣∣∣η(t)

radict∣∣∣1 (930)

where Fδ is as in (947) We also have the trivial bound

|Gprimeδ(s)| =∣∣∣∣int infin

0

(log t)η(t)tsdt

t

∣∣∣∣ le int infin0

|(log t)η(t)|tσ dtt

=∣∣(log t)η(t)tσminus1

∣∣1

(931)for s = σ + iτ

Let us start by bounding the contribution of very low-lying zeros (|=(ρ)| le 1) By(926) and (925)

N(1 χ) =1

πlog

q

2πe+Olowast (05 log q + 177) = Olowast(0819 log q + 168)

Therefore sumρ non-trivial|=(ρ)|le1

|Gδ(ρ)| le∣∣∣η(t)tminus12

∣∣∣1middot (0819 log q + 168)

Let us now consider zeros ρ with |=(ρ)| gt 1 Apply Lemma 913 with y = 1 and

f(t) =

|Gδ(12 + it)| if t le T0

0 if t gt T0

This gives us thatsumρ1lt|=(ρ)|leT0

f(=(ρ)) =1

π

int T0

1

f(T ) logqT

2πdT

+Olowast(|f(1)|gχ(1) +

int infin1

|f prime(T )| middot gχ(T ) dT

)

(932)

where we are using the fact that f(σ+ iτ) = f(σminus iτ) (because η is real-valued) ByCauchy-Schwarz

1

π

int T0

1

f(T ) logqT

2πdT le

radic1

π

int T0

1

|f(T )|2dT middot

radic1

π

int T0

1

(log

qT

)2

dT

Now

1

π

int T0

1

|f(T )|2dT le 1

int infinminusinfin

∣∣∣∣Gδ (1

2+ iT

)∣∣∣∣2 dT le int infin0

|e(δt)η(t)|2dt = |η|22

by Plancherel (as in (26)) We also haveint T0

1

(log

qT

)2

dT le 2π

q

int qT02π

0

(log t)2dt le

((log

qT0

2πe

)2

+ 1

)middot T0

174 CHAPTER 9 EXPLICIT FORMULAS

Hence1

π

int T0

1

f(T ) logqT

2πdT le

radic(log

qT0

2πe

)2

+ 1 middot |η|2radicT0

Again by Cauchy-Schwarzint infin1

|f prime(T )| middot gχ(T ) dT le

radic1

int infinminusinfin|f prime(T )|2dT middot

radic1

π

int T0

1

|gχ(T )|2dT

Since |f prime(T )| = |Gprimeδ(12 + iT )| and (Mη)prime(s) is the Mellin transform of log(t) middote(δt)η(t) (by (210))

1

int infinminusinfin|f prime(T )|2dT = |η(t) log(t)|2

Much as beforeint T0

1

|gχ(T )|2dT leint T0

0

(05 log qT + 177)2dT

= (025(log qT0)2 + 172(log qT0) + 29609)T0

Summing we obtain

1

π

int T0

1

f(T ) logqT

2πdT +

int infin1

|f prime(T )| middot gχ(T ) dT

le((

logqT0

2πe+

1

2

)|η|2 +

(log qT0

2+ 1721

)|η(t)(log t)|2

)radicT0

Finally by (930) and (925)

|f(1)|gχ(1) le∣∣∣η(t)

radict∣∣∣1middot (05 log q + 177)

By (932) and the assumption that all non-trivial zeros with |=(ρ)| le T0 lie on the linelt(s) = 12 we conclude thatsum

ρ non-trivial1lt|=(ρ)|leT0

|Gδ(ρ)| le (|η|2 + |η middot log |2)radicT0 log qT0

+ (1721|η middot log |2 minus (log 2πradice)|η|2)

radicT0

+∣∣∣η(t)

radict∣∣∣1middot (05 log q + 177)

All that remains is to bound the contribution tosumρGδ(ρ)xρ corresponding to all

zeroes ρ with |=(ρ)| gt T0 This will do by another application of Lemma 913combined with bounds on Gδ(ρ) for =(ρ) large This is the only part that will requireus to take a look at the actual smoothing function η we are working with it is at thispoint not before that we actually have to look at each of our options for η one by one

92 SUMS AND DECAY FOR THE GAUSSIAN 175

92 Sums and decay for the GaussianIt is now time to derive our bounds for the Gaussian smoothing As we were sayingthere is really only one thing left to do namely an estimate for the sum

sumρ |Fδ(ρ)|

over all zeros ρ with |=(ρ)| gt T0

Lemma 921 Let ηhearts(t) = eminust22 Let x isin R+ δ isin R Let χ be a primitive character

mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge 50

Write Fδ(s) for the Mellin transform of η(t)e(δt) Thensumρ

|=(ρ)|gtT0

|Fδ(ρ)| le logqT0

2πmiddot(

353eminus01598T0 + 225δ2

T0eminus01065( T0

π|δ| )2)

Here we have preferred to give a bound with a simple form It is probably feasibleto derive from Theorem 801 a bound essentially proportional to eminusE(ρ)T0 where ρ =T0(πδ)

2 and E(ρ) is as in (82) (As we discussed in sect85 E(ρ) behaves as eminus(π4)T0

for ρ large and as eminus0125(T0(πδ))2

for ρ small)

Proof First of allsumρ

|=(ρ)|gtT0

|Fδ(ρ)| =sumρ

=(ρ)gtT0

(|Fδ(ρ)|+ |Fδ(1minus ρ)|)

by the functional equation (which implies that non-trivial zeros come in pairs ρ 1minusρ)Hence by a somewhat brutish application of Cor 802sum

ρ

|=(ρ)|gtT0

|Fδ(ρ)| lesumρ

=(ρ)gtT0

f(=(ρ)) (933)

wheref(τ) = 3001eminus01065( τ

πδ )2

+ 3286eminus01598|τ | (934)

Obviously f(τ) is a decreasing function of τ for τ ge T0We now apply Lemma 913 We obtain thatsum

ρ

=(ρ)gtT0

f(=(ρ)) leint infinT0

f(T )

(1

2πlog

qT

2π+

1

4T

)dT (935)

We just need to estimate some integrals For any y ge 1 c c1 gt 0int infiny

(log t+

c1t

)eminusctdt le

int infiny

(log tminus 1

ct

)eminusctdt+

(1

c+ c1

)int infiny

eminusct

tdt

=(log y)eminuscy

c+

(1

c+ c1

)E1(cy)

176 CHAPTER 9 EXPLICIT FORMULAS

where E1(x) =intinfinxeminustdtt Clearly E1(x) le

intinfinxeminustdtx = eminusxx Henceint infin

y

(log t+

c1t

)eminusctdt le

(log y +

(1

c+ c1

)1

y

)eminuscy

c

We conclude thatint infinT0

eminus01598t

(1

2πlog

qt

2π+

1

4t

)dt

le 1

int infinT0

(log t+

π2

t

)eminusctdt+

log q2π

2πc

int infinT0

eminusctdt

=1

2πc

(log T0 + log

q

2π+

(1

c+π

2

)1

T0

)eminuscT0

(936)

with c = 01598 Since T0 ge 50 and q ge 1 this is at most

1072 logqT0

2πeminuscT0 (937)

Now let us deal with the Gaussian term (It appears only if T0 lt (32)(πδ)2 asotherwise |τ | ge (32)(πδ)2 holds whenever |τ | ge T0) For any y ge e c ge 0int infin

y

eminusct2

dt =1radicc

int infinradiccy

eminust2

dt le 1

cy

int infinradiccy

teminust2

dt le eminuscy2

2cy (938)

int infiny

eminusct2

tdt =

int infincy2

eminust

2tdt =

E1(cy2)

2le eminuscy

2

2cy2 (939)int infin

y

(log t)eminusct2

dt leint infiny

(log t+

log tminus 1

2ct2

)eminusct

2

dt =log y

2cyeminuscy

2

(940)

Hence int infinT0

eminus01065( Tπδ )2(

1

2πlog

qT

2π+

1

4T

)dT

=

int infinT0π|δ|

eminus01065t2(|δ|2

logq|δ|t

2+

1

4t

)dt

le

|δ|2 log T0

π|δ|

2cprime T0

π|δ|+|δ|2 log q|δ|

2

2cprime T0

π|δ|+

1

8cprime(T0

π|δ|

)2

eminuscprime( T0π|δ| )

2

(941)

with cprime = 01065 Since T0 ge 50 and q ge 1

8T0le π

200le 00152 middot 1

2log

qT0

Thus the last line of (941) is less than

10152|δ|2 log qT0

2π2cprimeT0

π|δ|eminusc

prime( T0π|δ| )

2

= 7487δ2

T0middot log

qT0

2πmiddot eminusc

prime( T0π|δ| )

2

(942)

92 SUMS AND DECAY FOR THE GAUSSIAN 177

Again by T0 ge 4π2|δ| we see that 10057π|δ|(4cT0) le 10057(16cπ) le 018787To obtain our final bound we simply sum (937) and (942) after multiplying them

by the constants 3286 and 3001 in (934) We conclude that the integral in (935) is atmost (

353eminus01598T0 + 225δ2

T0eminus01065( T0

π|δ| )2)

logqT0

We need to record a few norms related to the Gaussian ηhearts(t) = eminust22 before we

proceed Recall we are working with the one-sided Gaussian ie we set ηhearts(t) = 0for t lt 0 Symbolic integration then gives

|ηhearts|22 =

int infin0

eminust2

dt =

radicπ

2

|ηprimehearts|22 =

int infin0

(teminust22)2dt =

radicπ

4

|ηhearts middot log |22 =

int infin0

eminust2

(log t)2dt

=

radicπ

16

(π2 + 2γ2 + 8γ log 2 + 8(log 2)2

)le 194753

(943)

|ηhearts(t)radict|1 =

int infin0

eminust22

radictdt =

Γ(14)

234le 215581

|ηprimehearts(t)radict| = |ηhearts(t)

radict|1 =

int infin0

eminust2

2

radictdt =

Γ(34)

214le 103045∣∣∣ηprimehearts(t)t12

∣∣∣1

=∣∣∣ηhearts(t)t32

∣∣∣1

=

int infin0

eminust2

2 t32 dt = 107791

(944)

We can now state what is really our main result for the Gaussian smoothing (Theversion in sect71 will as we shall later see follow from this given numerical inputs)

Proposition 922 Let η(t) = eminust22 Let x ge 1 δ isin R Let χ be a primitive character

mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie onthe critical line Assume that T0 ge 50

Then

infinsumn=1

Λ(n)χ(n)e

xn

)η(nx

)=

η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

(945)where

errηχ(δ x) = logqT0

2πmiddot(

353eminus01598T0 + 225δ2

T0eminus01065( T0

π|δ| )2)

+ (2337radicT0 log qT0 + 21817

radicT0 + 285 log q + 7438)xminus

12

+ (3 log q + 14|δ|+ 17)xminus1 + (log q + 6) middot (1 + 5|δ|) middot xminus32

178 CHAPTER 9 EXPLICIT FORMULAS

Proof Let Fδ(s) be the Mellin transform of ηhearts(t)e(δt) By Lemmas 914 (withGδ =Fδ) and Lemma 921 ∣∣∣∣∣∣

sumρ non-trivial

Fδ(ρ)xρ

∣∣∣∣∣∣is at most (929) (with η = ηhearts) times

radicx plus

logqT0

2πmiddot(

353eminus01598T0 + 225|δ|2

T0eminus01065( T0

π|δ| )2)middot x

By the norm computations in (943) and (944) we see that (929) is at most

2337radicT0 log qT0 + 21817

radicT0 + 285 log q + 7438

Let us now apply Lemma 911 We saw that the value of R in Lemma 911 isbounded by (923) We know that ηhearts(0) = 1 Again by (943) and (944) the quantityc0 defined in (93) is at most 14056 + 133466|δ| Hence

|R| le 3 log q + 13347|δ|+ 16695

Lastly|ηprimehearts|2 + 2π|δ||ηhearts|2 le 0942 + 4183|δ| le 1 + 5|δ|

Clearly(601minus 6) middot (1 + 5|δ|) + 13347|δ|+ 16695 lt 14|δ|+ 17

and so we are done

93 The case of ηlowast(t)We will now work with a weight based on the Gaussian

η(t) =

t2eminust

22 if t ge 00 if t lt 0

(946)

The fact that this vanishes at t = 0 actually makes it easier to work with at severallevels

Its Mellin transform is just a shift of that of the Gaussian Write

Fδ(s) = (M(eminust2

2 e(δt)))(s)

Gδ(s) = (M(η(t)e(δt)))(s)(947)

Then by the definition of the Mellin transform

Gδ(s) = Fδ(s+ 2)

We start by bounding the contribution of zeros with large imaginary part just asbefore

93 THE CASE OF ηlowast(T ) 179

Lemma 931 Let η(t) = t2eminust22 Let x isin R+ δ isin R Let χ be a primitive character

mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge max(10π|δ| 50)

Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

sumρ

|=(ρ)|gtT0

|Gδ(ρ)| le T0 logqT0

2πmiddot(

611eminus01598T0 + 1578eminus01065middot T

20

(πδ)2

)

Proof We start by writingsumρ

|=(ρ)|gtT0

|Gδ(ρ)| =sumρ

=(ρ)gtT0

(|Fδ(ρ+ 2)|+ |Fδ((1minus ρ) + 2)|)

where we are usingGδ(ρ) = Fδ(ρ+2) and the fact that non-trivial zeros come in pairsρ 1minus ρ

By Cor 802 with k = 2sumρ

|=(ρ)|gtT0

|Gδ(ρ)| lesumρ

=(ρ)gtT0

f(=(ρ))

where

f(τ) =

κ21|τ |eminus01598|τ | +κ20

4

(|τ |πδ

)2

eminus01065( |τ|πδ )2

if |τ | lt 32 (πδ)2

κ21|τ |eminus01598|τ | if |τ | ge 32 (πδ)2

(948)

where κ20 = 796 and κ21 = 513 We are including the term |τ |eminus01598|τ | in bothcases in part because we cannot be bothered to take it out (just as we could not bebothered in the proof of Lem 921) and in part to ensure that f(τ) is a decreasingfunction of τ for τ ge T0

We can now apply Lemma 913 We obtain againsumρ

=(ρ)gtT0

f(=(ρ)) leint infinT0

f(T )

(1

2πlog

qT

2π+

1

4T

)dT (949)

Just as before we will need to estimate some integralsFor any y ge 1 c c1 gt 0 such that log y gt 1(cy)int infin

y

teminusctdt =

(y

c+

1

c2

)eminuscy

int infiny

(t log t+

c1t

)eminusctdt le

int infiny

((t+

aminus 1

c

)log tminus 1

cminus a

c2t

)eminusctdt

=(yc

+a

c2

)eminuscy log y

(950)

180 CHAPTER 9 EXPLICIT FORMULAS

where

a =

log yc + 1

c + c1y

log yc minus

1c2y

Setting c = 01598 c1 = π2 y = T0 ge 50 we obtain thatint infinT0

(1

2πlog

qT

2π+

1

4T

)Teminus01598T dT

le 1

(log

q

2πmiddot(T0

c+

1

c2

)+

(T0

c+a

c2

)log T0

)eminus01598T0

(951)

and

a =

log T0

01598 + 101598 + π2

T0

log T0

01598 minus1

015982T0

le 1299

It is easy to see that ratio of the expression within parentheses on the right side of(951) to T0 log(qT02π) increases as q decreases and if we hold q fixed decreases asT0 ge 2π increases thus it is maximal for q = 1 and T0 = 50 Multiplying (951) byκ21 = 513 and simplifying by the assumption T0 ge 50 we obtain thatint infin

T0

513Teminus01598T

(1

2πlog

qT0

2π+

1

4T

)dT le 611T0 log

qT0

2πmiddot eminus01598T0

(952)Now let us examine the Gaussian term First of all ndash when does it arise If T0 ge

(32)(πδ)2 then |τ | ge (32)(πδ)2 holds whenever |τ | ge T0 and so (948) does notgive us a Gaussian term Recall that T0 ge 10π|δ| which means that |δ| le 20(3π)implies that T0 ge (32)(πδ)2 We can thus assume from now on that |δ| gt 20(3π)since otherwise there is no Gaussian term to treat

For any y ge 1 c c1 gt 0int infiny

t2eminusct2

dt lt

int infiny

(t2 +

1

4c2t2

)eminusct

2

dt =

(y

2c+

1

4c2y

)middot eminuscy

2

int infiny

(t2 log t+ c1t) middot eminusct2

dt leint infiny

(t2 log t+

at log et

2cminus log et

2cminus a

4c2t

)eminusct

2

dt

=(2cy + a) log y + a

4c2middot eminuscy

2

where

a =c1y + log ey

2cy log ey

2c minus 14c2y

=1

y+

c1y + 14c2y2

y log ey2c minus 1

4c2y

=1

y+

2c1c

log ey+

c12cy log ey + 1

4c2y2

y log ey2c minus 1

4c2y

(Note that a decreases as y ge y0 increases provided that log ey0 gt 1(2cy20)) Setting

93 THE CASE OF ηlowast(T ) 181

c = 01065 c1 = 1(2|δ|) le 316 and y = T0(π|δ|) ge 4π we obtainint infinT0π|δ|

(1

2πlog

q|δ|t2

+1

4π|δ|t

)t2eminus01065t2dt

le(

1

2πlog

q|δ|2

)middot(

T0

2πc|δ|+

1

4c2 middot 10

)middot eminus01065( T0

π|δ| )2

+1

2πmiddot

(2c T0

π|δ| + a)

log T0

π|δ| + a

4c2middot eminus01065( T0

π|δ| )2

and

a le 1

10+

(2middot203π

)minus1 middot 10 + 14middot010652middot102

10 log 10e2middot01065 minus

14middot010652middot10

le 0117

Multiplying by (κ204)π|δ| we get thatint infinT0

κ20

4

(T

π|δ|

)2

eminus01065( Tπ|δ| )

2(

1

2πlog

qT0

2π+

1

4T

)dT (953)

is at most eminus01065( T0π|δ| )

2

times((1487T0 + 2194|δ|) middot log

q|δ|2

+ 1487T0 logT0

π|δ|+ 2566|δ| log

eT0

π|δ|

)le

(1487 + 2566 middot

1 + 1log T0π|δ|

T0|δ|

)T0 log

qT0

2πle 1578 middot T0 log

qT0

(954)

where we are using several times the assumption that T0 ge 4π2|δ| (and in one occa-sion the fact that |δ| gt 20(3π) gt 2)

We sum (952) and the estimate for (953) we have just got to reach our conclusion

Again we record some norms obtained by symbolic integration for η as in (946)

|η|22 =3

8

radicπ |ηprime|22 =

7

16

radicπ

|η middot log |22 =

radicπ

64

(8(3γ minus 8) log 2 + 3π2 + 6γ2 + 24(log 2)2 + 16minus 32γ

)le 016364

|η(t)radict|1 =

214Γ(14)

4le 107791 |η(t)

radict|1 =

3

4234Γ(34) le 154568

|ηprime(t)radict|1 =

int radic2

0

t32eminust2

2 dtminusint infinradic

2

t32eminust2

2 dt le 148469

|ηprime(t)radict|1 le 172169

(955)

182 CHAPTER 9 EXPLICIT FORMULAS

Proposition 932 Let η(t) = t2eminust22 Let x ge 1 δ isin R Let χ be a primitive

character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0

lie on the critical line Assume that T0 ge max(10π|δ| 50)Theninfinsumn=1

Λ(n)χ(n)e

xn

)η(nx) =

η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

(956)where

errηχ(δ x) = T0 logqT0

2πmiddot(

611eminus01598T0 + 1578eminus01065middot T

20

(πδ)2

)+(

122radicT0 log qT0 + 5056

radicT0 + 1423 log q + 3719

)middot xminus12

+ (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32(957)

Proof We proceed as in the proof of Prop 922 The contribution of Lemma 931 is

T0 logqT0

2πmiddot(

611eminus01598T0 + 1578eminus01065middot T

20

(πδ)2

)middot x

whereas the contribution of Lemma 914 is at most

(122radicT0 log qT0 + 5056

radicT0 + 1423 log q + 37188)

radicx

Let us now apply Lemma 911 Since η(0) = 0 we have

R = Olowast(c0) = Olowast(2138 + 1099|δ|)

Lastly|ηprime|2 + 2π|δ||η|2 le 0881 + 5123|δ|

Now that we have Prop 932 we can derive from it similar bounds for a smoothingdefined as the multiplicative convolution of η with something else In general forϕ1 ϕ2 [0infin)rarr C if we know how to bound sums of the form

Sfϕ1(x) =sumn

f(n)ϕ1(nx) (958)

we can bound sums of the form Sfϕ1lowastMϕ2 simply by changing the order of summationand integration

Sfϕ1lowastMϕ2 =sumn

f(n) middot (ϕ1 lowastM ϕ2)(nx

)=

int infin0

sumn

f(n)ϕ1

( n

wx

)ϕ2(w)

dw

w=

int infin0

Sfϕ1(wx)ϕ2(w)

dw

w

(959)

93 THE CASE OF ηlowast(T ) 183

This is particularly nice if ϕ2(t) vanishes in a neighbourhood of the origin since thenthe argument wx of Sfϕ1(wx) is always large

We will use ϕ1(t) = t2eminust22 ϕ2(t) = η1 lowastM η1 where η1 is 2 times the char-

acteristic function of the interval [12 1] The motivation for the choice of ϕ1 and ϕ2

is clear we have just got bounds based on ϕ1(t) in the major arcs and we obtainedminor-arc bounds for the weight ϕ2(t) in Part I

Corollary 933 Let η(t) = t2eminust22 η1 = 2 middot I[121] η2 = η1 lowastM η1 Let ηlowast =

η2 lowastM η Let x isin R+ δ isin R Let χ be a primitive character mod q q ge 1 Assumethat all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Assumethat T0 ge max(10π|δ| 50)

Theninfinsumn=1

Λ(n)χ(n)e

xn

)ηlowast(nx) =

ηlowast(minusδ)x+Olowast (errηlowastχ(δ x)) middot x if q = 1Olowast (errηlowastχ(δ x)) middot x if q gt 1

(960)where

errηχlowast(δ x) = T0 logqT0

2πmiddot(

611eminus01598T0 + 00102 middot eminus01065middot T20

(πδ)2

)+(

1679radicT0 log qT0 + 6957

radicT0 + 1958 log q + 5117

)middot xminus 1

2

+ (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32(961)

Proof The left side of (960) equalsint infin0

infinsumn=1

Λ(n)χ(n)e

(δn

x

)η( n

wx

)η2(w)

dw

w

=

int 1

14

infinsumn=1

Λ(n)χ(n)e

(δwn

wx

)η( n

wx

)η2(w)

dw

w

since η2 is supported on [minus14 1] By Prop 932 the main term (if q = 1) contributesint 1

14

η(minusδw)xw middot η2(w)dw

w= x

int infin0

η(minusδw)η2(w)dw

= x

int infin0

int infinminusinfin

η(t)e(δwt)dt middot η2(w)dw = x

int infin0

int infinminusinfin

η( rw

)e(δr)

dr

wη2(w)dw

= x

int infinminusinfin

(int infin0

η( rw

)η2(w)

dw

w

)e(δr)dr = ηlowast(minusδ) middot x

The error term isint 1

14

errηχ(δwwx) middot wx middot η2(w)dw

w= x middot

int 1

14

errηχ(δwwx)η2(w)dw (962)

184 CHAPTER 9 EXPLICIT FORMULAS

Using the fact that

η2(w) =

4 log 4w if w isin [14 12]4 logwminus1 if w isin [12 1]0 otherwise

we can easily check thatint infin0

η2(w)dw = 1

int infin0

wminus12η2(w)dw le 137259int infin0

wminus1η2(w)dw = 4(log 2)2 le 192182

int infin0

wminus32η2(w)dw le 274517

and by rigorous numerical integration from 14 to 12 and from 12 to 1 (using egVNODE-LP [Ned06])int infin

0

eminus01065middot102( 1w2minus1)η2(w)dw le 0006446

We then see that (957) and (962) imply (961)

94 The case of η+(t)

We will work with

η(t) = η+(t) = hH(t) middot tηhearts(t) = hH(t) middot teminust22 (963)

where hH is as in (76) We recall that hH is a band-limited approximation to thefunction h defined in (75) ndash to be more precise MhH(it) is the truncation of Mh(it)to the interval [minusHH]

We are actually defining h hH and η in a slightly different way from what was donein the first version of [Hela] The difference is instructive There η(t) was defined ashH(t)eminust

22 and hH was a band-limited approximation to a function h defined as in(75) but with t3(2 minus t)3 instead of t2(2 minus t)3 The reason for our new definitions isthat now the truncation of Mh(it) will not break the holomorphy of Mη and so wewill be able to use the general results we proved in sect91

In essence Mh will still be holomorphic because the Mellin transform of tηhearts(t) isholomorphic in the domain we care about unlike the Mellin transform of ηhearts(t) whichdoes have a pole at s = 0

As usual we start by bounding the contribution of zeros with large imaginary partThe procedure is much as before since η+(t) = ηH(t)ηhearts(t) the Mellin transformMη+ is a convolution of M(teminust

22) and something of support in [minusHH]i namelyMηH restricted to the imaginary axis This means that the decay of Mη+ is (at worst)like the decay of M(teminust

22) delayed by H

94 THE CASE OF η+(T ) 185

Lemma 941 Let η = η+ be as in (963) for some H ge 25 Let x isin R+ δ isin R Letχ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ)with |=(ρ)| le T0 satisfy lt(s) = 12 where T0 ge H + max(10π|δ| 50)

Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

sumρ

|=(ρ)|gtT0

|Gδ(ρ)| le

(11308

radicT prime0eminus01598T prime0 + 16147|δ|e

minus01065

(T prime0πδ

)2)log

qT0

where T prime0 = T0 minusH

Proof As usual sumρ

|=(ρ)|gtT0

|Gδ(ρ)| =sumρ

=(ρ)gtT0

(|Gδ(ρ)|+ |Gδ(1minus ρ)|)

Let Fδ be as in (947) Then since η+(t)e(δt) = hH(t)teminust22e(δt) where hH is as

in (76) we see by (29) that

Gδ(s) =1

int H

minusHMh(ir)Fδ(s+ 1minus ir)dr

and so since |Mh(ir)| = |Mh(minusir)|

|Gδ(ρ)|+ |Gδ(1minus ρ)| le 1

int H

minusH|Mh(ir)|(|Fδ(1 +ρminus ir)|+ |Fδ(2minus (ρminus ir))|)dr

(964)We apply Cor 802 with k = 1 and T0minusH instead of T0 and obtain that |Fδ(ρ)|+

|Fδ(1minus ρ)| le g(τ) where

g(τ) = κ11

radic|τ |eminus01598|τ | + κ10

|τ |2π|δ|

eminus01065( τπδ )

2

(965)

where κ10 = 4903 and κ11 = 4017 (As in the proof of Lemmas 921 and 931 weare putting in extra terms so as to simplify our integrals)

From (964) we conclude that

|Gδ(ρ)|+ |Gδ(1minus ρ)| le f(τ)

for ρ = σ + iτ τ gt 0 where

f(τ) =|Mh(ir)|1

2πmiddot g(τ minusH)

is decreasing for τ ge T0 (because g(τ) is decreasing for τ ge T0 minus H) By (A17)|Mh(ir)|1 le 16193918

186 CHAPTER 9 EXPLICIT FORMULAS

We apply Lemma 913 and get that

sumρ

|=(ρ)|gtT0

|Gδ(ρ)| leint infinT0

f(T )

(1

2πlog

qT

2π+

1

4T

)dT

=|Mh(ir)|1

int infinT0

g(T minusH)

(1

2πlog

qT

2π+

1

4T

)dT

(966)

Now we just need to estimate some integrals For any y ge e2 c gt 0 and κ κ1 ge 0int infiny

radicteminusctdt le

(radicy

c+

1

2c2radicy

)eminuscy

int infiny

(radict log(t+ κ) +

κ1radict

)eminusctdt le

(radicy

c+

a

c2radicy

)log(y + κ)eminuscy

where

a =1

2+

1 + cκ1

log(y + κ)

The contribution of the exponential term in (965) to (966) thus equals

κ11|Mh(ir)|12π

int infinT0

(1

2πlog

qT

2π+

1

4T

)radicT minusH middot eminus01598(TminusH)dT

le 103532

int infinT0minusH

(1

2πlog(T +H) +

log q2π

2π+

1

4T

)radicTeminus01598T dT

le 103532

(radicT0 minusH01598

+a

015982radicT0 minusH

)log

qT0

2πmiddot eminus01598(T0minusH)

(967)

where a = 12+(1+01598π2) log T0 Since T0minusH ge 50 and T0 ge 50+25 = 75this is at most

11308radicT0 minusH log

qT0

2πmiddot eminus01598(T0minusH)

We now estimate a few more integrals so that we can handle the Gaussian term in(965) For any y gt 1 c gt 0 κ κ1 ge 0int infin

y

teminusct2

dt =eminuscy

2

2c

int infiny

(t log(t+ κ) + κ1)eminusct2

dt le

(1 +

κ1 + 12cy

y log(y + κ)

)log(y + κ) middot eminuscy2

2c

Proceeding just as before we see that the contribution of the Gaussian term in (965)

94 THE CASE OF η+(T ) 187

to (966) is at most

κ10|Mh(ir)|12π

int infinT0

(1

2πlog

qT

2π+

1

4T

)T minusH2π|δ|

middot eminus01065(TminusHπδ )2

dT

le 126368 middot |δ|4

int infinT0minusHπ|δ|

(log

(T +

H

π|δ|

)+ log

q|δ|2

+π2

T

)Teminus01065T 2

dT

le 126368 middot |δ|8 middot 01065

1 +

π2 + π|δ|

2middot01065middot(T0minusH)

T0minusHπ|δ| log T0

π|δ|

logqT0

2πmiddot eminus01065(T0minusHπδ )

2

(968)Since (T0 minusH)(π|δ|) ge 10 this is at most

16147|δ| logqT0

2πmiddot eminus01065(T0minusHπδ )

2

Proposition 942 Let η = η+ be as in (963) for some H ge 25 Let x ge 103 δ isin RLet χ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0 lie on the critical line where T0 ge H + max(10π|δ| 50)

Theninfinsumn=1

Λ(n)χ(n)e

xn

)η+(nx) =

η+(minusδ)x+Olowast

(errη+χ(δ x)

)middot x if q = 1

Olowast(errη+χ(δ x)

)middot x if q gt 1

(969)where

errη+χ(δ x) =

(11308

radicT prime0 middot eminus01598T prime0 + 16147|δ|e

minus01065

(T prime0πδ

)2)log

qT0

+ (1634radicT0 log qT0 + 1243

radicT0 + 1321 log q + 3451)x12

+ (9 + 11|δ|)xminus1 + (log q)(11 + 6|δ|)xminus32(970)

where T prime0 = T0 minusH

Proof We can apply Lemmas 911 and Lemma 914 because η+(t) (log t)η+(t) andηprime+(t) are in `2 (by (A25) (A28) and (A32)) and η+(t)tσminus1 and ηprime+(t)tσminus1 are in`1 for σ in an open interval containing [12 32] (by (A30) and (A33)) (Because of(95) the fact that η+(t)tminus12 and η+(t)t12 are in `1 implies that η+(t) log t is also in`1 as is required by Lemma 914)

We apply Lemmas 911 914 and 941 We bound the norms involving η+ usingthe estimates in sectA3 and sectA4 Since η+(0) = 0 (by the definition (A3) of η+) theterm R in (92) is at most c0 where c0 is as in (93) We bound

c0 le2

3

(2922875

(radicΓ(12) +

radicΓ(32)

)+ 1062319

(radicΓ(52) +

radicΓ(72)

))+

3|δ| middot 1062319

(radicΓ(32) +

radicΓ(52)

)le 6536232 + 9319578|δ|

188 CHAPTER 9 EXPLICIT FORMULAS

using (A30) and (A33) By (A25) (A32) and the assumption H ge 25

|η+|2 le 080365 |ηprime+|2 le 10845789

Thus the error terms in (91) total at most

6536232+9319578|δ|+ (log q + 601)(10845789 + 2π middot 080365|δ|)xminus12

le 9 + 11|δ|+ (log q)(11 + 6|δ|)xminus12(971)

The part of the sumsumρGδ(ρ)xρ in (91) corresponding to zeros ρ with |=(ρ)| gt

T0 gets estimated by Lem 941 By Lemma 914 the part of the sum correspondingto zeros ρ with |=(ρ)| le T0 is at most

(1634radicT0 log qT0 + 1243

radicT0 + 1321 log q + 3451)x12

where we estimate the norms |η+|2 |η middot log |2 and |η(t)radict|1 by (A25) (A28) and

(A30)

95 A sum for η+(t)2

Using a smoothing function sometimes leads to considering sums involving the squareof the smoothing function In particular in Part III we will need a result involving η2

+

ndash something that could be slightly challenging to prove given the way in which η+ isdefined Fortunately we have bounds on |η+|infin and other `infin-norms (see AppendixA5) Our task will also be made easier by the fact that we do not have a phase e(δnx)this time All in all this will be yet another demonstration of the generality of theframework developed in sect91

Proposition 951 Let η = η+ be as in (963) H ge 25 Let x ge 108 Assume thatall non-trivial zeros ρ of the Riemann zeta function ζ(s) with |=(ρ)| le T0 lie on thecritical line where T0 ge max(2H + 25 200)

Theninfinsumn=1

Λ(n)(log n)η2+(nx) = x middot

int infin0

η2+(t) log xt dt+Olowast(err`2η+) middot x log x (972)

where

err`2η+ =

((0462

(log T1)2

log x+ 0909 log T1

)T1 + 171

(1 +

log T1

log x

)H

)eminus

π4 T1

+ (2445radicT0 log T0 + 5004) middot xminus12

(973)and T1 = T0 minus 2H

The assumption T0 ge 200 is stronger than what we strictly need but as it happenswe could make much stronger assumptions still Proposition 951 relies on a verifica-tion of zeros of the Riemann zeta function such verifications have gone up to valuesof T0 much higher than 200

95 A SUM FOR η+(T )2 189

Proof We will need to consider two smoothing functions namely η+0(t) = η+(t)2

and η+1 = η+(t)2 log t Clearly

infinsumn=1

Λ(n)(log n)η2+(nx) = (log x)

infinsumn=1

Λ(n)η+0(nx) +

infinsumn=1

Λ(n)η+1(nx)

Since η+(t) = hH(t)teminust22

η+0(r) = h2H(t)t2eminust

2

η+1(r) = h2H(t)(log t)t2eminust

2

Let η+2 = (log x)η+0 + η+1 = η2+(t) log xt

We wish to apply Lemma 911 For this we must first check that some norms arefinite Clearly

η+2(t) = η2+(t) log x+ η2

+(t) log t

ηprime+2(t) = 2η+(t)ηprime+(t) log x+ 2η+(t)ηprime+(t) log t+ η2+(t)t

(974)

Thus we see that η+2(t) is in `2 because η+(t) is in `2 and η+(t) η+(t) log t are bothin `infin (see (A25) (A38) (A40))

|η+2(t)|2 le∣∣η2

+(t)∣∣2

log x+∣∣η2

+(t) log t∣∣2

le |η+|infin |η+|2 log x+ |η+(t) log t|infin |η+|2 (975)

Similarly ηprime+2(t) is in `2 because η+(t) is in `2 ηprime+(t) is in `2 (A32) and η+(t)η+(t) log t and η+(t)t (see (A41)) are all in `infin∣∣ηprime+2(t)

∣∣2le∣∣2η+(t)ηprime+(t)

∣∣2

log x+∣∣2η+(t)ηprime+(t) log t

∣∣2

+∣∣η2

+(t)t∣∣2

le 2 |η+|infin∣∣ηprime+∣∣2 log x+ 2 |η+(t) log t|infin

∣∣ηprime+∣∣2 + |η+(t)t|infin |η+|2 (976)

In the same way we see that η+2(t)tσminus1 is in `1 for all σ in (minus1infin) (because the sameis true of η+(t)tσminus1 (A30) and η+(t) η+(t) log t are both in `infin) and ηprime+2(t)tσminus1 isin `1 for all σ in (0infin) (because the same is true of η+(t)tσminus1 and ηprime+(t)tσminus1 (A33)and η+(t) η+(t) log t η+(t)t are all in `infin)

We now apply Lemma 911 with q = 1 δ = 0 Since η+2(0) = 0 the residueterm R equals c0 which by (974) is at most 23 times

2 (|η+|infin log x+ |η+(t) log t|infin)(∣∣∣ηprime+(t)

radict∣∣∣1

+∣∣∣ηprime+(t)

radict∣∣∣1

)+ |η+(t)t|infin

(∣∣∣η+(t)radict∣∣∣1

+∣∣∣η+(t)

radict∣∣∣1

)

Using the bounds (A38) (A40) (A41) (with the assumption H ge 25) (A30) and(A33) we get that this means that

c0 le 1857606 log x+ 863264

190 CHAPTER 9 EXPLICIT FORMULAS

Since q = 1 and δ = 0 we get from (976) (and (A38) (A40) (A41) with theassumption H ge 25 and also (A25) and (A32)) that

(log q + 601)middot(∣∣ηprime+2∣∣2 + 2π|δ| |η+2|2

)xminus12

= 601∣∣ηprime+2∣∣2 xminus12 le (16256 log x+ 59325)xminus12

Using the assumption x ge 108 we obtain

c0 + (18526 log x+ 71799)xminus12 le 19064 log x (977)

We will now apply Lemma 914 ndash as we may because of the finiteness of the normswe have already checked together with

|η+2(t) log t|2 le∣∣η2

+(t) log t∣∣2

log x+∣∣η2

+(t)(log t)2∣∣2

le |η+(t) log t|infin (|η+(t)|2 log x+ |η+(t) log t|2)

le 04976 middot (080365 log x+ 082999) le 03999 log x+ 041301(978)

(by (A40) (A25) and (A28) use the assumption H ge 25) We also need the bounds

|η+2(t)|2 le 114199 log x+ 039989 (979)

(from (975) by the norm bounds (A38) (A40) and (A25) all with H ge 25) and∣∣∣η+2(t)radict∣∣∣1le (|η+(t)|infin log x+ |η+(t) log t|infin)

∣∣∣η+(t)radict∣∣∣1

le 14211 log x+ 049763(980)

(by (A38) (A40) (again with H ge 25) and (A30))Applying Lemma 914 we obtain that the sum

sumρ |G0(ρ)|xρ (where G0(ρ) =

Mη+2(ρ)) over all non-trivial zeros ρ with |=(ρ)| le T0 is at most x12 times

(154189 log x+ 08129)radicT0 log T0 + (421245 log x+ 617301)

radicT0

+ 491 log x+ 172(981)

where we are bounding norms by (979) (978) and (980) (We are using the fact thatT0 ge 2π

radice to ensure that the quantity

radicT0 log T0minus (log 2π

radice)radicT0 being multiplied

by |η+2|2 is positive thus an upper bound for |η+2|2 suffices) By the assumptionsx ge 108 T0 ge 200 (981) is at most

(2445radicT0 log T0 + 50034) log x

In comparison 19064xminus12 log x le 0002 log x since x ge 108It remains to bound the sum of Mη+2(ρ) over zeros with |=(ρ)| gt T0 This we

will do as usual by Lemma 913 For that we will need to bound Mη+2(ρ) for ρ inthe critical strip

95 A SUM FOR η+(T )2 191

The Mellin transform of eminust2

is Γ(s2)2 and so the Mellin transform of t2eminust2

is Γ(s2 + 1)2 By (210) this implies that the Mellin transform of (log t)t2eminust2

isΓprime(s2 + 1)4 Hence by (29)

Mη+2(s) =1

int infinminusinfin

M(h2H)(ir) middot Fx (sminus ir) dr (982)

whereFx(s) = (log x)Γ

(s2

+ 1)

+1

2Γprime(s

2+ 1) (983)

Moreover

M(h2H)(ir) =

1

int infinminusinfin

MhH(iu)MhH(i(r minus u)) du (984)

and so M(h2H)(ir) is supported on [minus2H 2H] We also see that |Mh2

H(ir)|1 le|MhH(ir)|212π We know that |MhH(ir)|212π le 4173727 by (A17)

Hence

|Mη+2(s)| le 1

int infinminusinfin|M(h2

H)(ir)|dr middot max|r|le2H

|Fx(sminus ir)|

le 4173727

4πmiddot max|r|le2H

|Fx(sminus ir)| le 332135 middot max|r|le2H

|Fx(sminus ir)|(985)

By (851) (Stirling with explicit constants)

|Γ(s)| leradic

2π|s|σminus 12 e

112|s|+

radic2

180|s|3 eminusπ|=(s)|2 (986)

when lt(s) ge 0 and so

|Γ(s)| leradic

(radic1252 + 152

125

)e

112middot125 +

radic2

180middot1253 middot |=(s)|eminusπ|=(s)|2

le 2542|=(s)|eminusπ|=(s)|2

(987)

for s isin C with 0 lt lt(s) le 32 and |=(s)| ge 252 Moreover by [OLBC10 5112]and the remarks at the beginning of [OLBC10 511(ii)]

Γprime(s)

Γ(s)= log sminus 1

2s+Olowast

(1

12|s|2middot 1

cos3 θ2

)for | arg(s)| lt θ (θ isin (minusπ π)) Again for s = σ + iτ with 0 lt σ le 32 and|τ | ge 252 this gives us

Γprime(s)

Γ(s)= log |τ |+ log

radic|τ |2 + 152

|τ |+Olowast

(1

2|τ |

)+Olowast

(1

12|τ |2middot 1

(1radic

2)3

)= log |τ |+Olowast

(9

8|τ |2+

1

2|τ |

)+Olowast(0236)

|τ |2

= log |τ |+Olowast(

0609

|τ |

)

192 CHAPTER 9 EXPLICIT FORMULAS

Hence for 0 le lt(s) le 1 (or in fact minus2 le lt(s) le 1) and |=(s)| ge 25

|Fx(s)| le(

(log x) +1

2log∣∣∣τ2

∣∣∣+1

2Olowast(

0609

|τ2|

))Γ(s

2+ 1)

le 2542((log x) +1

2log |τ | minus 0297)

|τ |2eminusπ|τ |2

(988)

Thus by (985) for ρ = σ + iτ with |τ | ge T0 ge 2H + 25 and 0 le σ le 1

|Mη+2(ρ)| le f(τ)

where

f(T ) = 845

(log x+

1

2log T

)(|τ |2minusH

)middot eminus

π(|τ|minus2H)4 (989)

The functions t 7rarr teminusπt2 and t 7rarr (log t)teminusπt2 are decreasing for t ge e (or in factfor t ge 1762) setting t = T2minusH we see that the right side of (989) is a decreasingfunction of T for T ge T0 since T02minusH ge 252 gt e

We can now apply Lemma 913 and get thatsumρ

|=(ρ)|gtT0

|Mη+2(ρ)| leint infinT0

f(T )

(1

2πlog

T

2π+

1

4T

)dT (990)

Since T ge T0 ge 75 gt 2 we know that ((12π) log(T2π) + 14T ) le (12π) log T Hence the right side of (990) is at most

839

int infinT0

((log x)(log T ) +

(log T )2

2

)(T minus 2H)eminus

π(Tminus2H)4 dT

le 0668

int infinT1

((log x)

(log t+

2H

t

)+

((log t)2

2+ 2H

log t

t

))teminus

πt4 dt

(991)

where T1 = T0 minus 2H and t = T minus 2H we are using the facts that (log t)primeprime lt 0 fort gt 0 and ((log t)2)primeprime lt 0 for t gt e (Of course T1 ge 25 gt e)

Of courseintinfinT1eminus(π4)t = (4π)eminus(π4)T1 We recall (936) and (950)int infinT1

log t middot eminusπ4 tdt le(

log T1 +4π

T1

)eminus

π4 T1

π4int infinT1

(log t)teminusπ4 tdt le

(T1 +

4a

π

)eminus

π4 T1 log T1

π4

for T1 ge 1 satisfying log T1 gt 4(πT1) where a = 1 + (1 + 4(πT1))(log T1 minus4(πT1)) It is easy to check that log T1 gt 4(πT1) and 4aπ le 16957 for T1 ge 25of course we also have (4π)25 le 0051 Lastlyint infin

T1

(log t)2teminusπ4 tdt le

(T1 +

4b

π

)eminus

π4 T1(log T1)2

π4

96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 193

for T1 ge e where b = 1 + (2 + 8(πT1))(log T1 minus 8(πT1)) and we check that4bπ le 21319 for T1 ge 25 We conclude that the integral on the second line of (991)is at most

4

π

((log T1)2

2(T1 + 2132) + (log x)(log T1)(T1 + 1696)

)eminus

π4 T1

+4

πmiddot 2H(log T1 + 0051 + log x)eminus

π4 T1

Multiplying this by 0668 and simplifying further (using T1 ge 25) we conclude thatsumρ|=(ρ)|gtT0

|Mη+2(ρ)| is at most

((0462 log T1 + 0909 log x)(log T1)T1 + 171(log T1 + log x)H) eminusπ4 T1

96 A verification of zeros and its consequencesDavid Platt verified in his doctoral thesis [Pla11] that for every primitive character χof conductor q le 105 all the non-trivial zeroes of L(s χ) with imaginary partle 108qlie on the critical line ie have real part exactly 12 (We call this a GRH verificationup to 108q)

In work undertaken in coordination with the present work [Plab] Platt has extendedthese computations to

bull all odd q le 3 middot 105 with Tq = 108q

bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

The method used was rigorous its implementation uses interval arithmeticLet us see what this verification gives us when used as an input to Prop 922 We

are interested in bounds on | errηχlowast(δ x)| for q le r and |δ| le 4rq We set r = 3middot105(We will not be using the verification for q even with 3 middot 105 lt q le 4 middot 105 though wecertainly could)

We let T0 = 108q Thus

T0 ge108

3 middot 105=

1000

3

T0

π|δ|ge 108q

π middot 4rq=

1000

12π

(992)

and so by |δ| le 4rq le 12 middot 106q le 12 middot 106

353eminus01598T0 le 2597 middot 10minus23

225δ2

T0eminus01065

T20

(πδ)2 le |δ| middot 7715 middot 10minus34 le 9258 middot 10minus28

194 CHAPTER 9 EXPLICIT FORMULAS

Since qT0 le 108 this gives us that

logqT0

2πmiddot(

353eminus01598T0 + 225δ2

T0eminus01065

T20

(πδ)2

)le 43054 middot 10minus22 +

154 middot 10minus26

qle 4306 middot 10minus22

Again by T0 = 108q

2337radicT0 log qT0 + 21817

radicT0 + 285 log q + 7438

is at most648662radicq

+ 111

and

3 log q + 14|δ|+ 17 le 55 +17 middot 107

q

(log q + 6) middot (1 + 5|δ|) le 19 +12 middot 108

q

Hence assuming x ge 108 to simplify we see that Prop 922 gives us that

errηχ(δ x) le 4306 middot 10minus22 +

648662radicq + 111radicx

+55 + 17middot107

q

x+

19 + 12middot108

q

x32

le 4306 middot 10minus22 +1radicx

(650400radicq

+ 112

)for η(t) = eminust

22 This proves Theorem 711Let us now see what Plattrsquos calculations give us when used as an input to Prop 932

and Cor 933 Again we set r = 3 middot 105 δ0 = 8 |δ| le 4rq and T0 = 108q so(992) is still valid We obtain

T0 logqT0

2πmiddot(

611eminus01598T0 + 1578eminus01065middot T

20

(πδ)2

)le log

108

(611 middot 1000

3eminus01598middot 10003 + 108 middot 1578eminus01065( 1000

12π )2)

le 2485 middot 10minus19

since t exp(minus01598t) is decreasing on t for t ge 101598 We use the same boundwhen we have 00102 instead of 1578 on the left side as in (961) (The coefficientaffects what is by far the smaller term so we are wasting nothing) Again by T0 =108q and q le r

122radicT0 log qT0 + 5053

radicT0 + 1423 log q + 3719 le 279793

radicq

+ 552

1679radicT0 log qT0 + 6957

radicT0 + 1958 log q + 5117 le 378854

radicq

+ 759

96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 195

For x ge 108 we use |δ| le 4rq le 12 middot 106q to bound

(3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32 le(

00004 +1322

q

)xminus12

(6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32 le(

00007 +2644

q

)xminus12

Summing we obtain

errηχ le 2485 middot 10minus19 +1radicx

(281200radicq

+ 56

)for η(t) = t2eminust

22 and

errηχ le 2485 middot 10minus19 +1radicx

(381500radicq

+ 76

)for η(t) = t2eminust

22 lowastM η2(t) This proves Theorem 712 and Corollary 713Now let us work with the smoothing weight η+ This time around set r = 150000

if q is odd and r = 300000 if q is even As before we assume

q le r |δ| le 4rq

We can see that Plattrsquos verification [Plab] mentioned before allows us to take

T0 = H +250r

q H = 200

since Tq is always at least this (Tq = 108q ge 200 + 7 middot 107q gt 200 + 375 middot 107qfor q le 150000 odd Tq ge 200 + 75 middot 107q for q le 300000 even)

Thus

T0 minusH =250r

qge 250r

r= 250

T0 minusHπδ

ge 250r

πδqge 250

4π= 1989436

and also

T0 le 200 + 250 middot 150000 le 3751 middot 107 qT0 le rH + 250r le 135 middot 108

Hence sinceradicteminus01598t is decreasing on t for t ge 1(2 middot 01598)

11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

(T0minusH)2

(πδ)2

le 79854 middot 10minus16 +4r

qmiddot 79814 middot 10minus18

le 79854 middot 10minus16 +95777 middot 10minus12

q

196 CHAPTER 9 EXPLICIT FORMULAS

Examining (970) we get

errη+χ(δ x) le log135 middot 108

2πmiddot(

79854 middot 10minus16 +95777 middot 10minus12

q

)+

((1634 log(135 middot 108) + 1243

) radic135 middot 108

radicq

+ 1321 log 300000 + 3451

)1radicx

+

(9 + 11 middot 12 middot 106

q

)xminus1 + (log 300000)

(11 + 6 middot 12 middot 106

q

)xminus32

le 13482 middot 10minus14 +1617 middot 10minus10

q

+

(499845radicq

+ 5117 +132 middot 106

qradicx

+9radicx

+91 middot 107

qx+

139

x

)1radicx

Making the assumption x ge 1012 we obtain

errη+χ(δ x) le 13482 middot 10minus14 +1617 middot 10minus10

q+

(499900radicq

+ 52

)1radicx

This proves Theorem 714 for general qLet us optimize things a little more carefully for the trivial character χT Again

we will make the assumption x ge 1012 We will also assume as we did before that|δ| le 4rq this now gives us |δ| le 600000 since q = 1 and r = 150000 for q oddWe will go up to a height T0 = H + 600000π middot t where H = 200 and t ge 10 Then

T0 minusHπδ

=600000πt

4πrge t

Hence

11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

(T0minusH)2

(πδ)2

le 10minus1300000 + 9689000eminus01065t2

Looking at (970) we get

errη+χT (δ x) le logT0

2πmiddot(

10minus1300000 + 9689000eminus01065t2)

+ ((1634 log T0 + 1243)radicT0 + 3451)xminus12 + 6600009xminus1

The value t = 20 seems good enough we choose it because it is not far from optimalfor x sim 1027 We get that T0 = 12000000π + 200 since T0 lt 108 we are within therange of the computations in [Plab] (or for that matter [Wed03] or [Plaa]) We obtain

errη+χT (δ x) le 4772 middot 10minus11 +251400radic

x

Lastly let us look at the sum estimated in (972) Here it will be enough to go upto just T0 = 2H + max(50 H4) = 450 where as before H = 200 Of course the

96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 197

verification of the zeros of the Riemann zeta function does go that far as we alreadysaid it goes until 108 (or rather more see [Wed03] and [Plaa]) We make again theassumption x ge 1012 We look at (973) and obtain that err`2η+ is at most((

0462(log 50)2

log 1012+ 0909 log 50

)middot 50 + 171

(1 +

log 50

log 1012

)middot 200

)eminus

π4 50

+ (2445radic

450 log 450 + 5004) middot xminus12

le 5123 middot 10minus15 +36691radic

x

(993)It remains only to estimate the integral in (972) First of allint infin

0

η2+(t) log xt dt =

int infin0

η2(t) log xt dt

+ 2

int infin0

(η+(t)minus η(t))η(t) log xt dt+

int infin0

(η+(t)minus η(t))2 log xt dt

The main term will be given byint infin0

η2(t) log xt dt =

(064020599736635 +O

(10minus14

))log x

minus 0021094778698867 +O(10minus15

)

where the integrals were computed rigorously using VNODE-LP [Ned06] (The in-tegral

intinfin0η2(t)dt can also be computed symbolically) By Cauchy-Schwarz and the

triangle inequalityint infin0

(η+(t)minus η(t))η(t) log xt dt le |η+ minus η|2|η(t) log xt|2

le |η+ minus η|2(|η|2 log x+ |η middot log |2)

le 27486

H72(080013 log x+ 0214)

le 1944 middot 10minus6 middot log x+ 52 middot 10minus7

where we are using (A23) and evaluate |η middot log |2 rigorously as above By (A23) and(A24)int infin

0

(η+(t)minus η(t))2 log xt dt le(

27486

H72

)2

log x+27428

H7

le 5903 middot 10minus12 middot log x+ 2143 middot 10minus12

We conclude thatint infin0

η2+(t) log xt dt

= (0640206 +Olowast(195 middot 10minus6)) log xminus 0021095 +Olowast(53 middot 10minus7)

(994)

198 CHAPTER 9 EXPLICIT FORMULAS

We add to this the error term 5123 middot 10minus15 + 36691radicx from (993) and simplify

using the assumption x ge 1012 We obtain

infinsumn=1

Λ(n)(log n)η2+(nx) = 0640206x log xminus 0021095x

+Olowast(2 middot 10minus6x log x+ 36691

radicx log x

)

(995)

and so Prop 951 gives us Proposition 715As we can see the relatively large error term 2 middot 10minus6 comes from the fact that we

have wanted to give the main term in (972) as an explicit constant rather than as anintegral This is satisfactory Prop 715 is an auxiliary result that will be needed forone specific purpose in Part III as opposed to Thms 711ndash714 which while crucialfor Part III are also of general applicability and interest

Part III

The integral over the circle

199

Chapter 10

The integral over the major arcs

LetSη(α x) =

sumn

Λ(n)e(αn)η(nx) (101)

where α isin RZ Λ is the von Mangoldt function and η R rarr C is of fast enoughdecay for the sum to converge

Our ultimate goal is to bound from belowsumn1+n2+n3=N

Λ(n1)Λ(n2)Λ(n3)η1(n1x)η2(n2x)η3(n3x) (102)

where η1 η2 η3 R rarr C Once we know that this is neither zero nor very close tozero we will know that it is possible to write N as the sum of three primes n1 n2 n3

in at least one way that is we will have proven the ternary Goldbach conjectureAs can be readily seen (102) equalsint

RZSη1(α x)Sη2(α x)Sη3(α x)e(minusNα) dα (103)

In the circle method the set RZ gets partitioned into the set of major arcs M and theset of minor arcs m the contribution of each of the two sets to the integral (103) isevaluated separately

Our objective here is to treat the major arcs we wish to estimateintM

Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα)dα (104)

for M = Mδ0r where

Mδ0r =⋃qlerq odd

⋃a mod q

(aq)=1

(a

qminus δ0r

2qxa

q+δ0r

2qx

)cup⋃qle2rq even

⋃a mod q

(aq)=1

(a

qminus δ0r

qxa

q+δ0r

qx

)(105)

201

202 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

and δ0 gt 0 r ge 1 are givenIn other words our major arcs will be few (that is a constant number) and narrow

While [LW02] used relatively narrow major arcs as well their number as in all pre-vious proofs of Vinogradovrsquos result was not bounded by a constant (In his proof ofthe five-primes theorem [Tao14] is able to take a single major arc around 0 this is notpossible here)

What we are about to see is the general major-arc setup This is naturally the placewhere the overlap with the existing literature is largest Two important differences cannevertheless be singled out

bull The most obvious one is the presence of smoothing At this point it improvesand simplifies error terms but it also means that we will later need estimates forexponential sums on major arcs and not just at the middle of each major arc (Ifthere is smoothing we cannot use summation by parts to reduce the problem ofestimating sums to a problem of counting primes in arithmetic progressions orweighted by characters)

bull Since our L-function estimates for exponential sums will give bounds that arebetter than the trivial one by only a constant ndash even if it is a rather large con-stant ndash we need to be especially careful when estimating error terms findingcancellation when possible

101 Decomposition of Sη by charactersWhat follows is largely classical cf [HL22] or say [Dav67 sect26] The only differencefrom the literature lies in the treatment of n non-coprime to q and the way in whichwe show that our exponential sum (108) is equal to a linear combination of twistedsums Sηχlowast over primitive characters χlowast (Non-primitive characters would give us L-functions with some zeroes inconveniently placed on the line lt(s) = 0)

Write τ(χ b) for the Gauss sum

τ(χ b) =sum

a mod q

χ(a)e(abq) (106)

associated to a b isin ZqZ and a Dirichlet character χ with modulus q We let τ(χ) =τ(χ 1) If (b q) = 1 then τ(χ b) = χ(bminus1)τ(χ)

Recall that χlowast denotes the primitive character inducing a given Dirichlet characterχ Writing

sumχ mod q for a sum over all characters χ of (ZqZ)lowast) we see that for any

a0 isin ZqZ

1

φ(q)

sumχ mod q

τ(χ b)χlowast(a0) =1

φ(q)

sumχ mod q

suma mod q

(aq)=1

χ(a)e(abq)χlowast(a0)

=sum

a mod q

(aq)=1

e(abq)

φ(q)

sumχ mod q

χlowast(aminus1a0) =sum

a mod q

(aq)=1

e(abq)

φ(q)

sumχ mod qprime

χ(aminus1a0)

(107)

101 DECOMPOSITION OF Sη BY CHARACTERS 203

where qprime = q gcd(q ainfin0 ) Nowsumχ mod qprime χ(aminus1a0) = 0 unless a = a0 (in which

casesumχ mod qprime χ(aminus1a0) = φ(qprime)) Thus (107) equals

φ(qprime)

φ(q)

suma mod q

(aq)=1

aequiva0 mod qprime

e(abq) =φ(qprime)

φ(q)

sumk mod qqprime

(kqqprime)=1

e

((a0 + kqprime)b

q

)

=φ(qprime)

φ(q)e

(a0b

q

) sumk mod qqprime

(kqqprime)=1

e

(kb

qqprime

)=φ(qprime)

φ(q)e

(a0b

q

)micro(qqprime)

provided that (b q) = 1 (We are evaluating a Ramanujan sum in the last step) Hencefor α = aq + δx q le x (a q) = 1

1

φ(q)

sumχ

τ(χ a)sumn

χlowast(n)Λ(n)e(δnx)η(nx)

equals sumn

micro((q ninfin))

φ((q ninfin))Λ(n)e(αn)η(nx)

Since (a q) = 1 τ(χ a) = χ(a)τ(χ) The factor micro((q ninfin))φ((q ninfin)) equals 1when (n q) = 1 the absolute value of the factor is at most 1 for every n Clearlysum

n(nq)6=1

Λ(n)η(nx

)=sump|q

log psumαge1

η

(pα

x

)

Recalling the definition (101) of Sη(α x) we conclude that

Sη(α x) =1

φ(q)

sumχ mod q

χ(a)τ(χ)Sηχlowast

x x

)+Olowast

2sump|q

log psumαge1

η

(pα

x

)

(108)where

Sηχ(β x) =sumn

Λ(n)χ(n)e(βn)η(nx) (109)

Hence Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα) equals

1

φ(q)3

sumχ1

sumχ2

sumχ3

τ(χ1)τ(χ2)τ(χ3)χ1(a)χ2(a)χ3(a)e(minusNaq)

middot Sη1χlowast1 (δx x)Sη2χlowast2 (δx x)Sη3χlowast3 (δx x)e(minusδNx)

(1010)

plus an error term of absolute value at most

2

3sumj=1

prodjprime 6=j

|Sηjprime (α x)|sump|q

log psumαge1

ηj

(pα

x

) (1011)

204 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

We will later see that the integral of (1011) over S1 is negligible ndash for our choices ofηj it will in fact be of size O(x(log x)A) A a constant The error term O(x(log x)A)should be compared to the main term which will be of size about a constant times x2

In (1010) we have reduced our problems to estimating Sηχ(δx x) for χ prim-itive a more obvious way of reaching the same goal would have made (1011) worseby a factor of about

radicq

102 The integral over the major arcs the main term

We are to estimate the integral (104) where the major arcs Mδ0r are defined as in(105) We will use η1 = η2 = η+ η3(t) = ηlowast(κt) where η+ and ηlowast will be set later

We can write

Sηχ(δx x) = Sη(δx x) =

int infin0

η(tx)e(δtx)dt+Olowast(errηχ(δ x)) middot x

= η(minusδ) middot x+Olowast(errηχT (δ x)) middot x(1012)

for χ = χT the trivial character and

Sηχ(δx) = Olowast(errηχ(δ x)) middot x (1013)

for χ primitive and non-trivial The estimation of the error terms err will come laterlet us focus on (a) obtaining the contribution of the main term (b) using estimates onthe error terms efficiently

The main term three principal characters The main contribution will be given bythe term in (1010) with χ1 = χ2 = χ3 = χ0 where χ0 is the principal character modq

The sum τ(χ0 n) is a Ramanujan sum as is well-known (see eg [IK04 (32)])

τ(χ0 n) =sumd|(qn)

micro(qd)d (1014)

This simplifies to micro(q(q n))φ((q n)) for q square-free The special case n = 1 givesus that τ(χ0) = micro(q)

Thus the term in (1010) with χ1 = χ2 = χ3 = χ0 equals

e(minusNaq)φ(q)3

micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx) (1015)

where of course Sηχlowast0 (α x) = Sη(α x) (since χlowast0 is the trivial character) Summing(1015) for α = aq+δx and a going over all residues mod q coprime to q we obtain

micro(

q(qN)

)φ((qN))

φ(q)3micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx)

102 THE INTEGRAL OVER THE MAJOR ARCS THE MAIN TERM 205

The integral of (1015) over all of M = Mδ0r (see (105)) thus equals

sumqlerq odd

φ((qN))

φ(q)3micro(q)2micro((qN))

int δ0r2qx

minus δ0r2qx

S2η+χlowast0

(α x)Sηlowastχlowast0 (α x)e(minusαN)dα

+sumqle2rq even

φ((qN))

φ(q)3micro(q)2micro((qN))

int δ0rqx

minus δ0rqxS2η+χlowast0

(α x)Sηlowastχlowast0 (α x)e(minusαN)dα

(1016)The main term in (1016) is

x3 middotsumqlerq odd

φ((qN))

φ(q)3micro(q)2micro((qN))

int δ0r2qx

minus δ0r2qx

(η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

+x3 middotsumqle2rq even

φ((qN))

φ(q)3micro(q)2micro((qN))

int δ0rqx

minus δ0rqx(η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

(1017)We would like to complete both the sum and the integral Before we should say

that we will want to be able to use smoothing functions η+ whose Fourier transformsare not easy to deal with directly All we want to require is that there be a smoothingfunction η easier to deal with such that η be close to η+ in `2 norm

Assume then that

|η+ minus η|2 le ε0|η|

where η is thrice differentiable outside finitely many points and satisfies η(3) isin L1

Then (1017) equals

x3 middotsumqlerq odd

φ((qN))

φ(q)3micro(q)2micro((qN))

int δ0r2qx

minus δ0r2qx

(η(minusαx))2ηlowast(minusαx)e(minusαN)dα

+x3 middotsumqle2rq even

φ((qN))

φ(q)3micro(q)2micro((qN))

int δ0rqx

minus δ0rqx(η(minusαx))2ηlowast(minusαx)e(minusαN)dα

(1018)plus

Olowast

(x2 middot

sumq

micro(q)2

φ(q)2

int infinminusinfin|(η+(minusα))2 minus (η(minusα))2||ηlowast(minusα)|dα

) (1019)

206 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

Here (1019) is bounded by 282643x2 (by (C9)) times

|ηlowast(minusα)|infin middot

radicint infinminusinfin|η+(minusα)minus η(minusα)|2dα middot

int infinminusinfin|η+(minusα) + η(minusα)|2dα

le |ηlowast|1 middot |η+ minus η|2|η+ + η|2 = |ηlowast|1 middot |η+ minus η|2|η+ + η|2le |ηlowast|1 middot |η+ minus η|2(2|η|2 + |η+ minus η|2) = |ηlowast|1|η|22 middot (2 + ε0)ε0

Now (1018) equals

x3

int infinminusinfin

(η(minusαx))2ηlowast(minusαx)e(minusαN)sum

q(q2)lemin( δ0r

2|α|x r)micro(q)2=1

φ((qN))

φ(q)3micro((qN))dα

= x3

int infinminusinfin

(η(minusαx))2ηlowast(minusαx)e(minusαN)dα middot

sumqge1

φ((qN))

φ(q)3micro(q)2micro((qN))

minusx3

int infinminusinfin

(η(minusαx))2ηlowast(minusαx)e(minusαN)sum

q(q2)

gtmin( δ0r

2|α|x r)micro(q)2=1

φ((qN))

φ(q)3micro((qN))dα

(1020)The last line in (1020) is bounded1 by

x2|ηlowast|infinint infinminusinfin|η(minusα)|2

sumq

(q2)gtmin( δ0r2|α| r)

micro(q)2

φ(q)2dα (1021)

By (21) (with k = 3) (C16) and (C17) this is at most

x2|ηlowast|1int δ02

minusδ02|η(minusα)|2 431004

rdα

+ 2x2|ηlowast|1int infinδ02

(|η(3) |1

(2πα)3

)2862008|α|

δ0rdα

le |ηlowast|1

(431004|η|22 + 000113

|η(3) |21δ50

)x2

r

It is easy to see that

sumqge1

φ((qN))

φ(q)3micro(q)2micro((qN)) =

prodp|N

(1minus 1

(pminus 1)2

)middotprodp-N

(1 +

1

(pminus 1)3

)

1This is obviously crude in that we are bounding φ((qN))φ(q) by 1 We are doing so in order toavoid a potentially harmful dependence on N

103 THE `2 NORM OVER THE MAJOR ARCS 207

Expanding the integral implicit in the definition of f int infininfin

(η(minusαx))2ηlowast(minusαx)e(minusαN)dα =

1

x

int infin0

int infin0

η(t1)η(t2)ηlowast

(N

xminus (t1 + t2)

)dt1dt2

(1022)

(This is standard One rigorous way to obtain (1022) is to approximate the integralover α isin (minusinfininfin) by an integral with a smooth weight at different scales as the scalebecomes broader the Fourier transform of the weight approximates (as a distribution)the δ function Apply Plancherel)

Hence (1017) equals

x2 middotint infin

0

int infin0

η(t1)η(t2)ηlowast

(N

xminus (t1 + t2)

)dt1dt2

middotprodp|N

(1minus 1

(pminus 1)2

)middotprodp-N

(1 +

1

(pminus 1)3

)

(1023)

(the main term) plus

282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 000113

|η(3) |21δ50

r

|ηlowast|1x2 (1024)

Here (1023) is just as in the classical case [IK04 (1910)] except for the fact thata factor of 12 has been replaced by a double integral Later in chapter 11 we will seehow to choose our smoothing functions (and x in terms ofN ) so as to make the doubleintegral as large as possible in comparison with the error terms This is an importantoptimization (We already had a first discussion of this in the introduction see (139)and what follows)

What remains to estimate is the contribution of all the terms of the form errηχ(δ x)in (1012) and (1013) Let us first deal with another matter ndash bounding the `2 norm of|Sη(α x)|2 over the major arcs

103 The `2 norm over the major arcs

We can always bound the integral of |Sη(α x)|2 on the whole circle by Plancherel Ifwe only want the integral on certain arcs we use the bound in Prop 1212 (based onwork by Ramare) If these arcs are really the major arcs ndash that is the arcs on whichwe have useful analytic estimates ndash then we can hope to get better bounds using L-functions This will be useful both to estimate the error terms in this section and tomake the use of Ramarersquos bounds more efficient later

208 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

By (108)

suma mod q

gcd(aq)=1

∣∣∣∣Sη (aq +δ

x χ

)∣∣∣∣2

=1

φ(q)2

sumχ

sumχprime

τ(χ)τ(χprime)

suma mod q

gcd(aq)=1

χ(a)χprime(a)

middot Sηχlowast(δx x)Sηχprimelowast(δx x)

+Olowast(

2(1 +radicq)(log x)2|η|infinmax

α|Sη(α x)|+

((1 +

radicq)(log x)2|η|infin

)2)=

1

φ(q)

sumχ

|τ(χ)|2|Sηχlowast(δx x)|2 +Kq1(2|Sη(0 x)|+Kq1)

where

Kq1 = (1 +radicq)(log x)2|η|infin

As is well-known (see eg [IK04 Lem 31])

τ(χ) = micro

(q

qlowast

)χlowast(q

qlowast

)τ(χlowast)

where qlowast is the modulus of χlowast (ie the conductor of χ) and

|τ(χlowast)| =radicqlowast

Using the expressions (1012) and (1013) we obtain

suma mod q

(aq)=1

∣∣∣∣Sη (aq +δ

x x

)∣∣∣∣2 =micro2(q)

φ(q)|η(minusδ)x+Olowast (errηχT (δ x) middot x)|2

+1

φ(q)

sumχ 6=χT

micro2

(q

qlowast

)qlowast middotOlowast

(| errηχ(δ x)|2x2

)+Kq1(2|Sη(0 x)|+Kq1)

=micro2(q)x2

φ(q)

(|η(minusδ)|2 +Olowast (|errηχT (δ x)(2|η|1 + errηχT (δ x))|)

)+Olowast

(maxχ6=χT

qlowast| errηχlowast(δ x)|2x2 +Kq2x

)

where Kq2 = Kq1(2|Sη(0 x)|x+Kq1x)

103 THE `2 NORM OVER THE MAJOR ARCS 209

Thus the integral of |Sη(α x)|2 over M (see (105)) is

sumqlerq odd

suma mod q

(aq)=1

int aq+

δ0r2qx

aqminus

δ0r2qx

|Sη(α x)|2 dα+sumqle2rq even

suma mod q

(aq)=1

int aq+

δ0rqx

aqminus

δ0rqx

|Sη(α x)|2 dα

=sumqlerq odd

micro2(q)x2

φ(q)

int δ0r2qx

minus δ0r2qx

|η(minusαx)|2 dα+sumqle2rq even

micro2(q)x2

φ(q)

int δ0rqx

minus δ0rqx|η(minusαx)|2 dα

+Olowast

(sumq

micro2(q)x2

φ(q)middot gcd(q 2)δ0r

qx

(ET

ηδ0r2

(2|η|1 + ETηδ0r2

)))

+sumqlerq odd

δ0rx

qmiddotOlowast

maxχ mod q

χ 6=χT|δ|leδ0r2q

qlowast| errηχlowast(δ x)|2 +Kq2

x

+sumqle2rq even

2δ0rx

qmiddotOlowast

maxχ mod q

χ 6=χT|δ|leδ0rq

qlowast| errηχlowast(δ x)|2 +Kq2

x

(1025)where

ETηs = max|δ|les

| errηχT (δ x)|

and χT is the trivial character If all we want is an upper bound we can simply remarkthat

xsumqlerq odd

micro2(q)

φ(q)

int δ0r2qx

minus δ0r2qx

|η(minusαx)|2 dα+ xsumqle2rq even

micro2(q)

φ(q)

int δ0rqx

minus δ0rqx|η(minusαx)|2 dα

le

sumqlerq odd

micro2(q)

φ(q)+sumqle2rq even

micro2(q)

φ(q)

|η|22 = 2|η|22sumqlerq odd

micro2(q)

φ(q)

If we also need a lower bound we proceed as follows

Again we will work with an approximation η such that (a) |η minus η|2 is small (b)

210 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

η is thrice differentiable outside finitely many points (c) η(3) isin L1 Clearly

xsumqlerq odd

micro2(q)

φ(q)

int δ0r2qx

minus δ0r2qx

|η(minusαx)|2 dα

lesumqlerq odd

micro2(q)

φ(q)

(int δ0r2q

minus δ0r2q

|η(minusα)|2 dα+ 2〈|η| |η minus η|〉+ |η minus η|22

)

=sumqlerq odd

micro2(q)

φ(q)

int δ0r2q

minus δ0r2q

|η(minusα)|2 dα

+Olowast(

1

2log r + 085

)(2 |η|2 |η minus η|2 + |η minus η|22

)

where we are using (C11) and isometry Alsosumqle2rq even

micro2(q)

φ(q)

int δ0rqx

minus δ0rqx|η(minusαx)|2 dα =

sumqlerq odd

micro2(q)

φ(q)

int δ0r2qx

minus δ0r2qx

|η(minusαx)|2 dα

By (21) and Plancherelint δ0r2q

minus δ0r2q

|η(minusα)|2 dα =

int infinminusinfin|η(minusα)|2 dαminusOlowast

(2

int infinδ0r2q

|η(3) |21

(2πα)6dα

)

= |η|22 +Olowast

(|η(3) |21q5

5π6(δ0r)5

)

Hence

sumqlerq odd

micro2(q)

φ(q)

int δ0r2q

minus δ0r2q

|η(minusα)|2 dα = |η|22 middotsumqlerq odd

micro2(q)

φ(q)+Olowast

sumqlerq odd

micro2(q)

φ(q)

|η(3) |21q5

5π6(δ0r)5

Using (C18) we get thatsumqlerq odd

micro2(q)

φ(q)

|η(3) |21q5

5π6(δ0r)5le 1

r

sumqlerq odd

micro2(q)q

φ(q)middot |η

(3) |21

5π6δ50

le |η(3) |21

5π6δ50

middot(

064787 +log r

4r+

0425

r

)

Going back to (1025) we use (C7) to boundsumq

micro2(q)x2

φ(q)

gcd(q 2)δ0r

qxle 259147 middot δ0rx

103 THE `2 NORM OVER THE MAJOR ARCS 211

We also note that sumqlerq odd

1

q+sumqle2rq even

2

q=sumqler

1

qminussumqle r2

1

2q+sumqler

1

q

le 2 log er minus logr

2le log 2e2r

We have proven the following result

Lemma 1031 Let η [0infin) rarr R be in L1 cap Linfin Let Sη(α x) be as in (101) andlet M = Mδ0r be as in (105) Let η [0infin) rarr R be thrice differentiable outsidefinitely many points Assume η(3)

isin L1Assume r ge 182 ThenintM

|Sη(α x)|2dα = Lrδ0x+Olowast(

519δ0xr

(ET

ηδ0r2middot(|η|1 +

ETηδ0r2

2

)))+Olowast

(δ0r(log 2e2r)

(x middot E2

ηrδ0 +Kr2

))

(1026)where

Eηrδ0 = maxχ mod q

qlermiddotgcd(q2)

|δ|legcd(q2)δ0r2q

radicqlowast| errηχlowast(δ x)| ETηs = max

|δ|les| errηχT (δ x)|

Kr2 = (1 +radic

2r)(log x)2|η|infin(2|Sη(0 x)|x+ (1 +radic

2r)(log x)2|η|infinx)(1027)

and Lrδ0 satisfies both

Lrδ0 le 2|η|22sumqlerq odd

micro2(q)

φ(q)(1028)

and

Lrδ0 = 2|η|22sumqlerq odd

micro2(q)

φ(q)+Olowast(log r + 17) middot

(2 |η|2 |η minus η|2 + |η minus η|22

)

+Olowast

(2|η(3) |21

5π6δ50

)middot(

064787 +log r

4r+

0425

r

)

(1029)Here as elsewhere χlowast denotes the primitive character inducing χ whereas qlowast denotesthe modulus of χlowast

The error term xrETηδ0r will be very small since it will be estimated using theRiemann zeta function the error term involving Kr2 will be completely negligibleThe term involving xr(r+1)E2

ηrδ0 we see that it constrains us to have | errηχ(xN)|

less than a constant times 1r if we do not want the main term in the bound (1026) tobe overwhelmed

212 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

104 The integral over the major arcs conclusion

There are at least two ways we can evaluate (104) One is to substitute (1010) into(104) The disadvantages here are that (a) this can give rise to pages-long formulae (b)this gives error terms proportional to xr| errηχ(xN)| meaning that to win we wouldhave to show that | errηχ(xN)| is much smaller than 1r What we will do instead isto use our `2 estimate (1026) in order to bound the contribution of non-principal termsThis will give us a gain of almost

radicr on the error terms in other words to win it will

be enough to show later that | errηχ(xN)| is much smaller than 1radicr

The contribution of the error terms in Sη3(α x) (that is all terms involving thequantities errηχ in expressions (1012) and (1013)) to (104) is

sumqlerq odd

1

φ(q)

sumχ3 mod q

τ(χ3)sum

a mod q

(aq)=1

χ3(a)e(minusNaq)

int δ0r2qx

minus δ0r2qx

Sη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

+sumqle2rq even

1

φ(q)

sumχ3 mod q

τ(χ3)sum

a mod q

(aq)=1

χ3(a)e(minusNaq)

int δ0rqx

minus δ0rqxSη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

(1030)

We should also remember the terms in (1011) we can integrate them over all of RZand obtain that they contribute at most

intRZ

2

3sumj=1

prodjprime 6=j

|Sηjprime (α x)| middotmaxqler

sump|q

log psumαge1

ηj

(pα

x

)dα

le 2

3sumj=1

prodjprime 6=j

|Sηjprime (α x)|2 middotmaxqler

sump|q

log psumαge1

ηj

(pα

x

)

= 2sumn

Λ2(n)η2+(nx) middot log r middotmax

pler

sumαge1

ηlowast

(pα

x

)

+ 4

radicsumn

Λ2(n)η2+(nx) middot

sumn

Λ2(n)η2lowast(nx) middot log r middotmax

pler

sumαge1

ηlowast

(pα

x

)

by Cauchy-Schwarz and Plancherel

104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 213

The absolute value of (1030) is at most

sumqlerq odd

suma mod q

(aq)=1

int δ0r2qx

minus δ0r2qx

∣∣Sη+(α+ aq x)∣∣2 dα middot max

χ mod q

|δ|leδ0r2q

radicqlowast| errηlowastχlowast(δ x)|

+sumqle2rq even

suma mod q

(aq)=1

int δ0rqx

minus δ0rqx

∣∣Sη+(α+ aq x)∣∣2 dα middot max

χ mod q

|δ|leδ0rq

radicqlowast| errηlowastχlowast(δ x)|

leintMδ0r

∣∣Sη+(α)∣∣2 dα middot max

χ mod q

qlermiddotgcd(q2)

|δ|legcd(q2)δ0rq

radicqlowast| errηlowastχlowast(δ x)|

(1031)We can bound the integral of |Sη+(α)|2 by (1026)

What about the contribution of the error part of Sη2(α x) We can obviouslyproceed in the same way except that to avoid double-counting Sη3(α x) needs tobe replaced by

1

φ(q)τ(χ0)η3(minusδ) middot x =

micro(q)

φ(q)η3(minusδ) middot x (1032)

which is its main term (coming from (1012)) Instead of having an `2 norm as in(1031) we have the square-root of a product of two squares of `2 norms (by Cauchy-Schwarz) namely

intM|Slowastη+(α)|2dα and

sumqlerq odd

micro2(q)

φ(q)2

int δ0r2qx

minus δ0r2qx

|ηlowast(minusαx)x|2 dα+sumqle2rq even

micro2(q)

φ(q)2

int δ0rqx

minus δ0rqx|ηlowast(minusαx)x|2 dα

le x|ηlowast|22 middotsumq

micro2(q)

φ(q)2

(1033)

By (C9) the sum over q is at most 282643As for the contribution of the error part of Sη1(α x) we bound it in the same way

using solely the `2 norm in (1033) (and replacing both Sη2(α x) and Sη3(α x) byexpressions as in (1032))

The total of the error terms is thus

x middot maxχ mod q

qlermiddotgcd(q2)

|δ|legcd(q2)δ0rq

radicqlowast middot | errηlowastχlowast(δ x)| middotA

+ x middot maxχ mod q

qlermiddotgcd(q2)

|δ|legcd(q2)δ0rq

radicqlowast middot | errη+χlowast(δ x)|(

radicA+

radicB+)

radicBlowast

(1034)

where A = (1x)intM|Sη+(α x)|2dα (bounded as in (1026)) and

Blowast = 282643|ηlowast|22 B+ = 282643|η+|22 (1035)

214 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

In conclusion we have proven

Proposition 1041 Let x ge 1 Let η+ ηlowast [0infin)rarr R Assume η+ isin C2 ηprimeprime+ isin L2

and η+ ηlowast isin L1 cap L2 Let η [0infin) rarr R be thrice differentiable outside finitelymany points Assume η(3)

isin L1 and |η+ minus η|2 le ε0|η|2 where ε0 ge 0Let Sη(α x) =

sumn Λ(n)e(αn)η(nx) Let errηχ χ primitive be given as in

(1012) and (1013) Let δ0 gt 0 r ge 1 Let M = Mδ0r be as in (105)Then for any N ge 0int

M

Sη+(α x)2Sηlowast(α x)e(minusNα)dα

equals

C0Cηηlowastx2 +

282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

|η(3) |21δ50

r

|ηlowast|1x2

+Olowast(Eηlowastrδ0Aη+ + Eη+rδ0 middot 16812(radicAη+ + 16812|η+|2)|ηlowast|2) middot x2

+Olowast(

2Zη2+2(x)LSηlowast(x r) middot x+ 4radicZη2+2(x)Zη2lowast2(x)LSη+(x r) middot x

)

(1036)where

C0 =prodp|N

(1minus 1

(pminus 1)2

)middotprodp-N

(1 +

1

(pminus 1)3

)

Cηηlowast =

int infin0

int infin0

η(t1)η(t2)ηlowast

(N

xminus (t1 + t2)

)dt1dt2

(1037)

Eηrδ0 = maxχ mod q

qlegcd(q2)middotr|δ|legcd(q2)δ0r2q

radicqlowast middot | errηχlowast(δ x)| ETηs = max

|δ|lesq| errηχT (δ x)|

Aη =1

x

intM

∣∣Sη+(α x)∣∣2 dα Lηrδ0 le 2|η|22

sumqlerq odd

micro2(q)

φ(q)

Kr2 = (1 +radic

2r)(log x)2|η|infin(2Zη1(x)x+ (1 +radic

2r)(log x)2|η|infinx)

Zηk(x) =1

x

sumn

Λk(n)η(nx) LSη(x r) = log r middotmaxpler

sumαge1

η

(pα

x

)

(1038)and errηχ is as in (1012) and (1013)

Here is how to read these expressions The error term in the first line of (1036)will be small provided that ε0 is small and r is large The third line of (1036) willbe negligible as will be the term 2δ0r(log er)Kr2 in the definition of Aη (ClearlyZηk(x)η (log x)kminus1 and LSη(x q)η τ(q) log x for any η of rapid decay)

104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 215

It remains to estimate the second line of (1036) This includes estimating Aη ndasha task that was already accomplished in Lemma 1031 We see that we will have togive very good bounds for Eηrδ0 when η = η+ or η = ηlowast We also see that we wantto make C0Cη+ηlowastx

2 as large as possible it will be competing not just with the errorterms here but more importantly with the bounds from the minor arcs which will beproportional to |η+|22|ηlowast|1

216 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

Chapter 11

Optimizing and adaptingsmoothing functions

One of our goals is to maximize the quantity Cηηlowast in (1037) relative to |η|22|ηlowast|1One way to do this is to ensure that (a) ηlowast is concentrated on a very short1 interval [0 ε)(b) η is supported on the interval [0 2] and is symmetric around t = 1 meaning thatη(t) sim η(2minus t) Then for x sim N2 the integralint infin

0

int infin0

η(t1)η(t2)ηlowast

(N

xminus (t1 + t2)

)dt1dt2

in (1037) should be approximately equal to

|ηlowast|1 middotint infin

0

η(t)η

(N

xminus t)dt = |ηlowast|1 middot

int infin0

η(t)2dt = |ηlowast|1 middot |η|22 (111)

provided that η0(t) ge 0 for all t It is easy to check (using Cauchy-Schwarz in thesecond step) that this is essentially optimal (We will redo this rigorously in a littlewhile)

At the same time the fact is that major-arc estimates are best for smoothing func-tions η of a particular form and we have minor-arc estimates from Part I for a differentspecific smoothing η2 The issue then is how do we choose η and ηlowast as above so that

bull ηlowast is concentrated on [0 ε)

bull η is supported on [0 2] and symmetric around t = 1

bull we can give minor-arc and major-arc estimates for ηlowast

bull we can give major-arc estimates for a function η+ close to η in `2 norm

1This is an idea appearing in work by Bourgain in a related context [Bou99]

217

218 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

111 The symmetric smoothing function ηWe will later work with a smoothing function ηhearts whose Mellin transform decreasesvery rapidly Because of this rapid decay we will be able to give strong results basedon an explicit formula for ηhearts The issue is how to define η given ηhearts so that η issymmetric around t = 1 (ie η(2minus x) sim η(x)) and is very small for x gt 2

We will later set ηhearts(t) = eminust22 Let

h t 7rarr

t3(2minus t)3etminus12 if t isin [0 2]0 otherwise

(112)

We define η Rrarr R by

η(t) = h(t)ηhearts(t) =

t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 otherwise

(113)

It is clear that η is symmetric around t = 1 for t isin [0 2]

1111 The product η(t)η(ρminus t)We now should go back and redo rigorously what we discussed informally around(111) More precisely we wish to estimate

η(ρ) =

int infinminusinfin

η(t)η(ρminus t)dt =

int infinminusinfin

η(t)η(2minus ρ+ t)dt (114)

for ρ le 2 close to 2 In this it will be useful that the Cauchy-Schwarz inequalitydegrades slowly in the following sense

Lemma 1111 Let V be a real vector space with an inner product 〈middot middot〉 Then forany v w isin V with |w minus v|2 le |v|22

〈v w〉 = |v|2|w|2 +Olowast(271|v minus w|22)

Proof By a truncated Taylor expansion

radic1 + x = 1 +

x

2+x2

2max

0letle1

1

4(1minus (tx)2)32

= 1 +x

2+Olowast

(x2

232

)for |x| le 12 Hence for δ = |w minus v|2|v|2

|w|2|v|2

=

radic1 +

2〈w minus v v〉+ |w minus v|22|v|22

= 1 +2 〈wminusvv〉|v|22

+ δ2

2+Olowast

((2δ + δ2)2

232

)= 1 + δ +Olowast

((1

2+

(52)2

232

)δ2

)= 1 +

〈w minus v v〉|v|22

+Olowast(

271|w minus v|22|v|22

)

112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS219

Multiplying by |v|22 we obtain that

|v|2|w|2 = |v|22 + 〈w minus v v〉+Olowast(271|w minus v|22

)= 〈v w〉+Olowast

(271|w minus v|22

)

Applying Lemma 1111 to (114) we obtain that

(η lowast η)(ρ) =

int infinminusinfin

η(t)η((2minus ρ) + t)dt

=

radicint infinminusinfin|η(t)|2dt

radicint infinminusinfin|η((2minus ρ) + t)|2dt

+Olowast(

271

int infinminusinfin|η(t)minus η((2minus ρ) + t)|2 dt

)= |η|22 +Olowast

(271

int infinminusinfin

(int 2minusρ

0

|ηprime(r + t)| dr)2

dt

)

= |η|22 +Olowast(

271(2minus ρ)

int 2minusρ

0

int infinminusinfin|ηprime(r + t)|2 dtdr

)= |η|22 +Olowast(271(2minus ρ)2|ηprime|22)

(115)

We will be working with ηlowast supported on the non-negative reals we recall that ηis supported on [0 2] Henceint infin

0

int infin0

η(t1)η(t2)ηlowast

(N

xminus (t1 + t2)

)dt1dt2

=

int Nx

0

(η lowast η)(ρ)ηlowast

(N

xminus ρ)dρ

=

int Nx

0

(|η|22 +Olowast(271(2minus ρ)2|ηprime|22)) middot ηlowast(N

xminus ρ)dρ

= |η|22int N

x

0

ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

x

0

((2minusNx) + ρ)2ηlowast(ρ)dρ

)

(116)provided that Nx ge 2 We see that it will be wise to set Nx very slightly larger than2 As we said before ηlowast will be scaled so that it is concentrated on a small interval[0 ε)

112 The smoothing function ηlowast adapting minor-arcbounds

Here the challenge is to define a smoothing function ηlowast that is good both for minor-arcestimates and for major-arc estimates The two regimes tend to favor different kinds of

220 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

smoothing function For minor-arc estimates we use as [Tao14] did

η2(t) = 4 max(log 2minus | log 2t| 0) = ((2I[121]) lowastM (2I[121]))(t) (117)

where I[121](t) is 1 if t isin [12 1] and 0 otherwise For major-arc estimates we willuse a function based on

ηhearts = eminust22

We will actually use here the function t2eminust22 whose Mellin transform isMηhearts(s+2)

(by eg [BBO10 Table 111]))We will follow the simple expedient of convolving the two smoothing functions

one good for minor arcs the other one for major arcs In general let ϕ1 ϕ2 [0infin)rarrC It is easy to use bounds on sums of the form

Sfϕ1(x) =

sumn

f(n)ϕ1(nx) (118)

to bound sums of the form Sfϕ1lowastMϕ2

Sfϕ1lowastMϕ2=sumn

f(n)(ϕ1 lowastM ϕ2)(nx

)=

int infin0

sumn

f(n)ϕ1

( n

wx

)ϕ2(w)

dw

w=

int infin0

Sfϕ1(wx)ϕ2(w)dw

w

(119)The same holds of course if ϕ1 and ϕ2 are switched since ϕ1 lowastM ϕ2 = ϕ2 lowastM ϕ1The only objection is that the bounds on (118) that we input might not be valid ornon-trivial when the argument wx of Sfϕ1

(wx) is very small Because of this it isimportant that the functions ϕ1 ϕ2 vanish at 0 and desirable that their first derivativesdo so as well

Let us see how this works out in practice for ϕ1 = η2 Here η2 [0infin) rarr R isgiven by

η2 = η1 lowastM η1 = 4 max(log 2minus | log 2t| 0) (1110)

where η1 = 2 middot I[121]Let us restate the bounds from Theorem 311 ndash the main result of Part I We will

use Lemma C22 to bound terms of the form qφ(q)Let x ge x0 x0 = 216 middot 1020 Let 2α = aq + δx q le Q gcd(a q) = 1

|δx| le 1qQ where Q = (34)x23 Then if 3 le q le x136 Theorem 311 givesus that

|Sη2(α x)| le gx(

max

(1|δ|8

)middot q)x (1111)

where

gx(r) =(Rx2r log 2r + 05)

radicz(r) + 25radic

2r+L2r

r+ 336xminus16 (1112)

112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS221

with

Rxt = 027125 log

(1 +

log 4t

2 log 9x13

2004t

)+ 041415

Lt = z(t2)

(13

4log t+ 782

)+ 1366 log t+ 3755

(1113)

If q gt x136 then again by Theorem 311

|Sη2(α x)| le h(x)x (1114)

whereh(x) = 0276xminus16(log x)32 + 1234xminus13 log x (1115)

We will work with x varying within a range and so we must pay some attentionto the dependence of (1111) and (1114) on x Let us prove two auxiliary lemmas onthis

Lemma 1121 Let gx(r) be as in (1112) and h(x) as in (1115) Then

x 7rarr

h(x) if x lt (6r)3

gx(r) if x ge (6r)3

is a decreasing function of x for r ge 11 fixed and x ge 21

Proof It is clear from the definitions that x 7rarr h(x) (for x ge 21) and x 7rarr gx(r) areboth decreasing Thus we simply have to show that h(xr) ge gxr (r) for xr = (6r)3Since xr ge (6 middot 11)3 gt e125

Rxr2r le 027125 log(0065 log xr + 1056) + 041415

le 027125 log((0065 + 00845) log xr) + 041415 le 027215 log log xr

Hence

Rxr2r log 2r + 05 le 027215 log log xr log x13r minus 027215 log 125 log 3 + 05

le 009072 log log xr log xr minus 0255

At the same time

z(r) = eγ log logx

13r

6+

250637

log log rle eγ log log xr minus eγ log 3 + 19521

le eγ log log xr

(1116)

for r ge 37 and we also get z(r) le eγ log log xr for r isin [11 37] by the bisectionmethod with 10 iterations Hence

(Rxr2r log 2r + 05)radicz(r) + 25

le (009072 log log xr log xr minus 0255)radiceγ log log xr + 25

le 01211 log xr(log log xr)32 + 2

222 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

and so

(Rxr2r log 2r + 05)radic

z(r) + 25radic2r

le (021 log xr(log log xr)32 + 347)xminus16

r

Now by (1116)

L2r le eγ log log xr middot(

13

4log(x13

r 3) + 782

)+ 1366 log(x13

r 3) + 3755

le eγ log log xr middot(

13

12xr + 425

)+ 456 log xr + 2255

It is clear that

425eγ log log xr + 456 log xr + 2255

x13r 6

lt 1234xminus13r log xr

for xr ge e we make the comparison for xr = e and take the derivative of the ratio ofthe left side by the right side

It remains to show that

021 log xr(log log xr)32 + 347 + 336 +

13

2eγxminus13

r log xr log log xr (1117)

is less than 0276(log xr)32 for xr large enough Since t 7rarr (log t)32t12 is de-

creasing for t gt e3 we see that

021 log xr(log log xr)32 + 683 + 13

2 eγxminus13r log xr log log xr

0276(log xr)32lt 1

for all xr ge e33 simply because it is true for x = e33 which is greater than ee3

We conclude that h(xr) ge gxr (r) = gxr (x

13r 6) for xr ge e33 We check that

h(xr) ge gxr (x13r 6) for log xr isin [log 663 33] as well by the bisection method

(applied with 30 iterations with log xr as the variable on the intervals [log 663 20][20 25] [25 30] and [30 33]) Since r ge 11 implies xr ge 663 we are done

Lemma 1122 Let Rxr be as in (1112) Then t rarr Retr(r) is convex-up for t ge3 log 6r

Proof Since trarr eminust6 and trarr t are clearly convex-up all we have to do is to showthat trarr Retr is convex-up In general since

(log f)primeprime =

(f prime

f

)prime=f primeprimef minus (f prime)2

f2

a function of the form (log f) is convex-up exactly when f primeprimef minus (f prime)2 ge 0 If f(t) =1 + a(tminus b) we have f primeprimef minus (f prime)2 ge 0 whenever

(t+ aminus b) middot (2a) ge a2

ie a2 + 2at ge 2ab and that certainly happens when t ge b In our case b =3 log(2004r9) and so t ge 3 log 6r implies t ge b

112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS223

Now we come to the point where we prove bounds on exponential sums of the formSηlowast(α x) (that is sums based on the smoothing ηlowast) based on our bounds (1111) and(1114) on the exponential sums Sη2(α x) This is straightforward as promised

Proposition 1123 Let x ge Kx0 x0 = 216 middot 1020 K ge 1 Let Sη(α x) be asin (101) Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in L1

Let 2α = aq+δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23If q le (xK)136 then

Sηlowast(α x) le gxϕ(

max

(1|δ|8

)q

)middot |ϕ|1x (1118)

where

gxϕ(r) =(RxKϕ2r log 2r + 05)

radicz(r) + 25radic

2r+L2r

r+ 336K16xminus16

RxKϕt = Rxt + (RxKt minusRxt)Cϕ2K|ϕ|1

logK(1119)

with Rxt and Lt are as in (1113) and

Cϕ2K = minusint 1

1K

ϕ(w) logw dw (1120)

If q gt (xK)136 then

|Sηlowast(α x)| le hϕ(xK) middot |ϕ|1x

wherehϕ(x) = h(x) + Cϕ0K|ϕ|1

Cϕ0K = 104488

int 1K

0

|ϕ(w)|dw(1121)

and h(x) is as in (1115)

Proof By (119)

Sηlowast(α x) =

int 1K

0

Sη2(αwx)ϕ(w)dw

w+

int infin1K

Sη2(αwx)ϕ(w)dw

w

We bound the first integral by the trivial estimate |Sη2(αwx)| le |Sη2(0 wx)| andCor C13 int 1K

0

|Sη2(0 wx)|ϕ(x)dw

wle 104488

int 1K

0

wxϕ(w)dw

w

= 104488x middotint 1K

0

ϕ(w)dw

224 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

Ifw ge 1K thenwx ge x0 and we can use (1111) or (1114) If q gt (xK)136then |Sη2(αwx)| le h(xK)wx by (1114) moreover |Sη2(α y)| le h(y)y forxK le y lt (6q)3 (by (1114)) and |Sη2(α y)| le gy1(r) for y ge (6q)3 (by (1111))Thus Lemma 1121 gives us thatint infin

1K

|Sη2(αwx)|ϕ(w)dw

wleint infin

1K

h(xK)wx middot ϕ(w)dw

w

= h(xK)x

int infin1K

ϕ(w)dw le h(xK)|ϕ|1 middot x

If q le (xK)136 we always use (1111) We can use the coarse boundint infin1K

336xminus16 middot wx middot ϕ(w)dw

wle 336K16|ϕ|1x56

Since Lr does not depend on xint infin1K

Lrrmiddot wx middot ϕ(w)

dw

wle Lr

r|ϕ|1x

By Lemma 1122 and q le (xK)136 y 7rarr Reyt is convex-up and decreasingfor y isin [log(xK)infin) Hence

Rwxt le

logwlog 1

K

RxKt +(

1minus logwlog 1

K

)Rxt if w lt 1

Rxt if w ge 1

Thereforeint infin1K

Rwxt middot wx middot ϕ(w)dw

w

leint 1

1K

(logw

log 1K

RxKt +

(1minus logw

log 1K

)Rxt

)xϕ(w)dw +

int infin1

Rxtϕ(w)xdw

le Rxtx middotint infin

1K

ϕ(w)dw + (RxKt minusRxt)x

logK

int 1

1K

ϕ(w) logwdw

le(Rxt|ϕ|1 + (RxKt minusRxt)

Cϕ2logK

)middot x

where

Cϕ2K = minusint 1

1K

ϕ(w) logw dw

We finish by proving a couple more lemmas

Lemma 1124 Let x gt K gt 1 Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) andϕ [0infin)rarr [0infin) is continuous and in L1 Let gxϕ be as in (1119)

Then gxϕ(r) is a decreasing function of r for 670 le r le (xK)136

112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS225

Proof Taking derivatives we can easily see that

r 7rarr log log r

r r 7rarr log r

r r 7rarr log r log log r

r r 7rarr (log r)2 log log r

r(1122)

are decreasing for r ge 20 The same is true if log log r is replaced by z(r) sincez(r) log log r is a decreasing function for r ge e Since (Cϕ2K|ϕ|1) logK le 1(by (1120)) we see that it is enough to prove that r 7rarr Ry2r log 2r

radiclog log r

radic2r is

decreasing on r for y = x and y = xK (under the assumption that r ge 670)Looking at (1113) and at (1122) we see that it remains only to check that

r 7rarr log

(1 +

log 8r

2 log 9y13

4008r

)log 2r middot

radiclog log r

r(1123)

is decreasing on r for r ge 670 Taking logarithms and then derivatives we see that wehave to show that

1r `+

log 8rr

2`2(1 + log 8r

2`

)log(

1 + log 8r2`

) +1

r log 2r+

1

2r log r log log rlt

1

2r

where ` = log 9y13

4008r We multiply by 2r and see that this is equivalent to

1`

(2minus 1

1+ log 8r2`

)log(

1 + log 8r2`

) +2

log 2r+

1

log r log log rlt 1 (1124)

A derivative test is enough to show that s log(1 + s) is an increasing function of s fors gt 0 hence so is s middot (2minus 1(1 + s)) log(1 + s) Setting s = (log 8r)` we obtainthat the left side of (1124) is a decreasing function of ` for r ge 1 fixed

Since r le y136 ` ge log 544008 gt 26 Thus for (1124) to hold it is enoughto ensure that

126

(2minus 1

1+ log 8r52

)log(

1 + log 8r52

) +2

log 2r+

1

log r log log rlt 1 (1125)

A derivative test shows that (2 minus 1s) log(1 + s) is a decreasing function of s fors ge 123 since log(8 middot 75)52 gt 123 this implies that the left side of (1125) is adecreasing function of r for r ge 75

We check that the left side of (1125) is indeed less than 1 for r = 670 we concludethat it is less than 1 for all r ge 670

Lemma 1125 Let x ge 1025 Let φ [0infin) rarr [0infin) be continuous and in L1 Letgxφ(r) and h(x) be as in (1119) and (1115) respectively Then

gxφ

(3

8x415

)ge h(2x log x)

226 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

Proof We can bound gxφ(r) from below by

gmx(r) =(Rxr log 2r + 05)

radicz(r) + 25radic

2r

Let r = (38)x415 Using the assumption that x ge 1025 we see that

Rxr = 027125 log

1 +log(

3x415

2

)2 log

(9

2004middot 38middot x 1

3minus415)+ 041415 ge 063368

(1126)(It is easy to see that the left side of (1126) is increasing on x) Using x ge 1025 againwe get that

z(r) = eγ log log r +250637

log log rge 568721

Since log 2r = (415) log x+ log(34) we conclude that

gmx(r) ge 040298 log x+ 325765radic34 middot x215

Recall that

h(x) =0276(log x)32

x16+

1234 log x

x13

We can see that

x 7rarr (log x+ 33)x215

(log(2x log x))32(2x log x)16(1127)

is increasing for x ge 1025 (and indeed for x ge e27) by taking the logarithm of theright side of (1127) and then taking its derivative with respect to t = log x We cansee in the same way that (1x215)(log(2x log x)(2x log x)13) is increasing forx ge e22 Since

040298(log x+ 33)radic34 middot x215

ge 0276(log(2x log x))32

(2x log x)16

325765minus 33 middot 040298radic34 middot x215

ge 1234 log(2x log(x))

(2x log(x))13

for x = 1025 we are done

Chapter 12

The `2 norm and the large sieve

Our aim here is to give a bound on the `2 norm of an exponential sum over the minorarcs While we care about an exponential sum in particular we will prove a result validfor all exponential sums S(α x) =

sumn ane(αn) with an of prime support

We start by adapting ideas from Ramarersquos version of the large sieve for primes toestimate `2 norms over parts of the circle (sect121) We are left with the task of givingan explicit bound on the factor in Ramarersquos work this we do in sect122 As a side effectthis finally gives a fully explicit large sieve for primes that is asymptotically optimalmeaning a sieve that does not have a spurious factor of eγ in front this was an arguablyimportant gap in the literature

121 Variations on the large sieve for primes

We are trying to estimate an integralintRZ |S(α)|3dα Instead of bounding it trivially by

|S|infin|S|22 we can use the fact that large (ldquomajorrdquo) values of S(α) have to be multipliedonly by

intM|S(α)|2dα where M is a union (small in measure) of major arcs Now

can we give an upper bound forintM|S(α)|2dα better than |S|22 =

intRZ |S(α)|2dα

The first version of [Helb] gave an estimate on that integral using a technique due toHeath-Brown which in turn rests on an inequality of Montgomeryrsquos ([Mon71 (39)]see also eg [IK04 Lem 715]) The technique was communicated by Heath-Brownto the present author who communicated it to Tao who used it in his own notable workon sums of five primes (see [Tao14 Lem 46] and adjoining comments) We will beable to do better than that estimate here

The role played by Montgomeryrsquos inequality in Heath-Brownrsquos method is playedhere by a result of Ramarersquos ([Ram09 Thm 21] see also [Ram09 Thm 52]) Thefollowing proposition is based on Ramarersquos result or rather on one possible proof ofit Instead of using the result as stated in [Ram09] we will actually be using elementsof the proof of [Bom74 Thm 7A] credited to Selberg Simply integrating Ramarersquosinequality would give a non-trivial if slightly worse bound

227

228 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

Proposition 1211 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

radicx Let Q0 ge 1 δ0 ge 1 be such that

δ0Q20 le x2 set Q =

radicx2δ0 ge Q0 Let

M =⋃qleQ0

⋃a mod q

(aq)=1

(a

qminus δ0r

qxa

q+δ0r

qx

) (121)

Let S(α) =sumn ane(αn) for α isin RZ Thenint

M

|S(α)|2 dα le(

maxqleQ0

maxsleQ0q

Gq(Q0sq)

Gq(Qsq)

)sumn

|an|2

where

Gq(R) =sumrleR

(rq)=1

micro2(r)

φ(r) (122)

Proof By (121)intM

|S(α)|2 dα =sumqleQ0

int δ0Q0qx

minus δ0Q0qx

suma mod q

(aq)=1

∣∣∣∣S (aq + α

)∣∣∣∣2 dα (123)

Thanks to the last equations of [Bom74 p 24] and [Bom74 p 25]

suma mod q

(aq)=1

∣∣∣∣S (aq)∣∣∣∣2 =

1

φ(q)

sumqlowast|q

(qlowastqqlowast)=1

micro2(qqlowast)=1

qlowast middotsumlowast

χ mod qlowast

∣∣∣∣∣sumn

anχ(n)

∣∣∣∣∣2

for every q leradicx where we use the assumption that n is prime and gt

radicx (and thus

coprime to q) when an 6= 0 HenceintM

|S(α)|2 dα =sumqleQ0

sumqlowast|q

(qlowastqqlowast)=1

micro2(qqlowast)=1

qlowastint δ0Q0

qx

minus δ0Q0qx

1

φ(q)

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

=sumqlowastleQ0

qlowast

φ(qlowast)

sumrleQ0qlowast

(rqlowast)=1

micro2(r)

φ(r)

int δ0Q0qlowastrx

minus δ0Q0qlowastrx

sumlowast

χ mod qlowast

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

=sumqlowastleQ0

qlowast

φ(qlowast)

int δ0Q0qlowastx

minus δ0Q0qlowastx

sumrleQ0

qlowast min(1δ0|α|x )

(rqlowast)=1

micro2(r)

φ(r)

sumlowast

χ mod qlowast

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 229

Here |α| le δ0Q0qlowastx implies (Q0q)δ0|α|x ge 1 Thereforeint

M

|S(α)|2 dα le(

maxqlowastleQ0

maxsleQ0qlowast

Gqlowast(Q0sqlowast)

Gqlowast(Qsqlowast)

)middot Σ (124)

where

Σ =sumqlowastleQ0

qlowast

φ(qlowast)

int δ0Q0qlowastx

minus δ0Q0qlowastx

sumrle Q

qlowast min(1δ0|α|x )

(rqlowast)=1

micro2(r)

φ(r)

sumlowast

χ mod qlowast

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

lesumqleQ

q

φ(q)

sumrleQq(rq)=1

micro2(r)

φ(r)

int δ0Qqrx

minus δ0Qqrx

sumlowast

χ mod q

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

As stated in the proof of [Bom74 Thm 7A]

χ(r)χ(n)τ(χ)cr(n) =

qrsumb=1

(bqr)=1

χ(b)e2πin bqr

for χ primitive of modulus q Here cr(n) stands for the Ramanujan sum

cr(n) =sum

u mod r(ur)=1

e2πnur

For n coprime to r cr(n) = micro(r) Since χ is primitive |τ(χ)| =radicq Hence for

r leradicx coprime to q

q

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

=

∣∣∣∣∣∣∣∣qrsumb=1

(bqr)=1

χ(b)S

(b

qr+ α

)∣∣∣∣∣∣∣∣2

Thus

Σ =sumqleQ

sumrleQq(rq)=1

micro2(r)

φ(rq)

int δ0Qqrx

minus δ0Qqrx

sumlowast

χ mod q

∣∣∣∣∣∣∣∣qrsumb=1

(bqr)=1

χ(b)S

(b

qr+ α

)∣∣∣∣∣∣∣∣2

lesumqleQ

1

φ(q)

int δ0Qqx

minus δ0Qqx

sumχ mod q

∣∣∣∣∣∣∣∣qsumb=1

(bq)=1

χ(b)S

(b

q+ α

)∣∣∣∣∣∣∣∣2

=sumqleQ

int δ0Qqx

minus δ0Qqx

qsumb=1

(bq)=1

∣∣∣∣S ( bq + α

)∣∣∣∣2 dα

230 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

Let us now check that the intervals (bq minus δ0Qqx bq + δ0Qqx) do not overlapSince Q =

radicx2δ0 we see that δ0Qqx = 12qQ The difference between two

distinct fractions bq bprimeqprime is at least 1qqprime For q qprime le Q 1qqprime ge 12qQ+ 12QqprimeHence the intervals around bq and bprimeqprime do not overlap We conclude that

Σ leintRZ|S(α)|2 =

sumn

|an|2

and so by (124) we are done

We will actually use Prop 1211 in the slightly modified form given by the follow-ing statement

Proposition 1212 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

radicx Let Q0 ge 1 δ0 ge 1 be such that

δ0Q20 le x2 set Q =

radicx2δ0 ge Q0 Let M = Mδ0Q0

be as in (105)Let S(α) =

sumn ane(αn) for α isin RZ Then

intMδ0Q0

|S(α)|2 dα le

maxqle2Q0

q even

maxsle2Q0q

Gq(2Q0sq)

Gq(2Qsq)

sumn

|an|2

where

Gq(R) =sumrleR

(rq)=1

micro2(r)

φ(r) (125)

Proof By (105)intM

|S(α)|2 dα =sumqleQ0

q odd

int δ0Q02qx

minus δ0Q02qx

suma mod q

(aq)=1

∣∣∣∣S (aq + α

)∣∣∣∣2 dα+sumqleQ0

q even

int δ0Q0qx

minus δ0Q0qx

suma mod q

(aq)=1

∣∣∣∣S (aq + α

)∣∣∣∣2 dαWe proceed as in the proof of Prop 1211 We still have (123) Hence

intM|S(α)|2 dα

equals

sumqlowastleQ0

qlowast odd

qlowast

φ(qlowast)

int δ0Q02qlowastx

minus δ0Q02qlowastx

sumrleQ0

qlowast min(1δ0

2|α|x )(r2qlowast)=1

micro2(r)

φ(r)

sumlowast

χ mod qlowast

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

+sum

qlowastle2Q0

qlowast even

qlowast

φ(qlowast)

int δ0Q0qlowastx

minus δ0Q0qlowastx

sumrle 2Q0

qlowast min(1δ0

2|α|x )(rqlowast)=1

micro2(r)

φ(r)

sumlowast

χ mod qlowast

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 231

(The sum with q odd and r even is equal to the first sum hence the factor of 2 in front)Therefore int

M

|S(α)|2 dα le

maxqlowastleQ0

qlowast odd

maxsleQ0qlowast

G2qlowast(Q0sqlowast)

G2qlowast(Qsqlowast)

middot 2Σ1

+

maxqlowastle2Q0

qlowast even

maxsle2Q0qlowast

Gqlowast(2Q0sqlowast)

Gqlowast(2Qsqlowast)

middot Σ2

(126)

where

Σ1 =sumqleQq odd

q

φ(q)

sumrleQq

(r2q)=1

micro2(r)

φ(r)

int δ0Q2qrx

minus δ0Q2qrx

sumlowast

χ mod q

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

=sumqleQq odd

q

φ(q)

sumrle2Qq

(rq)=1

r even

micro2(r)

φ(r)

int δ0Qqrx

minus δ0Qqrx

sumlowast

χ mod q

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

Σ2 =sumqle2Qq even

q

φ(q)

sumrle2Qq

(rq)=1

micro2(r)

φ(r)

int δ0Qqrx

minus δ0Qqrx

sumlowast

χ mod q

∣∣∣∣∣sumn

ane(αn)χ(n)

∣∣∣∣∣2

The two expressions within parentheses in (126) are actually equalMuch as before using [Bom74 Thm 7A] we obtain that

Σ1 lesumqleQq odd

1

φ(q)

int δ0Q2qx

minus δ0Q2qx

qsumb=1

(bq)=1

∣∣∣∣S ( bq + α

)∣∣∣∣2 dαΣ1 + Σ2 le

sumqle2Qq even

1

φ(q)

int δ0Qqx

minus δ0Qqx

qsumb=1

(bq)=1

∣∣∣∣S ( bq + α

)∣∣∣∣2 dαLet us now check that the intervals of integration (bq minus δ0Q2qx bq + δ0Q2qx)(for q odd) (bq minus δ0Qqx bq + δ0Qqx) (for q even) do not overlap Recall thatδ0Qqx = 12qQ The absolute value of the difference between two distinct fractionsbq bprimeqprime is at least 1qqprime For q qprime le Q odd this is larger than 14qQ + 14Qqprimeand so the intervals do not overlap For q le Q odd and qprime le 2Q even (or vice versa)1qqprime ge 14qQ + 12Qqprime and so again the intervals do not overlap If q le Qand qprime le Q are both even then |bq minus bprimeqprime| is actually ge 2qqprime Clearly 2qqprime ge12qQ+ 12Qqprime and so again there is no overlap We conclude that

2Σ1 + Σ2 leintRZ|S(α)|2 =

sumn

|an|2

232 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

122 Bounding the quotient in the large sieve for primesThe estimate given by Proposition 1211 involves the quotient

maxqleQ0

maxsleQ0q

Gq(Q0sq)

Gq(Qsq) (127)

where Gq is as in (122) The appearance of such a quotient (at least for s = 1)is typical of Ramarersquos version of the large sieve for primes see eg [Ram09] Wewill see how to bound such a quotient in a way that is essentially optimal not justasymptotically but also in the ranges that are most relevant to us (This includes forexample Q0 sim 106 Q sim 1015)

As the present work shows an approach based on Ramarersquos work gives bounds thatare in some contexts better than those of other large sieves for primes by a constantfactor (approaching eγ = 178107 ) Thus giving a fully explicit and nearly optimalbound for (127) is a task of clear general relevance besides being needed for our maingoal

We will obtain bounds for Gq(Q0sq)Gq(Qsq) when Q0 le 2 middot 1010 Q ge Q20

As we shall see our bounds will be best when s = q = 1 ndash or sometimes when s = 1and q = 2 instead

Write G(R) for G1(R) =sumrleR micro

2(r)φ(r) We will need several estimates forGq(R) and G(R) As stated in [Ram95 Lemma 34]

G(R) le logR+ 14709 (128)

for R ge 1 By [MV73 Lem 7]

G(R) ge logR+ 107 (129)

for R ge 6 There is also the trivial bound

G(R) =sumrleR

micro2(r)

φ(r)=sumrleR

micro2(r)

r

prodp|r

(1minus 1

p

)minus1

=sumrleR

micro2(r)

r

prodp|r

sumjge1

1

pjgesumrleR

1

rgt logR

(1210)

The following bound also well-known and easy

G(R) le q

φ(q)Gq(R) le G(Rq) (1211)

can be obtained by multiplying Gq(R) =sumrleR(rq)=1 micro

2(r)φ(r) term-by-term byqφ(q) =

prodp|q(1 + 1φ(p))

We will also use Ramarersquos estimate from [Ram95 Lem 34]

Gd(R) =φ(d)

d

logR+ cE +sump|d

log p

p

+Olowast(

7284Rminus13f1(d))

(1212)

122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 233

for all d isin Z+ and all R ge 1 where

f1(d) =prodp|d

(1 + pminus23)

(1 +

p13 + p23

p(pminus 1)

)minus1

(1213)

andcE = γ +

sumpge2

log p

p(pminus 1)= 13325822 (1214)

by [RS62 (211)]If R ge 182 then

logR+ 1312 le G(R) le logR+ 1354 (1215)

where the upper bound is valid for R ge 120 This is true by (1212) for R ge 4 middot 107we check (1215) for 120 le R le 4 middot 107 by a numerical computation1 Similarly forR ge 200

logR+ 1661

2le G2(R) le logR+ 1698

2(1216)

by (1212) for R ge 16 middot108 and by a numerical computation for 200 le R le 16 middot108Write ρ = (logQ0)(logQ) le 1 We obtain immediately from (1215) and (1216)

thatG(Q0)

G(Q)le logQ0 + 1354

logQ+ 1312

G2(Q0)

G2(Q)le logQ0 + 1698

logQ+ 1661

(1217)

for QQ0 ge 200 What is hard is to approximate Gq(Q0)Gq(Q) for q large and Q0

smallLet us start by giving an easy bound off from the truth by a factor of about eγ

(Specialists will recognize this as a factor that appears often in first attempts at esti-mates based on either large or small sieves) First we need a simple explicit lemma

Lemma 1221 Let m ge 1 q ge 1 Thenprodp|qorplem

p

pminus 1le eγ(log(m+ log q) + 065771) (1218)

Proof Let P =prodplemorp|q p Then by [RS75 (51)]

P le qprodplem

p = qesumplem log p le qe(1+ε0)m

where ε0 = 0001102 Now by [RS62 (342)]

n

φ(n)le eγ log log n+

250637

log log nle eγ log log x+

250637

log log x

1Using D Plattrsquos implementation [Pla11] of double-precision interval arithmetic based on Lambovrsquos[Lam08] ideas

234 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

for all x ge n ge 27 (since given a b gt 0 the function t 7rarr a + bt is increasing on tfor t ge

radicba) Hence if qem ge 27

P

φ(P)le eγ log((1 + ε0)m+ log q) +

250637

log(m+ log q)

le eγ(

log(m+ log q) + ε0 +250637eγ

log(m+ log q)

)

Thus (1218) holds when m + log q ge 853 since then ε0 + (250637eγ) log(m +log q) le 065771 We verify all choices of m q ge 1 with m + log q le 853 compu-tationally the worst case is that of m = 1 q = 6 which give the value 065771 in(1218)

Here is the promised easy bound

Lemma 1222 Let Q0 ge 1 Q ge 182Q0 Let q le Q0 s le Q0q q an integer Then

Gq(Q0sq)

Gq(Qsq)leeγ log

(Q0

sq + log q)

+ 1172

log QQ0

+ 1312le eγ logQ0 + 1172

log QQ0

+ 1312

Proof Let P =prodpleQ0sqorp|q p Then

Gq(Q0sq)GP(QQ0) le Gq(Qsq)

and soGq(Q0sq)

Gq(Qsq)le 1

GP(QQ0) (1219)

Now the lower bound in (1211) gives us that for d = P R = QQ0

GP(QQ0) ge φ(P)

PG(QQ0)

By Lem 1221

P

φ(P)le eγ

(log

(Q0

sq+ log q

)+ 0658

)

Hence using (1215) we get that

Gq(Q0sq)

Gq(Qsq)le Pφ(P)

G(QQ0)leeγ log

(Q0

sq + log q)

+ 1172

log QQ0

+ 1312 (1220)

since QQ0 ge 184 Since(Q0

sq+ log q

)prime= minusQ0

sq2+

1

q=

1

q

(1minus Q0

sq

)le 0

the rightmost expression of (1220) is maximal for q = 1

122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 235

Lemma 1222 will play a crucial role in reducing to a finite computation the prob-lem of bounding Gq(Q0sq)Gq(Qsq) As we will now see we can use Lemma1222 to obtain a bound that is useful when sq is large compared to Q0 ndash precisely thecase in which asymptotic estimates such as (1212) are relatively weak

Lemma 1223 Let Q0 ge 1 Q ge 200Q0 Let q le Q0 s le Q0q Let ρ =(logQ0) logQ le 23 Then for any σ ge 1312ρ

Gq(Q0sq)

Gq(Qsq)le logQ0 + σ

logQ+ 1312(1221)

holds provided thatQ0

sqle c(σ) middotQ(1minusρ)eminusγ

0 minus log q

where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

Proof By Lemma 1222 we see that (1221) will hold provided that

eγ log

(Q0

sq+ log q

)+ 1172 le

log QQ0

+ 1312

logQ+ 1312middot (logQ0 + σ) (1222)

The expression on the right of (1222) equals

logQ0 + σ minus (logQ0 + σ) logQ0

logQ+ 1312

= (1minus ρ)(logQ0 + σ) +1312ρ(logQ0 + σ)

logQ+ 1312

ge (1minus ρ)(logQ0 + σ) + 1312ρ2

and so (1222) will hold provided that

eγ log

(Q0

sq+ log q

)+ 1172 le (1minus ρ)(logQ0) + (1minus ρ)σ + 1312ρ2

Taking derivatives we see that

(1minus ρ)σ + 1312ρ2 minus 1172 ge(

1minus σ

2624

)σ + 1312

( σ

2624

)2

minus 1172

= σ minus σ2

4 middot 1312minus 1172

Hence it is enough that

Q0

sq+ log q le ee

minusγ(

(1minusρ) logQ0+σminus σ2

4middot1312minus1172)

= c(σ) middotQ(1minusρ)eminusγ0

where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

We now pass to the main result of the section

236 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

Proposition 1224 Let Q ge 20000Q0 Q0 ge Q0min where Q0min = 105 Letρ = (logQ0) logQ Assume ρ le 06 Then for every 1 le q le Q0 and everys isin [1 Q0q]

Gq(Q0sq)

Gq(Qsq)le logQ0 + c+

logQ+ cE (1223)

where cE is as in (1214) and c+ = 136

An ideal result would have c+ instead of cE but this is not actually possible errorterms do exist even if they are in reality smaller than the bound given in (1212) thismeans that a bound such as (1223) with c+ instead of cE would be false for q = 1s = 1

There is nothing special about the assumptions

Q ge 20000Q0 Q0 ge 105 (logQ0)(logQ) le 06

They can all be relaxed at the cost of an increase in c+

Proof Define errqR so that

Gq(R) =φ(q)

q

logR+ cE +sump|q

log p

p

+ errqR (1224)

Then (1223) will hold if

logQ0

sq+ cE +

sump|q

log p

p+

q

φ(q)err

qQ0sq

le

logQ

sq+ cE +

sump|q

log p

p+

q

φ(q)errq Qsq

logQ0 + c+logQ+ cE

(1225)

This in turn happens iflog sq minussump|q

log p

p

(1minus logQ0 + c+logQ+ cE

)+ c+ minus cE

ge q

φ(q)

(err

qQ0sqminus logQ0 + c+

logQ+ cEerrq Qsq

)

Defineω(ρ) =

logQ0min + c+1ρ logQ0min + cE

= ρ+c+ minus ρcE

1ρ logQ0min + cE

Then ρ le (logQ0 + c+)(logQ+ cE) le ω(ρ) (because c+ ge ρcE) We conclude that(1225) (and hence (1223)) holds provided that

(1minus ω(ρ))

log sq minussump|q

log p

p

+ c∆

ge q

φ(q)

(err

qQ0sq

+ω(ρ) max(

0minus errq Qsq

))

(1226)

122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 237

where c∆ = c+ minus cE Note that 1minus ω(ρ) gt 0First let us give some easy bounds on the error terms these bounds will yield upper

bounds for s By (128) and (1211)

errqR leφ(q)

q

log q minussump|q

log p

p+ (14709minus cE)

for R ge 1 by (1215) and (1211)

errqR ge minusφ(q)

q

sump|q

log p

p+ (cE minus 1312)

for R ge 182 Therefore the right side of (1226) is at most

log q minus (1minus ω(ρ))sump|q

log p

p+ ((14709minus cE) + ω(ρ)(cE minus 1312))

and so (1226) holds provided that

(1minus ω(ρ)) log sq ge log q + (14709minus cE) + ω(ρ)(cE minus 1312)minus c∆ (1227)

We will thus be able to assume from now on that (1227) does not hold or what is thesame that

sq lt (cρ2q)1

1minusω(ρ) (1228)

holds where cρ2 = exp((14709minus cE) + ω(ρ)(cE minus 1312)minus c∆)What values of R = Q0sq must we consider for q given First by (1228) we

can assume R gt Q0min(cρ2q)1(1minusω(ρ)) We can also assume

R gt c(c+) middotmax(RqQ0min)(1minusρ)eminusγ minus log q (1229)

for c(c+) is as in Lemma 1223 since all smaller R are covered by that LemmaClearly (1229) implies that

R1minusτ gt c(c+) middot qτ minus log q

Rτgt c(c+)qτ minus log q

where τ = (1minusρ)eminusγ and also thatR gt c(c+)Q(1minusρ)eminusγ0min minus log q Iterating we obtain

that we can assume that R gt $(q) where

$(q) = max

($0(q) c(c+)Qτ0min minus log q

Q0min

(cρ2q)1

1minusω(ρ)

)(1230)

and

$0(q) =

(c(c+)qτ minus log q

(c(c+)qτminuslog q)τ

1minusτ

) 11minusτ

if c(c+)qτ gt log q + 1

0 otherwise

238 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

Looking at (1226) we see that it will be enough to show that for all R satisfyingR gt $(q) we have

errqR +ω(ρ) max (0minus errqtR) le φ(q)

qκ(q) (1231)

for all t ge 20000 where

κ(q) = (1minus ω(ρ))

log q minussump|q

log p

p

+ c∆

Ramarersquos bound (1212) implies that

| errqR | le 7284Rminus13f1(q) (1232)

with f1(q) as in (1213) and so

errqR +ω(ρ) max (0minus errqtR) le (1 + βρ) middot 7284Rminus13f1(q)

where βρ = ω(ρ)2000013 This is enough when

R ge λ(q) =

(q

φ(q)

7284(1 + βρ)f1(q)

κ(q)

)3

(1233)

It remains to do two things First we have to compute how large q has to be for$(q) to be guaranteed to be greater than λ(q) (For such q there is no checking to bedone) Then we check the inequality (1231) for all smaller q letting R range throughthe integers in [$(q) λ(q)] We bound errqtR using (1232) but we compute errqRdirectly

How large must q be for $(q) gt λ(q) to hold We claim that $(q) gt λ(q)whenever q ge 22 middot 1010 Let us show this

It is easy to see that (p(pminus1)) middotf1(p) and prarr (log p)p are decreasing functionsof p for p ge 3 moreover for both functions the value at p ge 7 is smaller than forp = 2 Hence we have that for q lt

prodplep0 p p0 a prime

κ(q) ge (1minus ω(ρ))

(log q minus

sumpltp0

log p

p

)+ c∆ (1234)

and

λ(q) le

prodpltp0

p

pminus 1middot

7284(1 + βρ)prodpltp0

f1(p)

(1minus ω(ρ))(

log q minussumpltp0

log pp

)+ c∆

3

(1235)

If we also assume that 2 middot 3 middot 5 middot 7 - q we obtain

κ(q) ge (1minus ω(ρ))

log q minussumpltp0p 6=7

log p

p

+ c∆ (1236)

122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 239

and

λ(q) le

prodpltp0p 6=7

p

pminus 1middot

7284(1 + βρ)prodpltp0p 6=7 f1(p)

(1minus ω(ρ))(

log q minussumpltp0p6=7

log pp

)+ c∆

3

(1237)

for q ltprodplep0 (We are taking out 7 because it is the ldquoleast helpfulrdquo prime to omit

among all primes from 2 to 7 again by the fact that (p(p minus 1)) middot f1(p) and p rarr(log p)p are decreasing functions for p ge 3)

We know how to give upper bounds for the expression on the right of (1235)The task is in essence simple we can base our bounds on the classic explicit work in[RS62] except that we also have to optimize matters so that they are close to tight forp1 = 29 p1 = 31 and other low p1

By [RS62 (330)] and a numerical computation for 29 le p1 le 43prodplep1

p

pminus 1lt 190516 log p1

for p1 ge 29 Since ω(ρ) is increasing on ρ and we are assuming ρ le 06 Q0min =100000

ω(ρ) le 0627312 βρ le 0023111

For x gt a where a gt 1 is any constant we obviously havesumaltplex

log(

1 + pminus23)le

sumaltplex

(log p)pminus23

log a

by Abel summation (133) and the estimate [RS62 (332)] for θ(x) =sumplex log psum

altplex

(log p)pminus23 = (θ(x)minus θ(a))xminus23 minus

int x

a

(θ(u)minus θ(a))

(minus2

3uminus

53

)du

le (101624xminus θ(a))xminus23 +

2

3

int x

a

(101624uminus θ(a))uminus53 du

= (101624xminus θ(a))xminus23 + 2 middot 101624(x13 minus a13) + θ(a)(xminus23 minus aminus23)

= 3 middot 101624 middot x13 minus (203248a13 + θ(a)aminus23)

We conclude thatsum

104ltplex log(1 + pminus23) le 033102x13 minus 706909 for x gt 104Since

sumple104 log p le 1009062 this means thatsum

plex

log(1 + pminus23) le(

033102 +1009062minus 706909

1043

)x13 le 047126x13

for x gt 104 a direct computation for all x prime between 29 and 104 then confirmsthat sum

plex

log(1 + pminus23) le 074914x13

240 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

for all x ge 29 Thusprodplex

f1(p) le esumplex log(1+pminus23)prod

ple29

(1 + p13+p23

p(pminus1)

) le e074914x13

662365

for x ge 29 Finally by [RS62 (324)]sumplep1

log pp lt log p1

We conclude that for q ltprodplep0 p0 p0 a prime and p1 the prime immediately

preceding p0

λ(q) le

190516 log p1 middot745235 middot

(e074914p

131

662365

)037268(log q minus log p1) + 002741

3

le 190272(log p1)3e224742p131

(log q minus log p1 + 007354)3

(1238)

It is clear from (1230) that $(q) is increasing as soon as

q ge max(Q0min Q1minusω(ρ)0min cρ2)

and c(c+)qτ gt log q+ 1 since then $0(q) is increasing and $(q) = $0(q) Here it isuseful to recall that cρ2 ge exp(14709 minus c+) and to note that c(c+)qτ minus (log q + 1)is increasing for q ge 1(τ middot c(c+))1τ we see also that 1(τ middot c(c+))1τ le 1((1 minus06)eminusγc(c+))1((1minus06)eminusγ) for ρ le 06 A quick computation for our value of c+makes us conclude that q gt 112Q0min = 112000 is a sufficient condition for $(q) tobe equal to $0(q) and for $0(q) to be increasing

Since (1238) is decreasing on q for p1 fixed and $0(q) is decreasing on ρ andincreasing on q we set ρ = 06 and check that then

$0

(22 middot 1010

)ge 846765

whereas by (1238)

λ(22 middot 1010) le 838227 lt 846765

this is enough to ensure that λ(q) lt $0(q) for 22 middot 1010 le q ltprodple31 p

Let us now give some rough bounds that will be enough to cover the case q geprodple31 p First as we already discussed $(q) = $0(q) and since c(c+)qτ gt log q +

1

$0(q) ge (c(c+)qτ minus log q)1

1minusτ ge (0911q0224 minus log q)1289 ge q02797 (1239)

by q geprodple31 p We are in the range

prodplep1 p le q le

prodplep0 p where p1 lt p0

are two consecutive primes with p1 ge 31 By [RS62 (316)] and a computation for31 le q lt 200 we know that log q ge

prodplep1 log p ge 08009p1 By (1238) and

(1239) it follows that we just have to show that

e0224t gt190272(log t)3e224742t13

(08009tminus log t+ 007354)3

122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 241

for t ge 31 Now t ge 31 implies 08009tminus log t+ 007354 ge 06924t and so takinglogarithms we see that we just have to verify

0224tminus 224742t13 gt 3 log log tminus 3 log t+ 63513 (1240)

for t ge 31 and since the left side is increasing and the right side is decreasing fort ge 31 this is trivial to check

We conclude that $(q) gt λ(q) whenever q ge 22 middot 1010It remains to see how we can relax this assumption if we assume that 2 middot 3 middot 5 middot 7 - q

We repeat the same analysis as before using (1236) and (1237) instead of (1234) and(1235) For p1 ge 29

prodplep1p 6=7

p

pminus 1lt 1633 log p1

prodplep1p6=7

f1(p) le e074914x13minuslog(1+7minus23)

58478le e074914x13

744586

andsumplep1p 6=7(log p)p lt log p1minus (log 7)7 So for q lt

prodplep0p 6=7 p and p1 ge 29

the prime immediately preceding p0

λ(q) le

1633 log p1 middot745235 middot

(e074914p

131

744586

)037268

(log q minus log p1 + log 7

7

)+ 002741

3

le 84351(log p1)3e224742p131

(log q minus log p1 + 035152)3

Thus we obtain just like before that

$0(33 middot 109) ge 477465 λ(33 middot 109) le 475513 lt 477465

We also check that $0(q0) ge 916322 is greater than λ(q0) le 429731 for q0 =prodple31p 6=7 p The analysis for q ge

prodple37p 6=7 p is also just like before since log q ge

08009p1 minus log 7 we have to show that

e0224t

7gt

84351(log t)3e224742t13

(08009tminus log t+ 007354)3

for t ge 37 and that in turn follows from

0224tminus 224742t13 gt 3 log log tminus 3 log t+ 674849

which we check for t ge 37 just as we checked (1240)We conclude that $(q) gt λ(q) if q ge 33 middot 109 and 210 - qComputation Now for q lt 33middot109 (and also for 33middot109 le q lt 22middot1010 210|q)

we need to check that the maximum mqR1 of errqR over all $(q) le R lt λ(q)satisfies (1231) Note that there is a term errqtR in (1231) we bound it using (1232)

Since logR is increasing on R and Gq(R) depends only on bRc we can tell from(1224) that since we are taking the maximum of errqR it is enough to check integer

242 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

values of R We check all integers R in [$(q) λ(q)) for all q lt 33 middot 109 (and all33 middot 109 le q lt 22 middot 1010 210|q) by an explicit computation2

Finally we have the trivial bound

Gq(Q0sq)

Gq(Qsq)le 1 (1241)

which we shall use for Q0 close to Q

Corollary 1225 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

radicx Let Q0 ge 105 δ0 ge 1 be such that

(20000Q0)2 le x2δ0 set Q =radicx2δ0

Let S(α) =sumn ane(αn) for α isin RZ Let M as in (121) Then if Q0 le Q06int

M

|S(α)|2 dα le logQ0 + c+logQ+ cE

sumn

|an|2

where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

Let Mδ0Q0 as in (105) Then if (2Q0) le (2Q)06intMδ0Q0

|S(α)|2 dα le log 2Q0 + c+log 2Q+ cE

sumn

|an|2 (1242)

Here of courseintRZ |S(α)|2 dα =

sumn |an|2 (Plancherel) If Q0 gt Q06 we will

use the trivial boundintMδ0r

|S(α)|2 dα leintRZ|S(α)|2 dα =

sumn

|an|2 (1243)

Proof Immediate from Prop 1211 Prop 1212 and Prop 1224

Obviously one can also give a statement derived from Prop 1211 the resultingbound is int

M

|S(α)|2dα le logQ0 + c+logQ+ cE

sumn

|an|2

where M is as in (121)We also record the large-sieve form of the result

2This is by far the heaviest computation in the present work though it is still rather minor (about twoweeks of computing on a single core of a fairly new (2010) desktop computer carrying out other tasks as wellthis is next to nothing compared to the computations in [Plab] or even those in [HP13]) For the applicationshere we could have assumed ρ le 815 and that would have reduced computation time drastically thelighter assumption ρ le 06 was made with views to general applicability in the future As elsewhere in thissection numerical computations were carried out by the author in C all floating-point operations used DPlattrsquos interval arithmetic package

122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 243

Corollary 1226 Let N ge 1 Let aninfinn=1 an isin C be supported on the integersn le N Let Q0 ge 105 Q ge 20000Q0 Assume that an = 0 for every n for whichthere is a p le Q dividing n

Let S(α) =sumn ane(αn) for α isin RZ Then if Q0 le Q06sum

qleQ0

suma mod q

(aq)=1

|S(aq)|2 dα le logQ0 + c+logQ+ cE

middot (N +Q2)sumn

|an|2

where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

Proof Proceed as Ramare does in the proof of [Ram09 Thm 52] with Kq = a isinZqZ (a q) = 1 and un = an) in particular apply [Ram09 Thm 21] The proofof [Ram09 Thm 52] shows thatsum

qleQ0

suma mod q

(aq)=1

|S(aq)|2 dα le maxqleQ0

Gq(Q0)

Gq(Q)middotsumqleQ0

suma mod q

(aq)=1

|S(aq)|2 dα

Now instead of using the easy inequalityGq(Q0)Gq(Q) le G1(Q0)G1(QQ0) useProp 1224

It would seem desirable to prove a result such as Prop 1224 (or Cor 1225 orCor 1226) without computations and with conditions that are as weak as possibleSince as we said we cannot make c+ equal to cE and since c+ does have to increasewhen the conditions are weakened (as is shown by computations this is not an arti-fact of our method of proof) the right goal might be to show that the maximum ofGq(Q0sq)Gq(Qsq) is reached when s = q = 1

However this is also untrue without conditions For instance for Q0 = 2 and Qlarge the value of Gq(Q0q)Gq(Qq) at q = 2 is larger than at q = 1 by (1212)

G2

(Q0

2

)G2

(Q2

) sim 1

12

(log Q

2 + cE + log 22

)=

2

logQ+ cE minus log 22

gt2

logQ+ cEsim G(Q0)

G(Q)

Thus at the very least a lower bound on Q0 is needed as a condition This also dimsthe hopes somewhat for a combinatorial proof of Gq(Q0q)G(Q) le Gq(Qq)G(Q0)at any rate while such a proof would be welcome it could not be extremely straightfor-ward since there are terms in Gq(Q0q)G(Q) that do not appear in Gq(Qq)G(Q0)

244 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

Chapter 13

The integral over the minor arcs

The time has come to bound the part of our triple-product integral (103) that comesfrom the minor arcs m sub RZ We have an `infin estimate (from Prop 1123 based onTheorem 311) and an `2 estimate (from sect122) Now we must put them together

There are two ways in which we must be careful A trivial bound of the form`33 =

int|S(α)|3dα le `22 middot `infin would introduce a fatal factor of log x coming from `2

We avoid this by using the fact that we have `2 estimates over Mδ0Q0for varying Q0

We must also remember to substract the major-arc contribution from our estimatefor Mδ0Q0 this is why we were careful to give a lower bound in Lem 1031 asopposed to just the upper bound (1028)

131 Putting together `2 bounds over arcs and `infin bounds

Let us start with a simple lemma ndash essentially a way to obtain upper bounds by meansof summation by parts

Lemma 1311 Let f g a a+ 1 b rarr R+0 where a b isin Z+ Assume that for

all x isin [a b] sumalenlex

f(n) le F (x) (131)

where F [a b]rarr R is continuous piecewise differentiable and non-decreasing Then

bsumn=a

f(n) middot g(n) le (maxngea

g(n)) middot F (a) +

int b

a

(maxngeu

g(n)) middot F prime(u)du

Proof Let S(n) =sumnm=a f(m) Then by partial summation

bsumn=a

f(n) middot g(n) le S(b)g(b) +bminus1sumn=a

S(n)(g(n)minus g(n+ 1)) (132)

245

246 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

Let h(x) = maxxlenleb g(n) Then h is non-increasing Hence (131) and (132) implythat

bsumn=a

f(n)g(n) lebsum

n=a

f(n)h(n)

le S(b)h(b) +

bminus1sumn=a

S(n)(h(n)minus h(n+ 1))

le F (b)h(b) +

bminus1sumn=a

F (n)(h(n)minus h(n+ 1))

In general for αn isin C A(x) =sumalenlex αn and F continuous and piecewise differ-

entiable on [a x]sumalenlex

αnF (x) = A(x)F (x)minusint x

a

A(u)F prime(u)du (Abel summation) (133)

Applying this with αn = h(n)minush(n+1) andA(x) =sumalenlex αn = h(a)minush(bxc+

1) we obtain

bminus1sumn=a

F (n)(h(n)minus h(n+ 1))

= (h(a)minus h(b))F (bminus 1)minusint bminus1

a

(h(a)minus h(buc+ 1))F prime(u)du

= h(a)F (a)minus h(b)F (bminus 1) +

int bminus1

a

h(buc+ 1)F prime(u)du

= h(a)F (a)minus h(b)F (bminus 1) +

int bminus1

a

h(u)F prime(u)du

= h(a)F (a)minus h(b)F (b) +

int b

a

h(u)F prime(u)du

since h(buc+ 1) = h(u) for u isin Z Hence

bsumn=a

f(n)g(n) le h(a)F (a) +

int b

a

h(u)F prime(u)du

We will now see our main application of Lemma 1311 We have to bound anintegral of the form

intMδ0r

|S1(α)|2|S2(α)|dα where Mδ0r is a union of arcs defined

as in (105) Our inputs are (a) a bound on integrals of the formintMδ0r

|S1(α)|2dα (b)a bound on |S2(α)| for α isin (RZ)Mδ0r The input of type (a) is what we derived insect121 and sect122 the input of type (b) is a minor-arcs bound and as such was the mainsubject of Part I

131 PUTTING TOGETHER `2 BOUNDS OVER ARCS AND `infin BOUNDS 247

Proposition 1312 Let S1(α) =sumn ane(αn) an isin C an in L1 Let S2 RZrarr

C be continuous Define Mδ0r as in (105)Let r0 be a positive integer not greater than r1 Let H [r0 r1] rarr R+ be a

continuous piecewise differentiable non-decreasing function such that

1sum|an|2

intMδ0r+1

|S1(α)|2dα le H(r) (134)

for some δ0 le x2r21 and all r isin [r0 r1] Assume moreover that H(r1) = 1 Let

g [r0 r1]rarr R+ be a non-increasing function such that

maxαisin(RZ)Mδ0r

|S2(α)| le g(r) (135)

for all r isin [r0 r1] and δ0 as aboveThen

1sumn |an|2

int(RZ)Mδ0r0

|S1(α)|2|S2(α)|dα

le g(r0) middot (H(r0)minus I0) +

int r1

r0

g(r)H prime(r)dr

(136)

whereI0 =

1sumn |an|2

intMδ0r0

|S1(α)|2dα (137)

The condition δ0 le x2r21 is there just to ensure that the arcs in the definition of

Mδ0r do not overlap for r le r1

Proof For r0 le r lt r1 let

f(r) =1sum

n |an|2

intMδ0r+1Mδ0r

|S1(α)|2dα

Letf(r1) =

1sumn |an|2

int(RZ)Mδ0r1

|S1(α)|2dα

Then by (135)

1sumn |an|2

int(RZ)Mδ0r0

|S1(α)|2|S2(α)|dα ler1sumr=r0

f(r)g(r)

By (134)sumr0lerlex

f(r) =1sum

n |an|2

intMδ0x+1Mδ0r0

|S1(α)|2dα

=

(1sum

n |an|2

intMδ0x+1

|S1(α)|2dα

)minus I0 le H(x)minus I0

(138)

248 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

for x isin [r0 r1) Moreoversumr0lerler1

f(r) =1sum

n |an|2

int(RZ)Mδ0r0

|S1(α)|2

=

(1sum

n |an|2

intRZ|S1(α)|2

)minus I0 = 1minus I0 = H(r1)minus I0

We let F (x) = H(x) minus I0 and apply Lemma 1311 with a = r0 b = r1 Weobtain that

r1sumr=r0

f(r)g(r) le (maxrger0

g(r))F (r0) +

int r1

r0

(maxrgeu

g(r))F prime(u) du

le g(r0)(H(r0)minus I0) +

int r1

r0

g(u)H prime(u) du

132 The minor-arc totalWe now apply Prop 1312 Inevitably the main statement involves some integrals thatwill have to be evaluated at the end of the section

Theorem 1321 Let x ge 1025 middot κ where κ ge 1 Let

Sη(α x) =sumn

Λ(n)e(αn)η(nx) (139)

Let ηlowast(t) = (η2 lowastM ϕ)(κt) where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in `1 Let η+ [0infin)rarr [0infin) be a bounded piecewise differentiablefunction with limtrarrinfin η+(t) = 0 Let Mδ0r be as in (105) with δ0 = 8 Let 105 ler0 lt r1 where r1 = (38)(xκ)415 Let g(r) = gxκϕ(r) where

gyϕ(r) =(RyKϕ2r log 2r + 05)

radicz(r) + 25radic

2r+L2r

r+ 336K16yminus16 (1310)

just as in (1119) and K = log(xκ)2 Here RyKφt is as in (1119) and Lt is asin (1113)

Denote

Zr0 =

int(RZ)M8r0

|Sηlowast(α x)||Sη+(α x)|2dα

Then

Zr0 le

(radic|ϕ|1xκ

(M + T ) +radicSηlowast(0 x) middot E

)2

132 THE MINOR-ARC TOTAL 249

where

S =sumpgtradicx

(log p)2η2+(nx)

T = Cϕ3

(1

2log

x

κ

)middot (S minus (

radicJ minusradicE)2)

J =

intM8r0

|Sη+(α x)|2 dα

E =((Cη+0 + Cη+2) log x+ (2Cη+0 + Cη+1)

)middot x12

(1311)

Cη+0 = 07131

int infin0

1radict(suprget

η+(r))2dt

Cη+1 = 07131

int infin1

log tradict

(suprget

η+(r))2dt

Cη+2 = 051942|η+|2infin

Cϕ3(K) =104488

|ϕ|1

int 1K

0

|ϕ(w)|dw

(1312)

and

M = g(r0) middot(

log(r0 + 1) + c+

logradicx+ cminus

middot S minus (radicJ minusradicE)2

)+

(2

log x+ 2cminus

int r1

r0

g(r)

rdr +

(7

15+minus214938 + 8

15 logκlog x+ 2cminus

)g(r1)

)middot S

(1313)where c+ = 20532 and cminus = 06394

Proof Let y = xκ Let Q = (34)y23 as in Thm 311 (applied with y insteadof x) Let α isin (RZ) M8r where r0 le r le y136 and y is used instead ofx to define M8r (see (105)) There exists an approximation 2α = aq + δy withq le Q |δ|y le 1qQ Thus α = aprimeqprime + δ2y where either aprimeqprime = a2q oraprimeqprime = (a + q)2q holds (In particular if qprime is odd then qprime = q if qprime is even thenqprime = 2q)

There are three cases

1 q le r Then either (a) qprime is odd and qprime le r or (b) qprime is even and qprime le 2rSince α is not in M8r then by definition (105) |δ|2y ge δ0r2qy and so|δ| ge δ0rq = 8rq In particular |δ| ge 8

Thus by Prop 1123

|Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(|δ|8q

)middot|ϕ|1y le gyϕ(r)middot|ϕ|1y (1314)

where we use the fact that g(r) is a non-increasing function (Lemma 1124)

250 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

2 r lt q le y136 Then by Prop 1123 and Lemma 1124

|Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(

max

(|δ|8 1

)q

)middot |ϕ|1y

le gyϕ(r) middot |ϕ|1y(1315)

3 q gt y136 Again by Prop 1123

|Sηlowast(α x)| = |Sη2lowastMφ(α y)| le(h( yK

)+ Cϕ3(K)

)|ϕ|1y (1316)

where h(x) is as in (1115) (Of course Cϕ3(K) as in (1312) is equal toCϕ0K|φ|1 where Cϕ0K is as in (1121)) We set K = (log y)2 Sincey = xκ ge 1025 it follows that yK = 2y log y gt 347 middot 1023 gt 216 middot 1020

Let

r1 =3

8y415 g(r) =

gyϕ(r) if r le r1

gyϕ(r1) if r gt r1

By Lemma 1124 for r ge 670 g(r) is a non-increasing function and g(r) ge gyφ(r)Moreover by Lemma 1125 gyφ(r1) ge h(2y log y) where h is as in (1115) and sog(r) ge h(2y log y) for all r ge r0 ge 670 Thus we have shown that

|Sηlowast(y α)| le(g(r) + Cϕ3

(log y

2

))middot |ϕ|1y (1317)

for all α isin (RZ) M8rWe first need to undertake the fairly dull task of getting non-prime or small n out

of the sum defining Sη+(α x) Write

S1η+(α x) =sumpgtradicx

(log p)e(αp)η+(px)

S2η+(α x) =sum

n non-primengtradicx

Λ(n)e(αn)η+(nx) +sumnleradicx

Λ(n)e(αn)η+(nx)

By the triangle inequality (with weights |Sη+(α x)|)radicint(RZ)M8r0

|Sηlowast(α x)||Sη+(α x)|2dα

le2sumj=1

radicint(RZ)M8r0

|Sηlowast(α x)||Sjη+(α x)|2dα

132 THE MINOR-ARC TOTAL 251

Clearlyint(RZ)M8r0

|Sηlowast(α x)||S2η+(α x)|2dα

le maxαisinRZ

|Sηlowast(α x)| middotintRZ|S2η+(α x)|2dα

leinfinsumn=1

Λ(n)ηlowast(nx) middot

sumn non-prime

Λ(n)2η+(nx)2 +sumnleradicx

Λ(n)2η+(nx)2

Let η+(z) = suptgez η+(t) Since η+(t) tends to 0 as t rarr infin so does η+ By [RS62Thm 13] partial summation and integration by partssum

n non-prime

Λ(n)2η+(nx)2 lesum

n non-prime

Λ(n)2η+(nx)2

le minusint infin

1

sumnlet

n non-prime

Λ(n)2

(η+2(tx)

)primedt

le minusint infin

1

(log t) middot 14262radict(η+

2(tx))primedt

le 07131

int infin1

log e2tradictmiddot η+

2

(t

x

)dt

=

(07131

int infin1x

2 + log txradict

η+2(t)dt

)radicx

while by [RS62 Thm 12]sumnleradicx

Λ(n)2η+(nx)2 le 1

2|η+|2infin(log x)

sumnleradicx

Λ(n)

le 051942|η+|2infin middotradicx log x

This shows thatint(RZ)M8r0

|Sηlowast(α x)||S2η+(α x)|2dα leinfinsumn=1

Λ(n)ηlowast(nx) middot E = Sηlowast(0 x) middot E

where E is as in (1311)It remains to boundint

(RZ)M8r0

|Sηlowast(α x)||S1η+(α x)|2dα (1318)

We wish to apply Prop 1312 Corollary 1225 gives us an input of type (134) wehave just derived a bound (1317) that provides an input of type (135) More precisely

252 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

by (1242) (134) holds with

H(r) =

log(r+1)+c+

logradicx+cminus

if r lt r1

1 if r ge r1

where c+ = 20532 gt log 2 + 136 and cminus = 06394 lt log(1radic

2 middot 8) + log 2 +13325822 (We can apply Corollary 1225 because 2(r1 + 1) = (34)x415 + 2 le(2radicx16)06 for x ge 1025 (or even for x ge 100000)) Since r1 = (38)y415 and

x ge 1025 middot κ

limrrarrr+1

H(r)minus limrrarrrminus1

H(r) = 1minus log((38)(xκ)415 + 1) + c+

logradicx+ cminus

le 1minus(

415

12+

log 38 + c+ minus 4

15 logκ minus 815cminus

logradicx+ cminus

)le 7

15+minus214938 + 8

15 logκlog x+ 2cminus

We also have (135) with (g(r) + Cϕ3

(log y

2

))middot |ϕ|1y (1319)

instead of g(r) (by (1317)) Here (1319) is a non-increasing function of r becauseg(r) is as we already checked Hence Prop 1312 gives us that (1318) is at most

g(r0)middot(H(r0)minus I0) + (1minus I0) middot Cϕ3(

log y

2

)+

1

logradicx+ cminus

int r1

r0

g(r)

r + 1dr +

(7

15+minus214938 + 8

15 logκlog x+ 2cminus

)g(r1)

(1320)times |ϕ|1y middot

sumpgtradicx(log p)2η2

+(px) where

I0 =1sum

pgtradicx(log p)2η2

+(nx)

intM8r0

|S1η+(α x)|2 dα (1321)

By the triangle inequalityradicintM8r0

|S1η+(α x)|2 dα =

radicintM8r0

|Sη+(α x)minus S2η+(α x)|2 dα

geradicint

M8r0

|Sη+(α x)|2 dαminusradicint

M8r0

|S2η+(α x)|2 dα

geradicint

M8r0

|Sη+(α x)|2 dαminusradicint

RZ|S2η+(α x)|2 dα

132 THE MINOR-ARC TOTAL 253

As we already showedintRZ|S2η+(α x)|2 dα =

sumn non-primeor n le

radicx

Λ(n)2η+(nx)2 le E

ThusI0 middot S ge (

radicJ minusradicE)2

and so we are done

We now should estimate the integralint r1r0

g(r)r dr in (1313) It is easy to see thatint infin

r0

1

r32dr =

2

r120

int infinr0

log r

r2dr =

log er0

r0

int infinr0

1

r2dr =

1

r0int r1

r0

1

rdr = log

r1

r0

int infinr0

log r

r32dr =

2 log e2r0radicr0

int infinr0

log 2r

r32dr =

2 log 2e2r0radicr0

int infinr0

(log 2r)2

r32dr =

2P2(log 2r0)radicr0

int infinr0

(log 2r)3

r32dr =

2P3(log 2r0)

r120

(1322)where

P2(t) = t2 + 4t+ 8 P3(t) = t3 + 6t2 + 24t+ 48 (1323)

We also have int infinr0

dr

r2 log r= E1(log r0) (1324)

where E1 is the exponential integral

E1(z) =

int infinz

eminust

tdt

We must also estimate the integralsint r1

r0

radicz(r)

r32dr

int r1

r0

z(r)

r2dr

int r1

r0

z(r) log r

r2dr

int r1

r0

z(r)

r32dr (1325)

Clearly z(r) minus eγ log log r = 250637 log log r is decreasing on r Hence forr ge 105

z(r) le eγ log log r + cγ

where cγ = 1025742 Let F (t) = eγ log t+ cγ Then F primeprime(t) = minuseγt2 lt 0 Hence

d2radicF (t)

dt2=

F primeprime(t)

2radicF (t)

minus (F prime(t))2

4(F (t))32lt 0

254 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

for all t gt 0 In other wordsradicF (t) is convex-down and so we can bound

radicF (t)

from above byradicF (t0) +

radicFprime(t0) middot (tminus t0) for any t ge t0 gt 0 Hence for r ge r0 ge

105 radicz(r) le

radicF (log r) le

radicF (log r0) +

dradicF (t)

dt|t=log r0 middot log

r

r0

=radicF (log r0) +

eγradicF (log r0)

middotlog r

r0

2 log r0

Thus by (1322)int infinr0

radicz(r)

r32dr le

radicF (log r0)

(2minus eγ

F (log r0)

)1radicr0

+eγradic

F (log r0) log r0

log e2r0radicr0

=2radicF (log r0)radicr0

(1 +

F (log r0) log r0

)

(1326)

The other integrals in (1325) are easier Just as in (1326) we extend the range ofintegration to [r0infin] Using (1322) and (1324) we obtainint infin

r0

z(r)

r2dr le

int infinr0

F (log r)

r2dr = eγ

(log log r0

r0+ E1(log r0)

)+cγr0int infin

r0

z(r) log r

r2dr le eγ

((1 + log r0) log log r0 + 1

r0+ E1(log r0)

)+cγ log er0

r0

By [OLBC10 (682)]

1

r(log r + 1)le E1(log r) le 1

r log r

(The second inequality is obvious) Henceint infinr0

z(r)

r2dr le eγ(log log r0 + 1 log r0) + cγ

r0

int infinr0

z(r) log r

r2dr le

eγ(

log log r0 + 1log r0

)+ cγ

r0middot log er0

Finally int infinr0

z(r)

r32le eγ

(2 log log r0radic

r0+ 2E1

(log r0

2

))+

2cγradicr0

le 2radicr0

(F (log r0) +

2eγ

log r0

)

(1327)

It is time to estimate int r1

r0

Rz2r log 2rradicz(r)

r32dr (1328)

132 THE MINOR-ARC TOTAL 255

where z = y or z = y((log y)2) (and y = xκ as before) and where Rzt is asdefined in (1113) By Cauchy-Schwarz (1328) is at most

radicint r1

r0

(Rz2r log 2r)2

r32dr middot

radicint r1

r0

z(r)

r32dr (1329)

We have already bounded the second integral Let us look at the first one We can writeRzt = 027125Rzt + 041415 where

Rzt = log

(1 +

log 4t

2 log 9z13

2004t

) (1330)

Clearly

Rzet4 = log

(1 +

t2

log 36z13

2004 minus t

)

Now for f(t) = log(c+ at(bminus t)) and t isin [0 b)

f prime(t) =ab(

c+ atbminust

)(bminus t)2

f primeprime(t) =minusab((aminus 2c)(bminus 2t)minus 2ct)(

c+ atbminust

)2

(bminus t)4

In our case a = 12 c = 1 and b = log 36z13 minus log(2004) gt 0 Hence for t lt b

minusab((aminus 2c)(bminus 2t)minus 2ct) =b

2

(2t+

3

2(bminus 2t)

)=b

2

(3

2bminus t

)gt 0

and so f primeprime(t) gt 0 In other words t rarr Rzet4 is convex-up for t lt b ie foret4 lt 9z132004 It is easy to check that since we are assuming y ge 1025

2r1 =3

16y415 lt

9

2004

(2y

log y

)13

le 9z13

2004

We conclude that r rarr Rz2r is convex-up on log 8r for r le r1 and hence so isr rarr Rzr and so in turn is r rarr R2

zr Thus for r isin [r0 r1]

R2z2r le R2

z2r0 middotlog r1r

log r1r0+R2

z2r1 middotlog rr0

log r1r0 (1331)

256 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

Therefore by (1322)

int r1

r0

(Rz2r log 2r)2

r32dr

leint r1

r0

(R2z2r0

log r1r

log r1r0+R2

z2r1

log rr0

log r1r0

)(log 2r)2 dr

r32

=2R2

z2r0

log r1r0

((P2(log 2r0)radicr0

minus P2(log 2r1)radicr1

)log 2r1 minus

P3(log 2r0)radicr0

+P3(log 2r1)radicr1

)+

2R2z2r1

log r1r0

(P3(log 2r0)radicr0

minus P3(log 2r1)radicr1

minus(P2(log 2r0)radicr0

minus P2(log 2r1)radicr1

)log 2r0

)

= 2

(R2z2r0 minus

log 2r0

log r1r0

(R2z2r1 minusR

2z2r0)

)middot(P2(log 2r0)radicr0

minus P2(log 2r1)radicr1

)+ 2

R2z2r1 minusR

2z2r0

log r1r0

(P3(log 2r0)radicr0

minus P3(log 2r1)radicr1

)= 2R2

z2r0 middot(P2(log 2r0)radicr0

minus P2(log 2r1)radicr1

)+ 2

R2z2r1 minusR

2z2r0

log r1r0

(Pminus2 (log 2r0)radicr0

minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

)

(1332)where P2(t) and P3(t) are as in (1323) and Pminus2 (t) = P3(t)minustP2(t) = 2t2 +16t+48

Putting all terms together we conclude that

int r1

r0

g(r)

rdr le f0(r0 y) + f1(r0) + f2(r0 y) (1333)

where

f0(r0 y) =

((1minus cϕ)

radicI0r0r1y + cϕ

radicI0r0r1 2y

log y

)radic2radicr0I1r0

f1(r0) =

radicF (log r0)radic

2r0

(1 +

F (log r0) log r0

)+

5radic2r0

+1

r0

((13

4log er0 + 1107

)Jr0 + 1366 log er0 + 3755

)f2(r0 y) = 336

((log y)2)16

y16log

r1

r0

(1334)

132 THE MINOR-ARC TOTAL 257

where F (t) = eγ log t+ cγ cγ = 1025742 y = xκ (as usual)

I0r0r1z = R2z2r0 middot

(P2(log 2r0)radicr0

minus P2(log 2r1)radicr1

)+R2z2r1 minusR

2z2r0

log r1r0

(Pminus2 (log 2r0)radicr0

minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

)Jr = F (log r) +

log r I1r = F (log r) +

2eγ

log r cϕ =

Cϕ2 log y2|ϕ|1

log log y2(1335)

and Cϕ2K is as in (1120)Let us recapitulate briefly The term f2(r0 y) in (1334) comes from the term

336xminus116 in (1112) The term f1(r0 y) includes all other terms in (1112) exceptfor Rx2r log 2r

radicz(r)(

radic2r) The contribution of that last term is (1328) divided

byradic

2 That in turn is at most (1329) divided byradic

2 The first integral in (1329)was bounded in (1332) the second integral was bounded in (1327)

258 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

Chapter 14

Conclusion

We now need to gather all results using the smoothing functions

ηlowast = (η2 lowastM ϕ)(κt)

where ϕ(t) = t2eminust22 η2 = η1 lowastM η1 and η1 = 2 middot I[minus1212] and

η+ = h200(t)teminust22

where

hH(t) =

int infin0

h(tyminus1)FH(y)dy

y

h(t) =

t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

FH(t) =sin(H log y)

π log y

We studied ηlowast and η+ in Part II We saw ηlowast in Thm 1321 (which actually works forgeneral ϕ [0infin)rarr [0infin) as its statement says) We will set κ soon

We fix a value for r namely r = 150000 Our results will have to be valid for anyx ge x+ where x+ is fixed We set x+ = 49 middot 1026 since we want a result valid forN ge 1027 and as was discussed in (111) we will work with x+ slightly smaller thanN2

141 The `2 norm over the major arcs explicit versionWe apply Lemma 1031 with η = η+ and η as in (113) Let us first work out theerror terms defined in (1027) Recall that δ0 = 8 By Thm 714

ETη+δ0r2 = max|δ|leδ0r2

| errηχT (δ x)|

= 4772 middot 10minus11 +251400radicx+

le 11405 middot 10minus8(141)

259

260 CHAPTER 14 CONCLUSION

Eη+rδ0 = maxχ mod q

qlermiddotgcd(q2)

|δ|legcd(q2)δ0r2q

radicqlowast| errη+χlowast(δ x)|

le 13482 middot 10minus14radic

300000 +1617 middot 10minus10

radic2

+1radicx+

(499900 + 52

radic300000

)le 23992 middot 10minus8

(142)where in the latter case we are using the fact that a stronger bound for q = 1 (namely(141)) allows us to assume q ge 2

We also need to bound a few norms by the estimates in sectA3 and sectA5 (appliedwith H = 200)

|η+|1 le 1062319 |η+|2 le 0800129 +2748569

20072le 0800132

|η+|infin le 1 + 206440727 middot1 + 4

π logH

Hle 1079955

(143)

By (1012) and (141)

|Sη+(0 x)| =∣∣η+(0) middot x+Olowast

(errη+χT (0 x)

)middot x∣∣

le (|η+|1 + ETη+δ0r2)x le 1063x

This is far from optimal but it will do since all we wish to do with this is to bound thetiny error term Kr2 in (1027)

Kr2 = (1 +radic

300000)(log x)2 middot 1079955

middot (2 middot 106232 + (1 +radic

300000)(log x)21079955x)

le 125906(log x)2 le 971 middot 10minus21x

for x ge x+ By (141) we also have

519δ0r

(ET

η+δ0r2middot

(|η+|1 +

ETη+

δ0r2

2

))le 0075272

andδ0r(log 2e2r)

(E2η+rδ0 +Kr2x

)le 100393 middot 10minus8

By (A23) and (A26)

08001287 le |η|2 le 08001288 (144)

and|η+ minus η|2 le

274856893

H72le 242942 middot 10minus6 (145)

We bound |η(3) |1 using the fact that (as we can tell by taking derivatives) η(2)

(t)

increases from 0 at t = 0 to a maximum within [0 12] and then decreases to η(2) (1) =

142 THE TOTAL MAJOR-ARC CONTRIBUTION 261

minus7 only to increase to a maximum within [32 2] (equal to the maximum attainedwithin [0 12]) and then decrease to 0 at t = 2

|η(3) |1 = 2 max

tisin[012]η

(2) (t)minus 2η

(2) (1) + 2 max

tisin[322]η

(2) (t)

= 4 maxtisin[012]

η(2) (t) + 14 le 4 middot 46255653 + 14 le 325023

(146)

where we compute the maximum by the bisection method with 30 iterations (usinginterval arithmetic as always)

We evaluate explicitly sumqlerq odd

micro2(q)

φ(q)= 6798779

using yet again interval arithmeticLooking at (1029) and (1028) we conclude that

Lrδ0 le 2 middot 6798779 middot 08001322 le 870531

Lrδ0 ge 2 middot 6798779 middot 080012872 minus ((log r + 17) middot (3888 middot 10minus6 + 591 middot 10minus12))

minus(1342 middot 10minus5

)middot(

064787 +log r

4r+

0425

r

)ge 870517

Lemma 1031 thus gives us thatintM8r0

∣∣Sη+(α x)∣∣2 dα = (870524 +Olowast(000007))x+Olowast(0075273)x

= (87052 +Olowast(00754))x le 87806x

(147)

142 The total major-arc contributionFirst of all we must bound from below

C0 =prodp|N

(1minus 1

(pminus 1)2

)middotprodp-N

(1 +

1

(pminus 1)3

) (148)

The only prime that we know does not divide N is 2 Thus we use the bound

C0 ge 2prodpgt2

(1minus 1

(pminus 1)2

)ge 13203236 (149)

The other main constant is Cηηlowast which we defined in (1037) and already startedto estimate in (116)

Cηηlowast = |η|22int N

x

0

ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

x

0

((2minusNx) + ρ)2ηlowast(ρ)dρ

)(1410)

262 CHAPTER 14 CONCLUSION

provided that N ge 2x Recall that ηlowast = (η2 lowastM ϕ)(κt) where ϕ(t) = t2eminust22

Thereforeint Nx

0

ηlowast(ρ)dρ =

int Nx

0

(η2 lowast ϕ)(κρ)dρ =

int 1

14

η2(w)

int Nx

0

ϕ(κρw

)dρdw

w

=|η2|1|ϕ|1

κminus 1

κ

int 1

14

η2(w)

int infinκNxw

ϕ(ρ)dρdw

By integration by parts and [AS64 (7113)]int infiny

ϕ(ρ)dρ = yeminusy22 +

radic2

int infinyradic

2

eminust2

dt lt

(y +

1

y

)eminusy

22

Hence int infinκNxw

ϕ(ρ)dρ leint infin

2κϕ(ρ)dρ lt

(2κ +

1

)eminus2κ2

and so since |η2|1 = 1int Nx

0

ηlowast(ρ)dρ ge |ϕ|1κminusint 1

14

η2(w)dw middot(

2 +1

2κ2

)eminus2κ2

ge |ϕ|1κminus(

2 +1

2κ2

)eminus2κ2

(1411)

Let us now focus on the second integral in (1410) Write Nx = 2 + c1κ Thenthe integral equalsint 2+c1κ

0

(minusc1κ + ρ)2ηlowast(ρ)dρ le 1

κ3

int infin0

(uminus c1)2 (η2 lowastM ϕ)(u) du

=1

κ3

int 1

14

η2(w)

int infin0

(vw minus c1)2ϕ(v)dvdw

=1

κ3

int 1

14

η2(w)

(3

radicπ

2w2 minus 2 middot 2c1w + c21

radicπ

2

)dw

=1

κ3

(49

48

radicπ

2minus 9

4c1 +

radicπ

2c21

)

It is thus best to choose c1 = (94)radic

2π = 089762 We must now estimate |ηprime|22 We could do this directly by rigorous numerical

integration but we might as well do it the hard way (which is actually rather easy) Bythe definition (113) of η

|ηprime(x+ 1)|2 =(x14 minus 18x12 + 111x10 minus 284x8 + 351x6 minus 210x4 + 49x2

)eminusx

2

(1412)for x isin [minus1 1] and ηprime(x+ 1) = 0 for x 6isin [minus1 1] Now for any even integer k gt 0int 1

minus1

xkeminusx2

dx = 2

int 1

0

xkeminusx2

dx = γ

(k + 1

2 1

)

142 THE TOTAL MAJOR-ARC CONTRIBUTION 263

where γ(a r) =int r

0eminusttaminus1dt is the incomplete gamma function (We substitute

t = x2 in the integral) By [AS64 (6516) (6522)] γ(a+ 1 1) = aγ(a 1)minus 1e forall a gt 0 and γ(12 1) =

radicπ erf(1) where

erf(z) =2radicπ

int 1

0

eminust2

dt

Thus starting from (1412) we see that

|ηprime|22 = γ

(15

2 1

)minus 18 middot γ

(13

2 1

)+ 111 middot γ

(11

2 1

)minus 284 middot γ

(9

2 1

)+ 351 middot γ

(7

2 1

)minus 210 middot γ

(5

2 1

)+ 49 middot γ

(3

2 1

)=

9151

128

radicπ erf(1)minus 18101

64e= 27375292

(1413)We thus obtain

271|ηprime|22middotint N

x

0

((2minusNx) + ρ)2ηlowast(ρ)dρ

le 74188 middot 1

κ3

(49

48

radicπ

2minus (94)2

2radic

)le 20002

κ3

We conclude that

Cηηlowast ge1

κ|ϕ|1|η|22 minus |η|22

(2 +

1

2κ2

)eminus2κ2

minus 20002

κ3

Settingκ = 49

and using (144) we obtain

Cηηlowast ge1

κ(|ϕ|1|η|22 minus 0000834) (1414)

Here it is useful to note that |ϕ|1 =radic

π2 and so by (144) |ϕ|1|η|22 = 080237

We have finally chosen x in terms of N

x =N

2 + c1κ

=N

2 + 94radic2π

149

= 0495461 middotN (1415)

Thus we see that since we are assuming N ge 1027 we in fact have x ge 495461 middot1026 and so in particular

x ge 49 middot 1026x

κge 1025 (1416)

264 CHAPTER 14 CONCLUSION

Let us continue with our determination of the major-arcs total We should com-pute the quantities in (1038) We already have bounds for Eη+rδ0 Aη+ (see (147))Lηrδ0 and Kr2 By Corollary 713 we have

Eηlowastr8 le maxχ mod q

qlermiddotgcd(q2)

|δ|legcd(q2)δ0r2q

radicqlowast| errηlowastχlowast(δ x)|

le 1

κ

(2485 middot 10minus19 +

1radic1025

(381500 + 76

radic300000

))le 133805 middot 10minus8

κ

(1417)

where the factor of κ comes from the scaling in ηlowast(t) = (η2 lowastM ϕ)(κt) (which ineffect divides x by κ) It remains only to bound the more harmless terms of type Zη2and LSη

Clearly Zη2+2 le (1x)sumn Λ(n)(log n)η2

+(nx) Now by Prop 715

infinsumn=1

Λ(n)(log n)η2(nx)

=

(0640206 +Olowast

(2 middot 10minus6 +

36691radicx

))x log xminus 0021095x

le (0640206 +Olowast(3 middot 10minus6))x log xminus 0021095x

(1418)

ThusZη2+2 le 0640209 log x (1419)

We will proceed a little more crudely for Zη2lowast2

Zη2lowast2 =1

x

sumn

Λ2(n)η2lowast(nx) le 1

x

sumn

Λ(n)ηlowast(nx) middot (ηlowast(nx) log n)

le (|ηlowast|1 + | errηlowastχT (0 x)|) middot (|ηlowast(t) middot log+(κt)|infin + |ηlowast|infin log(xκ))(1420)

where log+(t) = max(0 log t) It is easy to see that

|ηlowast|infin = |η2 lowastM ϕ|infin le∣∣∣∣η2(t)

t

∣∣∣∣1

|ϕ|infin le 4(log 2)2 middot 2

ele 1414 (1421)

and since log+ is non-decreasing and η2 is supported on a subset of [0 1]

|ηlowast(t) middot log+(κt)|infin = |(η2 lowastM ϕ) middot log+ |infin le |η2 lowastM (ϕ middot log+)|infin

le∣∣∣∣η2(t)

t

∣∣∣∣1

middot |ϕ middot log+ |infin le 1921813 middot 0381157 le 0732513

where we bound |ϕ middot log+ |infin by the bisection method with 25 iterations We alreadyknow that

|ηlowast|1 =|η2|1|ϕ|1

κ=|ϕ|1κ

=

radicπ2

κ (1422)

142 THE TOTAL MAJOR-ARC CONTRIBUTION 265

By Cor 713

| errηlowastχT (0 x)| le 2485 middot 10minus19 +1radic1025

(381500 + 76) le 120665 middot 10minus7

We conclude that

Zη2lowast2 le (radicπ249 + 120665 middot 10minus7)(0732513 + 1414 log(x49)) le 00362 log x

(1423)We have bounds for |ηlowast|infin and |η+|infin We can also bound

|ηlowast middot t|infin =|(η2 lowastM ϕ) middot t|infin

κle |η2|1 middot |ϕ middot t|infin

κle 332eminus32

κ

We quote the estimate

|η+ middot t|infin = 1064735 + 325312 middot (1 + (4π) log 200)200 le 119073 (1424)

from (A42)We can now bound LSη(x r) for η = ηlowast η+

LSη(x r) = log r middotmaxpler

sumαge1

η

(pα

x

)

le (log r) middotmaxpler

log x

log p|η|infin +

sumαge1

pαgex

|η middot t|infinpαx

le (log r) middotmax

pler

(log x

log p|η|infin +

|η middot t|infin1minus 1p

)le (log r)(log x)

log 2|η|infin + 2(log r)|η middot t|infin

and so

LSηlowast le(

1414

log 2log x+ 2 middot (3e)32

49

)log r le 2432 log x+ 057

LSη+ le(

107996

log 2log x+ 2 middot 119073

)log r le 1857 log x+ 2839

(1425)

where we are using the bound on |η+|infin in (143)We can now start to put together all terms in (1036) Let ε0 = |η+ minus η|2|η|2

Then by (145)ε0|η|2 = |η+ minus η|2 le 242942 middot 10minus6

Thus

282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

|η(3) |21δ50

r

266 CHAPTER 14 CONCLUSION

is at most

282643 middot 242942 middot 10minus6 middot (2 middot 080013 + 242942 middot 10minus6)

+43101 middot 0800132 + 00012 middot 325032

85

150000le 29387 middot 10minus5

by (144) (146) and (1422)Since ηlowast = (η2 lowastM ϕ)(κx) and η2 is supported on [14 1]

|ηlowast|22 =|η2 lowastM ϕ|22

κ=

1

κ

int infin0

(int infin0

η2(t)ϕ(wt

) dtt

)2

dw

le 1

κ

int infin0

(1minus 1

4

)int infin0

η22(t)ϕ2

(wt

) dtt2dw

=3

int infin0

η22(t)

t

(int infin0

ϕ2(wt

) dwt

)dt

=3

4κ|η2(t)

radict|22 middot |ϕ|22 =

3

4κmiddot 32

3(log 2)3 middot 3

8

radicπ le 177082

κ

where we go from the first to the second line by Cauchy-SchwarzRecalling the bounds on Eηlowastrδ0 and Eη+rδ0 we obtained in (142) and (1417)

we conclude that the second line of (1036) is at most x2 times

133805 middot 10minus8

κmiddot 87806 + 23922 middot 10minus8 middot 16812

middot (radic

87806 + 16812 middot 080014)

radic177082

κle 17316 middot 10minus6

κ

where we are using the boundAη+ le 87806 we obtained in (147) (We are also usingthe bounds on norms in (143) and the value κ = 49)

By the bounds (1419) (1423) and (1425) we see that the third line of (1036) isat most

2 middot (0640209 log x) middot (2432 log x+ 057) middot x

+ 4radic

0640209 log x middot 00362 log x(1857 log x+ 2839)x le 43(log x)2x

where we use the assumption x ge x+ = 49 middot 1026 (though a much weaker assumptionwould suffice)

Using the assumption x ge x+ again together with (1422) and the bounds we havejust proven we conclude that for r = 150000 the integral over the major arcsint

M8r

Sη+(α x)2Sηlowast(α x)e(minusNα)dα

143 THE MINOR-ARC TOTAL EXPLICIT VERSION 267

is

C0 middot Cη0ηlowastx2 +Olowast

(29387 middot 10minus5 middot

radicπ2

κx2 +

17316 middot 10minus6

κx2 + 43(log x)2x

)

= C0 middot Cη0ηlowastx2 +Olowast(

385628 middot 10minus5 middot x2

κ

)= C0 middot Cη0ηlowastx2 +Olowast(786996 middot 10minus7x2)

(1426)where C0 and Cη0ηlowast are as in (1037) Notice that C0Cη0ηlowastx

2 is the expected asymp-totic for the integral over all of RZ

Moreover by (149) (1414) and (144) as well as |ϕ|1 =radicπ2

C0 middot Cη0ηlowast ge 13203236

(|ϕ|1|η|22

κminus 0000834

κ

)ge 10594003

κminus 0001102

κge 1058298

49

Hence intM8r

Sη+(α x)2Sηlowast(α x)e(minusNα)dα ge 1058259

κx2 (1427)

where as usual κ = 49 This is our total major-arc bound

143 The minor-arc total explicit versionWe need to estimate the quantities E S T J M in Theorem 1321 Let us start bybounding the constants in (1312) The constants Cη+j j = 0 1 2 will appear onlyin the minor term E and so crude bounds on them will do

By (143) and (1424)

suprget

η+(r) le min

(107996

119073

t

)for all t ge 0 Thus

Cη+0 = 07131

int infin0

1radict

(suprget

η+(r)

)2

dt

le 07131

(int 1

0

1079962

radict

dt+

int infin1

1190732

t52dt

)le 233744

Similarly

Cη+1 = 07131

int infin1

log tradict

(suprget

η+(r)

)2

dt

le 07131

int infin1

1190732 log t

t52dt le 044937

268 CHAPTER 14 CONCLUSION

Immediately from (143)

Cη+2 = 051942|η+|2infin le 060581

We get

E le ((233744 + 060581) log x+ (2 middot 233744 + 044937)) middot x12

le (294325 log x+ 512426) middot x12 le 84029 middot 10minus12 middot x(1428)

where E is defined as in (1311) and where we are using the assumption x ge x+ =49 middot 1026 Using (1417) and (1422) we see that

Sηlowast(0 x) = (|ηlowast|1 +Olowast(ETηlowast0))x =(radic

π2 +Olowast(133805 middot 10minus8)) xκ

Hence

Sηlowast(0 x) middot E le 105315 middot 10minus11 middot x2

κ (1429)

We can bound

S lesumn

Λ(n)(log n)η2+(nx) le 0640209x log xminus 0021095x (1430)

by (1418) Let us now estimate T Recall that ϕ(t) = t2eminust22 Sinceint u

0

ϕ(t)dt =

int u

0

t2eminust22dt le

int u

0

t2dt =u3

3

we can bound

Cϕ3

(1

2log

x

κ

)=

104488radicπ2

int 2log xκ

0

t2eminust22dt le 02779

((log xκ)2)3

By (147) we already know that J = (87052 +Olowast(00754))x Hence

(radicJ minusradicE)2 = (

radic(87052 +Olowast(00754))xminus

radic84029 middot 10minus12 middot x)2

ge 86297x(1431)

and so

T = Cϕ3

(1

2log

x

κ

)middot (S minus (

radicJ minusradicE)2)

le 8 middot 02779

(log xκ)3middot (0640209x log xminus 0021095xminus 86297x)

le 0177928x log x

(log xκ)3minus 240405

8x

(log xκ)3

le 142336x

(log xκ)2minus 1369293

x

(log xκ)3

143 THE MINOR-ARC TOTAL EXPLICIT VERSION 269

for κ = 49 Since xκ ge 1025 this implies that

T le 35776 middot 10minus4 middot x (1432)

It remains to estimate M Let us first look at g(r0) here g = gxκϕ where gyϕ isdefined as in (1119) and φ(t) = t2eminust

22 as usual Write y = xκ We must estimatethe constant Cϕ2K defined in (1121)

Cϕ2K = minusint 1

1K

ϕ(w) logw dw le minusint 1

0

ϕ(w) logw dw

le minusint 1

0

w2eminusw22 logw dw le 0093426

where again we use VNODE-LP for rigorous numerical integration Since |ϕ|1 =radicπ2 and K = (log y)2 this implies that

Cϕ2K|ϕ|1logK

le 007455

log log y2

(1433)

and so

RyKϕt =007455

log log y2

RyKt +

(1minus 007455

log log y2

)Ryt (1434)

Let t = 2r0 = 300000 we recall that K = (log y)2 Recall from (1416) thaty = xκ ge 1025 thus yK ge 347435 middot 1023 and log((log y)2) ge 335976 Goingback to the definition of Rxt in (1113) we see that

Ry2r0 le 027125 log

(1 +

log(8 middot 150000)

2 log 9middot(1025)13

2004middot2middot150000

)+ 041415 le 058341

(1435)

RyK2r0 le 027125 log

(1 +

log(8 middot 150000)

2 log 9middot(347435middot1023)13

2004middot2middot150000

)+ 041415 le 060295

(1436)and so

RyKϕ2r0 le007455

335976060295 +

(1minus 007455

335976

)058341 le 058385

Using

z(r) = eγ log log r +250637

log log rle 542506

we see from (1113) that

L2r0 = 542506 middot(

13

4log 300000 + 782

)+ 1366 log 300000 + 3755 le 474608

270 CHAPTER 14 CONCLUSION

Going back to (1119) we sum up and obtain that

g(r0) =(058385 middot log 300000 + 05)

radic542506 + 25radic

2 middot 150000

+474608

150000+ 336

(log y

2y

)16

le 0041568

Using again the bound x ge 49 middot 1026 we obtain

log(150000 + 1) + c+

logradicx+ cminus

middot S minus (radicJ minusradicE)2

le 13971612 log x+ 06394

middot (0640209x log xminus 0021095x)minus 86297x

le 178895xminus 117332x12 log x+ 06394

minus 86297x

le (178895minus 86297)x le 92598x

where c+ = 20532 and cminus = 06394 Therefore

g(r0) middot(

log(150000 + 1) + c+

logradicx+ cminus

middot S minus (radicJ minusradicE)2

)le 0041568 middot 92598x

le 038492x(1437)

This is one of the main terms

Let r1 = (38)y415 where as usual y = xκ and κ = 49 Then

Ry2r1 = 027125 log

1 +log(8 middot 3

8y415

)2 log 9y13

2004middot 34y415

+ 041415

= 027125 log

(1 +

415 log y + log 3

2(

13 minus

415

)log y + 2 log 9

2004middot 34

)+ 041415

le 027125 log

(1 +

415

2(

13 minus

415

))+ 041415 le 071215

(1438)

143 THE MINOR-ARC TOTAL EXPLICIT VERSION 271

Similarly for K = (log y)2 (as usual)

RyK2r1 = 027125 log

1 +log(8 middot 3

8y415

)2 log 9(yK)13

2004middot 34y415

+ 041415

= 027125 log

1 +415 log y + log 3

215 log y minus 2

3 log log y + 2 log 9middot213

2004middot 34

+ 041415

= 027125 log

(3 +

43 log log y minus c

215 log y minus 2

3 log log y + 2 log 12middot213

2004

)+ 041415

(1439)where c = 4 log(12 middot 2132004)minus log 3 Let

f(t) =43 log tminus c

215 tminus

23 log t+ 2 log 12middot213

2004

The bisection method with 32 iterations shows that

f(t) le 0019562618 (1440)

for 180 le t le 30000 since f(t) lt 0 for 0 lt t lt 180 (by (43) log t minus c lt 0) andsince by c gt 203 we have f(t) lt (52)(log t)t as soon as t gt (log t)2 (and so inparticular for t gt 30000) we see that (1440) is valid for all t gt 0 Therefore

RyK2r1 le 071392 (1441)

and so by (1434) we conclude that

RyKϕ2r1 le007455

335976middot 071392 +

(1minus 007455

335976

)middot 071215 le 071219

Since r1 = (38)y415 and z(r) is increasing for r ge 27 we know that

z(r1) le z(y415) = eγ log log y415 +250637

log log y415

= eγ log log y +250637

log log y minus log 154

minus eγ log15

4le eγ log log y minus 143644

(1442)for y ge 1025 Hence (1113) gives us that

L2r1 le (eγ log log y minus 143644)

(13

4log

3

4y

415 + 782

)+ 1366 log

3

4y

415 + 3755

le 13

15eγ log y log log y + 239776 log y + 122628 log log y + 237304

le (213522 log y + 18118) log log y

272 CHAPTER 14 CONCLUSION

Moreover again by (1442)radicz(r1) le

radiceγ log log y minus 143644

2radiceγ log log y

and so by y ge 1025

(071219 log3

4y

415 + 05)

radicz(r1)

le (018992 log y + 029512)

(radiceγ log log y minus 143644

2radiceγ log log y

)le 019505

radiceγ log log y minus 019505 middot 143644 log y

2radiceγ log log y

le 026031 log yradic

log log y minus 300147

Therefore by (1119)

gyϕ(r1) le 026031 log yradic

log log y + 25minus 300147radic34y

415

+(213522 log y + 18118) log log y

38y

415

+336((log y)2)16

y16

le 030059 log yradic

log log y

y215

+569392 log y log log y

y415

minus 057904

y215

+483147 log log y

y415

+2994(log y)16

y16

le 030059 log yradic

log log y

y215

+569392 log y log log y

y415

+130151(log y)16

y16

le 030915 log yradic

log log y

y215

where we use y ge 1025 and verify that the functions t 7rarr (log t)16t16minus215 t 7rarrradiclog log tt415minus215 and t 7rarr (log log t)t415minus215 are decreasing for t ge y (just by

taking derivatives)Since κ = 49 one of the terms in (1313) simplifies easily

7

15+minus214938 + 8

15 logκlog x+ 2cminus

le 7

15

By (1430) and y = xκ = x49 we conclude that

7

15g(r1)S le 7

15middot 030915 log y

radiclog log y

y215

middot (0640209 log xminus 0021095)x

le 014427 log yradic

log log y

y215

(0640209 log y + 24705)x le 030517x

(1443)

143 THE MINOR-ARC TOTAL EXPLICIT VERSION 273

where we are using the fact that y 7rarr (log y)2radic

log log yy215 is decreasing for y ge1025 (because y 7rarr (log y)52y215 is decreasing for y ge e754 and 1025 gt e754)

It remains only to bound

2S

log x+ 2cminus

int r1

r0

g(r)

rdr

in the expression (1313) forM We will use the bound on the integral given in (1333)The easiest term to bound there is f1(r0) defined in (1334) since it depends only onr0 for r0 = 150000

f1(r0) = 00169073

It is also not hard to bound f2(r0 x) also defined in (1334)

f2(r0 y) = 336((log y)2)16

x16log

38y

415

r0

le 336(log y)16

(2y)16

(4

15log y + 005699minus log r0

)

where we recall again that x = κy = 49y Thus since r0 = 150000 and y ge 1025

f2(r0 y) le 0001399

Let us now look at the terms I1r cϕ in (1335) We already saw in (1433) that

cϕ =Cϕ2|ϕ|1

logKle 007455

log log y2

le 002219

Since F (t) = eγ log t+ cγ with cγ = 1025742

I1r0 = F (log r0) +2eγ

log r0= 573826 (1444)

It thus remains only to estimate I0r0r1z for z = y and z = yK where K =(log y)2

We will first give estimates for y large Omitting negative terms from (1335) weeasily get the following general bound crude but useful enough

I0r0r1z le R2z2r0 middot

P2(log 2r0)radicr0

+R2z2r1 minus 0414152

log r1r0

Pminus2 (log 2r0)radicr0

where P2(t) = t2 + 4t+ 8 and Pminus2 (t) = 2t2 + 16t+ 48 By (1438) and (1441)

Ry2r1 le 071215 RyK2r1 le 071392

for y ge 1025 Assume now that y ge 10150 Then since r0 = 150000

Ryr0 le 027125 log

(1 +

log 4r0

2 log 9middot(10150)13

2004r0

)+ 041415 le 043086

274 CHAPTER 14 CONCLUSION

and similarly RyKr0 le 043113 Since

0430862 middot P2(log 2r0)radicr0

le 010426 0431132 middot P2(log 2r0)radicr0

le 010439

we obtain that

(1minus cϕ)radicI0r0r1y + cϕ

radicI0r0r1 2y

log y

le 097781 middotradic

010426 +049214

415 log y minus log 400000

+ 002219

radic010439 +

049584415 log y minus log 400000

le 033239

(1445)

for y ge 10150For y between 1025 and 10150 we evaluate the left side of (1445) directly using

the definition (1335) of I0r0r1z instead as well as the bound

cϕ le007455

log log y2

from (1433) (It is clear from the second and third lines of (1332) that I0r0r1z isdecreasing on z for r0 r1 fixed and so the upper bound for cϕ does give the worst case)The bisection method (applied to the interval [25 150] with 30 iterations including 30initial iterations) gives us that

(1minus cϕ)radicI0r0r1y + cϕ

radicI0r0r1 2y

log yle 04153461 (1446)

for 1025 le y le 10140 By (1445) (1446) is also true for y gt 10150 Hence

f0(r0 y) le 04153461 middot

radic2radicr0

573827 le 0071498

By (1333) we conclude thatint r1

r0

g(r)

rdr le 0071498 + 0016908 + 0001399 le 0089805

By (1430)

2S

log x+ 2cminusle 2(0640209x log xminus 0021095x)

log x+ 2cminusle 2 middot 0640209x = 1280418x

where we recall that cminus = 06294 gt 0 Hence

2S

log x+ 2cminus

int r1

r0

g(r)

rdr le 0114988x (1447)

144 CONCLUSION PROOF OF MAIN THEOREM 275

Putting (1437) (1443) and (1447) together we conclude that the quantity Mdefined in (1313) is bounded by

M le 038492x+ 030517x+ 0114988x le 080508x (1448)

Gathering the terms from (1429) (1432) and (1448) we see that Theorem 1321states that the minor-arc total

Zr0 =

int(RZ)M8r0

|Sηlowast(α x)||Sη+(α x)|2dα

is bounded by

Zr0 le

(radic|ϕ|1xκ

(M + T ) +radicSηlowast(0 x) middot E

)2

le(radic|ϕ|1(080508 + 35776 middot 10minus4)

xradicκ

+radic

10532 middot 10minus11xradicκ

)2

le 100948x2

κ

(1449)

for r0 = 150000 x ge 49 middot 1026 where we use yet again the fact that |ϕ|1 =radicπ2

This is our total minor-arc bound

144 Conclusion proof of main theoremAs we have known from the startsum

n1+n2+n3=N

Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

=

intRZ

Sη+(α x)2Sηlowast(α x)e(minusNα)dα

(1450)

We have just shown that assuming N ge 1027 N oddintRZ

Sη+(α x)2Sηlowast(α x)e(minusNα)dα

=

intM8r0

Sη+(α x)2Sηlowast(α x)e(minusNα)dα

+Olowast

(int(RZ)M8r0

|Sη+(α x)|2|Sηlowast(α x)|dα

)

ge 1058259x2

κ+Olowast

(100948

x2

κ

)ge 004877

x2

κ

for r0 = 150000 where x = N(2 + 9(196radic

2π)) as in (1415) (We are using(1427) and (1449)) Recall that κ = 49 and ηlowast(t) = (η2 lowastM ϕ)(κt) where ϕ(t) =

t2eminust22

276 CHAPTER 14 CONCLUSION

It only remains to show that the contribution of terms with n1 n2 or n3 non-primeto the sum in (1450) is negligible (Let us take out n1 n2 n3 equal to 2 as well sincesome prefer to state the ternary Goldbach conjecture as follows every odd numberge 9is the sum of three odd primes) Clearlysum

n1+n2+n3=Nn1 n2 or n3 even or non-prime

Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

le 3|η+|2infin|ηlowast|infinsum

n1+n2+n3=Nn1 even or non-prime

Λ(n1)Λ(n2)Λ(n3)

le 3|η+|2infin|ηlowast|infinmiddot(logN)sum

n1 le N non-primeor n1 = 2

Λ(n1)sumn2leN

Λ(n2)

(1451)

By (143) and (1421) |η+|infin le 1079955 and |ηlowast|infin le 1414 By [RS62 Thms 12and 13] sum

n1 le N non-primeor n1 = 2

Λ(n1) lt 14262radicN + log 2 lt 14263

radicN

sumn1 le N non-prime

or n1 = 2

Λ(n1)sumn2leN

Λ(n2) = 14263radicN middot 103883N le 148169N32

Hence the sum on the first line of (1451) is at most

73306N32 logN

Thus for N ge 1027 oddsumn1+n2+n3=N

n1 n2 n3 odd primes

Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

ge 004877x2

κminus 73306N32 logN

ge 000024433N2 minus 14412 middot 10minus11 middotN2 ge 00002443N2

by κ = 49 and (1415) Since 00002443N2 gt 0 this shows that every odd numberN ge 1027 can be written as the sum of three odd primes

Since the ternary Goldbach conjecture has already been checked for allN le 8875middot1030 [HP13] we conclude that every odd number N gt 7 can be written as the sumof three odd primes and every odd number N gt 5 can be written as the sum of threeprimes The main result is hereby proven the ternary Goldbach conjecture is true

Part IV

Appendices

277

Appendix A

Norms of smoothing functions

Our aim here is to give bounds on the norms of some smoothing functions ndash and inparticular on several norms of a smoothing function η+ [0infin) rarr R based on theGaussian ηhearts(t) = eminust

22As before we write

h t 7rarr

t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

(A1)

We recall that we will work with an approximation η+ to the function η [0infin)rarr Rdefined by

η(t) = h(t)ηhearts(t) =

t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

(A2)

The approximation η+ is defined by

η+(t) = hH(t)teminust22 (A3)

where

FH(t) =sin(H log y)

π log y

hH(t) = (h lowastM FH)(y) =

int infin0

h(tyminus1)FH(y)dy

y

(A4)

and H is a positive constant to be set later By (28) MhH = Mh middotMFH Now FH isjust a Dirichlet kernel under a change of variables using this we get that for τ real

MFH(iτ) =

1 if |τ | lt H 12 if |τ | = H 0 if |τ | gt H

(A5)

279

280 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

Thus

MhH(iτ) =

Mh(iτ) if |τ | lt H 12Mh(iτ) if |τ | = H 0 if |τ | gt H

(A6)

As it turns out h η and Mh (and hence MhH ) are relatively easy to work withwhereas we can already see that hH and η+ have more complicated definitions Partof our work will consist in expressing norms of hH and η+ in terms of norms of h ηand Mh

A1 The decay of a Mellin transformNow consider any φ [0infin) rarr C that (a) has compact support (or fast decay) (b)satisfies φ(k)(t)tkminus1 = O(1) for trarr 0+ and 0 le k le 3 and (c) is C2 everywhere andquadruply differentiable outside a finite set of points

By definition

Mφ(s) =

int infin0

φ(x)xsdx

x

Thus by integration by parts for lt(s) gt minus1 and s 6= 0

Mφ(s) =

int infin0

φ(x)xsdx

x= limtrarr0+

int infint

φ(x)xsdx

x= minus lim

trarr0+

int infint

φprime(x)xs

sdx

= limtrarr0+

int infint

φprimeprime(x)xs+1

s(s+ 1)dx = lim

trarr0+minusint infint

φ(3)(x)xs+2

s(s+ 1)(s+ 2)dx

= limtrarr0+

int infint

φ(4)(x)xs+3

s(s+ 1)(s+ 2)(s+ 3)dx

(A7)where φ(4)(x) is understood in the sense of distributions at the finitely many pointswhere it is not well-defined as a function

Let s = it φ = h Let Ck = limtrarr0+

intinfint|h(k)(x)|xkminus1dx for 0 le k le 4 Then

(A7) gives us that

Mh(it) le min

(C0

C1

|t|

C2

|t||t+ i|

C3

|t||t+ i||t+ 2i|

C4

|t||t+ i||t+ 2i||t+ 3i|

)

(A8)We must estimate the constants Cj 0 le j le 4

Clearly h(t)tminus1 = O(1) as t rarr 0+ hk(t) = O(1) as t rarr 0+ for all k ge 1h(2) = hprime(2) = hprimeprime(2) = 0 and h(x) hprime(x) and hprimeprime(x) are all continuous Thefunction hprimeprimeprime has a discontinuity at t = 2 As we said we understand h(4) in the senseof distributions at t = 2 for example limεrarr0

int 2+ε

2minusε h(4)(t)dt = limεrarr0(h(3)(2 + ε)minus

h(3)(2minus ε))Symbolic integration easily gives that

C0 =

int 2

0

t(2minus t)3etminus12dt = 92eminus12 minus 12e32 = 202055184 (A9)

A1 THE DECAY OF A MELLIN TRANSFORM 281

We will have to compute Ck 1 le k le 4 with some care due to the absolute valueinvolved in the definition

The function (x2(2minus x)3exminus12)prime = ((x2(2minus x)3)prime + x2(2minus x)3)exminus12 has thesame zeros as H1(x) = (x2(2minus x)3)prime + x2(2minus x)3 namely minus4 0 1 and 2 The signof H1(x) (and hence of hprime(x)) is + within (0 1) and minus within (1 2) Hence

C1 =

int infin0

|hprime(x)|dx = |h(1)minus h(0)|+ |h(2)minus h(1)| = 2h(1) = 2radice (A10)

The situation with (x2(2 minus x)3exminus12)primeprime is similar it has zeros at the roots ofH2(x) = 0 where H2(x) = H1(x) + H prime1(x) (and in general Hk+1(x) = Hk(x) +H primek(x)) This time we will prefer to find the roots numerically It is enough to find(candidates for) the roots using any available tool1 and then check rigorously that thesign does change around the purported roots In this way we check thatH2(x) = 0 hastwo roots α21 α22 in the interval (0 2) another root at 2 and two more roots outside[0 2] moreover

α21 = 048756597185712

α22 = 148777169309489 (A11)

where we verify the root using interval arithmetic The sign of H2(x) (and hence ofhprimeprime(x)) is first + then minus then + Write α20 = 0 α23 = 2 By integration by parts

C2 =

int infin0

|hprimeprime(x)|x dx =

int α21

0

hprimeprime(x)x dxminusint α22

α21

hprimeprime(x)x dx+

int 2

α22

hprimeprime(x)x dx

=

3sumj=1

(minus1)j+1

(hprime(x)x|α2j

α2jminus1minusint α2j

α2jminus1

hprime(x) dx

)

= 2

2sumj=1

(minus1)j+1 (hprime(α2j)α2j minus h(α2j)) = 1079195821037

(A12)

To compute C3 we proceed in the same way finding two roots of H3(x) = 0(numerically) within the interval (0 2) viz

α31 = 104294565694978

α32 = 180999654602916

The sign of H3(x) on the interval [0 2] is first minus then + then minus Write α30 = 0α33 = 2 Proceeding as before ndash with the only difference that the integration by parts

1Routine find root in SAGE was used here

282 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

is iterated once now ndash we obtain that

C3 =

int infin0

|hprimeprimeprime(x)|x2dx =

3sumj=1

(minus1)jint α3j

α3jminus1

hprimeprimeprime(x)x2dx

=

3sumj=1

(minus1)j

(hprimeprime(x)x2|α3j

α3jminus1minusint α3j

α3jminus1

hprimeprime(x) middot 2x

)dx

=

3sumj=1

(minus1)j(hprimeprime(x)x2 minus hprime(x) middot 2x+ 2h(x)

)|α3jα3jminus1

= 2

2sumj=1

(minus1)j(hprimeprime(α3j)α23j minus 2hprime(α3j)α3j + 2h(α3j))

(A13)

and so interval arithmetic gives us

C3 = 751295251672 (A14)

The treatment of the integral in C4 is very similar at least as first There are tworoots of H4(x) = 0 in the interval (0 2) namely

α41 = 045839599852663

α42 = 154626346975533

The sign ofH4(x) on the interval [0 2] is firstminus + thenminus Using integration by partsas before we obtainint 2minus

0+

∣∣∣h(4)(x)∣∣∣x3dx

= minusint α41

0+

h(4)(x)x3dx+

int α42

α41

h(4)(x)x3dxminusint 2minus

α41

h(4)(x)x3dx

= 2

2sumj=1

(minus1)j(h(3)(α4j)α

34j minus 3h(2)(α4j)α

24j + 6hprime(α4j)α4j minus 6h(α4j)

)minus limtrarr2minus

h(3)(t)t3 = 115269754862

since limtrarr0+ h(k)(t)tk = 0 for 0 le k le 3 limtrarr2minus h(k)(t) = 0 for 0 le k le 2 and

limtrarr2minus h(3)(t) = minus24e32 Nowint infin

2minus|h(4)(x)x3|dx = lim

εrarr0+|h(3)(2 + ε)minus h(3)(2minus ε)| middot 23 = 23 middot 24e32

Hence

C4 =

int 2minus

0+

∣∣∣h(4)(x)∣∣∣x3dx+ 24e32 middot 23 = 201318185012 (A15)

A2 THE DIFFERENCE η+ minus η IN `2 NORM 283

We finish by remarking that can write down Mh explicitly

Mh = minuseminus12(minus1)minuss(8γ(s+2minus2)+12γ(s+3minus2)+6γ(s+4minus2)+γ(s+5minus2))(A16)

where γ(s x) is the (lower) incomplete Gamma function

γ(s x) =

int x

0

eminusttsminus1dt

We will however find it easier to deal with Mh by means of the bound (A8) in partbecause (A16) amounts to an invitation to numerical instability

For instance it is easy to use (A8) to give a bound for the `1-norm of Mh(it)Since C4C3 gt C3C2 gt C2C1 gt C1C0

|Mh(it)|1 = 2

int infin0

Mh(it)dt

le2

(C0C1

C0+ C1

int C2C1

C1C0

dt

t+ C2

int C3C2

C2C1

dt

t2+ C3

int C4C3

C3C2

dt

t3+ C4

int infinC4C3

dt

t4

)

=2

(C1 + C1 log

C2C0

C21

+ C2

(C1

C2minus C2

C3

)+C3

2

(C2

2

C23

minus C23

C24

)+C4

3middot C

33

C34

)

and so|Mh(it)|1 le 161939176 (A17)

This bound is far from tight but it will certainly be usefulSimilarly |(t+ i)Mh(it)|1 is at most two times

C0

int C1C0

0

|t+ i| dt+ C1

int C2C1

C1C0

∣∣∣∣1 +i

t

∣∣∣∣ dt+ C2

int C3C2

C2C1

dt

t+ C3

int C4C3

C3C2

dt

t2+ C4

int infinC4C3

dt

t3

=C0

2

(radicC4

1

C40

+C2

1

C20

+ sinhminus1 C1

C0

)+ C1

(radict2 + 1 + log

(radict2 + 1minus 1

t

))|C2C1C1C0

+ C2 logC3C1

C22

+ C3

(C2

C3minus C3

C4

)+C4

2

C23

C24

and so|(t+ i)Mh(it)|1 le 278622803 (A18)

A2 The difference η+ minus η in `2 norm

We wish to estimate the distance in `2 norm between η and its approximation η+ Thiswill be an easy affair since on the imaginary axis the Mellin transform of η+ is just atruncation of the Mellin transform of η

284 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

By (A2) and (A3)

|η+ minus η|22 =

int infin0

∣∣∣hH(t)teminust22 minus h(t)teminust

22∣∣∣2 dt

le(

maxtge0

eminust2

t3)middotint infin

0

|hH(t)minus h(t)|2 dtt

(A19)

The maximum maxtge0 t3eminust

2

is (32)32eminus32 Since the Mellin transform is anisometry (ie (26) holds)int infin

0

|hH(t)minus h(t)|2 dtt

=1

int infinminusinfin|MhH(it)minusMh(it)|2dt =

1

π

int infinH

|Mh(it)|2dt

(A20)By (A8) int infin

H

|Mh(it)|2dt leint infinH

C24

t8dt le C2

4

7H7 (A21)

Hence int infin0

|hH(t)minus h(t)|2 dttle C2

4

7πH7 (A22)

Using the bound (A15) for C4 we conclude that

|η+ minus η|2 leC4radic7π

(3

2e

)34

middot 1

H72le 274856893

H72 (A23)

It will also be useful to bound∣∣∣∣int infin0

(η+(t)minus η(t))2 log t dt

∣∣∣∣ This is at most (

maxtge0

eminust2

t3| log t|)middotint infin

0

|hH(t)minus h(t)|2 dtt

Now

maxtge0

eminust2

t3| log t| = max

(maxtisin[01]

eminust2

t3(minus log t) maxtisin[15]

eminust2

t3 log t

)= 014882234545

where we find the maximum by the bisection method with 40 iterations (see 26)Hence by (A22)int infin

0

(η+(t)minus η(t))2| log t|dt le 0148822346C2

4

le 27427502

H7le(

16561251

H72

)2

(A24)

A3 NORMS INVOLVING η+ 285

A3 Norms involving η+

Let us now bound some `1- and `2-norms involving η+ Relatively crude bounds willsuffice in most cases

First by (A23)

|η+|2 le |η|2 + |η+ minus η|2 le 0800129 +2748569

H72

|η+|2 ge |η|2 minus |η+ minus η|2 ge 0800128minus 2748569

H72

(A25)

where we obtain

|η|2 =radic

0640205997 = 08001287 (A26)

by symbolic integrationLet us now bound |η+ middot log |22 By isometry and (210)

|η+ middot log |22 =1

2πi

int 12 +iinfin

12minusiinfin

|M(η+ middot log)(s)|2ds =1

2πi

int 12 +iinfin

12minusiinfin

|(Mη+)prime(s)|2ds

Now (Mη+)prime(12 + it) equals 12π times the additive convolution of MhH(it) and(Mηdiams)prime(12 + it) where ηdiams(t) = teminust

22 Hence by Youngrsquos inequality

|(Mη+)prime(12 + it)|2 le1

2π|MhH(it)|1|(Mηdiams)prime(12 + it)|2

Again by isometry and (210)

|(Mηdiams)prime(12 + it)|2 =radic

2π|ηdiams middot log |2

Hence by (A17)

|η+ middot log |2 le1

2π|MhH(it)|1|ηdiams middot log |2 le 25773421 middot |ηdiams middot log |2

Since by symbolic integration

|ηdiams middot log |2 leradicradic

π

32(8(log 2)2 + 2γ2 + π2 + 8(γ minus 2) log 2minus 8γ)

le 03220301

(A27)

we get that|η+ middot log |2 le 08299818 (A28)

Let us bound |η+(t)tσ|1 for σ isin (minus2infin) By Cauchy-Schwarz and Plancherel

|η+(t)tσ|1 =∣∣∣hH(t)t1+σeminust

22∣∣∣1le∣∣∣tσ+32eminust

22∣∣∣2|hH(t)

radict|2

=∣∣∣tσ+32eminust

22∣∣∣2

radicint infin0

|hH(t)|2 dtt

=∣∣∣tσ+32eminust

22∣∣∣2middot

radic1

int H

minusH|Mh(ir)|2dr

le∣∣∣tσ+32eminust

22∣∣∣2middot

radic1

int infinminusinfin|Mh(ir)|2dr =

∣∣∣tσ+32eminust22∣∣∣2middot |h(t)

radict|2

(A29)

286 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

Since ∣∣∣tσ+32eminust22∣∣∣2

=

radicint infin0

eminust2t2σ+3dt =

radicΓ(σ + 2)

2

|h(t)radict|2 =

radic31989

8eminus 585e3

8le 15023459

we conclude that|η+(t)tσ|1 le 1062319 middot

radicΓ(σ + 2) (A30)

for σ gt minus2

A4 Norms involving ηprime+By one of the standard transformation rules (see (210)) the Mellin transform of ηprime+equals minus(sminus 1) middotMη+(sminus 1) Since the Mellin transform is an isometry in the senseof (26)

|ηprime+|22 =1

2πi

int 12 +iinfin

12minusiinfin

∣∣M(ηprime+)(s)∣∣2 ds =

1

2πi

int minus 12 +iinfin

minus 12minusiinfin

|s middotMη+(s)|2 ds

Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) = teminust22 Thus by (29) the func-

tion Mη+(minus12 + it) equals 12π times the (additive) convolution of MhH(it) andMηdiams(minus12 + it) Therefore for s = minus12 + it

|s| |Mη+(s)| = |s|2π

int H

minusHMh(ir)Mηdiams(sminus ir)dr

le 3

int H

minusH|ir minus 1||Mh(ir)| middot |sminus ir||Mηhearts(sminus ir)|dr

=3

2π(f lowast g)(t)

(A31)

where f(t) = |it minus 1||Mh(it)| and g(t) = | minus 12 + it||Mηdiams(minus12 + it)| (Since|(minus12 + i(tminus r)) + (1 + ir)| = |12 + it| = |s| either | minus 12 + i(tminus r)| ge |s|3 or|1+ir| ge 2|s|3 hence |sminusir||irminus1| = |minus12+i(tminusr)||1+ir| ge |s|3) By Youngrsquosinequality (in a special case that follows from Cauchy-Schwarz) |f lowast g|2 le |f |1|g|2By (A18)

|f |1 = |(r + i)Mh(ir)|1 le 278622803

Yet again by Plancherel

|g|22 =

int minus 12 +iinfin

minus 12minusiinfin

|s|2|Mηdiams(s)|2ds

=

int 12 +iinfin

12minusiinfin

|(M(ηprimediams))(s)|2ds = 2π|ηprimediams|22 =3π

32

4

A4 NORMS INVOLVING ηprime+ 287

Hence

|ηprime+|2 le1radic2πmiddot 3

2π|f lowast g|2

le 1radic2π

3

2πmiddot 278622803

radic3π

32

4le 10845789

(A32)

Let us now bound |ηprime+(t)tσ|1 for σ isin (minus1infin) First of all

|ηprime+(t)tσ|1 =

∣∣∣∣(hH(t)teminust22)primetσ∣∣∣∣1

le∣∣∣(hprimeH(t)teminust

22 + hH(t)(1minus t2)eminust22)middot tσ∣∣∣1

le∣∣∣hprimeH(t)tσ+1eminust

22∣∣∣1

+ |η+(t)tσminus1|1 + |η+(t)tσ+1|1

We can bound the last two terms by (A30) Much as in (A29) we note that∣∣∣hprimeH(t)tσ+1eminust22∣∣∣1le∣∣∣tσ+12eminust

22∣∣∣2|hprimeH(t)

radict|2

and then see that

|hprimeH(t)radict|2 =

radicint infin0

|hprimeH(t)|2t dt =

radic1

int infinminusinfin|M(hprimeH)(1 + ir)|2dr

=

radic1

int infinminusinfin|(minusir)MhH(ir)|2dr =

radic1

int H

minusH|(minusir)Mh(ir)|2dr

=

radic1

int H

minusH|M(hprime)(1 + ir)|2dr le

radic1

int infinminusinfin|M(hprime)(1 + ir)|2dr = |hprime(t)

radict|2

where we use the first rule in (210) twice Since

∣∣∣tσ+12eminust22∣∣∣2

=

radicΓ(σ + 1)

2 |hprime(t)

radict|2 =

radic103983

16eminus 1899e3

16= 26312226

we conclude that

|ηprime+(t)tσ|1 le 1062319 middot (radic

Γ(σ + 1) +radic

Γ(σ + 3)) +

radicΓ(σ + 1)

2middot 26312226

le 2922875radic

Γ(σ + 1) + 1062319radic

Γ(σ + 3)(A33)

for σ gt minus1

288 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

A5 The `infin-norm of η+

Let us now get a bound for |η+|infin Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) =

teminust22 Clearly

|η+|infin = |hH(t)ηdiams(t)|infin le |η|infin + |(h(t)minus hH(t))ηdiams(t)|infin

le |η|infin +

∣∣∣∣h(t)minus hH(t)

t

∣∣∣∣infin|ηdiams(t)t|infin

(A34)

Taking derivatives we easily see that

|η|infin = η(1) = 1 |ηdiams(t)t|infin = 2e

It remains to bound |(h(t)minus hH(t))t|infin By (76)

hH(t) =

int infint2

h(tyminus1)sin(H log y)

π log y

dy

y=

int infinminusH log 2

t

h

(t

ewH

)sinw

πwdw (A35)

The sine integral

Si(x) =

int x

0

sin t

tdt

is defined for all x it tends to π2 as xrarr +infin and to minusπ2 as xrarr minusinfin (see [AS64(5225)]) We apply integration by parts to the second integral in (A35) and obtain

hH(t)minus h(t) = minus 1

π

int infinminusH log 2

t

(d

dwh

(t

ewH

))Si(w)dw minus h(t)

= minus 1

π

int infin0

(d

dwh

(t

ewH

))(Si(w)minus π

2

)dw

minus 1

π

int 0

minusH log 2t

(d

dwh

(t

ewH

))(Si(w) +

π

2

)dw

Now ∣∣∣∣ ddwh(

t

ewH

)∣∣∣∣ =teminuswH

H

∣∣∣∣hprime( t

ewH

)∣∣∣∣ le t|hprime|infinHewH

Integration by parts easily yields the bounds |Si(x) minus π2| lt 2x for x gt 0 and|Si(x) + π2| lt 2|x| for x lt 0 we also know that 0 le Si(x) le x lt π2 forx isin [0 1] and minusπ2 lt x le Si(x) le 0 for x isin [minus1 0] Hence

|hH(t)minus h(t)| le 2t|hprime|infinπH

(int 1

0

π

2eminuswHdw +

int infin1

2eminuswH

wdw

)= t|hprime|infin middot

((1minus eminus1H) +

4

π

E1(1H)

H

)

where E1 is the exponential integral

E1(z) =

int infinz

eminust

tdt

A5 THE `infin-NORM OF η+ 289

By [AS64 (5120)]

0 lt E1(1H) ltlog(H + 1)

e1H

and since log(H+1) = logH+log(1+1H) lt logH+1H lt (logH)(1+1H) lt(logH)e1H for H ge e we see that this gives us that E1(1H) lt logH (again forH ge e as is the case) Hence

|hH(t)minus h(t)|t

lt |hprime|infin middot(

1minus eminus 1H +

4

π

logH

H

)lt |hprime|infin middot

1 + 4π logH

H (A36)

and so by (A34)

|η+|infin le 1 +2

e

∣∣∣∣h(t)minus hH(t)

t

∣∣∣∣infinlt 1 +

2

e|hprime|infin middot

1 + 4π logH

H

By (A11) and interval arithmetic we determine that

|hprime|infin = |hprime(α22)| le 2805820379671 (A37)

where α22 is a root of hprimeprime(x) = 0 as in (A11) We have proven

|η+|infin lt 1+2

emiddot280582038 middot

1 + 4π logH

Hlt 1+206440727 middot

1 + 4π logH

H (A38)

We will need three other bounds of this kind namely for η+(t) log t η+(t)t andη+(t)t We start as in (A34)

|η+ log t|infin le |η log t|infin + |(h(t)minus hH(t))ηdiams(t) log t|infinle |η log t|infin + |(hminus hH(t))t|infin|ηdiams(t)t log t|infin

|η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)|infin|η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)t2|infin

(A39)

By the bisection method with 30 iterations implemented with interval arithmetic

|η(t) log t|infin le 0279491 |ηdiams(t)t log t|infin le 03811561

Hence by (A36) and (A37)

|η+ log t|infin le 0279491 + 1069456 middot1 + 4

π logH

H (A40)

By the bisection method with 32 iterations

|η(t)t|infin le 108754396

(We can also obtain this by solving (η(t)t)prime = 0 symbolically) It is easy to show

that |ηdiams|infin = 1radice Hence again by (A36) and (A37)

|η+(t)t|infin le 108754396 + 170181609 middot1 + 4

π logH

H (A41)

290 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

By the bisection method with 32 iterations

|η(t)t|infin le 106473476

Taking derivatives we see that |ηdiams(t)t2|infin = 332eminus32 Hence yet again by (A36)and (A37)

|η+(t)t|infin le 106473476 + 325312 middot1 + 4

π logH

H (A42)

Appendix B

Norms of Fourier transforms

B1 The Fourier transform of ηprimeprime2Our aim here is to give upper bounds on |ηprimeprime2 |infin where η2 is as in (34) We will doconsiderably better than the trivial bound |ηprimeprime|infin le |ηprimeprime|1

Lemma B11 For every t isin R

|4e(minust4)minus 4e(minust2) + e(minust)| le 787052 (B1)

We will describe an extremely simple but rigorous procedure to find the maxi-mum Since |g(t)|2 is C2 (in fact smooth) there are several more efficient and equallyrigourous algorithms ndash for starters the bisection method with error bounded in termsof |(|g|2)primeprime|infin

Proof Letg(t) = 4e(minust4)minus 4e(minust2) + e(minust) (B2)

For a le t le b

g(t) = g(a) +tminus abminus a

(g(b)minus g(a)) +1

8(bminus a)2 middotOlowast( max

visin[ab]|gprimeprime(v)|) (B3)

(This formula in all likelihood well-known is easy to derive First we can assumewithout loss of generality that a = 0 b = 1 and g(a) = g(b) = 0 Dividing by gby g(t) we see that we can also assume that g(t) is real (and in fact 1) We can alsoassume that g is real-valued in that it will be enough to prove (B3) for the real-valuedfunction ltg as this will give us the bound g(t) = ltg(t) le (18) maxv |(ltg)primeprime(v)| lemaxv |gprimeprime(v)| that we wish for Lastly we can assume (by symmetry) that 0 le t le 12and that g has a local maximum or minimum at t Writing M = maxuisin[01] |gprimeprime(u)|we then have

g(t) =

int t

0

gprime(v)dv =

int t

0

int v

t

gprimeprime(u)dudv = Olowast(int t

0

∣∣∣∣int v

t

Mdu

∣∣∣∣ dv)= Olowast

(int t

0

(v minus t)Mdv

)= Olowast

(1

2t2M

)= Olowast

(1

8M

)

291

292 APPENDIX B NORMS OF FOURIER TRANSFORMS

as desired)We obtain immediately from (B3) that

maxtisin[ab]

|g(t)| le max(|g(a)| |g(b)|) +1

8(bminus a)2 middot max

visin[ab]|gprimeprime(v)| (B4)

For any v isin R

|gprimeprime(v)| le(π

2

)2

middot 4 + π2 middot 4 + (2π)2 = 9π2 (B5)

Clearly g(t) depends only on t mod 4π Hence by (B4) and (B5) to estimate

maxtisinR|g(t)|

with an error of at most ε it is enough to subdivide [0 4π] into intervals of lengthleradic

8ε9π2 each We set ε = 10minus6 and compute

Lemma B12 Let η2 R+ rarr R be as in (34) Then

|ηprimeprime2 |infin le 31521 (B6)

This should be compared with |ηprimeprime2 |1 = 48

Proof We can write

ηprimeprime2 (x) = 4(4δ14(x)minus 4δ12(x) + δ1(x)) + f(x) (B7)

where δx0is the point measure at x0 of mass 1 (Dirac delta function) and

f(x) =

0 if x lt 14 or x ge 1minus4xminus2 if 14 le x lt 124xminus2 if 12 le x lt 1

Thus ηprimeprime2 (t) = 4g(t) + f(t) where g is as in (B2) It is easy to see that |f prime|1 =2 maxx f(x)minus 2 minx f(x) = 160 Therefore∣∣∣f(t)

∣∣∣ =∣∣∣f prime(t)(2πit)∣∣∣ le |f prime|1

2π|t|=

80

π|t| (B8)

Since 31521 minus 4 middot 787052 = 003892 we conclude that (B6) follows from LemmaB11 and (B8) for |t| ge 655 gt 80(π middot 003892)

It remains to check the range t isin (minus655 655) since 4g(minust)+f(minust) is the complexconjugate of 4g(t) + f(t) it suffices to consider t non-negative We use (B4) (with4g+ f instead of g) and obtain that to estimate maxtisinR |4g+ f(t)| with an error of at

most ε it is enough to subdivide [0 655) into intervals of lengthleradic

2ε|(4g + f)primeprime|infineach and check |4g + f(t)| at the endpoints Now for every t isin R∣∣∣∣(f)primeprime (t)∣∣∣∣ =

∣∣∣(minus2πi)2x2f(t)∣∣∣ = (2π)2 middotOlowast

(|x2f |1

)= 12π2

B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 293

By this and (B5) |(4g + f)primeprime|infin le 48π2 Thus intervals of length δ1 give an errorterm of size at most 24π2δ2

1 We choose δ1 = 0001 and obtain an error term less than0000237 for this stage

To evaluate f(t) (and hence 4g(t) + f(t)) at a point we integrate using Simpsonrsquosrule on subdivisions of the intervals [14 12] [12 1] into 200 middotmax(1 b

radic|t|c) sub-

intervals each1 The largest value of f(t) we find is 3152065 with an error termof at most 45 middot 10minus5

B2 Bounds involving a logarithmic factor

Our aim now is to give upper bounds on |ηprimeprime(y)|infin where η(y)(t) = log(yt)η2(t) andy ge 4

Lemma B21 Let η2 R+ rarr R be as in (34) Let η(y)(t) = log(yt)η2(t) wherey ge 4 Then

|ηprime(y)|1 lt (log y)|ηprime2|1 (B9)

Proof Recall that supp(η2) = (14 1) For t isin (14 12)

ηprime(y)(t) = (4 log(yt) log 4t)prime =4 log 4t

t+

4 log yt

tge 8 log 4t

tgt 0

whereas for t isin (12 1)

ηprime(y)(t) = (minus4 log(yt) log t)prime = minus4 log yt

tminus 4 log t

t= minus4 log yt2

tlt 0

where we are using the fact that y ge 4 Hence η(y)(t) is increasing on (14 12) anddecreasing on (12 1) it is also continuous at t = 12 Hence |ηprime(y)|1 = 2|η(y)(12)|We are done by

2|η(y)(12)| = 2 logy

2middot η2(12) = log

y

2middot 8 log 2 lt log y middot 8 log 2 = (log y)|ηprime2|1

Lemma B22 Let y ge 4 Let g(t) = 4e(minust4) minus 4e(minust2) + e(minust) and k(t) =2e(minust4)minus e(minust2) Then for every t isin R

|g(t) middot log y minus k(t) middot 4 log 2| le 787052 log y (B10)

Proof By Lemma B11 |g(t)| le 787052 Since y ge 4 k(t) middot (4 log 2) log y le 6For any complex numbers z1 z2 with |z1| |z2| le ` we can have |z1 minus z2| gt ` only if| arg(z1z2)| gt π3 It is easy to check that for all t isin [minus2 2]∣∣∣∣arg

(g(t) middot log y

4 log 2 middot k(t)

)∣∣∣∣ =

∣∣∣∣arg

(g(t)

k(t)

)∣∣∣∣ lt 07 ltπ

3

(It is possible to bound maxima rigorously as in (B4)) Hence (B10) holds1As usual the code uses interval arithmetic (sect26)

294 APPENDIX B NORMS OF FOURIER TRANSFORMS

Lemma B23 Let η2 R+ rarr R be as in (34) Let η(y)(t) = (log yt)η2(t) wherey ge 4 Then

|ηprimeprime(y)|infin lt 31521 middot log y (B11)

Proof Clearly

ηprimeprime(y)(x) = ηprimeprime2 (x)(log y) +

((log x)ηprimeprime2 (x) +

2

xηprime2(x)minus 1

x2η2(x)

)= ηprimeprime2 (x)(log y) + 4(log x)(4δ14(x)minus 4δ12(x) + δ1(x)) + h(x)

where

h(x) =

0 if x lt 14 or x gt 14x2 (2minus 2 log 2x) if 14 le x lt 124x2 (minus2 + 2 log x) if 12 le x lt 1

(Here we are using the expression (B7) for ηprimeprime2 (x)) Hence

ηprimeprime(y)(t) = (4g(t) + f(t))(log y) + (minus16 log 2 middot k(t) + h(t)) (B12)

where k(t) = 2e(minust4)minus e(minust2) Just as in the proof of Lemma B12

|f(t)| le |fprime|1

2π|t|le 80

π|t| |h(t)| le 160(1 + log 2)

π|t| (B13)

Again as before this implies that (B11) holds for

|t| ge 1

π middot 003892

(80 +

160(1 + log 2)

(log 4)

)= 225251

Note also that it is enough to check (B11) for t ge 0 by symmetry Our remaining taskis to prove (B11) for 0 le t le 225221

Let I = [03 225221] [325 365] For t isin I we will have

arg

(4g(t) + f(t)

minus16 log 2 middot k(t) + h(t)

)sub(minusπ

3

) (B14)

(This is actually true for 0 le t le 03 as well but we will use a different strategy inthat range in order to better control error terms) Consequently by Lemma B12 andlog y ge log 4

|ηprimeprime(y)(t)| lt max(|4g(t) + f(t)| middot (log y) |16 log 2 middot k(t)minus h(t)|)

lt max(31521(log y) |48 log 2 + 25|) = 31521 log y

where we bound h(t) by (B13) and by a numerical computation of the maximum of|h(t)| for 0 le t le 4 as in the proof of Lemma B12

It remains to check (B14) Here as in the proof of Lemma B22 the allowableerror is relatively large (the expression on the left of (B14) is actually contained in

B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 295

(minus1 1) for t isin I) We decide to evaluate the argument in (B14) at all t isin 0005Z cap I computing f(t) and h(t) by numerical integration (Simpsonrsquos rule) with a subdivisionof [minus14 1] into 5000 intervals Proceeding as in the proof of Lemma B11 we seethat the sampling induces an error of at most

1

200052 max

visinI((4|gprimeprime(v)|+ |(f)primeprime(t)|) le 00001

848π2 lt 000593 (B15)

in the evaluation of 4g(t) + f(t) and an error of at most

1

200052 max

visinI((16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|)

le 00001

8(16 log 2 middot 6π2 + 24π2 middot (2minus log 2)) lt 00121

(B16)

in the evaluation of 16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|Running the numerical evaluation just described for t isin I the estimates for the left

side of (B14) at the sample points are at most 099134 in absolute value the absolutevalues of the estimates for 4g(t) + f(t) are all at least 27783 and the absolute valuesof the estimates for | minus 16 log 2 middot log k(t) + h(t)| are all at least 21166 Numericalintegration by Simpsonrsquos rule gives errors bounded by 017575 percent Hence theabsolute value of the left side of (B14) is at most

099134 + arcsin

(000593

27783+ 00017575

)+ arcsin

(00121

21166+ 00017575

)le 100271 lt

π

3

for t isin I Lastly for t isin [0 03] cup [325 365] a numerical computation (samples at 0001Z

interpolation as in Lemma B12 integrals computed by Simpsonrsquos rule with a subdi-vision into 1000 intervals) gives

maxtisin[003]cup[325365]

(|(4g(t) + f(t))|+ | minus 16 log 2 middot k(t) + h(t)|

log 4

)lt 2908

and so maxtisin[003]cup[325365] |ηprimeprime(y)|infin lt 291 log y lt 31521 log y

An easy integral gives us that the function log middotη2 satisfies

| log middotη2|1 = 2minus log 4 (B17)

The following function will appear only in a lower-order term thus an `1 estimate willdo

Lemma B24 Let η2 R+ rarr R be as in (34) Then

|(log middotη2)primeprime|1 = 96 log 2 (B18)

296 APPENDIX B NORMS OF FOURIER TRANSFORMS

Proof The function log middotη(t) is 0 for t isin [14 1] is increasing and negative for t isin(14 12) and is decreasing and positive for t isin (12 1) Hence

|(log middotη2)primeprime|infin = 2

((log middotη2)prime

(1

2

)minus (log middotη2)prime

(1

4

))= 2(16 log 2minus (minus32 log 2)) = 96 log 2

Appendix C

Sums involving Λ and φ

C1 Sums over primesHere we treat some sums of the type

sumn Λ(n)ϕ(n) where ϕ has compact support

Since the sums are over all integers (not just an arithmetic progression) and there is nophase e(αn) involved the treatment is relatively straightforward

The following is standard

Lemma C11 (Explicit formula) Let ϕ [1infin) rarr C be continuous and piecewiseC1 with ϕprimeprime isin `1 let it also be of compact support contained in [1infin) Thensum

n

Λ(n)ϕ(n) =

int infin1

(1minus 1

x(x2 minus 1)

)ϕ(x)dxminus

sumρ

(Mϕ)(ρ) (C1)

where ρ runs over the non-trivial zeros of ζ(s)

The non-trivial zeros of ζ(s) are of course those in the critical strip 0 lt lt(s) lt 1Remark Lemma C11 appears as exercise 5 in [IK04 sect55] the condition there

that ϕ be smooth can be relaxed since already the weaker assumption that ϕprimeprime be in L1

implies that the Mellin transform (Mϕ)(σ + it) decays quadratically on t as t rarr infinthereby guaranteeing that the sum

sumρ(Mϕ)(ρ) converges absolutely

Lemma C12 Let x ge 10 Let η2 be as in (117) Assume that all non-trivial zeros ofζ(s) with |=(s)| le T0 lie on the critical line

Thensumn

Λ(n)η2

(nx

)= x+Olowast

(0135x12 +

97

x2

)+

log eT0

T0

(94

2π+

603

T0

)x

(C2)In particular with T0 = 3061 middot 1010 in the assumption we have for x ge 2000sum

n

Λ(n)η2

(nx

)= (1 +Olowast(ε))x+Olowast(0135x12)

where ε = 273 middot 10minus10

297

298 APPENDIX C SUMS INVOLVING Λ AND φ

The assumption that all non-trivial zeros up to T0 = 3061 middot 1010 lie on the criticalline was proven rigorously in [Plaa] higher values of T0 have been reached elsewhere([Wed03] [GD04])

Proof By Lemma C11sumn

Λ(n)η2

(nx

)=

int infin1

η2

(t

x

)dtminus

int infin1

η2(tx)

t(t2 minus 1)dtminus

sumρ

(Mϕ)(ρ)

where ϕ(u) = η2(ux) and ρ runs over all non-trivial zeros of ζ(s) Since η2 is non-negative

intinfin1η2(tx)dt = x|η2|1 = x whileint infin

1

η2(tx)

t(t2 minus 1)dt = Olowast

(int 1

14

η2(t)

tx2(t2 minus 1100)dt

)= Olowast

(961114

x2

)

By (211)

sumρ

(Mϕ)(ρ) =sumρ

Mη2(ρ) middot xρ =sumρ

(1minus 2minusρ

ρ

)2

= S1(x)minus 2S1(x2) + S1(x4)

whereSm(x) =

sumρ

ρm+1 (C3)

Setting aside the contribution of all ρ with |=(ρ)| le T0 and all ρ with |=(ρ)| gt T0 andlt(s) le 12 and using the symmetry provided by the functional equation we obtain

|Sm(x)| le x12 middotsumρ

1

|ρ|m+1+ x middot

sumρ

|=(ρ)|gtT0

|lt(ρ)|gt12

1

|ρ|m+1

le x12 middotsumρ

1

|ρ|m+1+x

2middotsumρ

|=(ρ)|gtT0

1

|ρ|m+1

We bound the first sum by [Ros41 Lemma 17] and the second sum by [RS03 Lemma2] We obtain

|Sm(x)| le(

1

2mπTm0+

268

Tm+10

)x log

eT0

2π+ κmx

12 (C4)

where κ1 = 00463 κ2 = 000167 and κ3 = 00000744Hence∣∣∣∣∣sum

ρ

(Mη)(ρ) middot xρ∣∣∣∣∣ le

(1

2πT0+

268

T 20

)9x

4log

eT0

2π+

(3

2+radic

2

)κ1x

12

C2 SUMS INVOLVING φ 299

For T0 = 3061 middot 1010 and x ge 2000 we obtainsumn

Λ(n)η2

(nx

)= (1 +Olowast(ε))x+Olowast(0135x12)

where ε = 273 middot 10minus10

Corollary C13 Let η2 be as in (117) Assume that all non-trivial zeros of ζ(s) with|=(s)| le T0 T0 = 3061 middot 1010 lie on the critical line Then for all x ge 1sum

n

Λ(n)η2

(nx

)le min

((1 + ε)x+ 02x12 104488x

) (C5)

where ε = 273 middot 10minus10

Proof Immediate from Lemma C12 for x ge 2000 For x lt 2000 we use computa-tion as follows Since |ηprime2|infin = 16 and

sumx4lenlex Λ(n) le x for all x ge 0 computingsum

nlex Λ(n)η2(nx) only for x isin (11000)Z cap [0 2000] results in an inaccuracy of atmost (16 middot 0000509995)x le 000801x This resolves the matter at all points outside(205 207) (for the first estimate) or outside (95 105) and (135 145) (for the secondestimate) In those intervals the prime powers n involved do not change (since whetherx4 lt n le x depends only on n and [x]) and thus we can find the maximum of thesum in (C5) just by taking derivatives

C2 Sums involving φWe need estimates for several sums involving φ(q) in the denominator

The easiest are convergent sums such assumq micro

2(q)(φ(q)q) We can express thisasprodp(1 + 1(p(pminus 1))) This is a convergent product and the main task is to bound

a tail for r an integer

logprodpgtr

(1 +

1

p(pminus 1)

)lesumpgtr

1

p(pminus 1)lesumngtr

1

n(nminus 1)=

1

r (C6)

A quick computation1 now suffices to give

2591461 lesumq

gcd(q 2)micro2(q)

φ(q)qlt 2591463 (C7)

and so

1295730 lesumq odd

micro2(q)

φ(q)qlt 1295732 (C8)

since the expression bounded in (C8) is exactly half of that bounded in (C7)

1Using D Plattrsquos integer arithmetic package

300 APPENDIX C SUMS INVOLVING Λ AND φ

Again using (C6) we get that

2826419 lesumq

micro2(q)

φ(q)2lt 2826421 (C9)

In what follows we will use values for convergent sums obtained in much the sameway ndash an easy tail bound followed by a computation

By [Ram95 Lemma 34]sumqler

micro2(q)

φ(q)= log r + cE +Olowast(7284rminus13)

sumqlerq odd

micro2(q)

φ(q)=

1

2

(log r + cE +

log 2

2

)+Olowast(4899rminus13)

(C10)

wherecE = γ +

sump

log p

p(pminus 1)= 1332582275 +Olowast(10minus93)

by [RS62 (211)] As we already said in (1215) this supplemented by a computationfor r le 4 middot 107 gives

log r + 1312 lesumqler

micro2(q)

φ(q)le log r + 1354

for r ge 182 In the same way we get that

1

2log r + 083 le

sumqlerq odd

micro2(q)

φ(q)le 1

2log r + 085 (C11)

for r ge 195 (The numerical verification here goes up to 138 middot 108 for r gt 318 middot 108use C11)

Clearly sumqle2rq even

micro2(q)

φ(q)=sumqlerq odd

micro2(q)

φ(q) (C12)

We wish to obtain bounds for the sumssumqger

micro2(q)

φ(q)2

sumqgerq odd

micro2(q)

φ(q)2

sumqgerq even

micro2(q)

φ(q)2

where N isin Z+ and r ge 1 To do this it will be helpful to express some of thequantities within these sums as convolutions2 For q squarefree and j ge 1

micro2(q)qjminus1

φ(q)j=sumab=q

fj(b)

a (C13)

2The author would like to thank O Ramare for teaching him this technique

C2 SUMS INVOLVING φ 301

where fj is the multiplicative function defined by

fj(p) =pj minus (pminus 1)j

(pminus 1)jp fj(p

k) = 0 for k ge 2

We will also find the following estimate useful

Lemma C21 Let j ge 2 be an integer andA a positive real Letm ge 1 be an integerThen sum

ageA(am)=1

micro2(a)

ajle ζ(j)ζ(2j)

Ajminus1middotprodp|m

(1 +

1

pj

)minus1

(C14)

It is useful to note that ζ(2)ζ(4) = 15π2 = 1519817 and ζ(3)ζ(6) =1181564

Proof The right side of (C14) decreases as A increases while the left side dependsonly on dAe Hence it is enough to prove (C14) when A is an integer

For A = 1 (C14) is an equality Let

C =ζ(j)

ζ(2j)middotprodp|m

(1 +

1

pj

)minus1

Let A ge 2 Since sumageA

(am)=1

micro2(a)

aj= C minus

sumaltA

(am)=1

micro2(a)

aj

and

C =suma

(am)=1

micro2(a)

ajlt

sumaltA

(am)=1

micro2(a)

aj+

1

Aj+

int infinA

1

tjdt

=sumaltA

(am)=1

micro2(a)

aj+

1

Aj+

1

(j minus 1)Ajminus1

we obtainsumageA

(am)=1

micro2(a)

aj=

1

Ajminus1middot C +

Ajminus1 minus 1

Ajminus1middot C minus

sumaltA

(am)=1

micro2(a)

aj

ltC

Ajminus1+Ajminus1 minus 1

Ajminus1middot(

1

Aj+

1

(j minus 1)Ajminus1

)minus 1

Ajminus1

sumaltA

(am)=1

micro2(a)

aj

le C

Ajminus1+

1

Ajminus1

((1minus 1

Ajminus1

)(1

A+

1

j minus 1

)minus 1

)

302 APPENDIX C SUMS INVOLVING Λ AND φ

Since (1minus 1A)(1A+ 1) lt 1 and 1A+ 1(j minus 1) le 1 for j ge 3 we obtain that(1minus 1

Ajminus1

)(1

A+

1

j minus 1

)lt 1

for all integers j ge 2 and so the statement follows

We now obtain easily the estimates we want by (C13) and Lemma C21 (withj = 2 and m = 1)sumqger

micro2(q)

φ(q)2=sumqger

sumab=q

f2(b)

a

micro2(q)

qlesumbge1

f2(b)

b

sumagerb

micro2(a)

a2

le ζ(2)ζ(4)

r

sumbge1

f2(b) =15π2

r

prodp

(1 +

2pminus 1

(pminus 1)2p

)le 67345

r

(C15)

Similarly by (C13) and Lemma C21 (with j = 2 and m = 2)sumqgerq odd

micro2(q)

φ(q)2=sumbge1

b odd

f2(b)

b

sumagerba odd

micro2(a)

a2le ζ(2)ζ(4)

1 + 122

1

r

sumb odd

f2(b)

=12

π2

1

r

prodpgt2

(1 +

2pminus 1

(pminus 1)2p

)le 215502

r

(C16)

sumqgerq even

micro2(q)

φ(q)2=sumqger2q odd

micro2(q)

φ(q)2le 431004

r (C17)

Lastlysumqlerq odd

micro2(q)q

φ(q)=sumqlerq odd

micro2(q)sumd|q

1

φ(d)=sumdlerd odd

1

φ(d)

sumqlerd|qq odd

micro2(q) lesumdlerd odd

1

2φ(d)

( rd

+ 1)

le r

2

sumd odd

1

φ(d)d+

1

2

sumdlerd odd

1

φ(d)le 064787r +

log r

4+ 0425

(C18)where we are using (C8) and (C11)

Since we are on the subject of φ(q) let us also prove a simple lemma that we useat various points in the text to bound qφ(q)

Lemma C22 For any q ge 1 and any r ge max(3 q)

q

φ(q)lt z(r)

C2 SUMS INVOLVING φ 303

wherez(r) = eγ log log r +

250637

log log r (C19)

Proof Since z(r) is increasing for r ge 27 the statement follows immediately forq ge 27 by [RS62 Thm 15]

q

φ(q)lt z(q) le z(r)

For q lt 27 it is clear that qφ(q) le 2 middot 3(1 middot 2) = 3 By the arithmeticgeometricmean inequality z(t) ge 2

radiceγ250637 gt 3 for all t gt e and so the lemma holds for

q lt 27

304 APPENDIX C SUMS INVOLVING Λ AND φ

Appendix D

Checking small n by checkingzeros of ζ(s)

In order to show that every odd number n le N is the sum of three primes it is enoughto show for some M le N that

1 every even integer 4 le m leM can be written as the sum of two primes

2 the difference between any two consecutive primes le N is at most M minus 4

(If we want to show that every odd number n le N is the sum of three odd primeswe just replace M minus 4 by M minus 6 in (2)) The best known result of type (1) is thatof Oliveira e Silva Herzog and Pardi ([OeSHP14] M = 4 middot 1018) As for (2) it wasproven in [HP13] for M = 4 middot 1018 and N = 8875694 middot 1030 by a direct computation(valid even if we replace M minus 4 by M minus 6 in the statement of (2))

Alternatively one can establish results of type (2) by means of numerical verifica-tions of the Riemann hypothesis up to a certain height This is a classical approachfollowed in [RS75] and [Sch76] and later in [RS03] we will use the version of (1)kindly provided by Ramare in [Ramd] We carry out this approach in full here notbecause it is preferrable to [HP13] ndash it is still based on computations and it is slightlymore indirect than [HP13] ndash but simply to show that one can establish what we needby a different route

A numerical verification of the Riemann hypothesis up to a certain height consistssimply in checking that all (non-trivial) zeroes z of the Riemann zeta function up to aheight H (meaning =(z) le H) lie on the critical line lt(z) = 12

The height up to which the Riemann hypothesis has actually been fully verified isnot a matter on which there is unanimity The strongest claim in the literature is in[GD04] which states that the first 1013 zeroes of the Riemann zeta function lie on thecritical line lt(z) = 12 This corresponds to checking the Riemann hypothesis up toheight H = 244599 middot 1012 It is unclear whether this computation was or could beeasily made rigorous as pointed out in [SD10 p 2398] it has not been replicated yet

Before [GD04] the strongest results were those of the ZetaGrid distributed com-puting project led by S Wedeniwski [Wed03] the method followed in it was more

305

306 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

traditional and should allow rigorous verification involving interval arithmetic Unfor-tunately the results were never formally published The statement that the ZetaGridproject verified the first 9 middot 1011 zeroes (corresponding to H = 2419 middot 1011) is oftenquoted (eg [Bom10 p 29]) this is the point to which the project had got by thetime of Gourdon and Demichelrsquos announcement Wedeniwski asserts in private com-munication that the project verified the first 1012 zeroes and that the computation wasdouble-checked (by the same method)

The strongest claim prior to ZetaGrid was that of van de Lune (H = 3293 middot 109first 1010 zeroes unpublished) Recently Platt [Plaa] checked the first 11 middot 1011 ze-roes (H = 3061 middot 1010) rigorously following a method essentially based on thatin [Boo06a] Note that [Plaa] uses interval arithmetic which is highly desirable forfloating-point computations

Proposition D03 Every odd integer 5 le n le n0 is the sum of three primes where

n0 =

590698 middot 1029 if [GD04] is used (H = 244 middot 1012)615697 middot 1028 if ZetaGrid results are used (H = 2419 middot 1011)123163 middot 1027 if [Plaa] is used ( H = 3061 middot 1010)

Proof For n le 4 middot 1018 + 3 this is immediate from [OeSHP14] Let 4 middot 1018 + 3 ltn le n0 We need to show that there is a prime p in [n minus 4 minus (n minus 4)∆ n minus 4]where ∆ is large enough for (nminus 4)∆ le 4 middot 1018 minus 4 to hold We will then have that4 le n minus p le 4 + (n minus 4)∆ le 4 middot 1018 Since n minus p is even [OeSHP14] will thenimply that nminus p is the sum of two primes pprime pprimeprime and so

n = p+ pprime + pprimeprime

Since nminus 4 gt 1011 the interval [nminus 4minus (nminus 4)∆ nminus 4] with ∆ = 28314000must contain a prime [RS03] This gives the solution for (nminus4) le 11325 middot1026 sincethen (nminus 4) le 4 middot 1018 minus 4 Note 11325 middot 1026 gt e59

From here onwards we use the tables in [Ramd] to find acceptable values of ∆Since nminus 4 ge e59 we can choose

∆ =

52211882224 if [GD04] is used (case (a))13861486834 if ZetaGrid is used (case (b))307779681 if [Plaa] is used (case (c))

This gives us (n minus 4)∆ le 4 middot 1018 minus 4 for n minus 4 lt er0 where r0 = 67 in case (a)r0 = 66 in case (b) and r0 = 62 in case (c)

If nminus 4 ge er0 we can choose (again by [Ramd])

∆ =

146869130682 in case (a)15392435100 in case (b)307908668 in case (c)

This is enough for nminus4 lt e68 in case (a) and without further conditions for (b) or (c)

307

Finally if nminus 4 ge e68 and we are in case (a) [Ramd] assures us that the choice

∆ = 147674531294

is valid we verify as well that (n0 minus 4)∆ le 4 middot 1018 minus 4

In other words the rigorous results in [Plaa] are enough to show the result for allodd n le 1027 Of course [HP13] is also more than enough and gives stronger resultsthan Prop D03

308 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

Bibliography

[AS64] M Abramowitz and I A Stegun Handbook of mathematical func-tions with formulas graphs and mathematical tables volume 55 ofNational Bureau of Standards Applied Mathematics Series For sale bythe Superintendent of Documents US Government Printing OfficeWashington DC 1964

[BBO10] J Bertrand P Bertrand and J-P Ovarlez Mellin transform In A DPoularikas editor Transforms and applications handbook CRC PressBoca Raton FL 2010

[Bom74] E Bombieri Le grand crible dans la theorie analytique des nombresSociete Mathematique de France Paris 1974 Avec une sommaire enanglais Asterisque No 18

[Bom10] E Bombieri The classical theory of zeta and L-functions Milan JMath 78(1)11ndash59 2010

[Bom76] E Bombieri On twin almost primes Acta Arith 28(2)177ndash193197576

[Boo06a] A R Booker Artinrsquos conjecture Turingrsquos method and the Riemannhypothesis Experiment Math 15(4)385ndash407 2006

[Boo06b] A R Booker Turing and the Riemann hypothesis Notices AmerMath Soc 53(10)1208ndash1211 2006

[Bor56] K G Borodzkin On the problem of I M Vinogradovrsquos constant (inRussian) In Proc Third All-Union Math Conf volume 1 page 3Izdat Akad Nauk SSSR Moscow 1956

[Bou99] J Bourgain On triples in arithmetic progression Geom Funct Anal9(5)968ndash984 1999

[BR02] G Bastien and M Rogalski Convexite complete monotonie etinegalites sur les fonctions zeta et gamma sur les fonctions desoperateurs de Baskakov et sur des fonctions arithmetiques CanadJ Math 54(5)916ndash944 2002

309

310 BIBLIOGRAPHY

[But11] Y Buttkewitz Exponential sums over primes and the prime twin prob-lem Acta Math Hungar 131(1-2)46ndash58 2011

[Che73] J R Chen On the representation of a larger even integer as the sum ofa prime and the product of at most two primes Sci Sinica 16157ndash1761973

[Che85] J R Chen On the estimation of some trigonometrical sums and theirapplication Sci Sinica Ser A 28(5)449ndash458 1985

[Chu37] NG Chudakov On the Goldbach problem C R (Dokl) Acad SciURSS n Ser 17335ndash338 1937

[Chu38] NG Chudakov On the density of the set of even numbers which arenot representable as the sum of two odd primes Izv Akad Nauk SSSRSer Mat 2 pages 25ndash40 1938

[Chu47] N G Chudakov Introduction to the theory of Dirichlet L-functionsOGIZ Moscow-Leningrad 1947 In Russian

[CW89] J R Chen and T Z Wang On the Goldbach problem Acta MathSinica 32(5)702ndash718 1989

[CW96] J R Chen and T Z Wang The Goldbach problem for odd numbersActa Math Sinica (Chin Ser) 39(2)169ndash174 1996

[Dab96] H Daboussi Effective estimates of exponential sums over primesIn Analytic number theory Vol 1 (Allerton Park IL 1995) volume138 of Progr Math pages 231ndash244 Birkhauser Boston Boston MA1996

[Dav67] H Davenport Multiplicative number theory Markham PublishingCo Chicago Ill 1967 Lectures given at the University of MichiganWinter Term

[dB81] N G de Bruijn Asymptotic methods in analysis Dover PublicationsInc New York third edition 1981

[Des08] R Descartes Œuvres de Descartes publiees par Charles Adam etPaul Tannery sous les auspices du Ministere de lrsquoInstruction publiquePhysico-mathematica Compendium musicae Regulae ad directionemingenii Recherche de la verite Supplement a la correspondance XParis Leopold Cerf IV u 691 S 4 1908

[Des77] J-M Deshouillers Sur la constante de Snirelprimeman In SeminaireDelange-Pisot-Poitou 17e annee (197576) Theorie des nombresFac 2 Exp No G16 page 6 Secretariat Math Paris 1977

[DEtRZ97] J-M Deshouillers G Effinger H te Riele and D Zinoviev A com-plete Vinogradov 3-primes theorem under the Riemann hypothesisElectron Res Announc Amer Math Soc 399ndash104 1997

BIBLIOGRAPHY 311

[Dic66] L E Dickson History of the theory of numbers Vol I Divisibilityand primality Chelsea Publishing Co New York 1966

[DLDDD+10] C Daramy-Loirat F De Dinechin D Defour M Gallet N Gast andCh Lauter Crlibm March 2010 version 10beta4

[DR01] H Daboussi and J Rivat Explicit upper bounds for exponential sumsover primes Math Comp 70(233)431ndash447 (electronic) 2001

[Dre93] F Dress Fonction sommatoire de la fonction de Mobius I Majorationsexperimentales Experiment Math 2(2)89ndash98 1993

[DS70] H G Diamond and J Steinig An elementary proof of the prime num-ber theorem with a remainder term Invent Math 11199ndash258 1970

[Eff99] G Effinger Some numerical implications of the Hardy and Littlewoodanalysis of the 3-primes problem Ramanujan J 3(3)239ndash280 1999

[EM95] M El Marraki Fonction sommatoire de la fonction de Mobius III Ma-jorations asymptotiques effectives fortes J Theor Nombres Bordeaux7(2)407ndash433 1995

[EM96] M El Marraki Majorations de la fonction sommatoire de la fonctionmicro(n)n Univ Bordeaux 1 preprint (96-8) 1996

[Est37] T Estermann On Goldbachrsquos Problem Proof that Almost all EvenPositive Integers are Sums of Two Primes Proc London Math SocS2-44(4)307ndash314 1937

[FI98] J Friedlander and H Iwaniec Asymptotic sieve for primes Ann ofMath (2) 148(3)1041ndash1065 1998

[FI10] J Friedlander and H Iwaniec Opera de cribro volume 57 of AmericanMathematical Society Colloquium Publications American Mathemat-ical Society Providence RI 2010

[For02] K Ford Vinogradovrsquos integral and bounds for the Riemann zeta func-tion Proc London Math Soc (3) 85(3)565ndash633 2002

[GD04] X Gourdon and P Demichel The first 1013 zeros of the Rie-mann zeta function and zeros computation at very large heighthttpnumberscomputationfreefrConstantsMiscellaneouszetazeros1e13-1e24pdf 2004

[GR94] I S Gradshteyn and I M Ryzhik Table of integrals series and prod-ucts Academic Press Inc Boston MA fifth edition 1994 Transla-tion edited and with a preface by Alan Jeffrey

[GR96] A Granville and O Ramare Explicit bounds on exponential sumsand the scarcity of squarefree binomial coefficients Mathematika43(1)73ndash107 1996

312 BIBLIOGRAPHY

[Har66] G H Hardy Collected papers of G H Hardy (Including Joint pa-pers with J E Littlewood and others) Vol I Edited by a committeeappointed by the London Mathematical Society Clarendon Press Ox-ford 1966

[HB79] D R Heath-Brown The fourth power moment of the Riemann zetafunction Proc London Math Soc (3) 38(3)385ndash422 1979

[HB85] D R Heath-Brown The ternary Goldbach problem Rev MatIberoamericana 1(1)45ndash59 1985

[HB11] H Hong and Ch W Brown QEPCAD B ndash Quantifier elimination bypartial cylindrical algebraic decomposition May 2011 version 162

[Hela] H A Helfgott Major arcs for Goldbachrsquos problem Preprint Availableat arXiv12035712

[Helb] H A Helfgott Minor arcs for Goldbachrsquos problem Preprint Availableas arXiv12055252

[Helc] H A Helfgott The Ternary Goldbach Conjecture is true PreprintAvailable as arXiv13127748

[Hel13a] H Helfgott La conjetura debil de Goldbach Gac R Soc Mat Esp16(4) 2013

[Hel13b] H A Helfgott The ternary Goldbach conjecture 2013 Avail-able at httpvaluevarwordpresscom20130702the-ternary-goldbach-conjecture

[Hel14a] H A Helfgott La conjecture de Goldbach ternaire Gaz Math(140)5ndash18 2014 Translated by Margaret Bilu revised by the author

[Hel14b] H A Helfgott The ternary Goldbach problem To appear in Proceed-ings of the International Congress of Mathematicians (Seoul Korea2014) 2014

[HL22] G H Hardy and J E Littlewood Some problems of lsquoPartitio numero-rumrsquo III On the expression of a number as a sum of primes ActaMath 44(1)1ndash70 1922

[HP13] H A Helfgott and David J Platt Numerical verification of the ternaryGoldbach conjecture up to 8875 middot 1030 Exp Math 22(4)406ndash4092013

[HR00] G H Hardy and S Ramanujan Asymptotic formulaelig in combinatoryanalysis [Proc London Math Soc (2) 17 (1918) 75ndash115] In Collectedpapers of Srinivasa Ramanujan pages 276ndash309 AMS Chelsea PublProvidence RI 2000

BIBLIOGRAPHY 313

[Hux72] M N Huxley Irregularity in sifted sequences J Number Theory4437ndash454 1972

[IK04] H Iwaniec and E Kowalski Analytic number theory volume 53 ofAmerican Mathematical Society Colloquium Publications AmericanMathematical Society Providence RI 2004

[Kad] H Kadiri An explicit zero-free region for the Dirichlet L-functionsPreprint Available as arXiv0510570

[Kad05] H Kadiri Une region explicite sans zeros pour la fonction ζ de Rie-mann Acta Arith 117(4)303ndash339 2005

[Kar93] A A Karatsuba Basic analytic number theory Springer-VerlagBerlin 1993 Translated from the second (1983) Russian edition andwith a preface by Melvyn B Nathanson

[Knu99] O Knuppel PROFILBIAS February 1999 version 2

[Kor58] N M Korobov Estimates of trigonometric sums and their applicationsUspehi Mat Nauk 13(4 (82))185ndash192 1958

[Lam08] B Lambov Interval arithmetic using SSE-2 In Reliable Implemen-tation of Real Number Algorithms Theory and Practice Interna-tional Seminar Dagstuhl Castle Germany January 8-13 2006 volume5045 of Lecture Notes in Computer Science pages 102ndash113 SpringerBerlin 2008

[Leh66] R Sherman Lehman On the difference π(x) minus li(x) Acta Arith11397ndash410 1966

[LW02] M-Ch Liu and T Wang On the Vinogradov bound in the three primesGoldbach conjecture Acta Arith 105(2)133ndash175 2002

[Mar41] K K Mardzhanishvili On the proof of the Goldbach-Vinogradov the-orem (in Russian) C R (Doklady) Acad Sci URSS (NS) 30(8)681ndash684 1941

[McC84a] K S McCurley Explicit estimates for the error term in the prime num-ber theorem for arithmetic progressions Math Comp 42(165)265ndash285 1984

[McC84b] K S McCurley Explicit zero-free regions for Dirichlet L-functionsJ Number Theory 19(1)7ndash32 1984

[Mon68] H L Montgomery A note on the large sieve J London Math Soc4393ndash98 1968

[Mon71] H L Montgomery Topics in multiplicative number theory LectureNotes in Mathematics Vol 227 Springer-Verlag Berlin 1971

314 BIBLIOGRAPHY

[MV73] H L Montgomery and R C Vaughan The large sieve Mathematika20119ndash134 1973

[MV74] H L Montgomery and R C Vaughan Hilbertrsquos inequality J LondonMath Soc (2) 873ndash82 1974

[MV07] H L Montgomery and R C Vaughan Multiplicative number the-ory I Classical theory volume 97 of Cambridge Studies in AdvancedMathematics Cambridge University Press Cambridge 2007

[Ned06] N S Nedialkov VNODE-LP a validated solver for initial value prob-lems in ordinary differential equations July 2006 version 03

[OeSHP14] T Oliveira e Silva S Herzog and S Pardi Empirical verification ofthe even Goldbach conjecture and computation of prime gaps up to4 middot 1018 Math Comp 832033ndash2060 2014

[OLBC10] F W J Olver D W Lozier R F Boisvert and Ch W Clark edi-tors NIST handbook of mathematical functions US Department ofCommerce National Institute of Standards and Technology Washing-ton DC 2010 With 1 CD-ROM (Windows Macintosh and UNIX)

[Olv58] F W J Olver Uniform asymptotic expansions of solutions of lin-ear second-order differential equations for large values of a parameterPhilos Trans Roy Soc London Ser A 250479ndash517 1958

[Olv59] F W J Olver Uniform asymptotic expansions for Weber paraboliccylinder functions of large orders J Res Nat Bur Standards Sect B63B131ndash169 1959

[Olv61] F W J Olver Two inequalities for parabolic cylinder functions ProcCambridge Philos Soc 57811ndash822 1961

[Olv65] F W J Olver On the asymptotic solution of second-order differentialequations having an irregular singularity of rank one with an applica-tion to Whittaker functions J Soc Indust Appl Math Ser B NumerAnal 2225ndash243 1965

[Olv74] F W J Olver Asymptotics and special functions Academic Press[A subsidiary of Harcourt Brace Jovanovich Publishers] New York-London 1974 Computer Science and Applied Mathematics

[Plaa] D Platt Computing π(x) analytically To appear in Math CompAvailable as arXiv12035712

[Plab] D Platt Numerical computations concerning GRH Preprint Availableat arXiv13053087

[Pla11] D Platt Computing degree 1 L-functions rigorously PhD thesis Bris-tol University 2011

BIBLIOGRAPHY 315

[Rama] O Ramare Etat des lieux Preprint Available as httpmathuniv-lille1fr˜ramareMathsExplicitJNTBpdf

[Ramb] O Ramare Explicit estimates on several summatory functions involv-ing the Moebius function To appear in Math Comp

[Ramc] O Ramare A sharp bilinear form decomposition for primes and Moe-bius function Preprint To appear in Acta Math Sinica

[Ramd] O Ramare Short effective intervals containing primes Preprint

[Ram95] O Ramare On Snirelprimemanrsquos constant Ann Scuola Norm Sup PisaCl Sci (4) 22(4)645ndash706 1995

[Ram09] O Ramare Arithmetical aspects of the large sieve inequality volume 1of Harish-Chandra Research Institute Lecture Notes Hindustan BookAgency New Delhi 2009 With the collaboration of D S Ramana

[Ram10] O Ramare On Bombierirsquos asymptotic sieve J Number Theory130(5)1155ndash1189 2010

[Ram13] O Ramare From explicit estimates for primes to explicit estimates forthe Mobius function Acta Arith 157(4)365ndash379 2013

[Ram14] O Ramare Explicit estimates on the summatory functions of theMobius function with coprimality restrictions Acta Arith 165(1)1ndash10 2014

[Ros41] B Rosser Explicit bounds for some functions of prime numbers AmerJ Math 63211ndash232 1941

[RR96] O Ramare and R Rumely Primes in arithmetic progressions MathComp 65(213)397ndash425 1996

[RS62] J B Rosser and L Schoenfeld Approximate formulas for some func-tions of prime numbers Illinois J Math 664ndash94 1962

[RS75] J B Rosser and L Schoenfeld Sharper bounds for the Chebyshevfunctions θ(x) and ψ(x) Math Comp 29243ndash269 1975 Collectionof articles dedicated to Derrick Henry Lehmer on the occasion of hisseventieth birthday

[RS03] O Ramare and Y Saouter Short effective intervals containing primesJ Number Theory 98(1)10ndash33 2003

[RV83] H Riesel and R C Vaughan On sums of primes Ark Mat 21(1)46ndash74 1983

[Sao98] Y Saouter Checking the odd Goldbach conjecture up to 1020 MathComp 67(222)863ndash866 1998

316 BIBLIOGRAPHY

[Sch33] L Schnirelmann Uber additive Eigenschaften von Zahlen Math Ann107(1)649ndash690 1933

[Sch76] L Schoenfeld Sharper bounds for the Chebyshev functions θ(x) andψ(x) II Math Comp 30(134)337ndash360 1976

[SD10] Y Saouter and P Demichel A sharp region where π(x) minus li(x) ispositive Math Comp 79(272)2395ndash2405 2010

[Sel91] A Selberg Lectures on sieves In Collected papers vol II pages66ndash247 Springer Berlin 1991

[Sha14] X Shao A density version of the Vinogradov three primes theoremDuke Math J 163(3)489ndash512 2014

[Shu92] F H Shu The Cosmos In Encyclopaedia Britannica Macropaediavolume 16 pages 762ndash795 Encyclopaedia Britannica Inc 15 edition1992

[Tao14] T Tao Every odd number greater than 1 is the sum of at most fiveprimes Math Comp 83(286)997ndash1038 2014

[Tem10] N M Temme Parabolic cylinder functions In NIST Handbook ofmathematical functions pages 303ndash319 US Dept Commerce Wash-ington DC 2010

[Tru] T S Trudgian An improved upper bound for the error in thezero-counting formulae for Dirichlet L-functions and Dedekind zeta-functions Preprint

[Tuc11] W Tucker Validated numerics A short introduction to rigorous com-putations Princeton University Press Princeton NJ 2011

[Tur53] A M Turing Some calculations of the Riemann zeta-function ProcLondon Math Soc (3) 399ndash117 1953

[TV03] N M Temme and R Vidunas Parabolic cylinder functions exam-ples of error bounds for asymptotic expansions Anal Appl (Singap)1(3)265ndash288 2003

[van37] J G van der Corput Sur lrsquohypothese de Goldbach pour presque tousles nombres pairs Acta Arith 2266ndash290 1937

[Vau77a] R C Vaughan On the estimation of Schnirelmanrsquos constant J ReineAngew Math 29093ndash108 1977

[Vau77b] R-C Vaughan Sommes trigonometriques sur les nombres premiersC R Acad Sci Paris Ser A-B 285(16)A981ndashA983 1977

[Vau80] R C Vaughan Recent work in additive prime number theory In Pro-ceedings of the International Congress of Mathematicians (Helsinki1978) pages 389ndash394 Acad Sci Fennica Helsinki 1980

BIBLIOGRAPHY 317

[Vau97] R C Vaughan The Hardy-Littlewood method volume 125 of Cam-bridge Tracts in Mathematics Cambridge University Press Cam-bridge second edition 1997

[Vin37] I M Vinogradov A new method in analytic number theory (Russian)Tr Mat Inst Steklova 105ndash122 1937

[Vin47] IM Vinogradov The method of trigonometrical sums in the theory ofnumbers (Russian) Tr Mat Inst Steklova 233ndash109 1947

[Vin54] I M Vinogradov The method of trigonometrical sums in the theoryof numbers Interscience Publishers London and New York 1954Translated revised and annotated by K F Roth and Anne Davenport

[Vin58] I M Vinogradov A new estimate of the function ζ(1 + it) Izv AkadNauk SSSR Ser Mat 22161ndash164 1958

[Vin04] I M Vinogradov The method of trigonometrical sums in the theory ofnumbers Dover Publications Inc Mineola NY 2004 Translated fromthe Russian revised and annotated by K F Roth and Anne DavenportReprint of the 1954 translation

[Wed03] S Wedeniwski ZetaGrid - Computational verification of the Riemannhypothesis Conference in Number Theory in honour of Professor HC Williams Banff Alberta Canada May 2003

[Wei84] A Weil Number theory An approach through history From Hammu-rapi to Legendre Birkhauser Boston Inc Boston MA 1984

[Whi03] E T Whittaker On the functions associated with the parabolic cylinderin harmonic analysis Proc London Math Soc 35417ndash427 1903

[Wig20] S Wigert Sur la theorie de la fonction ζ(s) de Riemann Ark Mat141ndash17 1920

[Won01] R Wong Asymptotic approximations of integrals volume 34 of Clas-sics in Applied Mathematics Society for Industrial and Applied Math-ematics (SIAM) Philadelphia PA 2001 Corrected reprint of the 1989original

[Zin97] D Zinoviev On Vinogradovrsquos constant in Goldbachrsquos ternary problemJ Number Theory 65(2)334ndash358 1997

  • Preface
  • Acknowledgements
  • 1 Introduction
    • 11 History and new developments
    • 12 The circle method Fourier analysis on Z
    • 13 The major arcs M
      • 131 What do we really know about L-functions and their zeros
      • 132 Estimates of f0362f() for in the major arcs
        • 14 The minor arcs m
          • 141 Qualitative goals and main ideas
          • 142 Combinatorial identities
          • 143 Type I sums
          • 144 Type II or bilinear sums
            • 15 Integrals over the major and minor arcs
            • 16 Some remarks on computations
              • 2 Notation and preliminaries
                • 21 General notation
                • 22 Dirichlet characters and L functions
                • 23 Fourier transforms and exponential sums
                • 24 Mellin transforms
                • 25 Bounds on sums of and
                • 26 Interval arithmetic and the bisection method
                  • I Minor arcs
                    • 3 Introduction
                      • 31 Results
                      • 32 Comparison to earlier work
                      • 33 Basic setup
                        • 331 Vaughans identity
                        • 332 An alternative route
                            • 4 Type I sums
                              • 41 Trigonometric sums
                              • 42 Type I estimates
                                • 421 Type I variations
                                    • 5 Type II sums
                                      • 51 The sum S1 cancellation
                                        • 511 Reduction to a sum with
                                        • 512 Explicit bounds for a sum with
                                        • 513 Estimating the triple sum
                                          • 52 The sum S2 the large sieve primes and tails
                                            • 6 Minor-arc totals
                                              • 61 The smoothing function
                                              • 62 Contributions of different types
                                                • 621 Type I terms SI1
                                                • 622 Type I terms SI2
                                                • 623 Type II terms
                                                  • 63 Adjusting parameters Calculations
                                                    • 631 First choice of parameters qy
                                                    • 632 Second choice of parameters
                                                      • 64 Conclusion
                                                          • II Major arcs
                                                            • 7 Major arcs overview and results
                                                              • 71 Results
                                                              • 72 Main ideas
                                                                • 8 The Mellin transform of the twisted Gaussian
                                                                  • 81 How to choose a smoothing function
                                                                  • 82 The twisted Gaussian overview and setup
                                                                    • 821 Relation to the existing literature
                                                                    • 822 General approach
                                                                      • 83 The saddle point
                                                                        • 831 The coordinates of the saddle point
                                                                        • 832 The direction of steepest descent
                                                                          • 84 The integral over the contour
                                                                            • 841 A simple contour
                                                                            • 842 Another simple contour
                                                                              • 85 Conclusions
                                                                                • 9 Explicit formulas
                                                                                  • 91 A general explicit formula
                                                                                  • 92 Sums and decay for the Gaussian
                                                                                  • 93 The case of (t)
                                                                                  • 94 The case of +(t)
                                                                                  • 95 A sum for +(t)2
                                                                                  • 96 A verification of zeros and its consequences
                                                                                      • III The integral over the circle
                                                                                        • 10 The integral over the major arcs
                                                                                          • 101 Decomposition of S by characters
                                                                                          • 102 The integral over the major arcs the main term
                                                                                          • 103 The 2 norm over the major arcs
                                                                                          • 104 The integral over the major arcs conclusion
                                                                                            • 11 Optimizing and adapting smoothing functions
                                                                                              • 111 The symmetric smoothing function
                                                                                                • 1111 The product (t) (-t)
                                                                                                  • 112 The smoothing function adapting minor-arc bounds
                                                                                                    • 12 The 2 norm and the large sieve
                                                                                                      • 121 Variations on the large sieve for primes
                                                                                                      • 122 Bounding the quotient in the large sieve for primes
                                                                                                        • 13 The integral over the minor arcs
                                                                                                          • 131 Putting together 2 bounds over arcs and bounds
                                                                                                          • 132 The minor-arc total
                                                                                                            • 14 Conclusion
                                                                                                              • 141 The 2 norm over the major arcs explicit version
                                                                                                              • 142 The total major-arc contribution
                                                                                                              • 143 The minor-arc total explicit version
                                                                                                              • 144 Conclusion proof of main theorem
                                                                                                                  • IV Appendices
                                                                                                                    • A Norms of smoothing functions
                                                                                                                      • A1 The decay of a Mellin transform
                                                                                                                      • A2 The difference +- in 2 norm
                                                                                                                      • A3 Norms involving +
                                                                                                                      • A4 Norms involving +
                                                                                                                      • A5 The -norm of +
                                                                                                                        • B Norms of Fourier transforms
                                                                                                                          • B1 The Fourier transform of 2
                                                                                                                          • B2 Bounds involving a logarithmic factor
                                                                                                                            • C Sums involving and
                                                                                                                              • C1 Sums over primes
                                                                                                                              • C2 Sums involving
                                                                                                                                • D Checking small n by checking zeros of (s)

    ii

    Contents

    Preface vii

    Acknowledgements ix

    1 Introduction 111 History and new developments 212 The circle method Fourier analysis on Z 613 The major arcs M 9

    131 What do we really know about L-functions and their zeros 9132 Estimates of f(α) for α in the major arcs 10

    14 The minor arcs m 14141 Qualitative goals and main ideas 14142 Combinatorial identities 16143 Type I sums 18144 Type II or bilinear sums 21

    15 Integrals over the major and minor arcs 2416 Some remarks on computations 28

    2 Notation and preliminaries 3121 General notation 3122 Dirichlet characters and L functions 3223 Fourier transforms and exponential sums 3224 Mellin transforms 3425 Bounds on sums of micro and Λ 3526 Interval arithmetic and the bisection method 38

    I Minor arcs 41

    3 Introduction 4331 Results 4432 Comparison to earlier work 4533 Basic setup 45

    331 Vaughanrsquos identity 45

    iii

    iv CONTENTS

    332 An alternative route 47

    4 Type I sums 5141 Trigonometric sums 5142 Type I estimates 56

    421 Type I variations 63

    5 Type II sums 7751 The sum S1 cancellation 80

    511 Reduction to a sum with micro 80512 Explicit bounds for a sum with micro 84513 Estimating the triple sum 89

    52 The sum S2 the large sieve primes and tails 93

    6 Minor-arc totals 10161 The smoothing function 10162 Contributions of different types 102

    621 Type I terms SI1 102622 Type I terms SI2 103623 Type II terms 107

    63 Adjusting parameters Calculations 117631 First choice of parameters q le y 119632 Second choice of parameters 125

    64 Conclusion 133

    II Major arcs 135

    7 Major arcs overview and results 13771 Results 13872 Main ideas 140

    8 The Mellin transform of the twisted Gaussian 14381 How to choose a smoothing function 14582 The twisted Gaussian overview and setup 146

    821 Relation to the existing literature 146822 General approach 147

    83 The saddle point 149831 The coordinates of the saddle point 149832 The direction of steepest descent 150

    84 The integral over the contour 152841 A simple contour 152842 Another simple contour 157

    85 Conclusions 159

    CONTENTS v

    9 Explicit formulas 16391 A general explicit formula 16492 Sums and decay for the Gaussian 17593 The case of ηlowast(t) 17894 The case of η+(t) 18495 A sum for η+(t)2 18896 A verification of zeros and its consequences 193

    III The integral over the circle 199

    10 The integral over the major arcs 201101 Decomposition of Sη by characters 202102 The integral over the major arcs the main term 204103 The `2 norm over the major arcs 207104 The integral over the major arcs conclusion 212

    11 Optimizing and adapting smoothing functions 217111 The symmetric smoothing function η 218

    1111 The product η(t)η(ρminus t) 218112 The smoothing function ηlowast adapting minor-arc bounds 219

    12 The `2 norm and the large sieve 227121 Variations on the large sieve for primes 227122 Bounding the quotient in the large sieve for primes 232

    13 The integral over the minor arcs 245131 Putting together `2 bounds over arcs and `infin bounds 245132 The minor-arc total 248

    14 Conclusion 259141 The `2 norm over the major arcs explicit version 259142 The total major-arc contribution 261143 The minor-arc total explicit version 267144 Conclusion proof of main theorem 275

    IV Appendices 277

    A Norms of smoothing functions 279A1 The decay of a Mellin transform 280A2 The difference η+ minus η in `2 norm 283A3 Norms involving η+ 285A4 Norms involving ηprime+ 286A5 The `infin-norm of η+ 288

    vi CONTENTS

    B Norms of Fourier transforms 291B1 The Fourier transform of ηprimeprime2 291B2 Bounds involving a logarithmic factor 293

    C Sums involving Λ and φ 297C1 Sums over primes 297C2 Sums involving φ 299

    D Checking small n by checking zeros of ζ(s) 305

    Preface

    ἐγγὺς δrsquo ἦν τέλεος ὃ δὲ τὀ τρίτον ἧκε χ[αμᾶζε

    σὺν τῶι δrsquo ἐξέφυγεν θάνατον καὶ κῆ[ρα μέλαιναν

    Hesiod () Ehoiai fr 7621ndash2 Merkelbach and West

    The ternary Goldbach conjecture (or three-prime conjecture) states that every oddnumber n greater than 5 can be written as the sum of three primes The purpose of thisbook is to give the first full proof of this conjecture

    The proof builds on the great advances made in the early 20th century by Hardy andLittlewood (1922) and Vinogradov (1937) Progress since then has been more gradualIn some ways it was necessary to clear the board and start work using only the mainexisting ideas towards the problem together with techniques developed elsewhere

    Part of the aim has been to keep the exposition as accessible as possible withan emphasis on qualitative improvements and new technical ideas that should be ofuse elsewhere The main strategy was to give an analytic approach that is efficientrelatively clean and as it must be for this problem explicit the focus does not lie inoptimizing explicit constants or in performing calculations necessary as these tasksare

    Organization In the introduction after a summary of the history of the problemwe will go over a detailed outline of the proof The rest of the book is divided in threeparts structured so that they can be read independently the first two parts do not referto each other and the third part uses only the main results (clearly marked) of the firsttwo parts

    As is the case in most proofs involving the circle method the problem is reduced toshowing that a certain integral over the ldquocirclerdquo RZ is non-zero The circle is dividedinto major arcs and minor arcs In Part I ndash in some ways the technical heart of the proofndash we will see how to give upper bounds on the integrand when α is in the minor arcsPart II will provide rather precise estimates for the integrand when the variable α is inthe major arcs Lastly Part III shows how to use these inputs as well as possible toestimate the integral

    Each part and each chapter starts with a general discussion of the strategy andthe main ideas involved Some of the more technical bounds and computations arerelegated to the appendices

    vii

    viii PREFACE

    Dependencies between the chapters

    1 2

    3 7 10

    4 8 11

    5 9 12

    6 13

    14

    Introduction Notation andpreliminaries

    Minor arcsintroduction

    Type I sums

    Type II sums

    Minor-arctotals

    Major arcsoverview

    Mellin transform oftwisted Gaussian

    Explicit formulas

    The integral overthe major arcs

    Smoothing func-tions and their use

    The `2 norm andthe large sieve

    The integral overthe minor arcs

    Conclusion

    Acknowledgements

    The author is very thankful to D Platt who working in close coordination with himprovided GRH verifications in the necessary ranges and also helped him with the usageof interval arithmetic He is also deeply grateful to O Ramare who in reply to hisrequests prepared and sent for publication several auxiliary results and who otherwiseprovided much-needed feedback

    The author is also much indebted to A Booker B Green R Heath-Brown HKadiri D Platt T Tao and M Watkins for many discussions on Goldbachrsquos prob-lem and related issues Several historical questions became clearer due to the helpof J Brandes K Gong R Heath-Brown Z Silagadze R Vaughan and T WooleyAdditional references were graciously provided by R Bryant S Huntsman and IRezvyakova Thanks are also due to B Bukh A Granville and P Sarnak for theirvaluable advice

    The introduction is largely based on the authorrsquos article for the Proceedings of the2014 ICM [Hel14b] That article in turn is based in part on the informal note [Hel13b]which was published in Spanish translation ([Hel13a] translated by M A Morales andthe author and revised with the help of J Cilleruelo and M Helfgott) and in a Frenchversion ([Hel14a] translated by M Bilu and revised by the author) The proof firstappeared as a series of preprints [Helb] [Hela] [Helc]

    Travel and other expenses were funded in part by the Adams Prize and the PhilipLeverhulme Prize The authorrsquos work on the problem started at the Universite deMontreal (CRM) in 2006 he is grateful to both the Universite de Montreal and theEcole Normale Superieure for providing pleasant working environments During thelast stages of the work travel was partly covered by ANR Project Caesar No ANR-12-BS01-0011

    The present work would most likely not have been possible without free and pub-licly available software SAGE PARI Maxima gnuplot VNODE-LP PROFIL BIASand of course LATEX Emacs the gcc compiler and GNULinux in general Some ex-ploratory work was done in SAGE and Mathematica Rigorous calculations used eitherD Plattrsquos interval-arithmetic package (based in part on Crlibm) or the PROFILBIASinterval arithmetic package underlying VNODE-LP

    The calculations contained in this paper used a nearly trivial amount of resourcesthey were all carried out on the authorrsquos desktop computers at home and work How-ever D Plattrsquos computations [Plab] used a significant amount of resources kindly do-nated to D Platt and the author by several institutions This crucial help was providedby MesoPSL (affiliated with the Observatoire de Paris and Paris Sciences et Lettres)

    ix

    x ACKNOWLEDGEMENTS

    Universite de Paris VIVII (UPMC - DSI - Pole Calcul) University of Warwick (thanksto Bill Hart) University of Bristol France Grilles (French National Grid InfrastructureDIRAC national instance) Universite de Lyon 1 and Universite de Bordeaux 1 BothD Platt and the author would like to thank the donating organizations their technicalstaff and all those who helped to make these resources available to them

    Chapter 1

    Introduction

    The question we will discuss or one similar to it seems to have been first posed byDescartes in a manuscript published only centuries after his death [Des08 p 298]Descartes states ldquoSed amp omnis numerus par fit ex uno vel duobus vel tribus primisrdquo(ldquoBut also every even number is made out of one two or three prime numbersrdquo1) Thisstatement comes in the middle of a discussion of sums of polygonal numbers such asthe squares

    Statements on sums of primes and sums of values of polynomials (polygonal num-bers powers nk etc) have since shown themselves to be much more than mere cu-riosities ndash and not just because they are often very difficult to prove Whereas the studyof sums of powers can rely on their algebraic structure the study of sums of primesleads to the realization that from several perspectives the set of primes behaves muchlike the set of integers or like a random set of integers (It also leads to the realizationthat this is very hard to prove)

    If instead of the primes we had a random set of odd integers S whose density ndashan intuitive concept that can be made precise ndash equaled that of the primes then wewould expect to be able to write every odd number as a sum of three elements of Sand every even number as the sum of two elements of S We would have to check byhand whether this is true for small odd and even numbers but it is relatively easy toshow that after a long enough check it would be very unlikely that there would be anyexceptions left among the infinitely many cases left to check

    The question then is in what sense we need the primes to be like a random set ofintegers in other words we need to know what we can prove about the regularities ofthe distribution of the primes This is one of the main questions of analytic numbertheory progress on it has been very slow and difficult

    Fourier analysis expresses information on the distribution of a sequence in termsof frequencies In the case of the primes what may be called the main frequencies ndashthose in the major arcs ndash correspond to the same kind of large-scale distribution thatis encoded by L-functions the family of functions to which the Riemann zeta function

    1Thanks are due to J Brandes and R Vaughan for a discussion on a possible ambiguity in the Latinwording Descartesrsquo statement is mentioned (with a translation much like the one given here) in DicksonrsquosHistory [Dic66 Ch XVIII]

    1

    2 CHAPTER 1 INTRODUCTION

    belongs On some of the crucial questions on L-functions the limits of our knowledgehave barely budged in the last century There is something relatively new now namelyrigorous numerical data of non-negligible scope still such data is by definition finiteand as a consequence its range of applicability is very narrow Thus the real questionin the major-arc regime is how to use well the limited information we do have on thelarge-scale distribution of the primes As we will see this requires delicate work onexplicit asymptotic analysis and smoothing functions

    Outside the main frequencies ndash that is in what are called the minor arcs ndash estimatesbased on L-functions no longer apply and what is remarkable is that one can sayanything meaningful on the distribution of the primes Vinogradov was the first to giveunconditional non-trivial bounds showing that there are no great irregularities in theminor arcs this is what makes them ldquominorrdquo Here the task is to give sharper boundsthan Vinogradov It is in this regime that we can genuinely say that we learn a littlemore about the distribution of the primes based on what is essentially an elementaryand highly optimized analytic-combinatorial analysis of exponential sums ie Fouriercoefficients given by series (supported on the primes in our case)

    The circle method reduces an additive problem ndash that is a problems on sums suchas sums of primes powers etc ndash to the estimation of an integral on the space offrequencies (the ldquocirclerdquo RZ) In the case of the primes as we have just discussed wehave precise estimates on the integrand on part of the circle (the major arcs) and upperbounds on the rest of the circle (the minor arcs) Putting them together efficiently togive an estimate on the integral is a delicate matter we leave it for the last part as itis really what is particular to our problem as opposed to being of immediate generalrelevance to the study of the primes As we shall see estimating the integral well doesinvolve using ndash and improving ndash general estimates on the variance of irregularities inthe distribution of the primes as given by the large sieve

    In fact one of the main general lessons of the proof is that there is a very closerelationship between the circle method and the large sieve we will use the large sievenot just as a tool ndash which we shall incidentally sharpen in certain contexts ndash but as asource for ideas on how to apply the circle method more effectively

    This has been an attempt at a first look from above Let us now undertake a moreleisurely and detailed overview of the problem and its solution

    11 History and new developments

    The history of the conjecture starts properly with Euler and his close friend ChristianGoldbach both of whom lived and worked in Russia at the time of their correspon-dence ndash about a century after Descartesrsquo isolated statement Goldbach a man of manyinterests is usually classed as a serious amateur he seems to have awakened Eulerrsquospassion for number theory which would lead to the beginning of the modern era ofthe subject [Wei84 Ch 3 sectIV] In a letter dated June 7 1742 Goldbach made aconjectural statement on prime numbers and Euler rapidly reduced it to the followingconjecture which he said Goldbach had already posed to him every positive integercan be written as the sum of at most three prime numbers

    11 HISTORY AND NEW DEVELOPMENTS 3

    We would now say ldquoevery integer greater than 1rdquo since we no long consider 1 tobe a prime number Moreover the conjecture is nowadays split into two

    bull the weak or ternary Goldbach conjecture states that every odd integer greaterthan 5 can be written as the sum of three primes

    bull the strong or binary Goldbach conjecture states that every even integer greaterthan 2 can be written as the sum of two primes

    As their names indicate the strong conjecture implies the weak one (easily subtract 3from your odd number n then express nminus 3 as the sum of two primes)

    The strong conjecture remains out of reach A short while ago ndash the first completeversion appeared on May 13 2013 ndash the author proved the weak Goldbach conjecture

    Theorem 111 Every odd integer greater than 5 can be written as the sum of threeprimes

    In 1937 I M Vinogradov proved [Vin37] that the conjecture is true for all oddnumbers n larger than some constant C (Hardy and Littlewood had proved the samestatement under the assumption of the Generalized Riemann Hypothesis which weshall have the chance to discuss later)

    It is clear that a computation can verify the conjecture only for n le c c a constantcomputations have to be finite What can make a result coming from analytic numbertheory be valid only for n ge C

    An analytic proof generally speaking gives us more than just existence In thiskind of problem it gives us more than the possibility of doing something (here writingan integer n as the sum of three primes) It gives us a rigorous estimate for the numberof ways in which this something is possible that is it shows us that this number ofways equals

    main term + error term (11)

    where the main term is a precise quantity f(n) and the error term is something whoseabsolute value is at most another precise quantity g(n) If f(n) gt g(n) then (11) isnon-zero ie we will have shown the existence of a way to write our number as thesum of three primes

    (Since what we truly care about is existence we are free to weigh different waysof writing n as the sum of three primes however we wish ndash that is we can decide thatsome primes ldquocountrdquo twice or thrice as much as others and that some do not count atall)

    Typically after much work we succeed in obtaining (11) with f(n) and g(n) suchthat f(n) gt g(n) asymptotically that is for n large enough To give a highly simplifiedexample if say f(n) = n2 and g(n) = 100n32 then f(n) gt g(n) for n gt C whereC = 104 and so the number of ways (11) is positive for n gt C

    We want a moderate value of C that is a C small enough that all cases n le C canbe checked computationally To ensure this we must make the error term bound g(n)as small as possible This is our main task A secondary (and sometimes neglected)possibility is to rig the weights so as to make the main term f(n) larger in comparisonto g(n) this can generally be done only up to a certain point but is nonetheless veryhelpful

    4 CHAPTER 1 INTRODUCTION

    As we said the first unconditional proof that odd numbers n ge C can be writtenas the sum of three primes is due to Vinogradov Analytic bounds fall into severalcategories or stages quite often successive versions of the same theorem will gothrough successive stages

    1 An ineffective result shows that a statement is true for some constant C but givesno way to determine what the constant C might be Vinogradovrsquos first proof ofhis theorem (in [Vin37]) is like this it shows that there exists a constant C suchthat every odd number n gt C is the sum of three primes yet give us no hope offinding out what the constant C might be2 Many proofs of Vinogradovrsquos resultin textbooks are also of this type

    2 An effective but not explicit result shows that a statement is true for someunspecified constant C in a way that makes it clear that a constant C couldin principle be determined following and reworking the proof with great careVinogradovrsquos later proof ([Vin47] translated in [Vin54]) is of this nature AsChudakov [Chu47 sectIV2] pointed out the improvement on [Vin37] given byMardzhanishvili [Mar41] already had the effect of making the result effective3

    3 An explicit result gives a value of C According to [Chu47 p 201] the firstexplicit version of Vinogradovrsquos result was given by Borozdkin in his unpub-lished doctoral dissertation written under the direction of Vinogradov (1939)C = exp(exp(exp(4196))) Such a result is by definition also effectiveBorodzkin later [Bor56] gave the value C = ee

    16038

    though he does not seem tohave published the proof The best ndash that is smallest ndash value of C known beforethe present work was that of Liu and Wang [LW02] C = 2 middot 101346

    4 What we may call an efficient proof gives a reasonable value for C ndash in our casea value small enough that checking all cases up to C is feasible

    How far were we from an efficient proof That is what sort of computation couldever be feasible The situation was paradoxical the conjecture was known above anexplicit C but C = 2 middot101346 is so large that it could not be said that the problem couldbe attacked by any foreseeable computational means within our physical universe (Atruly brute-force verification up to C takes at least C steps a cleverer verification takeswell over

    radicC steps The number of picoseconds since the beginning of the universe is

    less than 1030 whereas the number of protons in the observable universe is currentlyestimated at sim 1080 [Shu92] this limits the number of steps that can be taken inany currently imaginable computer even if it were to do parallel processing on anastronomical scale) Thus the only way forward was a series of drastic improvementsin the mathematical rather than computational side

    I gave a proof with C = 1029 in May 2013 Since D Platt and I had verifiedthe conjecture for all odd numbers up to n le 88 middot 1030 by computer [HP13] thisestablished the conjecture for all odd numbers n

    2Here as is often the case in ineffective results in analytic number theory the underlying issue is that ofSiegel zeros which are believed not to exist but have not been shown not to the strongest bounds on (ieagainst) such zeros are ineffective and so are all of the many results using such estimates

    3The proof in [Mar41] combined the bounds in [Vin37] with a more careful accounting of the effect ofthe single possible Siegel zero within range

    11 HISTORY AND NEW DEVELOPMENTS 5

    (In December 2013 I reduced C to 1027 The verification of the ternary Gold-bach conjecture up to n le 1027 can be done on a home computer over a weekendas of the time of writing (2014) It must be said that this uses the verification of thebinary Goldbach conjecture for n le 4 middot 1018 [OeSHP14] which itself required com-putational resources far outside the home-computing range Checking the conjectureup to n le 1027 was not even the main computational task that needed to be accom-plished to establish the Main Theorem ndash that task was the finite verification of zeros ofL-functions in [Plab] a general-purpose computation that should be useful elsewhere)

    What was the strategy of the proof The basic framework is the one pioneered byHardy and Littlewood for a variety of problems ndash namely the circle method which aswe shall see is an application of Fourier analysis over Z (There are other later routesto Vinogradovrsquos result see [HB85] [FI98] and especially the recent work [Sha14]which avoids using anything about zeros of L-functions inside the critical strip) Vino-gradovrsquos proof like much of the later work on the subject was based on a detailedanalysis of exponential sums ie Fourier transforms over Z So is the proof that wewill sketch

    At the same time the distance between 2 middot 101346 and 1027 is such that we cannothope to get to 1027 (or any other reasonable constant) by fine-tuning previous workRather we must work from scratch using the basic outline in Vinogradovrsquos originalproof and other initially unrelated developments in analysis and number theory (no-tably the large sieve) Merely improving constants will not do rather we must doqualitatively better than previous work (by non-constant factors) if we are to have anychance to succeed It is on these qualitative improvements that we will focus

    It is only fair to review some of the progress made between Vinogradovrsquos time andours Here we will focus on results later we will discuss some of the progress madein the techniques of proof See [Dic66 Ch XVIII] for the early history of the problem(before Hardy and Littlewood) see R Vaughanrsquos ICM lecture notes on the ternaryGoldbach problem [Vau80] for some further details on the history up to 1978

    In 1933 Schnirelmann proved [Sch33] that every integer n gt 1 can be written asthe sum of at most K primes for some unspecified constant K (This pioneering workis now considered to be part of the early history of additive combinatorics) In 1969Klimov gave an explicit value for K (namely K = 6 middot 109) he later improved theconstant to K = 115 (with G Z Piltay and T A Sheptickaja) and K = 55 Laterthere were results by Vaughan [Vau77a] (K = 27) Deshouillers [Des77] (K = 26)and Riesel-Vaughan [RV83] (K = 19)

    Ramare showed in 1995 that every even number n gt 1 can be written as the sum ofat most 6 primes [Ram95] In 2012 Tao proved [Tao14] that every odd number n gt 1is the sum of at most 5 primes

    There have been other avenues of attack towards the strong conjecture Using ideasclose to those of Vinogradovrsquos Chudakov [Chu37] [Chu38] Estermann [Est37] andvan der Corput [van37] proved (independently from each other) that almost every evennumber (meaning all elements of a subset of density 1 in the even numbers) can bewritten as the sum of two primes In 1973 J-R Chen showed [Che73] that every even

    6 CHAPTER 1 INTRODUCTION

    number n larger than a constant C can be written as the sum of a prime number andthe product of at most two primes (n = p1 + p2 or n = p1 + p2p3) IncidentallyJ-R Chen himself together with T-Z Wang was responsible for the best bounds onC (for ternary Goldbach) before Lui and Wang C = exp(exp(11503)) lt 4 middot 1043000

    [CW89] and C = exp(exp(9715)) lt 6 middot 107193 [CW96]Matters are different if one assumes the Generalized Riemann Hypothesis (GRH)

    A careful analysis [Eff99] of Hardy and Littlewoodrsquos work [HL22] gives that everyodd number n ge 124 middot 1050 is the sum of three primes if GRH is true4 Accordingto [Eff99] the same statement with n ge 1032 was proven in the unpublished doctoraldissertation of B Lucke a student of E Landaursquos in 1926 Zinoviev [Zin97] improvedthis to n ge 1020 A computer check ([DEtRZ97] see also [Sao98]) showed that theconjecture is true for n lt 1020 thus completing the proof of the ternary Goldbachconjecture under the assumption of GRH What was open until now was of course theproblem of giving an unconditional proof

    12 The circle method Fourier analysis on Z

    It is common for a first course on Fourier analysis to focus on functions over the re-als satisfying f(x) = f(x + 1) or what is the same functions f RZ rarr CSuch a function (unless it is fairly pathological) has a Fourier series converging to itthis is just the same as saying that f has a Fourier transform f Z rarr C definedby f(n) =

    intRZ f(α)e(minusαn)dα and satisfying f(α) =

    sumnisinZ f(n)e(αn) (Fourier

    inversion theorem) where e(t) = e2πitIn number theory we are especially interested in functions f Zrarr C Then things

    are exactly the other way around provided that f decays reasonably fast as n rarr plusmninfin(or becomes 0 for n large enough) f has a Fourier transform f RZ rarr C definedby f(α) =

    sumn f(n)e(minusαn) and satisfying f(n) =

    intRZ f(α)e(αn)dα (Highbrow

    talk we already knew that Z is the Fourier dual of RZ and so of course RZ isthe Fourier dual of Z) ldquoExponential sumsrdquo (or ldquotrigonometrical sumsrdquo as in the titleof [Vin54]) are sums of the form

    sumn f(α)e(minusαn) of course the ldquocirclerdquo in ldquocircle

    methodrdquo is just a name for RZ (To see an actual circle in the complex plane look atthe image of RZ under the map α 7rarr e(α))

    The study of the Fourier transform f is relevant to additive problems in numbertheory ie questions on the number of ways of writing n as a sum of k integers ofa particular form Why One answer could be that f gives us information about theldquorandomnessrdquo of f if f were the characteristic function of a random set then f(α)would be very small outside a sharp peak at α = 0

    We can also give a more concrete and immediate answer Recall that in generalthe Fourier transform of a convolution equals the product of the transforms over Z

    4In fact Hardy Littlewood and Effinger use an assumption somewhat weaker than GRH they assumethat Dirichlet L-functions have no zeroes satisfying lt(s) ge θ where θ lt 34 is arbitrary (We will reviewDirichlet L-functions in a minute)

    12 THE CIRCLE METHOD FOURIER ANALYSIS ON Z 7

    this means that for the additive convolution

    (f lowast g)(n) =sum

    m1m2isinZm1+m2=n

    f(m1)g(m2)

    the Fourier transform satisfies the simple rule

    f lowast g(α) = f(α) middot g(α)

    We can see right away from this that (f lowast g)(n) can be non-zero only if n can bewritten as n = m1 + m2 for some m1 m2 such that f(m1) and g(m2) are non-zeroSimilarly (f lowastglowasth)(n) can be non-zero only if n can be written as n = m1 +m2 +m3

    for some m1 m2 m3 such that f(m1) f2(m2) and f3(m3) are all non-zero Thissuggests that to study the ternary Goldbach problem we define f1 f2 f3 Zrarr C sothat they take non-zero values only at the primes

    Hardy and Littlewood defined f1(n) = f2(n) = f3(n) = 0 for n non-prime (andalso for n le 0) and f1(n) = f2(n) = f3(n) = (log n)eminusnx for n prime (where x isa parameter to be fixed later) Here the factor eminusnx is there to provide ldquofast decayrdquoso that everything converges as we will see later Hardy and Littlewoodrsquos choice ofeminusnx (rather than some other function of fast decay) comes across in hindsight asbeing very clever though not quite best-possible (Their ldquochoicerdquo was to some extentnot a choice but an artifact of their version of the circle method which was framedin terms of power series not in terms of exponential sums with arbitrary smoothingfunctions) The term log n is there for technical reasons ndash in essence it makes senseto put it there because a random integer around n has a chance of about 1(log n) ofbeing prime

    We can see that (f1 lowast f2 lowast f3)(n) 6= 0 if and only if n can be written as the sumof three primes Our task is then to show that (f1 lowast f2 lowast f3)(n) (ie (f lowast f lowast f)(n))is non-zero for every n larger than a constant C sim 1027 Since the transform of aconvolution equals a product of transforms

    (f1lowastf2lowastf3)(n) =

    intRZ

    f1 lowast f2 lowast f3(α)e(αn)dα =

    intRZ

    (f1f2f3)(α)e(αn)dα (12)

    Our task is thus to show that the integralintRZ(f1f2f3)(α)e(αn)dα is non-zero

    As it happens f(α) is particularly large when α is close to a rational with smalldenominator Moreover for such α it turns out we can actually give rather preciseestimates for f(α) Define M (called the set of major arcs) to be a union of narrowarcs around the rationals with small denominator

    M =⋃qler

    ⋃a mod q

    (aq)=1

    (a

    qminus 1

    qQa

    q+

    1

    qQ

    )

    where Q is a constant times xr and r will be set later (This is a slight simplificationthe major-arc set we will actually use in the course of the proof will be a little different

    8 CHAPTER 1 INTRODUCTION

    due to a distinction between odd and even q) We can writeintRZ

    (f1f2f3)(α)e(αn)dα =

    intM

    (f1f2f3)(α)e(αn)dα+

    intm

    (f1f2f3)(α)e(αn)dα

    (13)where m is the complement (RZ) M (called minor arcs)

    Now we simply do not know how to give precise estimates for f(α) when α is inm However as Vinogradov realized one can give reasonable upper bounds on |f(α)|for α isin m This suggests the following strategy show thatint

    m

    |f1(α)||f2(α)||f3(α)|dα ltintM

    f1(α)f2(α)f3(α)e(αn)dα (14)

    By (12) and (13) this will imply immediately that (f1 lowast f2 lowast f3)(n) gt 0 and so wewill be done

    The name of circle method is given to the study of additive problems by means ofFourier analysis over Z and in particular to the use of a subdivision of the circle RZinto major and minor arcs to estimate the integral of a Fourier transform There wasa ldquocirclerdquo already in Hardy and Ramanujanrsquos work [HR00] but the subdivision intomajor and minor arcs is due to Hardy and Littlewood who also applied their methodto a wide variety of additive problems (Hence ldquothe Hardy-Littlewood methodrdquo as analternative name for the circle method) For instance before working on the ternaryGoldbach conjecture they studied the question of whether every n gt C can be writtenas the sum of kth powers (Waringrsquos problem) In fact they used a subdivision intomajor and minor arcs to study Waringrsquos problem and not for the ternary Goldbachproblem they had no minor-arc bounds for ternary Goldbach and their use of GRHhad the effect of making every α isin RZ yield to a major-arc treatment

    Vinogradov worked with finite exponential sums ie fi compactly supportedFrom todayrsquos perspective it is clear that there are applications (such as ours) in whichit can be more important for fi to be smooth than compactly supported still Vino-gradovrsquos simplifications were an incentive to further developments In the case of theternary Goldbachrsquos problem his key contribution consisted in the fact that he couldgive bounds on f(α) for α in the minor arcs without using GRH

    An important note in the case of the binary Goldbach conjecture the method failsat (14) and not before if our understanding of the actual value of fi(α) is at all correctit is simply not true in general thatint

    m

    |f1(α)||f2(α)|dα ltintM

    f1(α)f2(α)e(αn)dα

    Let us see why this is not surprising Set f1 = f2 = f3 = f for simplicity so thatwe have the integral of the square (f(α))2 for the binary problem and the integral ofthe cube (f(α))3 for the ternary problem Squaring like cubing amplifies the peaksof f(α) which are at the rationals of small denominator and their immediate neighbor-hoods (the major arcs) however cubing amplifies the peaks much more than squaringThis is why even though the arcs making up M are very narrow

    intM

    (f(α))3e(αn)dα

    13 THE MAJOR ARCS M 9

    is larger thanintm|f(α)|3dα that explains the name major arcs ndash they are not large but

    they give the major part of the contribution In contrast squaring amplifies the peaksless and this is why the absolute value of

    intMf(α)2e(αn)dα is in general smaller thanint

    m|f(α)|2dα As nobody knows how to prove a precise estimate (and in particular

    lower bounds) on f(α) for α isin m the binary Goldbach conjecture is still very muchout of reach

    To prove the ternary Goldbach conjecture it is enough to estimate both sides of(14) for carefully chosen f1 f2 f3 and compare them This is our task from now on

    13 The major arcs M

    131 What do we really know about L-functions and their zerosBefore we start let us give a very brief review of basic analytic number theory (in thesense of say [Dav67]) A Dirichlet character χ Z rarr C of modulus q is a characterof (ZqZ)lowast lifted to Z (In other words χ(n) = χ(n+ q) for all n χ(ab) = χ(a)χ(b)for all a b and χ(n) = 0 for (n q) 6= 1) A Dirichlet L-series is defined by

    L(s χ) =

    infinsumn=1

    χ(n)nminuss

    for lt(s) gt 1 and by analytic continuation for lt(s) le 1 (The Riemann zeta functionζ(s) is the L-function for the trivial character ie the character χ such that χ(n) = 1for all n) Taking logarithms and then derivatives we see that

    minus Lprime(s χ)

    L(s χ)=

    infinsumn=1

    χ(n)Λ(n)nminuss (15)

    for lt(s) gt 1 where Λ is the von Mangoldt function (Λ(n) = log p if n is some primepower pα α ge 1 and Λ(n) = 0 otherwise)

    Dirichlet introduced his characters and L-series so as to study primes in arithmeticprogressions In general and after some work (15) allows us to restate many sumsover the primes (such as our Fourier transforms f(α)) as sums over the zeros ofL(s χ)A non-trivial zero of L(s χ) is a zero of L(s χ) such that 0 lt lt(s) lt 1 (The otherzeros are called trivial because we know where they are namely at negative integersand in some cases also on the line lt(s) = 0 In order to eliminate all zeros onlt(s) = 0 outside s = 0 it suffices to assume that χ is primitive a primitive charactermodulo q is one that is not induced by (ie not the restriction of) any character modulod|q d lt q)

    The Generalized Riemann Hypothesis for Dirichlet L-functions is the statementthat for every Dirichlet character χ every non-trivial zero of L(s χ) satisfies lt(s) =12 Of course the Generalized Riemann Hypothesis (GRH) ndash and the Riemann Hy-pothesis which is the special case of χ trivial ndash remains unproven Thus if we want toprove unconditional statements we need to make do with partial results towards GRHTwo kinds of such results have been proven

    10 CHAPTER 1 INTRODUCTION

    bull Zero-free regions Ever since the late nineteenth century (Hadamard de laVallee-Poussin) we have known that there are hourglass-shaped regions (moreprecisely of the shape c

    log t le σ le 1minus clog t where c is a constant and where we

    write s = σ + it) outside which non-trivial zeros cannot lie Explicit values forc are known [McC84b] [Kad05] [Kad] There is also the Vinogradov-Korobovregion [Kor58] [Vin58] which is broader asymptotically but narrower in mostof the practical range (see [For02] however)

    bull Finite verifications of GRH It is possible to (ask a computer to) prove smallfinite fragments of GRH in the sense of verifying that all non-trivial zeros ofa given finite set of L-functions with imaginary part less than some constant Hlie on the critical line lt(s) = 12 Such verifications go back to Riemannwho checked the first few zeros of ζ(s) Large-scale rigorous computer-basedverifications are now a possibility

    Most work in the literature follows the first alternative though [Tao14] did use afinite verification of RH (ie GRH for the trivial character) Unfortunately zero-freeregions seem too narrow to be useful for the ternary Goldbach problem Thus we areleft with the second alternative

    In coordination with the present work Platt [Plab] verified that all zeros s of L-functions for characters χ with modulus q le 300000 satisfying =(s) le Hq lie on theline lt(s) = 12 where

    bull Hq = 108q for q odd and

    bull Hq = max(108q 200 + 75 middot 107q) for q even

    This was a medium-large computation taking a few hundreds of thousands of core-hours on a parallel computer It used interval arithmetic for the sake of rigor we willlater discuss what this means

    The choice to use a finite verification of GRH rather than zero-free regions hadconsequences on the manner in which the major and minor arcs had to be chosen Aswe shall see such a verification can be used to give very precise bounds on the majorarcs but also forces us to define them so that they are narrow and their number isconstant To be precise the major arcs were defined around rationals aq with q le rr = 300000 moreover as will become clear the fact that Hq is finite will force theirwidth to be bounded by c0rqx where c0 is a constant (say c0 = 8)

    132 Estimates of f(α) for α in the major arcs

    Recall that we want to estimate sums of the type f(α) =sumf(n)e(minusαn) where

    f(n) is something like (log n)η(nx) for n equal to a prime and 0 otherwise hereη Rrarr C is some function of fast decay such as Hardy and Littlewoodrsquos choice

    η(t) =

    eminust for t ge 0

    0 for t lt 0

    13 THE MAJOR ARCS M 11

    Let us modify this just a little ndash we will actually estimate

    Sη(α x) =sum

    Λ(n)e(αn)η(nx) (16)

    where Λ is the von Mangoldt function (as in (15)) The use of α rather thanminusα is justa bow to tradition as is the use of the letter S (for ldquosumrdquo) however the use of Λ(n)rather than just plain log p does actually simplify matters

    The function η here is sometimes called a smoothing function or simply a smooth-ing It will indeed be helpful for it to be smooth on (0infin) but in principle it neednot even be continuous (Vinogradovrsquos work implicitly uses in effect the ldquobrutal trun-cationrdquo 1[01](t) defined to be 1 when t isin [0 1] and 0 otherwise that would be fine forthe minor arcs but as it will become clear it is a bad idea as far as the major arcs areconcerned)

    Assume α is on a major arc meaning that we can write α = aq+δx for some aq(q small) and some δ (with |δ| small) We can write Sη(α x) as a linear combination

    Sη(α x) =sumχ

    cχSηχ

    x x

    )+ tiny error term (17)

    where

    Sηχ

    x x

    )=sum

    Λ(n)χ(n)e(δnx)η(nx) (18)

    In (17) χ runs over primitive Dirichlet characters of moduli d|q and cχ is small(|cχ| le

    radicdφ(q))

    Why are we expressing the sums Sη(α x) in terms of the sums Sηχ(δx x) whichlook more complicated The argument has become δx whereas before it was αHere δ is relatively small ndash smaller than the constant c0r in our setup In other wordse(δnx) will go around the circle a bounded number of times as n goes from 1 up to aconstant times x (by which time η(nx) has become small because η is of fast decay)This makes the sums much easier to estimate

    To estimate the sums Sηχ we will use L-functions together with one of the mostcommon tools of analytic number theory the Mellin transform This transform is es-sentially a Laplace transform with a change of variables and a Laplace transform inturn is a Fourier transform taken on a vertical line in the complex plane For f of fastenough decay the Mellin transform F = Mf of f is given by

    F (s) =

    int infin0

    f(t)tsdt

    t

    we can express f in terms of F by the Mellin inversion formula

    f(t) =1

    2πi

    int σ+iinfin

    σminusiinfinF (s)tminussds

    for any σ within an interval We can thus express e(δt)η(t) in terms of its Mellintransform Fδ and then use (15) to express Sηχ in terms of Fδ and Lprime(s χ)L(s χ)

    12 CHAPTER 1 INTRODUCTION

    shifting the integral in the Mellin inversion formula to the left we obtain what is knownin analytic number theory as an explicit formula

    Sηχ(δx x) = [η(minusδ)x]minussumρ

    Fδ(ρ)xρ + tiny error term

    Here the term between brackets appears only for χ trivial In the sum ρ goes over allnon-trivial zeros ofL(s χ) and Fδ is the Mellin transform of e(δt)η(t) (The tiny errorterm comes from a sum over the trivial zeros of L(s χ)) We will obtain the estimatewe desire if we manage to show that the sum over ρ is small

    The point is this if we verify GRH for L(s χ) up to imaginary part H ie ifwe check that all zeroes ρ of L(s χ) with |=(ρ)| le H satisfy lt(ρ) = 12 we have|xρ| =

    radicx In other words xρ is very small (compared to x) However for any

    ρ whose imaginary part has absolute value greater than H we know next to nothingabout its real part other than 0 le lt(ρ) le 1 (Zero-free regions are notoriously weakfor =(ρ) large we will not use them) Hence our only chance is to make sure thatFδ(ρ) is very small when |=(ρ)| ge H

    This has to be true for both δ very small (including the case δ = 0) and for δ not sosmall (|δ| up to c0rq which can be large because r is a large constant) How can wechoose η so that Fδ(ρ) is very small in both cases for τ = =(ρ) large

    The method of stationary phase is useful as an exploratory tool here In brief itsuggests (and can sometimes prove) that the main contribution to the integral

    Fδ(t) =

    int infin0

    e(δt)η(t)tsdt

    t(19)

    can be found where the phase of the integrand has derivative 0 This happens whent = minusτ2πδ (for sgn(τ) 6= sgn(δ)) the contribution is then a moderate factor timesη(minusτ2πδ) In other words if sgn(τ) 6= sgn(δ) and δ is not too small (|δ| ge 8 say)Fδ(σ + iτ) behaves like η(minusτ2πδ) if δ is small (|δ| lt 8) then Fδ behaves like F0which is the Mellin transform Mη of η Here is our goal then the decay of η(t) as|t| rarr infin should be as fast as possible and the decay of the transform Mη(σ + iτ)should also be as fast as possible

    This is a classical dilemma often called the uncertainty principle because it is themathematical fact underlying the physical principle of the same name you cannot havea function η that decreases extremely rapidly and whose Fourier transform (or in thiscase its Mellin transform) also decays extremely rapidly

    What does ldquoextremely rapidlyrdquo mean here It means (as Hardy himself proved)ldquofaster than any exponential eminusCtrdquo Thus Hardy and Littlewoodrsquos choice η(t) = eminust

    seems essentially optimal at first sightHowever it is not optimal We can choose η so that Mη decreases exponentially

    (with a constant C somewhat worse than for η(t) = eminust) but η decreases faster thanexponentially This is a particularly appealing possibility because it is t|δ| and not somuch t that risks being fairly small (To be explicit say we check GRH for charactersof modulus q up to Hq sim 50 middot c0rq ge 50|δ| Then we only know that |τ2πδ| amp8 So for η(t) = eminust η(minusτ2πδ) may be as large as eminus8 which is not negligibleIndeed since this term will be multiplied later by other terms eminus8 is simply not small

    13 THE MAJOR ARCS M 13

    enough On the other hand we can assume that Hq ge 200 (say) and so Mη(s) simeminus(π2)|τ | is completely negligible and will remain negligible even if we replace π2by a somewhat smaller constant)

    We shall take η(t) = eminust22 (that is the Gaussian) This is not the only possible

    choice but it is in some sense natural It is easy to show that the Mellin transform Fδfor η(t) = eminust

    22 is a multiple of what is called a parabolic cylinder function U(a z)with imaginary values for z There are plenty of estimates on parabolic cylinder func-tions in the literature ndash but mostly for a and z real in part because that is one of thecases occuring most often in applications There are some asymptotic expansions andestimates for U(a z) a z general due to Olver [Olv58] [Olv59] [Olv61] [Olv65]but unfortunately they come without fully explicit error terms for a and z within ourrange of interest (The same holds for [TV03])

    In the end I derived bounds for Fδ using the saddle-point method (The methodof stationary phase which we used to choose η seems to lead to error terms that aretoo large) The saddle-point method consists in brief in changing the contour of anintegral to be bounded (in this case (19)) so as to minimize the maximum of theintegrand (To use a metaphor in [dB81] find the lowest mountain pass)

    Here we strive to get clean bounds rather than the best possible constants Considerthe case k = 0 of Corollary 802 with k = 0 it states the following For s = σ + iτwith σ isin [0 1] and |τ | ge max(100 4π2|δ|) we obtain that the Mellin transform Fδ ofη(t)e(δt) with η(t) = eminust

    22 satisfies

    |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

    3001eminus01065( 2|τ|

    |`| )2

    if 4|τ |`2 lt 323286eminus01598|τ | if 4|τ |`2 ge 32

    (110)

    Similar bounds hold for σ in other ranges thus giving us estimates on the Mellintransform Fδ for η(t) = tkeminust

    22 and σ in the critical range [0 1] (We could do a littlebetter if we knew the value of σ but in our applications we do not once we leavethe range in which GRH has been checked We will give a bound (Theorem 801) thatdoes take σ into account and also reflects and takes advantage of the fact that thereis a transitional region around |τ | sim (32)(πδ)2 in practice however we will useCor 802)

    A momentrsquos thought shows that we can also use (110) to deal with the Mellintransform of η(t)e(δt) for any function of the form η(t) = eminust

    22g(t) (or more gener-ally η(t) = tkeminust

    22g(t)) where g(t) is any band-limited function By a band-limitedfunction we could mean a function whose Fourier transform is compactly supportedwhile that is a plausible choice it turns out to be better to work with functions that areband-limited with respect to the Mellin transform ndash in the sense of being of the form

    g(t) =

    int R

    minusRh(r)tminusirdr

    where h Rrarr C is supported on a compact interval [minusRR] withR not too large (sayR = 200) What happens is that the Mellin transform of the product eminust

    22g(t)e(δt)

    is a convolution of the Mellin transform Fδ(s) of eminust22e(δt) (estimated in (110)) and

    14 CHAPTER 1 INTRODUCTION

    that of g(t) (supported in [minusRR]) the effect of the convolution is just to delay decayof Fδ(s) by at most a shift by y 7rarr y minusR

    We wish to estimate Sηχ(δx) for several functions η This motivates us to derivean explicit formula (sect) general enough to work with all the weights η(t) we will workwith while being also completely explicit and free of any integrals that may be tediousto evaluate

    Once that is done and once we consider the input provided by Plattrsquos finite verifi-cation of GRH up to Hq we obtain simple bounds for different weights

    For η(t) = eminust22 x ge 108 χ a primitive character of modulus q le r = 300000

    and any δ isin R with |δ| le 4rq we obtain

    Sηχ

    x x

    )= Iq=1 middot η(minusδ)x+ E middot x (111)

    where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

    |E| le 4306 middot 10minus22 +1radicx

    (650400radicq

    + 112

    ) (112)

    Here η stands for the Fourier transform from R to R normalized as follows η(t) =intinfinminusinfin e(minusxt)η(x)dx Thus η(minusδ) is just

    radic2πeminus2π2δ2 (self-duality of the Gaussian)

    This is one of the main results of Part II see sect71 Similar bounds are also proventhere for η(t) = t2eminust

    22 as well as for a weight of type η(t) = teminust22g(t) where

    g(t) is a band-limited function and also for a weight η defined by a multiplicativeconvolution The conditions on q (namely q le r = 300000) and δ are what weexpected from the outset

    Thus concludes our treatment of the major arcs This is arguably the easiest part ofthe proof it was actually what I left for the end as I was fairly confident it would workout Minor-arc estimates are more delicate let us now examine them

    14 The minor arcs m

    141 Qualitative goals and main ideas

    What kind of bounds do we need What is there in the literatureWe wish to obtain upper bounds on |Sη(α x)| for some weight η and any α isin RZ

    not very close to a rational with small denominator Every α is close to some rationalaq what we are looking for is a bound on |Sη(α x)| that decreases rapidly when qincreases

    Moreover we want our bound to decrease rapidly when δ increases where α =aq + δx In fact the main terms in our bound will be decreasing functions ofmax(1 |δ|8) middot q (Let us write δ0 = max(2 |δ|4) from now on) This will allowour bound to be good enough outside narrow major arcs which will get narrower andnarrower as q increases ndash that is precisely the kind of major arcs we were presupposingin our major-arc bounds

    14 THE MINOR ARCS M 15

    It would be possible to work with narrow major arcs that become narrower as qincreases simply by allowing q to be very large (close to x) and assigning each angleto the fraction closest to it This is in fact the common procedure However thismakes matters more difficult in that we would have to minimize at the same time thefactors in front of terms xq x

    radicq etc and those in front of terms q

    radicqx and so

    on (These terms are being compared to the trivial bound x) Instead we choose tostrive for a direct dependence on δ throughout this will allow us to cap q at a muchlower level thus making terms such as q and

    radicqx negligible (This choice has been

    taken elsewhere in applications of the circle method but strangely seems absent fromprevious work on the ternary Goldbach conjecture)

    How good must our bounds be Since the major-arc bounds are valid only forq le r = 300000 and |δ| le 4rq we cannot afford even a single factor of log x (orany other function tending to infin as x rarr infin) in front of terms such as x

    radicq|δ0| a

    factor like that would make the term larger than the trivial bound x if q|δ0| is equal toa constant (r say) and x is very large Apparently there was no such ldquolog-free boundrdquowith explicit constants in the literature even though such bounds were considered tobe in principle feasible and even though previous work ([Che85] [Dab96] [DR01][Tao14]) had gradually decreased the number of factors of log x (In limited ranges forq there were log-free bounds without explicit constants see [Dab96] [Ram10] Theestimate in [Vin54 Thm 2a 2b] was almost log-free but not quite There were alsobounds [Kar93] [But11] that used L-functions and thus were not really useful in atruly minor-arc regime)

    It also seemed clear that a main bound proportional to (log q)2xradicq (as in [Tao14])

    was too large At the same time it was not really necessary to reach a bound of thebest possible form that could be found through Vinogradovrsquos basic approach namely

    |Sη(α x)| le Cxradicq

    φ(q) (113)

    Such a bound had been proven by Ramare [Ram10] for q in a limited range and Cnon-explicit later in [Ramc] ndash which postdates the first version of [Helb] ndash Ramarebroadened the range to q le x148 and gave an explicit value forC namelyC = 13000Such a bound is a notable achievement but unfortunately it is not useful for ourpurposes Rather we will aim at a bound whose main term is bounded by a constantaround 1 times x(log δ0q)

    radicδ0φ(q) this is slightly worse asymptotically than (113)

    but it is much better in the delicate range of δ0q sim 300000 and in fact for a muchwider range as well

    We see that we have several tasks One of them is the removal of logarithms wecannot afford a single factor of log x and in practice we can afford at most one factorof log q Removing logarithms will be possible in part because of the use of previouslyexisting efficient techniques (the large sieve for sequences with prime support) but alsobecause we will be able to find cancellation at several places in sums coming from acombinatorial identity (namely Vaughanrsquos identity) The task of finding cancellationis particularly delicate because we cannot afford large constants or for that matter

    16 CHAPTER 1 INTRODUCTION

    statements valid only for large x (Bounding a sum such assumn micro(n) efficiently where

    micro is the Mobius function

    micro(n) =

    (minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

    is harder than estimating a sum such assumn Λ(n) equally efficiently even though we

    are used to thinking of the two problems as equivalent)We have said that our bounds will improve as |δ| increases This dependence on

    δ will be secured in different ways at different places Sometimes δ will appear asan argument as in η(minusδ) for η piecewise continuous with ηprime isin L1 we know that|η(t)| rarr 0 as |t| rarr infin Sometimes we will obtain a dependence on δ by using severaldifferent rational approximations to the same α isin R Lastly we will obtain a gooddependence on δ in bilinear sums by supplying a scattered input to a large sieve

    If there is a main moral to the argument it lies in the close relation between thecircle method and the large sieve The circle method rests on the estimation of anintegral involving a Fourier transform f RZ rarr C as we will later see this leadsnaturally to estimating the `2-norm of f on subsets (namely unions of arcs) of the circleRZ The large sieve can be seen as an approximate discrete version of Plancherelrsquosidentity which states that |f |2 = |f |2

    Both in this section and in sect15 we shall use the large sieve in part so as to usethe fact that some of the functions we work with have prime support ie are non-zeroonly on prime numbers There are ways to use prime support to improve the outputof the large sieve In sect15 these techniques will be refined and then translated to thecontext of the circle method where f has (essentially) prime support and |f |2 must beintegrated over unions of arcs (This allows us to remove a logarithm) The main pointis that the large sieve is not being used as a black box rather we can adapt ideas from(say) the large-sieve context and apply them to the circle method

    Lastly there are the benefits of a continuous η Hardy and Littlewood alreadyused a continuous η this was abandoned by Vinogradov presumably for the sake ofsimplicity The idea that smooth weights η can be superior to sharp truncations isnow commonplace As we shall see using a continuous η is helpful in the minor-arcsregime but not as crucial there as for the major arcs We will not use a smooth η wewill prove our estimates for any continuous η that is piecewise C1 and then towardsthe end we will choose to use the same weight η = η2 as in [Tao14] in part because ithas compact support and in part for the sake of comparison The moral here is not quitethe common dictum ldquoalways smoothrdquo but rather that different kinds of smoothing canbe appropriate for different tasks in the end we will show how to coordinate differentsmoothing functions η

    There are other ideas involved for instance some of Vinogradovrsquos lemmas areimproved Let us now go into some of the details

    142 Combinatorial identitiesGenerally since Vinogradov a treatment of the minor arcs starts with a combinatorialidentity expressing Λ(n) (or the characteristic function of the primes) as a sum of two

    14 THE MINOR ARCS M 17

    or more convolutions (In this section by a convolution flowastg we will mean the Dirichletconvolution (f lowast g)(n) =

    sumd|n f(d)g(nd) ie the multiplicative convolution on the

    semigroup of positive integers)In some sense the archetypical identity is

    Λ = micro lowast log

    but it will not usually do the contribution of micro(d) log(nd) with d close to n is toodifficult to estimate precisely There are alternatives for example there is the identity

    Λ(n) log n = micro lowast log2minusΛ lowast Λ (114)

    which underlies an estimate of Selbergrsquos that in turn is the basis for the Erdos-Selbergproof of the prime number theorem see eg [MV07 sect82] More generally onecan decompose Λ(n)(log n)k as micro lowast logk+1 minus a linear combination of convolu-tions this kind of decomposition ndash really just a direct consequence of the develop-ment of (ζ prime(s)ζ(s))(k) ndash will be familiar to some from the exposition of Bombierirsquoswork [Bom76] in [FI10 sect3] (for instance) Another useful identity was that used byDaboussi [Dab96] witness its application in [DR01] which gives explicit estimates onexponential sums over primes

    The proof of Vinogradovrsquos three-prime result was simplified substantially [Vau77b]by the introduction of Vaughanrsquos identity

    Λ(n) = microleU lowast logminusΛleV lowast microleU lowast 1 + 1 lowast microgtU lowast ΛgtV + ΛleV (115)

    where we are using the notation

    fleW =

    f(n) if n leW 0 if n gt W

    fgtW =

    0 if n leW f(n) if n gt W

    Of the resulting sums (sumn(microleU lowast log)(n)e(αn)η(nx) etc) the first three are said

    to be of type I type I (again) and type II the last sumsumnleV Λ(n) is negligible

    One of the advantages of Vaughanrsquos identity is its flexibility we can set U and Vto whatever values we wish Its main disadvantage is that it is not ldquolog-freerdquo in that itseems to impose the loss of two factors of log x if we sum each side of (115) from 1to x we obtain

    sumnlex Λ(n) sim x on the left side whereas if we bound the sum on the

    right side without the use of cancellation we obtain a bound of x(log x)2 Of coursewe will obtain some cancellation from the phase e(αn) still even if this gives us afactor of say 1

    radicq we will get a bound of x(log x)2

    radicq which is worse than the

    trivial bound x for q bounded and x large Since we want a bound that is useful for allq larger than the constant r and all x larger than a constant this will not do

    As was pointed out in [Tao14] it is possible to get a factor of (log q)2 instead of afactor of (log x)2 in the type II sums by setting U and V appropriately Unfortunatelya factor of (log q)2 is still too large in practice and there is also the issue of factors oflog x in type I sums

    Vinogradov had already managed to get an essentially log-free result (by a ratherdifficult procedure) in [Vin54 Ch IX] The result in [Dab96] is log-free Unfortu-nately the explicit result in [DR01] ndash the study of which encouraged me at the begin-ning of the project ndash is not For a while I worked with the case k = 2 of the expansion

    18 CHAPTER 1 INTRODUCTION

    of (ζ prime(s)ζ(s))(k) which gives

    Λ middot log2 = micro lowast log3minus3 middot (Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (116)

    This identity is essentially log-free while a trivial bound on the sum of the right sidefor n from 1 to N does seem to have two extra factors of log they are present only inthe term micro lowast log3 which is not the hardest one to estimate Ramare obtained a log-freebound in [Ram10] using an identity introduced by Diamond and Steinig in the courseof their own work on elementary proofs of the prime number theorem [DS70] thatidentity gives a decomposition for Λ middot logk that can also be derived from the expansionof (ζ prime(s)ζ(s))(k) by a clever grouping of terms

    In the end I decided to use Vaughanrsquos identity motivated in part by [Tao14] andin part by the lack of free parameters in (116) as can be seen in (115) Vaughanrsquosidentity has two parameters U V that we can set to whatever values we think best Theform of the identity allowed me to reuse much of my work up to that point but it alsoposed a challenge since Vaughanrsquos identity is by no means log-free one has obtaincancellation in Vaughanrsquos identity at every possible step beyond the cancellation givenby the phase e(αn) (The presence of a phase in fact makes the task of getting can-cellation from the identity more complicated) The removal of logarithms will be oneof our main tasks in what follows It is clear that the presence of the Mobius functionmicro should give in principle some cancellation we will show how to use it to obtain asmuch cancellation as we need ndash with good constants and not just asymptotically

    143 Type I sumsThere are two type I sums namelysum

    mleU

    micro(m)sumn

    (log n)e(αmn)η(mnx

    )(117)

    and sumvleV

    Λ(v)sumuleU

    micro(u)sumn

    e(αvun)η(vunx

    ) (118)

    In either case α = aq + δx where q is larger than a constant r and |δx| le 1qQ0

    for some Q0 gt max(qradicx) For the purposes of this exposition we will set it as our

    task to estimate the slightly simpler sumsummleD

    micro(m)sumn

    e(αmn)η(mnx

    ) (119)

    where D can be U or UV or something else less than xWhy can we consider this simpler sum without omitting anything essential It is

    clear that (117) is of the same kind as (119) The inner double sum in (118) is just(119) with αv instead of α this enables us to estimate (118) by means of (119) for qsmall ie the more delicate case If q is not small then the approximation αv sim avqmay not be accurate enough In that case we collapse the two outer sums in (118) intoa sum

    sumn(ΛleV lowast microleU )(n) and treat all of (118) much as we will treat (119) since

    14 THE MINOR ARCS M 19

    q is not small we can afford to bound (ΛleV lowast microleU )(n) trivially (by log n) in the lesssensitive terms

    Let us first outline Vinogradovrsquos procedure for bounding type I sums Just by sum-ming a geometric series we get∣∣∣∣∣∣

    sumnleN

    e(αn)

    ∣∣∣∣∣∣ le min

    (N

    c

    α

    ) (120)

    where c is a constant and α is the distance from α to the nearest integer Vinogradovsplits the outer sum in (119) into sums of length q When m runs on an interval oflength q the angle amq runs through all fractions of the form bq due to the errorδx αm could be close to 0 for two values of n but otherwise αm takes valuesbounded below by 1q (twice) 2q (twice) 3q (twice) etc Thus∣∣∣∣∣∣

    sumyltmley+q

    micro(m)sumnleN

    e(αmn)

    ∣∣∣∣∣∣ lesum

    yltmley+q

    ∣∣∣∣∣∣sumnleN

    e(αmn)

    ∣∣∣∣∣∣ le 2N

    m+ 2cq log eq

    (121)for any y ge 0

    There are several ways to improve this One is simply to estimate the inner summore precisely this was already done in [DR01] One can also define a smoothingfunction η as in (119) it is easy to get∣∣∣∣∣∣

    sumnleN

    e(αn)η(nx

    )∣∣∣∣∣∣ le min

    (x|η|1 +

    |ηprime|12|ηprime|1

    2| sin(πα)||ηprimeprime|infin

    4x(sinπα)2

    )

    Except for the third term this is as in [Tao14] We could also choose carefully whichbound to use for each m surprisingly this gives an improvement ndash in fact an impor-tant one for m large However even with these improvements we still have a termproportional to Nm as in (121) and this contributes about (x log x)q to the sum(119) thus giving us an estimate that is not log-free

    What we have to do naturally is to take out the terms with q|m for m small (If mis large then those may not be the terms for which mα is close to 0 we will later seewhat to do) For y + q le Q2 |αminus aq| le 1qQ we get thatsum

    yltmley+q

    q-m

    min

    (A

    B

    | sinπαn|

    C

    | sinπαn|2

    )(122)

    is at most

    min

    (20

    3π2Cq2 2A+

    4q

    π

    radicAC

    2Bq

    πmax

    (2 log

    Ce3q

    )) (123)

    This is satisfactory We are left with all the terms m le M = min(DQ2) with q|mndash and also with all the terms Q2 lt m le D For m le M divisible by q we can

    20 CHAPTER 1 INTRODUCTION

    estimate (as opposed to just bound from above) the inner sum in (119) by the Poissonsummation formula and then sum over m but without taking absolute values writingm = aq we get a main term

    xmicro(q)

    qmiddot η(minusδ) middot

    sumaleMq

    (aq)=1

    micro(a)

    a (124)

    where (a q) stands for the greatest common divisor of a and qIt is clear that we have to get cancellation over micro here There is an elegant elemen-

    tary argument [GR96] showing that the absolute value of the sum in (124) is at most1 We need to gain one more log however Ramare [Ramb] helpfully furnished thefollowing bound ∣∣∣∣∣∣∣∣

    sumalex

    (aq)=1

    micro(a)

    a

    ∣∣∣∣∣∣∣∣ le4

    5

    q

    φ(q)

    1

    log xq(125)

    for q le x (Cf [EM95] [EM96]) This is neither trivial nor elementary5 We are so tospeak allowed to use non-elementary means (that is methods based on L-functions)because the only L-function we need to use here is the Riemann zeta function

    What shall we do for m gt Q2 We can always give a bound

    sumyltmley+q

    min

    (A

    C

    | sinπαn|2

    )le 3A+

    4q

    π

    radicAC (126)

    for y arbitrary since AC will be of constant size (4qπ)radicAC is pleasant enough but

    the contribution of 3A sim 3|η|1xy is nasty (it adds a multiple of (x log x)q to thetotal) and seems unavoidable the values of m for which αm is close to 0 no longercorrespond to the congruence class m equiv 0 mod q and thus cannot be taken out

    The solution is to switch approximations (The idea of using different approxima-tions to the same α is neither new nor recent in the general context of the circle methodsee [Vau97 sect28 Ex 2] What may be new is its use to clear a hurdle in type I sums)What does this mean If α were exactly or almost exactly aq then there would beno other very good approximations in a reasonable range However note that we candefine Q = bx|δq|c for α = aq + δx and still have |αminus aq| le 1qQ If δ is verysmall Q will be larger than 2D and there will be no terms with Q2 lt m le D toworry about

    5The current state of knowledge may seem surprising after all we expect nearly square-root cancella-tion ndash for instance |

    sumnlex micro(n)n| le

    radic2x holds for all real 0 lt x le 1012 see also the stronger

    bound [Dre93]) The classical zero-free region of the Riemann zeta function ought to give a factor ofexp(minus

    radic(log x)c) which looks much better than 1 log x What happens is that (a) such a factor is

    not actually much better than 1 log x for x sim 1030 say (b) estimating sums involving the Mobius func-tion by means of an explicit formula is harder than estimating sums involving Λ(n) the residues of 1ζ(s)at the non-trivial zeros of s come into play As a result getting non-trivial explicit results on sums of micro(n)is harder than one would naively expect from the quality of classical effective (but non-explicit) results See[Rama] for a survey of explicit bounds

    14 THE MINOR ARCS M 21

    What happens if δ is not very small We know that for any Qprime there is an approx-imation aprimeqprime to α with |αminus aprimeqprime| le 1qprimeQprime and qprime le Qprime However for Qprime gt Q weknow that aprimeqprime cannot equal aq by the definition of Q the approximation aq is notgood enough ie |α minus aq| le 1qQprime does not hold Since aq 6= aprimeqprime we see that|aq minus aprimeqprime| ge 1qqprime and this implies that qprime ge (ε(1 + ε))Q

    Thus for m gt Q2 the solution is to apply (126) with aprimeqprime instead of aq Thecontribution of A fades into insignificance for the first sum over a range y lt m ley + qprime y ge Q2 it contributes at most x(Q2) and all the other contributions of Asum up to at most a constant times (x log x)qprime

    Proceeding in this way we obtain a total bound for (119) whose main terms areproportional to

    1

    φ(q)

    x

    log xq

    min

    (1

    1

    δ2

    )

    2

    π

    radic|ηprimeprime|infin middotD and q log max

    (D

    q q

    ) (127)

    with good explicit constants The first term ndash usually the largest one ndash is precisely whatwe needed it is proportional to (1φ(q))x log x for q small and decreases rapidly as|δ| increases

    144 Type II or bilinear sums

    We must now bound

    S =summ

    (1 lowast microgtU )(m)sumngtV

    Λ(n)e(αmn)η(mnx)

    At this point it is convenient to assume that η is the Mellin convolution of two functionsThe multiplicative or Mellin convolution on R+ is defined by

    (η0 lowastM η1)(t) =

    int infin0

    η0(r)η1

    (t

    r

    )dr

    r

    Tao [Tao14] takes η = η2 = η1 lowastM η1 where η1 is a brutal truncation viz thefunction taking the value 2 on [12 1] and 0 elsewhere We take the same η2 in partfor comparison purposes and in part because this will allow us to use off-the-shelfestimates on the large sieve (Brutal truncations are rarely optimal in principle but asthey are very common results for them have been carefully optimized in the literature)Clearly

    S =

    int XU

    V

    summ

    sumdgtUd|m

    micro(d)

    η1

    (m

    xW

    )middotsumngeV

    Λ(n)e(αmn)η1

    ( nW

    ) dWW

    (128)

    22 CHAPTER 1 INTRODUCTION

    By Cauchy-Schwarz the integrand is at mostradicS1(UW )S2(VW ) where

    S1(UW ) =sum

    x2W ltmle x

    W

    ∣∣∣∣∣∣∣∣sumdgtUd|m

    micro(d)

    ∣∣∣∣∣∣∣∣2

    S2(VW ) =sum

    x2W lemle

    xW

    ∣∣∣∣∣∣∣sum

    max(VW2 )lenleW

    Λ(n)e(αmn)

    ∣∣∣∣∣∣∣2

    (129)

    We must bound S1(UW ) by a constant times xW We are able to do this ndash witha good constant (A careless bound would have given a multiple of (xU) log3(xU)which is much too large) First we reduce S1(W ) to an expression involving an inte-gral of sum

    r1lex

    sumr2lex

    (r1r2)=1

    micro(r1)micro(r2)

    σ(r1)σ(r2) (130)

    We can bound (130) by the use of bounds onsumnlet micro(n)n combined with the es-

    timation of infinite products by means of approximations to ζ(s) for s rarr 1+ Aftersome additional manipulations we obtain a bound for S1(UW ) whose main term isat most (3π2)(xW ) for each W and closer to 022482xW on average over W

    (This is as good a point as any to say that throughout we can use a trick in [Tao14]that allows us to work with odd values of integer variables throughout instead of lettingm or n range over all integers Here for instance if m and n are restricted to be oddwe obtain a bound of (2π2)(xW ) for individual W and 015107xW on averageoverW This is so even though we are losing some cancellation in micro by the restriction)

    Let us now bound S2(VW ) This is traditionally done by Linnikrsquos dispersionmethod However it should be clear that the thing to do nowadays is to use a largesieve and more specifically a large sieve for primes that kind of large sieve is nothingother than a tool for estimating expressions such as S2(VW ) (Incidentally eventhough we are trying to save every factor of log we can we choose not to use smallsieves at all either here or elsewhere) In order to take advantage of prime support weuse Montgomeryrsquos inequality ([Mon68] [Hux72] see the expositions in [Mon71 pp27ndash29] and [IK04 sect74]) combined with Montgomery and Vaughanrsquos large sieve withweights [MV73 (16)] following the general procedure in [MV73 (16)] We obtain abound of the form

    logW

    log W2q

    (x

    4φ(q)+qW

    φ(q)

    )W

    2(131)

    on S2(VW ) where of course we can also choose not to gain a factor of logW2q ifq is close to or greater than W

    It remains to see how to gain a factor of |δ| in the major arcs and more specificallyin S2(VW ) To explain this let us step back and take a look at what the large sieve is

    14 THE MINOR ARCS M 23

    Given a civilized function f Zrarr C Plancherelrsquos identity tells us thatintRZ

    ∣∣∣f (α)∣∣∣2 dα =

    sumn

    |f(n)|2

    The large sieve can be seen as an approximate or statistical version of this for aldquosamplerdquo of points α1 α2 αk satisfying |αi minus αj | ge β for i 6= j it tells us thatsum

    1lejlek

    ∣∣∣f (αi)∣∣∣2 le (X + βminus1)

    sumn

    |f(n)|2 (132)

    assuming that f is supported on an interval of length X Now consider α1 = α α2 = 2α α3 = 3α If α = aq then the angles

    α1 αq are well-separated ie they satisfy |αi minus αj | ge 1q and so we can apply(132) with β = 1q However αq+1 = α1 Thus if we have an outer sum oflength L gt q ndash in (129) we have an outer sum of length L = x2W ndash we needto split it into dLqe blocks of length q and so the total bound given by (132) isdLqe(X + q)

    sumn |f(n)|2 Indeed this is what gives us (131) which is fine but we

    want to do better for |δ| larger than a constantSuppose then that α = aq + δx where |δ| gt 8 say Then the angles α1

    and αq+1 are not identical |α1 minus αq+1| le q|δ|x We also see that αq+1 is at adistance at least q|δ|x from α2 α3 αq provided that q|δ|x lt 1q We can goon with αq+2 αq+3 and stop only once there is overlap ie only once we reachαm such that m|δ|x ge 1q We then give all the angles α1 αm ndash which areseparated by at least q|δ|x from each other ndash to the large sieve at the same time Wedo this dLme le dL(x|δ|q)e times and obtain a total bound of dL(x|δ|q)e(X +x|δ|q)

    sumn |f(n)|2 which for L = x2W X = W2 gives us about(

    x

    4Q

    W

    2+x

    4

    )logW

    provided thatL ge x|δ|q and as usual |αminusaq| le 1qQ This is very small comparedto the trivial bound xW8

    What happens if L lt x|δq| Then there is never any overlap we consider allangles αi and give them all together to the large sieve The total bound is (W 24 +xW2|δ|q) logW If L = x2W is smaller than say x3|δq| then we see clearlythat there are non-intersecting swarms of angles αi around the rationals aq We canthus save a factor of log (or rather (φ(q)q) log(W|δq|)) by applying Montgomeryrsquosinequality which operates by strewing displacements of given angles (or here swarmsaround angles) around the circle to the extent possible while keeping everything well-separated In this way we obtain a bound of the form

    logW

    log W|δ|q

    (x

    |δ|φ(q)+

    q

    φ(q)

    W

    2

    )W

    2

    Compare this to (131) we have gained a factor of |δ|4 and so we use this estimatewhen |δ| gt 4 (We will actually use the criterion |δ| gt 8 but since we will be working

    24 CHAPTER 1 INTRODUCTION

    with approximations of the form 2α = aq + δx the value of δ in our actual workis twice of what it is in this introduction This is a consequence of working with sumsover the odd integers as in [Tao14])

    We have succeeded in eliminating all factors of log we came across The onlyfactor of log that remains is log xUV coming from the integral

    int xUV

    dWW Thuswe want UV to be close to x but we cannot let it be too close since we also have aterm proportional to D = UV in (127) and we need to keep it substantially smallerthan x We set U and V so that UV is x

    radicqmax(4 |δ|) or thereabouts

    In the end after some work we obtain our main minor-arcs bound (Theorem 311)It states the following Let x ge x0 x0 = 216 middot 1020 Tecall that Sη(α x) =sumn Λ(n)e(αn)η(nx) and η2 = η1lowastM η1 = 4 middot1[121]lowast1[121] Let 2α = aq+δx

    q le Q gcd(a q) = 1 |δx| le 1qQ where Q = (34)x23 If q le x136 then

    |Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

    middot x+25xradicδ0q

    +2x

    δ0qmiddot Lxδ0qq + 336x56

    (133)where

    δ0 = max(2 |δ|4) Rxt = 027125 log

    (1 +

    log 4t

    2 log 9x13

    2004t

    )+ 041415

    Lxtq =q

    φ(q)

    (13

    4log t+ 782

    )+ 1366 log t+ 3755

    (134)The factor Rxt is small in practice for typical ldquodifficultrdquo values of x and δ0x it is

    less than 1 The crucial things to notice in (133) are that there is no factor of log x andthat in the main term there is only one factor of log δ0q The fact that δ0 helps us asit grows is precisely what enables us to take major arcs that get narrower and narroweras q grows

    15 Integrals over the major and minor arcsSo far we have sketched (sect13) how to estimate Sη(α x) for α in the major arcs andη based on the Gaussian eminust

    22 and also (sect14) how to bound |Sη(α x)| for α in theminor arcs and η = η2 where η2 = 4 middot 1[121] lowastM 1[121] We now must show how touse such information to estimate integrals such as the ones in (14)

    We will use two smoothing functions η+ ηlowast in the notation of (13) we set f1 =f2 = Λ(n)η+(nx) f3 = Λ(n)ηlowast(nx) and so we must give a lower bound forint

    M

    (Sη+(α x))2Sηlowast(α x)e(minusαn)dα (135)

    and an upper bound for intm

    ∣∣Sη+(α x)∣∣2 Sηlowast(α x)e(minusαn)dα (136)

    15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 25

    so that we can verify (14)The traditional approach to (136) is to boundintm

    (Sη+(α x))2Sηlowast(α x)e(minusαn)dα leintm

    ∣∣Sη+(α x)∣∣2 dα middotmax

    αisinmηlowast(α)

    lesumn

    Λ(n)2η2+

    (nx

    )middotmaxαisinm

    Sηlowast(α x)(137)

    Since the sum over n is of the order of x log x this is not log-free and so cannot begood enough we will later see how to do better Still this gets the main shape rightour bound on (136) will be proportional to |η+|22|ηlowast|1 Moreover we see that ηlowast hasto be such that we know how to bound |Sηlowast(α x)| for α isin m while our choice of η+

    is more or less free at least as far as the minor arcs are concernedWhat about the major arcs In order to do anything on them we will have to be

    able to estimate both η+(α) and ηlowast(α) for α isin M If that is the case then as weshall see we will be able to obtain that the main term of (135) is an infinite product(independent of the smoothing functions) times x2 timesint infin

    minusinfin(η+(minusα))2ηlowast(minusα)e(minusαnx)dα

    =

    int infin0

    int infin0

    η+(t1)η+(t2)ηlowast

    (nxminus (t1 + t2)

    )dt1dt2

    (138)

    In other words we want to maximize (or nearly maximize) the expression on the rightof (138) divided by |η+|22|ηlowast|1

    One way to do this is to let ηlowast be concentrated on a small interval [0 ε) Then theright side of (138) is approximately

    |ηlowast|1 middotint infin

    0

    η+(t)η+

    (nxminus t)dt (139)

    To maximize (139) we should make sure that η+(t) sim η+(nxminus t) We set x sim n2and see that we should define η+ so that it is supported on [0 2] and symmetric aroundt = 1 or nearly so this will maximize the ratio of (139) to |η+|22|ηlowast|1

    We should do this while making sure that we will know how to estimate Sη+(α x)for α isin M We know how to estimate Sη(α x) very precisely for functions of theform η(t) = g(t)eminust

    22 η(t) = g(t)teminust22 etc where g(t) is band-limited We will

    work with a function η+ of that form chosen so as to be very close (in `2 norm) to afunction η that is in fact supported on [0 2] and symmetric around t = 1

    We choose

    η(t) =

    t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 if t 6isin [0 2]

    This function is obviously symmetric (η(t) = η(2 minus t)) and vanishes to high orderat t = 0 besides being supported on [0 2]

    We set η+(t) = hR(t)teminust22 where hR(t) is an approximation to the function

    h(t) =

    t2(2minus t)3etminus

    12 if t isin [0 2]

    0 if t 6isin [0 2]

    26 CHAPTER 1 INTRODUCTION

    We just let hR(t) be the inverse Mellin transform of the truncation ofMh to an interval[minusiR iR] (Explicitly

    hR(t) =

    int infin0

    h(tyminus1)FR(y)dy

    y

    where FR(t) = sin(R log y)(π log y) that is FR is the Dirichlet kernel with a changeof variables)

    Since the Mellin transform of teminust22 is regular at s = 0 the Mellin transform

    Mη+ will be holomorphic in a neighborhood of s 0 le lt(s) le 1 even thoughthe truncation of Mh to [minusiR iR] is brutal Set R = 200 say By the fast decay ofMh(it) and the fact that the Mellin transform M is an isometry |(hR(t)minush(t))t|2 isvery small and hence so is |η+ minus η|2 as we desired

    But what about the requirement that we be able to estimate Sηlowast(α x) for bothα isin m and α isinM

    Generally speaking if we know how to estimate Sη1(α x) for some α isin RZ andwe also know how to estimate Sη2(α x) for all other α isin RZ where η1 and η2 aretwo smoothing functions then we know how to estimate Sη3(α x) for all α isin RZwhere η3 = η1 lowastM η2 or more generally ηlowast(t) = (η1 lowastM η2)(κt) κ gt 0 a constantThis is an easy exercise on exchanging the order of integration and summation

    Sηlowast(α x) =sumn

    Λ(n)e(αn)(η1 lowastM η2)(κn

    x

    )=

    int infin0

    sumn

    Λ(n)e(αn)η1(κr)η2

    ( nrx

    ) drr

    =

    int infin0

    η1(κr)Sη2(rx)dr

    r

    (140)and similarly with η1 and η2 switched Of course this trick is valid for all exponentialsums any function f(n) would do in place of Λ(n) The only caveat is that η1 (andη2) should be small very near 0 since for r small we may not be able to estimateSη2(rx) (or Sη1(rx)) with any precision This is not a problem one of our functionswill be t2eminust

    22 which vanishes to second order at 0 and the other one will be η2 =4 middot 1[121] lowastM 1[121] which has support bounded away from 0 We will set κ large(say κ = 49) so that the support of ηlowast is indeed concentrated on a small interval [0 ε)as we wanted

    Now that we have chosen our smoothing weights η+ and ηlowast we have to estimate themajor-arc integral (135) and the minor-arc integral (136) What follows can actuallybe done for general η+ and ηlowast we could have left our particular choice of η+ and ηlowastfor the end

    Estimating the major-arc integral (135) may sound like an easy task since we haverather precise estimates for Sη(α x) (η = η+ ηlowast) when α is on the major arcs wecould just replace Sη(α x) in (135) by the approximation given by (17) and (111) Itis however more efficient to express (135) as the sum of the contribution of the trivialcharacter (a sum of integrals of (η(minusδ)x)3 where η(minusδ)x comes from (111)) plus a

    15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 27

    term of the form

    (maximum ofradicq middot E(q) for q le r) middot

    intM

    ∣∣Sη+(α x)∣∣2 dα

    where E(q) = E is as in (112) plus two other terms of essentially the same form Asusual the major arcs M are the arcs around rationals aq with q le r We will soondiscuss how to bound the integral of

    ∣∣Sη+(α x)∣∣2 over arcs around rationals aq with

    q le s s arbitrary Here however it is best to estimate the integral over M using theestimate on Sη+(α x) from (17) and (111) we obtain a great deal of cancellationwith the effect that for χ non-trivial the error term in (112) appears only when it getssquared and thus becomes negligible

    The contribution of the trivial character has an easy approximation thanks to thefast decay of η We obtain that the major-arc integral (135) equals a main termC0Cηηlowastx

    2 where

    C0 =prodp|n

    (1minus 1

    (pminus 1)2

    )middotprodp-n

    (1 +

    1

    (pminus 1)3

    )

    Cηηlowast =

    int infin0

    int infin0

    η(t1)η(t2)ηlowast

    (nxminus (t1 + t2)

    )dt1dt2

    plus several small error terms We have already chosen η ηlowast and x so as to (nearly)maximize Cηηlowast

    It is time to bound the minor-arc integral (136) As we said in sect15 we must dobetter than the usual bound (137) Since our minor-arc bound (32) on |Sη(α x)|α sim aq decreases as q increases it makes sense to use partial summation togetherwith bounds onint

    ms

    |Sη+(α x)|2 =

    intMs

    |Sη+(α x)|2dαminusintM

    |Sη+(α x)|2dα

    where ms denotes the arcs around aq r lt q le s and Ms denotes the arcs around allaq q le s We already know how to estimate the integral on M How do we boundthe integral on Ms

    In order to do better than the trivial boundintMsleintRZ we will need to use the

    fact that the series (16) defining Sη+(α x) is essentially supported on prime numbersBounding the integral on Ms is closely related to the problem of bounding

    sumqles

    suma mod q

    (aq)=1

    ∣∣∣∣∣∣sumnlex

    ane(aq)

    ∣∣∣∣∣∣2

    (141)

    efficiently for s considerably smaller thanradicx and an supported on the primes

    radicx lt

    p le x This is a classical problem in the study of the large sieve The usual bound on(141) (by for instance Montgomeryrsquos inequality) has a gain of a factor of

    2eγ(log s)(log xs2)

    28 CHAPTER 1 INTRODUCTION

    relative to the bound of (x + s2)sumn |an|2 that one would get from the large sieve

    without using prime support Heath-Brown proceeded similarly to boundintMs

    |Sη+(α x)|2dα 2eγ log s

    log xs2

    intRZ|Sη+(α x)|2dα (142)

    This already gives us the gain of C(log s) log x that we absolutely need butthe constant C is suboptimal the factor in the right side of (142) should really be(log s) log x ie C should be 1 We cannot reasonably hope to obtain a factor betterthan 2(log s) log x in the minor arcs due to what is known as the parity problem insieve theory As it turns out Ramare [Ram09] had given general bounds on the largesieve that were clearly conducive to better bounds on (141) though they involved aratio that was not easy to bound in general

    I used several careful estimations (including [Ram95 Lem 34]) to reduce theproblem of bounding this ratio to a finite number of cases which I then checked bya rigorous computation This approach gave a bound on (141) with a factor of sizeclose to 2(log s) log x (This solves the large-sieve problem for s le x03 it wouldstill be worthwhile to give a computation-free proof for all s le x12minusε ε gt 0) It wasthen easy to give an analogous bound for the integral over Ms namelyint

    Ms

    |Sη+(α x)|2dα 2 log s

    log x

    intRZ|Sη+(α x)|2dα

    where can easily be made precise by replacing log s by log s + 136 and log x bylog x + c where c is a small constant Without this improvement the main theoremwould still have been proved but the required computation time would have been mul-tiplied by a factor of considerably more than e3γ = 56499

    What remained then was just to compare the estimates on (135) and (136) andcheck that (136) is smaller for n ge 1027 This final step was just bookkeeping Aswe already discussed a check for n lt 1027 is easy Thus ends the proof of the maintheorem

    16 Some remarks on computationsThere were two main computational tasks verifying the ternary conjecture for all n leC and checking the Generalized Riemann Hypothesis for modulus q le r up to acertain height

    The first task was not very demanding Platt and I verified in [HP13] that everyodd integer 5 lt n le 88 middot 1030 can be written as the sum of three primes (In theend only a check for 5 lt n le 1027 was needed) We proceeded as follows In amajor computational effort Oliveira e Silva Herzog and Pardi [OeSHP14]) had alreadychecked that the binary Goldbach conjecture is true up to 4 middot 1018 ndash that is every evennumber up to 4 middot 1018 is the sum of two primes Given that all we had to do wasto construct a ldquoprime ladderrdquo that is a list of primes from 3 up to 88 middot 1030 suchthat the difference between any two consecutive primes in the list is at least 4 and atmost 4 middot 1018 (This is a known strategy see [Sao98]) Then for any odd integer

    16 SOME REMARKS ON COMPUTATIONS 29

    5 lt n le 88 middot 1030 there is a prime p in the list such that 4 le n minus p le 4 middot 1018 + 2(Choose the largest p lt n in the ladder or if n minus that prime is 2 choose the primeimmediately under that) By [OeSHP14] (and the fact that 4 middot 1018 + 2 equals p + qwhere p = 2000000000000001301 and q = 1999999999999998701 are both prime)we can write nminus p = p1 + p2 for some primes p1 p2 and so n = p+ p1 + p2

    Building a prime ladder involves only integer arithmetic that is computer manip-ulation of integers rather than of real numbers Integers are something that computerscan handle rapidly and reliably We look for primes for our ladder only among a spe-cial set of integers whose primality can be tested deterministically quite quickly (Prothnumbers k middot 2m + 1 k lt 2m) Thus we can build a prime ladder by a rigorousdeterministic algorithm that can be (and was) parallelized trivially

    The second computation is more demanding It consists in verifying that for everyL-function L(s χ) with χ of conductor q le r = 300000 (for q even) or q le r2(for q odd) all zeroes of L(s χ) such that |=(s)| le Hq = 108q (for q odd) and|=(s)| le Hq = max(108q 200 + 75 middot 107q (for q even) lie on the critical lineAs a matter of fact Platt went up to conductor q le 200000 (or twice that for q even)[Plab] he had already gone up to conductor 100000 in his PhD thesis [Pla11] Theverification took in total about 400000 core-hours (ie the total number of processorcores used times the number of hours they ran equals 400000 nowadays a top-of-the-line processor typically has eight cores) In the end since I used only q le 150000 (ortwice that for q even) the number of hours actually needed was closer to 160000 sinceI could have made do with q le 120000 (at the cost of increasing C to 1029 or 1030) itis likely in retrospect that only about 80000 core-hours were needed

    Checking zeros of L-functions computationally goes back to Riemann (who didit by hand for the special case of the Riemann zeta function) It is also one of thethings that were tried on digital computers in their early days (by Turing [Tur53] forinstance see the exposition in [Boo06b]) One of the main issues to be careful aboutarises whenever one manipulates real numbers via a computer generally speaking acomputer cannot store an irrational number moreover while a computer can handlerationals it is really most comfortable handling just those rationals whose denomina-tors are powers of two Thus one cannot really say ldquocomputer give me the sine ofthat numberrdquo and expect a precise result What one should do if one really wants toprove something (as is the case here) is to say ldquocomputer I am giving you an intervalI = [a2k b2k] give me an interval I prime = [c2` d2`] preferably very short suchthat sin(I) sub I primerdquo This is called interval arithmetic it is arguably the easiest way to dofloating-point computations rigorously

    Processors do not do this natively and if interval arithmetic is implemented purelyon software computations can be slowed down by a factor of about 100 Fortunatelythere are ways of running interval-arithmetic computations partly on hardware partlyon software

    Incidentally there are some basic functions (such as sin) that should always be doneon software not just if one wants to use interval arithmetic but even if one just wantsreasonably precise results the implementation of transcendental functions in some ofthe most popular processors does not always round correctly and errors can accumulatequickly Fortunately this problem is already well-known and there is software thattakes care of this (Platt and I used the crlibm library [DLDDD+10])

    30 CHAPTER 1 INTRODUCTION

    Lastly there were several relatively minor computations strewn here and there inthe proof There is some numerical integration done rigorously once or twice thiswas done using a standard package based on interval arithmetic [Ned06] but most ofthe time I wrote my own routines in C (using Plattrsquos interval arithmetic package) forthe sake of speed Another kind of computation (employed much more in [Hela] thanin the somewhat more polished version of the proof given here) was a rigorous versionof a ldquoproof by graphrdquo (ldquothe maximum of a function f is clearly less than 4 because Ican see it on the screenrdquo) There is a standard way to do this (see eg [Tuc11 sect52])essentially the bisection method combines naturally with interval arithmetic as weshall describe in sect26 Yet another computation (and not a very small one) was thatinvolved in verifying a large-sieve inequality in an intermediate range (as we discussedin sect15)

    It may be interesting to note that one of the inequalities used to estimate (130) wasproven with the help of automatic quantifier elimination [HB11] Proving this inequal-ity was a very minor task both computationally and mathematically in all likelihoodit is feasible to give a human-generated proof Still it is nice to know from first-hand experience that computers can nowadays (pretend to) do something other thanjust perform numerical computations ndash and that this is already applicable in currentmathematical practice

    Chapter 2

    Notation and preliminaries

    21 General notationGiven positive integers m n we say m|ninfin if every prime dividing m also divides nWe say a positive integer n is square-full if for every prime p dividing n the squarep2 also divides n (In particular 1 is square-full) We say n is square-free if p2 - nfor every prime p For p prime n a non-zero integer we define vp(n) to be the largestnon-negative integer α such that pα|n

    When we writesumn we mean

    suminfinn=1 unless the contrary is stated As always

    Λ(n) denotes the von Mangoldt function

    Λ(n) =

    log p if n = pα for some prime p and some integer α ge 10 otherwise

    and micro denotes the Mobius function

    micro(n) =

    (minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

    We let τ(n) be the number of divisors of an integer n ω(n) the number of primedivisors of n and σ(n) the sum of the divisors of n

    We write (a b) for the greatest common divisor of a and b If there is any riskof confusion with the pair (a b) we write gcd(a b) Denote by (a binfin) the divisorprodp|b p

    vp(a) of a (Thus a(a binfin) is coprime to b and is in fact the maximal divisorof a with this property)

    As is customary we write e(x) for e2πix We denote the Lr norm of a function fby |f |r We write Olowast(R) to mean a quantity at most R in absolute value Given a setS we write 1S for its characteristic function

    1S(x) =

    1 if x isin S0 otherwise

    Write log+ x for max(log x 0)

    31

    32 CHAPTER 2 NOTATION AND PRELIMINARIES

    22 Dirichlet characters and L functions

    Let us go over some basic terms A Dirichlet character χ Z rarr C of modulus q is acharacter χ of (ZqZ)lowast lifted to Z with the convention that χ(n) = 0 when (n q) 6= 1(In other words χ is completely multiplicative and periodic modulo q and vanisheson integers not coprime to q) Again by convention there is a Dirichlet character ofmodulus q = 1 namely the trivial character χT Z rarr C defined by χT (n) = 1 forevery n isin Z

    If χ is a character modulo q and χprime is a character modulo qprime|q such that χ(n) =χprime(n) for all n coprime to q we say that χprime induces χ A character is primitive if it isnot induced by any character of smaller modulus Given a character χ we write χlowast forthe (uniquely defined) primitive character inducing χ If a character χmod q is inducedby the trivial character χT we say that χ is principal and write χ0 for χ (provided themodulus q is clear from the context) In other words χ0(n) = 1 when (n q) = 1 andχ0(n) = 0 when (n q) = 0

    A Dirichlet L-function L(s χ) (χ a Dirichlet character) is defined as the analyticcontinuation of

    sumn χ(n)nminuss to the entire complex plane there is a pole at s = 1 if χ

    is principalA non-trivial zero of L(s χ) is any s isin C such that L(s χ) = 0 and 0 lt lt(s) lt 1

    (In particular a zero at s = 0 is called ldquotrivialrdquo even though its contribution can bea little tricky to work out The same would go for the other zeros with lt(s) = 0occuring for χ non-primitive though we will avoid this issue by working mainly withχ primitive) The zeros that occur at (some) negative integers are called trivial zeros

    The critical line is the line lt(s) = 12 in the complex plane Thus the generalizedRiemann hypothesis for Dirichlet L-functions reads for every Dirichlet character χall non-trivial zeros of L(s χ) lie on the critical line Verifiable finite versions ofthe generalized Riemann hypothesis generally read for every Dirichlet character χ ofmodulus q le Q all non-trivial zeros of L(s χ) with |=(s)| le f(q) lie on the criticalline (where f Zrarr R+ is some given function)

    23 Fourier transforms and exponential sums

    The Fourier transform on R is normalized here as follows

    f(t) =

    int infinminusinfin

    e(minusxt)f(x)dx

    The trivial bound is |f |infin le |f |1 If f is compactly supported (or of fast enoughdecay as t 7rarr plusmninfin) and piecewise continuous f(t) = f prime(t)(2πit) by integration byparts Iterating we obtain that if f is of fast decay and differentiable k times outsidefinitely many points then

    f(t) = Olowast

    (|f (k)|infin(2πt)k

    )= Olowast

    (|f (k)|1(2πt)k

    ) (21)

    23 FOURIER TRANSFORMS AND EXPONENTIAL SUMS 33

    Thus for instance if f is compactly supported continuous and piecewise C1 then fdecays at least quadratically

    It could happen that |f (k)|1 = infin in which case (21) is trivial (but not false) Inpractice we require f (k) isin L1 In a typical situation f is differentiable k times exceptat x1 x2 xk where it is differentiable only (k minus 2) times the contribution of xi(say) to |f (k)|1 is then | limxrarrx+

    if (kminus1)(x)minus limxrarrxminusi

    f (kminus1)(x)|The following bound is standard (see eg [Tao14 Lemma 31]) for α isin RZ and

    f Rrarr C compactly supported and piecewise continuous∣∣∣∣∣sumnisinZ

    f(n)e(αn)

    ∣∣∣∣∣ le min

    (|f |1 +

    1

    2|f prime|1

    12 |fprime|1

    | sin(πα)|

    ) (22)

    (The first bound follows fromsumnisinZ |f(n)| le |f |1 + (12)|f prime|1 which in turn is

    a quick consequence of the fundamental theorem of calculus the second bound isproven by summation by parts) The alternative bound (14)|f primeprime|1| sin(πα)|2 givenin [Tao14 Lemma 31] (for f continuous and piecewise C1) can usually be improvedby the following estimate

    Lemma 231 Let f Rrarr C be compactly supported continuous and piecewise C1Then ∣∣∣∣∣sum

    nisinZf(n)e(αn)

    ∣∣∣∣∣ le 14 |f primeprime|infin

    (sinπα)2(23)

    for every α isin R

    As usual the assumption of compact support could easily be relaxed to an assump-tion of fast decay

    Proof By the Poisson summation formulainfinsum

    n=minusinfinf(n)e(αn) =

    infinsumn=minusinfin

    f(nminus α)

    Since f(t) = f prime(t)(2πit)

    infinsumn=minusinfin

    f(nminus α) =

    infinsumn=minusinfin

    f prime(nminus α)

    2πi(nminus α)=

    infinsumn=minusinfin

    f primeprime(nminus α)

    (2πi(nminus α))2

    By Eulerrsquos formula π cot sπ = 1s+suminfinn=1(1(n+ s)minus 1(nminus s))

    infinsumn=minusinfin

    1

    (n+ s)2= minus(π cot sπ)prime =

    π2

    (sin sπ)2 (24)

    Hence∣∣∣∣∣infinsum

    n=minusinfinf(nminus α)

    ∣∣∣∣∣ le |f primeprime|infininfinsum

    n=minusinfin

    1

    (2π(nminus α))2= |f primeprime|infin middot

    1

    (2π)2middot π2

    (sinαπ)2

    34 CHAPTER 2 NOTATION AND PRELIMINARIES

    The trivial bound |f primeprime|infin le |f primeprime|1 applied to (23) recovers the bound in [Tao14Lemma 31] In order to do better we will give a tighter bound for |f primeprime|infin in AppendixB when f is equal to one of our main smoothing functions (f = η2)

    Integrals of multiples of f primeprime (in particular |f primeprime|1 and f primeprime) can still be made senseof when f primeprime is undefined at a finite number of points provided f is understood as adistribution (and f prime has finite total variation) This is the case in particular for f = η2

    When we need to estimatesumn f(n) precisely we will use the Poisson summation

    formula sumn

    f(n) =sumn

    f(n)

    We will not have to worry about convergence here since we will apply the Poissonsummation formula only to compactly supported functions f whose Fourier transformsdecay at least quadratically

    24 Mellin transformsThe Mellin transform of a function φ (0infin)rarr C is

    Mφ(s) =

    int infin0

    φ(x)xsminus1dx (25)

    If φ(x)xσminus1 is in `1 with respect to dt (ieintinfin

    0|φ(x)|xσminus1dx ltinfin) then the Mellin

    transform is defined on the line σ+ iR Moreover if φ(x)xσminus1 is in `1 for σ = σ1 andfor σ = σ2 where σ2 gt σ1 then it is easy to see that it is also in `1 for all σ isin (σ1 σ2)and that moreover the Mellin transform is holomorphic on s σ1 lt lt(s) lt σ2 Wethen say that s σ1 lt lt(s) lt σ2 is a strip of holomorphy for the Mellin transform

    The Mellin transform becomes a Fourier transform (of η(eminus2πv)eminus2πvσ) by meansof the change of variables x = eminus2πv We thus obtain for example that the Mellintransform is an isometry in the sense thatint infin

    0

    |f(x)|2x2σ dx

    x=

    1

    int infinminusinfin|Mf(σ + it)|2dt (26)

    Recall that in the case of the Fourier transform for |f |2 = |f |2 to hold it is enoughthat f be in `1 cap `2 This gives us that for (26) to hold it is enough that f(x)xσminus1 bein `1 and f(x)xσminus12 be in `2 (again with respect to dt in both cases)

    We write f lowastM g for the multiplicative or Mellin convolution of f and g

    (f lowastM g)(x) =

    int infin0

    f(w)g( xw

    ) dww (27)

    In generalM(f lowastM g) = Mf middotMg (28)

    25 BOUNDS ON SUMS OF micro AND Λ 35

    and

    M(f middot g)(s) =1

    2πi

    int σ+iinfin

    σminusiinfinMf(z)Mg(sminus z)dz [GR94 sect1732] (29)

    provided that z and sminus z are within the strips on which Mf and Mg (respectively) arewell-defined

    We also have several useful transformation rules just as for the Fourier transformFor example

    M(f prime(t))(s) = minus(sminus 1) middotMf(sminus 1)

    M(tf prime(t))(s) = minuss middotMf(s)

    M((log t)f(t))(s) = (Mf)prime(s)

    (210)

    (as in eg [BBO10 Table 111])Let

    η2 = (2 middot 1[121]) lowastM (2 middot 1[121])

    Since (see eg [BBO10 Table 113] or [GR94 sect1643])

    (MI[ab])(s) =bs minus as

    s

    we see that

    Mη2(s) =

    (1minus 2minuss

    s

    )2

    Mη4(s) =

    (1minus 2minuss

    s

    )4

    (211)

    Let fz = eminuszt where lt(z) gt 0 Then

    (Mf)(s) =

    int infin0

    eminuszttsminus1dt =1

    zs

    int infin0

    eminustdt

    =1

    zs

    int zinfin

    0

    eminusuusminus1du =1

    zs

    int infin0

    eminusttsminus1dt =Γ(s)

    zs

    where the next-to-last step holds by contour integration and the last step holds by thedefinition of the Gamma function Γ(s)

    25 Bounds on sums of micro and Λ

    We will need some simple explicit bounds on sums involving the von Mangoldt func-tion Λ and the Moebius function micro In non-explicit work such sums are usuallybounded using the prime number theorem or rather using the properties of the zetafunction ζ(s) underlying the prime number theorem Here however we need robustfully explicit bounds valid over just about any range

    For the most part we will just be quoting the literature supplemented with somecomputations when needed The proofs in the literature are sometimes based on prop-erties of ζ(s) and sometimes on more elementary facts

    36 CHAPTER 2 NOTATION AND PRELIMINARIES

    First let us see some bounds involving Λ The following bound can be easilyderived from [RS62 (323)] supplemented by a quick calculation of the contributionof powers of primes p lt 32 sum

    nlex

    Λ(n)

    nle log x (212)

    We can derive a bound in the other direction from [RS62 (321)] (for x gt 1000adding the contribution of all prime powers le 1000) and a numerical verification forx le 1000 sum

    nlex

    Λ(n)

    nge log xminus log

    3radic2 (213)

    We also use the following older bounds

    1 By the second table in [RR96 p 423] supplemented by a computation for2 middot 106 le V le 4 middot 106 sum

    nley

    Λ(n) le 10004y (214)

    for y ge 2 middot 106

    2 sumnley

    Λ(n) lt 103883y (215)

    for every y gt 0 [RS62 Thm 12]

    For all y gt 663 sumnley

    Λ(n)n lt 103884y2

    2 (216)

    where we use (215) and partial summation for y gt 200000 and a computation for663 lt y le 200000 Using instead the second table in [RR96 p 423] together withcomputations for small y lt 107 and partial summation we get that

    sumnley

    Λ(n)n lt 10008y2

    2(217)

    for y gt 16 middot 106Similarly sum

    nley

    Λ(n)radicn

    lt 2 middot 10004radicy (218)

    for all y ge 1It is also true that sum

    y2ltpley

    (log p)2 le 1

    2y(log y) (219)

    25 BOUNDS ON SUMS OF micro AND Λ 37

    for y ge 117 this holds for y ge 2 middot 758699 by [RS75 Cor 2] (applied to x = yx = y2 and x = 2y3) and for 117 le y lt 2 middot 758699 by direct computation

    Now let us see some estimates on sums involving micro The situation here is lesssatisfactory than for sums involving Λ The main reason is that the complex-analyticapproach to estimating

    sumnleN micro(n) would involve 1ζ(s) rather than ζ prime(s)ζ(s) and

    thus strong explicit bounds on the residues of 1ζ(s) would be needed Thus explicitestimates on sums involving micro are harder to obtain than estimates on sums involving ΛThis is so even though analytic number theorists are generally used (from the habit ofnon-explicit work) to see the estimation of one kind of sum or the other as essentiallythe same task

    Fortunately in the case of sums of the typesumnlex micro(n)n for x arbitrary (a type of

    sum that will be rather important for us) all we need is a saving of (log n) or (log n)2

    on the trivial bound This is provided by the following

    1 (Granville-Ramare [GR96] Lemma 102)∣∣∣∣∣∣sum

    nlexgcd(nq)=1

    micro(n)

    n

    ∣∣∣∣∣∣ le 1 (220)

    for all x q ge 1

    2 (Ramare [Ram13] cf El Marraki [EM95] [EM96])∣∣∣∣∣∣sumnlex

    micro(n)

    n

    ∣∣∣∣∣∣ le 003

    log x(221)

    for x ge 11815

    3 (Ramare [Ramb]) sumnlexgcd(nq)=1

    micro(n)

    n= Olowast

    (1

    log xqmiddot 4

    5

    q

    φ(q)

    )(222)

    for all x and all q le xsumnlexgcd(nq)=1

    micro(n)

    nlog

    x

    n= Olowast

    (100303

    q

    φ(q)

    )(223)

    for all x and all q

    Improvements on these bounds would lead to improvements on type I estimates butnot in what are the worst terms overall at this point

    A computation carried out by the author has proven the following inequality for allreal x le 1012 ∣∣∣∣∣∣

    sumnlex

    micro(n)

    n

    ∣∣∣∣∣∣ leradic

    2

    x(224)

    38 CHAPTER 2 NOTATION AND PRELIMINARIES

    The computation was conducted rigorously by means of interval arithmetic For thesake of verification we record that

    542625 middot 10minus8 lesum

    nle1012

    micro(n)

    nle 542898 middot 10minus8

    Computations also show that the stronger bound∣∣∣∣∣∣sumnlex

    micro(n)

    n

    ∣∣∣∣∣∣ le 1

    2radicx

    holds for all 3 le x le 7727068587 but not for x = 7727068588minus εEarlier numerical work carried out by Olivier Ramare [Ram14] had shown that

    (224) holds for all x le 1010

    26 Interval arithmetic and the bisection methodInterval arithmetic has at its basic data type intervals of the form I = [a2` b2`]where a b ` isin Z and a le b Say we have a real number x and we want to know sin(x)In general we cannot represent x in a computer in part because it may have no finitedescription The best we can do is to construct an interval of the form I = [a2` b2`]in which x is contained

    What we ask of a routine in an interval-arithmetic package is to construct an intervalI prime = [aprime2`

    prime bprime2`

    prime] in which sin(I) is contained (In practice this is done partly in

    software by means of polynomial approximations to sin with precise error terms andpartly in hardware by means of an efficient usage of rounding conventions) This givesus in effect a value for sin(x) (namely (aprime+ bprime)2`

    prime+1) and a bound on the error term(namely (bprime minus aprime)2`prime+1)

    There are several implementations of interval arithmetic available We will almostalways use D Plattrsquos implementation [Pla11] of double-precision interval arithmeticbased on Lambovrsquos [Lam08] ideas (At one point we will use the PROFILBIAS inter-val arithmetic package [Knu99] since it underlies the VNODE-LP [Ned06] packagewhich we use to bound an integral)

    The bisection method is a particularly simple method for finding maxima and min-ima of functions as well as roots It combines rather nicely with interval arithmeticwhich makes the method rigorous We follow an implementation based on [Tuc11sect52] Let us go over the basic ideas

    Let us use the bisection method to find the minima (say) of a function f on acompact interval I0 (If the interval is non-compact we generally apply the bisectionmethod to a compact sub-interval and use other tools eg power-series expansionsin the complement) The method proceeds by splitting an interval into two repeatedlydiscarding the halfs where the minimum cannot be found More precisely if we im-plement it by interval arithmetic it proceeds as follows First in an optional initialstep we subdivide (if necessary) the interval I0 into smaller intervals Ik to which thealgorithm will actually be applied For each k interval arithmetic gives us a lower

    26 INTERVAL ARITHMETIC AND THE BISECTION METHOD 39

    bound rminusk and an upper bound r+k on f(x) x isin Ik here rminusk and r+

    k are both ofthe form a2` a ` isin Z Let m0 be the minimum of r+

    k over all k We can discardall the intervals Ik for which rminusk gt m0 Then we apply the main procedure startingwith i = 1 split each surviving interval into two equal halves recompute the lower andupper bound on each half definemi as before to be the minimum of all upper boundsand discard again the intervals on which the lower bound is larger than mi increase iby 1 We repeat the main procedure as often as needed In the end we obtain that theminimum is no smaller than the minimum of the lower bounds (call them (r(i))minusk ) onall surviving intervals I(i)

    k Of course we also obtain that the minimum (or minima ifthere is more than one) must lie in one of the surviving intervals

    It is easy to see how the same method can be applied (with a trivial modification)to find maxima or (with very slight changes) to find the roots of a real-valued functionon a compact interval

    40 CHAPTER 2 NOTATION AND PRELIMINARIES

    Part I

    Minor arcs

    41

    Chapter 3

    Introduction

    The circle method expresses the number of solutions to a given problem in terms ofexponential sums Let η R+ rarr C be a smooth function Λ the von Mangoldt function(defined as in (15)) and e(t) = e2πit The estimation of exponential sums of the type

    Sη(α x) =sumn

    Λ(n)e(αn)η(nx) (31)

    where α isin RZ already lies at the basis of Hardy and Littlewoodrsquos approach to theternary Goldbach problem by means of the circle method [HL22] The division of thecircle RZ into ldquomajor arcsrdquo and ldquominor arcsrdquo goes back to Hardy and Littlewoodrsquosdevelopment of the circle method for other problems As they themselves noted as-suming GRH means that for the ternary Goldbach problem all of the circle can bein effect subdivided into major arcs ndash that is under GRH (31) can be estimated withmajor-arc techniques for α arbitrary They needed to make such an assumption pre-cisely because they did not yet know how to estimate Sη(α x) on the minor arcs

    Minor-arc techniques for Goldbachrsquos problem were first developed by Vinogradov[Vin37] These techniques make it possible to work without GRH The main obstacleto a full proof of the ternary Goldbach conjecture since then has been that in spite ofgradual improvements minor-arc bounds have simply not been strong enough

    As in all work to date our aim will be to give useful upper bounds on (31) forα in the minor bounds rather than the precise estimates that are typical of the major-arc case We will have to give upper bounds that are qualitatively stronger than thoseknown before (In Part III we will also show how to use them more efficiently)

    Our main challenge will be to give a good upper bound whenever q is larger than aconstant r Here ldquosufficiently goodrdquo means ldquosmaller than the trivial bound divided bya large constant and getting even smaller quickly as q growsrdquo Our bound must also begood for α = aq + δx where q lt r but δ is large (Such an α may be said to lie onthe tail (δ large) of a major arc (q small))

    Of course all expressions must be explicit and all constants in the leading terms ofthe bound must be small Still the main requirement is a qualitative one For instancewe know in advance that a single factor of log x would be the end of us That is we

    43

    44 CHAPTER 3 INTRODUCTION

    know that if there is a single term of the form say (x log x)q and the trivial boundis about x we are lost (x log x)q is greater than x for x large and q constant

    The quality of the results here is due to several new ideas of general applicabilityIn particular sect51 introduces a way to obtain cancellation from Vaughanrsquos identityVaughanrsquos identity is a two-log gambit in that it introduces two convolutions (each ofthem at a cost of log) and offers a great deal of flexibility in compensation One of theideas presented here is that at least one of two logs can be successfully recovered afterhaving been given away in the first stage of the proof This reduces the cost of the useof this basic identity in this and presumably many other problems

    There are several other improvements that make a qualitative difference see thediscussions at the beginning of sect4 and sect5 Considering smoothed sums ndash now a com-mon idea ndash also helps (Smooth sums here go back to Hardy-Littlewood [HL22] ndash bothin the general context of the circle method and in the context of Goldbachrsquos ternaryproblem In recent work on the problem they reappear in [Tao14])

    31 ResultsThe main bound we are about to see is essentially proportional to ((log q)

    radicφ(q)) middot x

    The term δ0 serves to improve the bound when we are on the tail of an arc

    Theorem 311 Let x ge x0 x0 = 216 middot 1020 Let Sη(α x) be as in (31) with ηdefined in (34) Let 2α = aq + δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23 If q le x136 then

    |Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

    middot x+25xradicδ0q

    +2x

    δ0qmiddot Lxδ0qq + 336x56

    (32)where

    δ0 = max(2 |δ|4) Rxt = 027125 log

    (1 +

    log 4t

    2 log 9x13

    2004t

    )+ 041415

    Lxtq =q

    φ(q)

    (13

    4log t+ 782

    )+ 1366 log t+ 3755

    (33)If q gt x136 then

    |Sη(α x)| le 0276x56(log x)32 + 1234x23 log x

    The factor Rxt is small in practice for instance for x = 1025 and δ0q = 5 middot 105

    (typical ldquodifficultrdquo values) Rxδ0q equals 059648 The classical choice1 for η in (31) is η(t) = 1 for t le 1 η(t) = 0 for t gt 1 which

    of course is not smooth or even continuous We use

    η(t) = η2(t) = 4 max(log 2minus | log 2t| 0) (34)

    1Or more precisely the choice made by Vinogradov and followed by most of the literature since himHardy and Littlewood [HL22] worked with η(t) = eminust

    32 COMPARISON TO EARLIER WORK 45

    as in Tao [Tao14] in part for purposes of comparison (This is the multiplicative con-volution of the characteristic function of an interval with itself) Nearly all work shouldbe applicable to any other sufficiently smooth function η of fast decay It is importantthat η decay at least quadratically

    We are not forced to use the same smoothing function as in Part II and we do notAs was explained in the introduction the simple technique (140) allows us to workwith one smoothing function on the major arcs and with another one on the minor arcs

    32 Comparison to earlier workTable 31 compares the bounds for the ratio |Sη(aq x)|x given by this paper and by[Tao14][Thm 13] for x = 1027 and different values of q We are comparing worstcases φ(q) as small as possible (q divisible by 2 middot 3 middot 5 middot middot middot ) in the result here and qdivisible by 4 (implying 4α sim a(q4)) in Taorsquos result The main term in the result inthis paper improves slowly with increasing x the results in [Tao14] worsen slowly withincreasing x The qualitative gain with respect to the main term in [Tao14 (110)] is inthe order of log(q)

    radicφ(q)q Notice also that the bounds in [Tao14] are not log-free in

    [Tao14 (110)] there is a term proportional to x(log x)2q This becomes larger thanthe trivial bound x for x very large

    The results in [DR01] are unfortunately worse than the trivial bound in the rangecovered by Table 31 Ramarersquos results ([Ram10 Thm 3] [Ramc Thm 6]) are notapplicable within the range since neither of the conditions log q le (150)(log x)13q le x148 is satisfied Ramarersquos bound in [Ramc Thm 6] is∣∣∣∣∣∣

    sumxltnle2x

    Λ(n)e(anq)

    ∣∣∣∣∣∣ le 13000

    radicq

    φ(q)x (35)

    for 20 le q le x148 We should underline that while both the constant 13000 and thecondition q le x148 keep (35) from being immediately useful in the present context(35) is asymptotically better than the results here as q rarr infin (Indeed qualitativelyspeaking the form of (35) is the best one can expect from results derived by the familyof methods stemming from Vinogradovrsquos work) There is also unpublished work byRamare (ca 1993) with different constants for q (log x log log x)4

    33 Basic setupIn the minor-arc regime the first step in estimating an exponential sum on the primesgenerally consists in the application of an identity expressing the von Mangoldt func-tion Λ(n) in terms of a sum of convolutions of other functions

    331 Vaughanrsquos identityWe recall Vaughanrsquos identity [Vau77b]

    Λ = microleU lowast log +microleU lowast ΛleV lowast 1 + microgtU lowast ΛgtV lowast 1 + ΛleV (36)

    46 CHAPTER 3 INTRODUCTION

    q0|Sη(aqx)|

    x HH |Sη(aqx)|x Tao

    105 004661 03447515 middot 105 003883 02883625 middot 105 003098 0231945 middot 105 002297 01741675 middot 105 001934 014775106 001756 013159107 000690 005251

    Table 31 Worst-case upper bounds on xminus1|Sη(a2q x)| for q ge q0 |δ| le 8 x =1027 The trivial bound is 1

    where 1 is the constant function 1 and where we write

    flez(n) =

    f(n) if n le z0 if n gt z

    fgtz(n) =

    0 if n le zf(n) if n gt z

    Here f lowast g denotes the Dirichlet convolution (f lowast g)(n) =sumd|n f(d)g(nd) We can

    set the values of U and V however we wishVaughanrsquos identity is essentially a consequence of the Mobius inversion formula

    (1 lowast micro)(n) =

    1 if n = 10 otherwise

    (37)

    Indeed by (37)

    ΛgtV (n) =sumdm|n

    micro(d)ΛgtV (m)

    =sumdm|n

    microleU (d)ΛgtV (m) +sumdm|n

    microgtU (d)ΛgtV (m)

    Applying to this the trivial equality ΛgtV = Λ minus ΛleV as well as the simple fact that1 lowast Λ = log we obtain that

    ΛgtV (n) =sumd|n

    microleU (d) log(nd)minussumdm|n

    microleU (d)ΛleV (m) +sumdm|n

    microgtU (d)ΛgtV (m)

    By ΛV = ΛgtV + ΛgeV we conclude that Vaughanrsquos identity (36) holdsApplying Vaughanrsquos identity we easily get that for any function η R rarr R any

    completely multiplicative function f Z+ rarr C and any x gt 0 U V ge 0sumn

    Λ(n)f(n)e(αn)η(nx) = SI1 minus SI2 + SII + S0infin (38)

    33 BASIC SETUP 47

    where

    SI1 =summleU

    micro(m)f(m)sumn

    (log n)e(αmn)f(n)η(mnx)

    SI2 =sumdleV

    Λ(d)f(d)summleU

    micro(m)f(m)sumn

    e(αdmn)f(n)η(dmnx)

    SII =summgtU

    f(m)

    sumdgtUd|m

    micro(d)

    sumngtV

    Λ(n)e(αmn)f(n)η(mnx)

    S0infin =sumnleV

    Λ(n)e(αn)f(n)η(nx)

    (39)

    We will use the function

    f(n) =

    1 if gcd(n v) = 10 otherwise

    (310)

    where v is a small positive square-free integer (Our final choice will be v = 2) Then

    Sη(x α) = SI1 minus SI2 + SII + S0infin + S0w (311)

    where Sη(x α) is as in (31) and

    S0v =sumn|v

    Λ(n)e(αn)η(nx)

    The sums SI1 SI2 are called ldquoof type Irdquo the sum SII is called ldquoof type IIrdquo (orbilinear) (The not-all-too colorful nomenclature goes back to Vinogradov) The sumS0infin is in general negligible for our later choice of V and η it will be in fact 0 Thesum S0v will be negligible as well

    As we already discussed in the introduction Vaughanrsquos identity is highly flexible(in that we can choose U and V at will) but somewhat inefficient in practice (in that atrivial estimate for the right side of (311) is actually larger than a trivial estimate forthe left side of (311)) Some of our work will consist in regaining part of what is givenup when we apply Vaughanrsquos identity

    332 An alternative route

    There is an alternative route ndash namely to use a less sacrificial though also more in-flexible identity While this was not in the end the route that was followed let usnevertheless discuss it in some detail in part so that we can understand to what extentit was in retrospect viable and in part so as to see how much of the work we willundertake is really more or less independent of the particular identity we choose

    48 CHAPTER 3 INTRODUCTION

    Since ζ prime(s)ζ(s) =sumn Λ(n)nminuss and(

    ζ prime(s)

    ζ(s)

    )(2)

    =

    (ζ primeprime(s)

    ζ(s)minus (ζ prime(s))

    2

    ζ(s)2

    )prime

    =ζ(3)(s)

    ζ(s)minus 3ζ primeprime(s)ζ prime(s)

    ζ(s)2+ 2

    (ζ prime(s)

    ζ(s)

    )3

    =ζ(3)(s)

    ζ(s)minus 3

    (ζ prime(s)

    ζ(s)

    )primemiddot ζprime(s)

    ζ(s)minus(ζ prime(s)

    ζ(s)

    )3

    (312)

    we can see comparing coefficients that

    Λ middot log2 = micro lowast log3minus3(Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (313)

    as was stated by Bombieri in [Bom76]Here the term microlowast log3 is of the same kind as the term microleU lowast log we have to estimate

    if we use Vaughanrsquos identity though the fact that there is no truncation at U means thatone of the error terms will get larger ndash it will be proportional to x in fact if we sumfrom 1 to x The trivial upper bound on the sum of Λ middot log2 from 1 to x is x(log x)2thus an error term of size x is barely acceptable

    In general when we have a double or triple sum we are not very good at gettingbetter than trivial bounds in ranges in which all but one of the variables are very smallThis is the source of the large error term that appears in the sum involving micro lowast log3

    because we are no longer truncating as for microleU lowast log It will also be the source of otherlarge error terms including one that would be too large ndash namely the one coming fromthe term (Λ middot log) lowast Λ when the variable of Λ middot log is large and that of Λ is small (Thetrivial bound on that range is x log x)

    We avoid this problem by substituting the identity Λ middot log = micro lowast log2minusΛ lowastΛ inside(313)

    Λ middot log2 = micro lowast log3minus3(micro lowast log2) lowast Λ + 2Λ lowast Λ lowast Λ (314)

    (We could also have got this directly from the next-to-last line in (312)) When thevariable of Λ in (micro lowast log2) lowast Λ is small the variable of micro lowast log2 is large and we canestimate the resulting term using the same techniques as for micro lowast log3

    It is easy to see that we can in fact mix (313) and (314)

    Λ middot log2 = micro lowast log3minus3((Λ middot log) lowast ΛgtV + (micro lowast log2) lowast ΛleV

    )+ (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ)

    (315)

    for V arbitrary Note here that there is some cancellation in the last term writing

    F3V (n) = (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ) (n) (316)

    we can check easily that for n = p1p2p3 square-free with V 3 lt n we have

    F3V (n) =

    minus6 log p1 log p2 log p3 if all pi gt V 0 if p1 lt p2 le V lt p36 log p1 log p2 log p3 if p1 le V lt p2 lt p312 log p1 log p2 log p3 if all pi le V

    33 BASIC SETUP 49

    In contrast for n square-free minusΛ lowast Λ lowast Λ(n) is minus6 if n is of the form p1p2p3 and 0otherwise

    We may find it useful to take aside two large terms that may need to be boundedtrivially namely micro lowast log3

    leu and (Λ middot log)leu lowastΛgtV where u will be a small parameter(We can let for instance u = 3) We conclude that

    Λ middot log2 = FI1u(n)minus 3FI2Vu(n)minus 3FIIVu(n) + F3V (n) + F0Vu(n) (317)

    whereFI1u = micro lowast log3

    gtu

    FI2Vu = (micro lowast log2) lowast ΛleV

    FIIVu(n) = (Λ middot log)gtu lowast ΛgtV

    F0Vu(n) = micro lowast log3leuminus3(Λ middot log)leu lowast ΛgtV

    and F3V is as in (316)In the bulk of the present work ndash in particular in all steps that are part of the proof

    of Theorem 311 or the Main Theorem ndash we will use Vaughanrsquos identity rather than(317) This choice was made while the proof was still underway it was due mainlyto back-of-the-envelope estimates that showed that the error terms could be too largeif (314) was used Of course this might have been the case with Vaughanrsquos identityas well but the fact that the parameters U V there have a large effect on the outcomemeant that one could hope to improve on insufficient estimates in part by adjusting Uand V without losing all previous work (This is what was meant by the ldquoflexibilityrdquoof Vaughanrsquos identity)

    The question remains can one prove ternary Goldbach using (317) rather thanVaughanrsquos identity This seems likely If so which proof would be more complicatedThis is not clear

    There are large parts of the work that are the essentially the same in both cases

    bull estimates for sums involving microleU lowast logk (ldquotype Irdquo)

    bull estimates for sums involving Λgtu lowast ΛgtV and the like (ldquotype IIrdquo)

    Trilinear sums ie sums involving ΛlowastΛlowastΛ can be estimated much like bilinear sumsie sums involving Λ lowast Λ

    There are also challenges that appear only for Vaughanrsquos identity and others thatappear only for (317) An example of a challenge that is successfully faced in the mainproof but does not appear if (317) is used consists in bounding sums of type

    sumUltmlexW

    sumdgtUd|m

    micro(d)

    2

    (In sect51 we will be able to bound sums of this type by a constant times xW ) Like-wise large tail terms that have to be estimated trivially seem unavoidable in (317)(The choice of a parameter u gt 1 as above is meant to alleviate the problem)

    50 CHAPTER 3 INTRODUCTION

    In the end losing a factor of about log xUV seems inevitable when one usesVaughanrsquos identity but not when one uses (317) Another reason why a full treatmentbased on (317) would also be worthwhile is that it is a somewhat less familiar andarguably under-used identity and deserves more exploration With these commentswe close the discussion of (317) we will henceforth use Vaughanrsquos identity

    Chapter 4

    Type I sums

    Here we must bound sums of the basic typesummleD

    micro(m)sumn

    e(αmn)η(mnx

    )and variations thereof There are three main improvements in comparison to standardtreatments

    1 The terms with m divisible by q get taken out and treated separately by analyticmeans This all but eliminates what would otherwise be the main term

    2 The other terms get handled by improved estimates on trigonometric sums Forlarge m the improvements have a substantial total effect ndash more than a constantfactor is gained

    3 The ldquoerrorrdquo term δx = α minus aq is used to our advantage This happens boththrough the Poisson summation formula and through the use of two alternativeapproximations to the same number α

    The fact that a continuous weight η is used (ldquosmoothingrdquo) is a difference with respectto the classical literature ([Vin37] and what followed) but not with respect to morerecent work (including [Tao14]) using smooth or continuous weights is an idea thathas become commonplace in analytic number theory even though it is not consistentlyapplied The improvements due to smoothing in type I are both relatively minor andessentially independent of the improvements due to (1) and (3) The use of a contin-uous weight combines nicely with (2) but the ideas given here would give qualitativeimprovements in the treatment of trigonometric sums even in the absence of smoothing

    41 Trigonometric sumsThe following lemmas on trigonometric sums improve on the best Vinogradov-typelemmas in the literature (By this we mean results of the type of Lemma 8a and

    51

    52 CHAPTER 4 TYPE I SUMS

    Lemma 8b in [Vin04 Ch I] See in particular the work of Daboussi and Rivat [DR01Lemma 1]) The main idea is to switch between different types of approximation withinthe sum rather than just choosing between bounding all terms either trivially (by A)or non-trivially (by C| sin(παn)|2) There will also1 be improvements in our appli-cations stemming from the fact that Lemmas 411 and Lemma 412 take quadratic(| sin(παn)|2) rather than linear (| sin(παn)|) inputs (These improved inputs comefrom the use of smoothing elsewhere)

    Lemma 411 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Then for anyAC ge 0

    sumyltnley+q

    min

    (A

    C

    | sin(παn)|2

    )le min

    (2A+

    6q2

    π2C 3A+

    4q

    π

    radicAC

    ) (41)

    Proof We start by letting m0 = byc + b(q + 1)2c j = n minusm0 so that j ranges inthe interval (minusq2 q2] We write

    αn =aj + c

    q+ δ1(j) + δ2 mod 1

    where |δ1(j)| and |δ2| are both le 12q we can assume δ2 ge 0 The variable r =aj + c mod q occupies each residue class mod p exactly once

    One option is to bound the terms corresponding to r = 0minus1 by A each and allthe other terms by C| sin(παn)|2 (This can be seen as the simple case it will takeus about a page just because we should estimate all sums and all terms here with greatcare ndash as in [DR01] only more so)

    The terms corresponding to r = minusk and r = k minus 1 (2 le k le q2) contribute atmost

    1

    sin2 πq (k minus 1

    2 minus qδ2)+

    1

    sin2 πq (k minus 3

    2 + qδ2)le 1

    sin2 πq

    (k minus 1

    2

    ) +1

    sin2 πq

    (k minus 3

    2

    ) since x 7rarr 1

    (sin x)2 is convex-up on (0infin) Hence the terms with r 6= 0 1 contribute atmost

    1(sin π

    2q

    )2 + 2sum

    2lerle q2

    1(sin π

    q (r minus 12))2 le

    1(sin π

    2q

    )2 + 2

    int q2

    1

    1(sin π

    q x)2

    where we use again the convexity of x 7rarr 1(sinx)2 (We can assume q gt 2 asotherwise we have no terms other than r = 0 1) Nowint q2

    1

    1(sin π

    q x)2 dx =

    q

    π

    int π2

    πq

    1

    (sinu)2du =

    q

    πcot

    π

    q

    1This is a change with respect to the first version of the preprint [Helb] The version of Lemma 411there has however the advantage of being immediately comparable to results in the literature

    41 TRIGONOMETRIC SUMS 53

    Hence sumyltnley+q

    min

    (A

    C

    (sinπαn)2

    )le 2A+

    C(sin π

    2q

    )2 + C middot 2q

    πcot

    π

    q

    Now by [AS64 (4368)] and [AS64 (4370)] for t isin (minusπ π)

    t

    sin t= 1 +

    sumkge0

    a2k+1t2k+2 = 1 +

    t2

    6+

    t cot t = 1minussumkge0

    b2k+1t2k+2 = 1minus t2

    3minus t4

    45minus

    (42)

    where a2k+1 ge 0 b2k+1 ge 0 Thus for t isin [0 t0] t0 lt π(t

    sin t

    )2

    = 1 +t2

    3+ c0(t)t4 le 1 +

    t2

    3+ c0(t0)t4 (43)

    where

    c0(t) =1

    t4

    ((t

    sin t

    )2

    minus(

    1 +t2

    3

    ))

    which is an increasing function because a2k+1 ge 0 For t0 = π4 c0(t0) le 0074807Hence

    t2

    sin2 t+ t cot 2t le

    (1 +

    t2

    3+ c0

    (π4

    )t4)

    +

    (1

    2minus 2t2

    3minus 8t4

    45

    )=

    3

    2minus t2

    3+

    (c0

    (π4

    )minus 8

    45

    )t4 le 3

    2minus t2

    3le 3

    2

    for t isin [0 π4]Therefore the left side of (41) is at most

    2A+ C middot(

    2q

    π

    )2

    middot 3

    2= 2A+

    6

    π2Cq2

    The following is an alternative approach it yields the other estimate in (41) Webound the terms corresponding to r = 0 r = minus1 r = 1 by A each We let r = plusmnrprimefor rprime ranging from 2 to q2 We obtain that the sum is at most

    3A+sum

    2lerprimeleq2

    min

    A C(sin π

    q

    (rprime minus 1

    2 minus qδ2))2

    +

    sum2lerprimeleq2

    min

    A C(sin π

    q

    (rprime minus 1

    2 + qδ2))2

    (44)

    54 CHAPTER 4 TYPE I SUMS

    We bound a term min(AC sin((πq)(rprime minus 12 plusmn qδ2))2) by A if and only ifC sin((πq)(rprimeminus 1plusmn qδ2))2 ge A (In other words we are choosing which of the twobounds A C| sin(παn)|2 on a case-by-case basis ie for each n instead of makinga single choice for all n in one go This is hardly anything deep but it does result ina marked improvement with respect to the literature and would give an improvementeven if we were given a bound B| sin(παn)| instead of a bound C| sin(παn)|2 asinput) The number of such terms is

    le max(0 b(qπ) arcsin(radicCA)∓ qδ2c)

    and thus at most (2qπ) arcsin(radicCA) in total (Recall that qδ2 le 12) Each

    other term gets bounded by the integral of C sin2(παq) from rprime minus 1 plusmn qδ2 (ge(qπ) arcsin(

    radicCA)) to rprime plusmn qδ2 by convexity Thus (44) is at most

    3A+2q

    πA arcsin

    radicC

    A+ 2

    int q2

    qπ arcsin

    radicCA

    C

    sin2 πtq

    dt

    le 3A+2q

    πA arcsin

    radicC

    A+

    2q

    πC

    radicA

    Cminus 1

    We can easily show (taking derivatives) that arcsinx + x(1 minus x2) le 2x for 0 lex le 1 Setting x = CA we see that this implies that

    3A+2q

    πA arcsin

    radicC

    A+

    2q

    πC

    radicA

    Cminus 1 le 3A+

    4q

    π

    radicAC

    (If CA gt 1 then 3A + (4qπ)radicAC is greater than Aq which is an obvious upper

    bound for the left side of (41))

    Now we will see that if we take out terms with n divisible by q and n is not toolarge then we can give a bound that does not involve a constant term A at all (We arereferring to the bound (203π2)Cq2 below of course 2A + (4qπ)

    radicAC does have

    a constant term 2A ndash it is just smaller than the constant term 3A in the correspondingbound in (41))

    Lemma 412 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any AC ge 0sum

    y1ltnley2q-n

    min

    (A

    C

    | sin(παn)|2

    )le min

    (20

    3π2Cq2 2A+

    4q

    π

    radicAC

    ) (45)

    Proof Clearly αn equals anq + (nQ)βq since y2 le Q2 this means that |αnminusanq| le 12q for n le y2 moreover again for n le y2 the sign of αnminus anq remainsconstant Hence the left side of (45) is at most

    q2sumr=1

    min

    (A

    C

    (sin πq (r minus 12))2

    )+

    q2sumr=1

    min

    (A

    C

    (sin πq r)

    2

    )

    41 TRIGONOMETRIC SUMS 55

    Proceeding as in the proof of Lemma 411 we obtain a bound of at most

    C

    (1

    (sin π2q )2

    +1

    (sin πq )2

    +q

    πcot

    π

    q+q

    πcot

    2q

    )

    for q ge 2 (If q = 1 then the left-side of (45) is trivially zero) Now by (42)

    t2

    (sin t)2+t

    2cot 2t le

    (1 +

    t2

    3+ c0

    (π4

    )t4)

    +1

    4

    (1minus 4t2

    3minus 16t4

    45

    )le 5

    4+

    (c0

    (π4

    )minus 4

    45

    )t4 le 5

    4

    for t isin [0 π4] and

    t2

    (sin t)2+ t cot

    3t

    2le(

    1 +t2

    3+ c0

    (π2

    )t4)

    +2

    3

    (1minus 3t2

    4minus 81t4

    24 middot 45

    )le 5

    3+

    (minus1

    6+

    (c0

    (π2

    )minus 27

    360

    )(π2

    )2)t2 le 5

    3

    for t isin [0 π2] Hence(1

    (sin π2q )2

    +1

    (sin πq )2

    +q

    πcot

    π

    q+q

    πcot

    2q

    )le(

    2q

    π

    )2

    middot 54

    +( qπ

    )2

    middot 53le 20

    3π2q2

    Alternatively we can follow the second approach in the proof of Lemma 411 andobtain an upper bound of 2A+ (4qπ)

    radicAC

    The following bound will be useful when the constant A in an application ofLemma 412 would be too large (This tends to happen for n small)

    Lemma 413 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any BC ge 0

    sumy1ltnley2

    q-n

    min

    (B

    | sin(παn)|

    C

    | sin(παn)|2

    )le 2B

    q

    πmax

    (2 log

    Ce3q

    ) (46)

    The upper bound le (2Bqπ) log(2e2qπ) is also valid

    Proof As in the proof of Lemma 412 we can bound the left side of (46) by

    2

    q2sumr=1

    min

    (B

    sin πq

    (r minus 1

    2

    ) C

    sin2 πq

    (r minus 1

    2

    ))

    56 CHAPTER 4 TYPE I SUMS

    Assume B sin(πq) le C le B By the convexity of 1 sin(t) and 1 sin(t)2 fort isin (0 π2]

    q2sumr=1

    min

    (B

    sin πq

    (r minus 1

    2

    ) C

    sin2 πq

    (r minus 1

    2

    ))

    le B

    sin π2q

    +

    int qπ arcsin C

    B

    1

    B

    sin πq tdt+

    int q2

    qπ arcsin C

    B

    1

    sin2 πq tdt

    le B

    sin π2q

    +q

    π

    (B

    (log tan

    (1

    2arcsin

    C

    B

    )minus log tan

    π

    2q

    )+ C cot arcsin

    C

    B

    )le B

    sin π2q

    +q

    π

    (B

    (log cot

    π

    2qminus log

    C

    B minusradicB2 minus C2

    )+radicB2 minus C2

    )

    Now for all t isin (0 π2)

    2

    sin t+

    1

    tlog cot t lt

    1

    tlog

    (e2

    t

    )

    we can verify this by comparing series Thus

    B

    sin π2q

    +q

    πB log cot

    π

    2qle B q

    πlog

    2e2q

    π

    for q ge 2 (If q = 1 the sum on the left of (46) is empty and so the bound we aretrying to prove is trivial) We also have

    t log(tminusradict2 minus 1) +

    radict2 minus 1 lt minust log 2t+ t (47)

    for t ge 1 (as this is equivalent to log(2t2(1minusradic

    1minus tminus2)) lt 1minusradic

    1minus tminus2 which wecheck easily after changing variables to δ = 1minus

    radic1minus tminus2) Hence

    B

    sin π2q

    +q

    π

    (B

    (log cot

    π

    2qminus log

    C

    B minusradicB2 minus C2

    )+radicB2 minus C2

    )le B q

    πlog

    2e2q

    π+q

    π

    (B minusB log

    2B

    C

    )le B q

    πlog

    Ce3q

    for q ge 2Given any C we can apply the above with C = B instead as for any t gt 0

    min(Bt Ct2) le Bt le min(BtBt2) (We refrain from applying (47) so as toavoid worsening a constant) If C lt B sinπq (or even if C lt (πq)B) we relax theinput to C = B sinπq and go through the above

    42 Type I estimatesLet us give our first main type I estimate2 One of the main innovations is the mannerin which the ldquomain termrdquo (m divisible by q) is separated we are able to keep error

    2The current version of Lemma 421 is an improvement over that included in the first version of thepreprint [Helb]

    42 TYPE I ESTIMATES 57

    terms small thanks to the particular way in which we switch between two differentapproximations

    (These are not necessarily successive approximations in the sense of continuedfractions we do not want to assume that the approximation aq we are given arisesfrom a continued fraction and at any rate we need more control on the denominator qprime

    of the new approximation aprimeqprime than continued fractions would furnish)The following lemma is a theme so to speak to which several variations will be

    given Later in practice we will always use one of the variations rather than theoriginal lemma itself This is so just because even though (48) is the basic type ofsum we treat in type I the sums that we will have to estimate in practice will alwayspresent some minor additional complication Proving the lemma we are about to givein full will give us a chance to see all the main ideas at work leaving complications forlater

    Lemma 421 Let α = aq+ δx (a q) = 1 |δx| le 1qQ0 q le Q0 Q0 ge 16 Letη be continuous piecewise C2 and compactly supported with |η|1 = 1 and ηprimeprime isin L1Let c0 ge |ηprimeprime|infin

    Let 1 le D le x Then if |δ| le 12c2 where c2 = (3π5radicc0)(1 +

    radic133) the

    absolute value of summleD

    micro(m)sumn

    e(αmn)η(mnx

    )(48)

    is at most

    x

    qmin

    (1

    c0(2πδ)2

    ) ∣∣∣∣∣∣∣∣∣∣summleMq

    (mq)=1

    micro(m)

    m

    ∣∣∣∣∣∣∣∣∣∣+Olowast

    (c0

    (1

    4minus 1

    π2

    )(D2

    2xq+D

    2x

    ))(49)

    plus

    2radicc0c1π

    D + 3c1x

    qlog+ D

    c2xq+

    radicc0c1π

    q log+ D

    q2

    +|ηprime|1π

    q middotmax

    (2 log

    c0e3q2

    4π|ηprime|1x

    )+

    (2radic

    3c0c1π

    +3c1c2

    +55c0c212π2

    )q

    (410)

    where c1 = 1 + |ηprime|1(2xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

    In general if |δ| ge 12c2 the absolute value of (48) is at most (49) plus

    2radicc0c1π

    (D + (1 + ε) min

    (lfloorx

    |δ|q

    rfloor+ 1 2D

    )($ε +

    1

    2log+ 2D

    x|δ|q

    ))

    + 3c1

    (2 +

    (1 + ε)

    εlog+ 2D

    x|δ|q

    )x

    Q0+

    35c0c26π2

    q

    (411)

    for ε isin (0 1] arbitrary where $ε =radic

    3 + 2ε+ ((1 +radic

    133)4minus 1)(2(1 + ε))

    58 CHAPTER 4 TYPE I SUMS

    In (49) min(1 c0(2πδ)2) always equals 1 when |δ| le 12c2 (since (35)(1 +radic

    133) gt 1)

    Proof Let Q = bx|δq|c Then α = aq + Olowast(1qQ) and q le Q (If δ = 0 welet Q = infin and ignore the rest of the paragraph since then we will never need Qprime orthe alternative approximation aprimeqprime) Let Qprime = d(1 + ε)Qe ge Q + 1 Then α is notaq + Olowast(1qQprime) and so there must be a different approximation aprimeqprime (aprime qprime) = 1qprime le Qprime such that α = aprimeqprime + Olowast(1qprimeQprime) (since such an approximation alwaysexists) Obviously |aq minus aprimeqprime| ge 1qqprime yet at the same time |aq minus aprimeqprime| le1qQ+ 1qprimeQprime le 1qQ+ 1((1 + ε)qprimeQ) Hence qprimeQ+ q((1 + ε)Q) ge 1 and soqprime ge Qminusq(1+ε) ge (ε(1+ε))Q (Note also that (ε(1+ε))Q ge (2|δq|x)middotbxδqc gt1 and so qprime ge 2)

    Lemma 412 will enable us to treat separately the contribution from terms withm divisible by q and m not divisible by q provided that m le Q2 Let M =min(Q2 D) We start by considering all terms with m le M divisible by q Thene(αmn) equals e((δmx)n) By Poisson summation

    sumn

    e(αmn)η(mnx) =sumn

    f(n)

    where f(u) = e((δmx)u)η((mx)u) Now

    f(n) =

    inte(minusun)f(u)du =

    x

    m

    inte((δ minus xn

    m

    )u)η(u)du =

    x

    mη( xmnminus δ

    )

    By assumption m le M le Q2 le x2|δq| and so |xm| ge 2|δq| ge 2δ Thus by(21) (with k = 2)

    sumn

    f(n) =x

    m

    η(minusδ) +sumn 6=0

    η(nxmminus δ)

    =x

    m

    η(minusδ) +Olowast

    sumn6=0

    1(2π(nxm minus δ

    ))2 middot ∣∣∣ηprimeprime∣∣∣

    infin

    =

    x

    mη(minusδ) +

    m

    x

    c0(2π)2

    Olowast

    max|r|le 1

    2

    sumn 6=0

    1

    (nminus r)2

    (412)

    Since x 7rarr 1x2 is convex on R+

    max|r|le 1

    2

    sumn 6=0

    1

    (nminus r)2=sumn 6=0

    1(nminus 1

    2

    )2 = π2 minus 4

    42 TYPE I ESTIMATES 59

    Therefore the sum of all terms with m leM and q|m issummleMq|m

    x

    mη(minusδ) +

    summleMq|m

    m

    x

    c0(2π)2

    (π2 minus 4)

    =xmicro(q)

    qmiddot η(minusδ) middot

    summleMq

    (mq)=1

    micro(m)

    m

    +Olowast(micro(q)2c0

    (1

    4minus 1

    π2

    )(D2

    2xq+D

    2x

    ))

    We will bound |η(minusδ)| by (21)As we have just seen estimating the contribution of the terms with m divisible by

    q and not too large (m le M ) involves isolating a main term estimating it carefully(with cancellation) and then bounding the remaining error terms

    We will now bound the contribution of all other m ndash that is m not divisible by qand m larger than M Cancellation will now be used only within the inner sum thatis we will bound each inner sum

    Tm(α) =sumn

    e(αmn)η(mnx

    )

    and then we will carefully consider how to bound sums of |Tm(α)| over m efficientlyBy (22) and Lemma 231

    |Tm(α)| le min

    (x

    m+

    1

    2|ηprime|1

    12 |ηprime|1

    | sin(πmα)|m

    x

    c04

    1

    (sinπmα)2

    ) (413)

    For any y2 gt y1 gt 0 with y2 minus y1 le q and y2 le Q2 (413) gives us thatsumy1ltmley2

    q-m

    |Tm(α)| lesum

    y1ltmley2q-m

    min

    (A

    C

    (sinπmα)2

    )(414)

    for A = (xy1)(1 + |ηprime|1(2(xy1))) and C = (c04)(y2x) We must now estimatethe sum sum

    mleMq-m

    |Tm(α)|+sum

    Q2 ltmleD

    |Tm(α)| (415)

    To bound the terms with m le M we can use Lemma 412 The question is thenwhich one is smaller the first or the second bound given by Lemma 412 A briefcalculation gives that the second bound is smaller (and hence preferable) exactly whenradicCA gt (3π10q)(1 +

    radic133) Since

    radicCA sim (

    radicc02)mx this means that

    it is sensible to prefer the second bound in Lemma 412 when m gt c2xq wherec2 = (3π5

    radicc0)(1 +

    radic133)

    It thus makes sense to ask does Q2 le c2xq (so that m le M implies m lec2xq) This question divides our work into two basic cases

    60 CHAPTER 4 TYPE I SUMS

    Case (a) δ large |δ| ge 12c2 where c2 = (3π5radicc0)(1 +

    radic133) Then

    Q2 le c2xq this will induce us to bound the first sum in (415) by the first bound inLemma 412

    Recall that M = min(Q2 D) and so M le c2xq By (414) and Lemma 412

    sum1lemleMq-m

    |Tm(α)| leinfinsumj=0

    sumjqltmlemin((j+1)qM)

    q-m

    min

    (x

    jq + 1+|ηprime|1

    2

    c04

    (j+1)qx

    (sinπmα)2

    )

    le 20

    3π2

    c0q3

    4x

    sum0lejleMq

    (j + 1) le 20

    3π2

    c0q3

    4xmiddot(

    1

    2

    M2

    q2+

    3

    2

    c2x

    q2+ 1

    )

    le 5c0c26π2

    M +5c0q

    3π2

    (3

    2c2 +

    q2

    x

    )le 5c0c2

    6π2M +

    35c0c26π2

    q

    (416)where to bound the smaller terms we are using the inequality Q2 le c2xq andwhere we are also using the observation that since |δx| le 1qQ0 the assumption|δ| ge 12c2 implies that q le 2c2xQ0 moreover since q le Q0 this gives us thatq2 le 2c2x In the main term we are bounding qM2x from above by M middot qQ2x leM2δ le c2M

    If D le (Q + 1)2 then M ge bDc and so (416) is all we need the second sumin (415) is empty Assume from now on that D gt (Q+ 1)2 The first sum in (415)is then bounded by (416) (with M = Q2) To bound the second sum in (415) wewill use the approximation aprimeqprime instead of aq The motivation is the following ifwe used the approximation aq even for m gt Q2 the contribution of the terms withq|m would be too large When we use aprimeqprime the contribution of the terms with qprime|m(or m equiv plusmn1 mod qprime) is very small only a fraction 1qprime (tiny since qprime is large) of allterms are like that and their individual contribution is always small precisely becausem gt Q2

    By (414) (without the restriction q - m on either side) and Lemma 411

    sumQ2ltmleD

    |Tm(α)| leinfinsumj=0

    sumjqprime+Q

    2 ltmlemin((j+1)qprime+Q2D)

    |Tm(α)|

    le

    lfloorDminus(Q+1)2

    qprime

    rfloorsumj=0

    (3c1

    x

    jqprime + Q+12

    +4qprime

    π

    radicc1c0

    4

    x

    jqprime + (Q+ 1)2

    (j + 1)qprime +Q2

    x

    )

    le

    lfloorDminus(Q+1)2

    qprime

    rfloorsumj=0

    (3c1

    x

    jqprime + Q+12

    +4qprime

    π

    radicc1c0

    4

    (1 +

    qprime

    jqprime + (Q+ 1)2

    ))

    where we recall that c1 = 1 + |ηprime|1(2xD) Since qprime ge (ε(1 + ε))QlfloorDminus(Q+1)2

    qprime

    rfloorsumj=0

    x

    jqprime + Q+12

    le x

    Q2+x

    qprime

    int D

    Q+12

    1

    tdt le 2x

    Q+

    (1 + ε)x

    εQlog+ D

    Q+12

    (417)

    42 TYPE I ESTIMATES 61

    Recall now that qprime le (1 + ε)Q+ 1 le (1 + ε)(Q+ 1) Therefore

    qprimebDminus(Q+1)2

    qprime csumj=0

    radic1 +

    qprime

    jqprime + (Q+ 1)2le qprime

    radic1 +

    (1 + ε)Q+ 1

    (Q+ 1)2+

    int D

    Q+12

    radic1 +

    qprime

    tdt

    le qprimeradic

    3 + 2ε+

    (D minus Q+ 1

    2

    )+qprime

    2log+ D

    Q+12

    (418)We conclude that

    sumQ2ltmleD |Tm(α)| is at most

    2radicc0c1π

    (D +

    ((1 + ε)

    radic3 + 2εminus 1

    2

    )(Q+ 1) +

    (1 + ε)Q+ 1

    2log+ D

    Q+12

    )

    + 3c1

    (2 +

    (1 + ε)

    εlog+ D

    Q+12

    )x

    Q

    (419)We sum this to (416) (with M = Q2) and obtain that (415) is at most

    2radicc0c1π

    (D + (1 + ε)(Q+ 1)

    ($ε +

    1

    2log+ D

    Q+12

    ))

    + 3c1

    (2 +

    (1 + ε)

    εlog

    DQ+1

    2

    )x

    Q+

    35c0c26π2

    q

    (420)

    where we are bounding

    5c0c26π2

    =5c06π2

    5radicc0

    (1 +

    radic13

    3

    )=

    radicc0

    (1 +

    radic13

    3

    )le

    2radicc0c1π

    middot 14

    (1 +

    radic13

    3

    )(421)

    and defining

    $ε =radic

    3 + 2ε+

    (1

    4

    (1 +

    radic13

    3

    )minus 1

    )1

    2(1 + ε) (422)

    (Note that $ε ltradic

    3 for ε lt 01741) A quick check against (416) shows that (420)is valid also when D le Q2 even when Q + 1 is replaced by min(Q + 1 2D) Webound Q from above by x|δ|q and log+D((Q + 1)2) by log+ 2D(x|δ|q + 1)and obtain the result

    Case (b) |δ| small |δ| le 12c2 or D le Q02 Then min(c2xqD) le Q2 Westart by bounding the first q2 terms in (415) by (413) and Lemma 413sum

    mleq2

    |Tm(α)| lesum

    mleq2

    min

    ( 12 |ηprime|1

    | sin(πmα)|

    c0q8x

    | sin(πmα)|2

    )

    le |ηprime|1π

    qmax

    (2 log

    c0e3q2

    4π|ηprime|1x

    )

    (423)

    62 CHAPTER 4 TYPE I SUMS

    If q2 lt 2c2x we estimate the terms with q2 lt m le c2xq by Lemma 412which is applicable because min(c2xqD) lt Q2

    sumq2ltmleDprime

    q-m

    |Tm(α)| leinfinsumj=1

    sum(jminus 1

    2 )qltmle(j+ 12 )q

    mlemin( c2xq D)q-m

    min

    (x(

    j minus 12

    )q

    +|ηprime1|2c04

    (j+12)qx

    (sinπmα)2

    )

    le 20

    3π2

    c0q3

    4x

    sum1lejleDprimeq + 1

    2

    (j +

    1

    2

    )le 20

    3π2

    c0q3

    4x

    (c2x

    2q2

    Dprime

    q+

    3

    2

    (c2x

    q2

    )+

    5

    8

    )

    le 5c06π2

    (c2D

    prime + 3c2q +5

    4

    q3

    x

    )le 5c0c2

    6π2

    (Dprime +

    11

    2q

    )

    (424)where we write Dprime = min(c2xqD) If c2xq ge D we stop here Assume thatc2xq lt D Let R = max(c2xq q2) The terms we have already estimated areprecisely those with m le R We bound the terms R lt m le D by the second boundin Lemma 411sum

    RltmleD

    |Tm(α)| leinfinsumj=0

    summgtjq+R

    mlemin((j+1)q+RD)

    min

    (c1x

    jq +Rc04

    (j+1)q+Rx

    (sinπmα)2

    )

    leb 1q (DminusR)csumj=0

    3c1x

    jq +R+

    4q

    π

    radicc1c0

    4

    (1 +

    q

    jq +R

    ) (425)

    (Note there is no need to use two successive approximations aq aprimeqprime as in case (a)We are also including all terms with m divisible by q as we may since |Tm(α)| isnon-negative) Now much as before

    b 1q (DminusR)csumj=0

    x

    jq +Rle x

    R+x

    q

    int D

    R

    1

    tdt le min

    (q

    c2

    2x

    q

    )+x

    qlog+ D

    c2xq (426)

    andb 1q (DminusR)csumj=0

    radic1 +

    q

    jq +Rleradic

    1 +q

    R+

    1

    q

    int D

    R

    radic1 +

    q

    tdt

    leradic

    3 +D minusRq

    +1

    2log+ D

    q2

    (427)

    We sum with (423) and (424) and we obtain that (415) is at most

    2radicc0c1π

    (radic3q +D +

    q

    2log+ D

    q2

    )+

    (3c1 log+ D

    c2xq

    )x

    q

    + 3c1 min

    (q

    c2

    2x

    q

    )+

    55c0c212π2

    q +|ηprime|1π

    q middotmax

    (2 log

    c0e3q2

    4π|ηprime|1x

    )

    (428)

    42 TYPE I ESTIMATES 63

    where we are using the fact that 5c0c26π2 lt 2

    radicc0c1π to make sure that the term

    (5c0c26π2)Dprime from (424) is more than compensated by the termminus2

    radicc0c1Rπ com-

    ing from minusRq in (427) (by the definition of Dprime and R we have R ge D) We canalso use 5c0c26π

    2 lt 2radicc0c1π to bound the term (5c0c26π

    2)Dprime from (424) by theterm 2

    radicc0c1Dπ in (428) in case c2xq ge D (Again by definition Dprime le D) Thus

    (428) is valid both when c2xq lt D and when c2xq ge D

    421 Type I variationsWe will need a version of Lemma 421 with m and n restricted to the odd numbers(We will barely be using the restriction of m whereas the restriction on n is both (a)slightly harder to deal with (b) something that can be turned to our advantage)

    Lemma 422 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge 16 Let η be continuous piecewise C2 and compactly supported with|η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin

    Let 1 le D le x Then if |δ| le 12c2 where c2 = 6π5radicc0 the absolute value ofsum

    mleDm odd

    micro(m)sumn odd

    e(αmn)η(mnx

    )(429)

    is at most

    x

    2qmin

    (1

    c0(πδ)2

    ) ∣∣∣∣∣∣∣∣∣∣summleMq

    (m2q)=1

    micro(m)

    m

    ∣∣∣∣∣∣∣∣∣∣+Olowast

    (c0q

    x

    (1

    8minus 1

    2π2

    )(D

    q+ 1

    )2)

    (430)

    plus

    2radicc0c1π

    D +3c12

    x

    qlog+ D

    c2xq+

    radicc0c1π

    q log+ D

    q2

    +2|ηprime|1π

    q middotmax

    (1 log

    c0e3q2

    4π|ηprime|1x

    )+

    (2radic

    3c0c1π

    +3c12c2

    +55c0c2

    6π2

    )q

    (431)

    where c1 = 1 + |ηprime|1(xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

    In general if |δ| ge 12c2 the absolute value of (48) is at most (430) plus

    2radicc0c1π

    (D + (1 + ε) min

    (lfloorx

    |δ|q

    rfloor+ 1 2D

    )(radic3 + 2ε+

    1

    2log+ 2D

    x|δ|q

    ))

    +3

    2c1

    (2 +

    (1 + ε)

    εlog+ 2D

    x|δ|q

    )x

    Q0+

    35c0c23π2

    q

    (432)for ε isin (0 1] arbitrary

    64 CHAPTER 4 TYPE I SUMS

    If q is even the sum (430) can be replaced by 0

    Proof The proof is almost exactly that of Lemma 421 we go over the differencesThe parameters Q Qprime aprime qprime and M are defined just as before (with 2α wherever wehad α)

    Let us first consider m le M odd and divisible by q (Of course this case arisesonly if q is odd) For n = 2r + 1

    e(αmn) = e(αm(2r + 1)) = e(2αrm)e(αm)

    = e

    xrm

    )e

    ((a

    2q+

    δ

    2x+κ

    2

    )m

    )= e

    (δ(2r + 1)

    2xm

    )e

    (a+ κq

    2

    m

    q

    )= κprimee

    (δ(2r + 1)

    2xm

    )

    where κ isin 0 1 and κprime = e((a + κq)2) isin minus1 1 are independent of m and nHence by Poisson summationsum

    n odd

    e(αmn)η(mnx) = κprimesumn odd

    e((δm2x)n)η(mnx)

    =κprime

    2

    (sumn

    f(n)minussumn

    f(n+ 12)

    )

    (433)

    where f(u) = e((δm2x)u)η((mx)u) Now

    f(t) =x

    (x

    mtminus δ

    2

    )

    Just as before |xm| ge 2|δq| ge 2δ Thus

    1

    2

    ∣∣∣∣∣sumn

    f(n)minussumn

    f(n+ 12)

    ∣∣∣∣∣ le x

    m

    1

    2

    ∣∣∣∣η(minusδ2)∣∣∣∣+

    1

    2

    sumn 6=0

    ∣∣∣∣η( xm n

    2minus δ

    2

    )∣∣∣∣

    =x

    m

    1

    2

    ∣∣∣∣η(minusδ2)∣∣∣∣+

    1

    2middotOlowast

    sumn 6=0

    1(π(nxm minus δ

    ))2 middot ∣∣∣ηprimeprime∣∣∣

    infin

    =

    x

    2m

    ∣∣∣∣η(minusδ2)∣∣∣∣+

    m

    x

    c02π2

    (π2 minus 4)x

    (434)The contribution of the second term in the last line of (434) issum

    mleMm oddq|m

    m

    x

    c02π2

    (π2 minus 4) =q

    x

    c02π2

    (π2 minus 4) middotsum

    mleMq

    m odd

    m

    =qc0x

    (1

    8minus 1

    2π2

    )(M

    q+ 1

    )2

    42 TYPE I ESTIMATES 65

    Hence the absolute value of the sum of all terms with m le M and q|m is given by(430)

    We define Tm(α) by

    Tm(α) =sumn odd

    e(αmn)η(mnx

    ) (435)

    Changing variables by n = 2r + 1 we see that

    |Tm(α)| =

    ∣∣∣∣∣sumr

    e(2α middotmr)η(m(2r + 1)x)

    ∣∣∣∣∣ Hence instead of (413) we get that

    |Tm(α)| le min

    (x

    2m+

    1

    2|ηprime|1

    12 |ηprime|1

    | sin(2πmα)|m

    x

    c02

    1

    (sin 2πmα)2

    ) (436)

    We obtain (414) but with Tm instead of Tm A = (x2y1)(1 + |ηprime|1(xy1)) andC = (c02)(y2x) and so c1 = 1 + |ηprime|1(xD)

    The rest of the proof of Lemma 421 carries almost over word-by-word (For thesake of simplicity we do not really try to take advantage of the odd support of mhere) Since C has doubled it would seem to make sense to reset the value of c2 to bec2 = (3π5

    radic2c0)(1 +

    radic133) this would cause complications related to the fact that

    5c0c23π2 would become larger than 2

    radicc0π and so we set c2 to the slightly smaller

    value c2 = 6π5radicc0 instead This implies

    5c0c23π2

    =2radicc0π

    (437)

    The bound from (416) gets multiplied by 2 (but the value of c2 has changed) thesecond line in (419) gets halved (421) gets replaced by (437) the second term inthe maximum in the second line of (423) gets doubled the bound from (424) getsdoubled and the bound from (426) gets halved

    We will also need a version of Lemma 421 (or rather Lemma 422 we will decideto work with the restriction that n and m be odd) with a factor of (log n) within theinner sum This is the sum SI1 in (39)

    Lemma 423 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(16 2

    radicx) Let η be continuous piecewise C2 and compactly

    supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Assume that for any ρ ge ρ0ρ0 a constant the function η(ρ)(t) = log(ρt)η(t) satisfies

    |η(ρ)|1 le log(ρ)|η|1 |ηprime(ρ)|1 le log(ρ)|ηprime|1 |ηprimeprime(ρ)|infin le c0 log(ρ) (438)

    Letradic

    3 le D le min(xρ0 xe) Then if |δ| le 12c2 where c2 = 6π5radicc0 the

    absolute value of summleDm odd

    micro(m)sumn

    n odd

    (log n)e(αmn)η(mnx

    )(439)

    66 CHAPTER 4 TYPE I SUMS

    is at most

    x

    qmin

    (1c0δ

    2

    (2π)2

    ) ∣∣∣∣∣∣∣∣∣∣summleMq

    (mq)=1

    micro(m)

    mlog

    x

    mq

    ∣∣∣∣∣∣∣∣∣∣+x

    q|log middotη(minusδ)|

    ∣∣∣∣∣∣∣∣∣∣summleMq

    (mq)=1

    micro(m)

    m

    ∣∣∣∣∣∣∣∣∣∣+Olowast

    (c0

    (1

    2minus 2

    π2

    )(D2

    4qxlog

    e12x

    D+

    1

    e

    )) (440)

    plus

    2radicc0c1π

    D logex

    D+

    3c12

    x

    qlog+ D

    c2xqlog

    q

    c2

    +

    (2|ηprime|1π

    max

    (1 log

    c0e3q2

    4π|ηprime|1x

    )log x+

    2radicc0c1π

    (radic3 +

    1

    2log+ D

    q2

    )log

    q

    c2

    )q

    +3c12

    radic2x

    c2log

    2x

    c2+

    20c0c322

    3π2

    radic2x log

    2radicex

    c2(441)

    for c1 = 1 + |ηprime|1(xD) The same bound holds if |δ| ge 12c2 but D le Q02In general if |δ| ge 12c2 the absolute value of (439) is at most

    2radicc0c1π

    D logex

    D+

    2radicc0c1π

    (1 + ε)

    (x

    |δ|q+ 1

    )(radic3 + 2ε middot log+ 2

    radice|δ|q +

    1

    2log+ 2D

    x|δ|q

    log+ 2|δ|q

    )

    +

    (3c14

    (2radic5

    +1 + ε

    2εlog x

    )+

    40

    3

    radic2c0c

    322

    )radicx log x

    (442)for ε isin (0 1]

    Proof DefineQQprimeM aprime and qprime as in the proof of Lemma 421 The same method ofproof works as for Lemma 421 we go over the differences When applying Poissonsummation or (22) use η(xm)(t) = (log xtm)η(t) instead of η(t) Then use thebounds in (438) with ρ = xm in particular

    |ηprimeprime(xm)|infin le c0 logx

    m

    For f(u) = e((δm2x)u)(log u)η((mx)u)

    f(t) =x

    mη(xm)

    (x

    mtminus δ

    2

    )

    42 TYPE I ESTIMATES 67

    and so

    1

    2

    sumn

    ∣∣∣f(n2)∣∣∣ le x

    m

    1

    2

    ∣∣∣∣η(xm)

    (minusδ

    2

    )∣∣∣∣+1

    2

    sumn 6=0

    ∣∣∣∣η( xm n

    2minus δ

    2

    )∣∣∣∣

    =1

    2

    x

    m

    (log middotη

    (minusδ

    2

    )+ log

    ( xm

    (minusδ

    2

    ))+m

    x

    (log

    x

    m

    ) c02π2

    (π2 minus 4)

    The part of the main term involving log(xm) becomes

    xη(minusδ)2

    summleMm oddq|m

    micro(m)

    mlog( xm

    )=xmicro(q)

    qη(minusδ) middot

    summleMq

    (m2q)=1

    micro(m)

    mlog

    (x

    mq

    )

    for q odd (We can see that this like the rest of the main term vanishes for m even)In the term in front of π2 minus 4 we find the sum

    summleMm oddq|m

    m

    xlog( xm

    )le M

    xlog

    x

    M+q

    2

    int Mq

    0

    t logxq

    tdt

    =M

    xlog

    x

    M+M2

    4qxlog

    e12x

    M

    where we use the fact that t 7rarr t log(xt) is increasing for t le xe By the same fact(and by M le D) (M2q) log(e12xM) le (D2q) log(e12xD) It is also easy tosee that (Mx) log(xM) le 1e (since M le D le x)

    The basic estimate for the rest of the proof (replacing (413)) is

    Tm(α) =sumn odd

    e(αmn)(log n)η(mnx

    )=sumn odd

    e(αmn)η(xm)

    (mnx

    )

    = Olowast

    min

    x

    2m|η(xm)|1 +

    |ηprime(xm)|12

    12 |ηprime(xm)|1

    | sin(2πmα)|m

    x

    12 |ηprimeprime(xm)|infin

    (sin 2πmα)2

    = Olowast

    (log

    x

    mmiddotmin

    (x

    2m+|ηprime|1

    2

    12 |ηprime|1

    | sin(2πmα)|m

    x

    c02

    1

    (sin 2πmα)2

    ))

    We wish to bound summleMq-mm odd

    |Tm(α)|+sum

    Q2 ltmleD

    |Tm(α)| (443)

    Just as in the proofs of Lemmas 421 and 422 we give two bounds one valid for|δ| large (|δ| ge 12c2) and the other for δ small (|δ| le 12c2) Again as in the proofof Lemma 422 we ignore the condition that m is odd in (415)

    68 CHAPTER 4 TYPE I SUMS

    Consider the case of |δ| large first Instead of (416) we havesum1lemleMq-m

    |Tm(α)| le 40

    3π2

    c0q3

    2x

    sum0lejleMq

    (j + 1) logx

    jq + 1 (444)

    Since sum0lejleMq

    (j + 1) logx

    jq + 1

    le log x+M

    qlog

    x

    M+

    sum1lejleMq

    logx

    jq+

    sum1lejleMq minus1

    j logx

    jq

    le log x+M

    qlog

    x

    M+

    int Mq

    0

    logx

    tqdt+

    int Mq

    1

    t logx

    tqdt

    le log x+

    (2M

    q+M2

    2q2

    )log

    e12x

    M

    this means thatsum1lemleMq-m

    |Tm(α)| le 40

    3π2

    c0q3

    4x

    (log x+

    (2M

    q+M2

    2q2

    )log

    e12x

    M

    )

    le 5c0c23π2

    M log

    radicex

    M+

    40

    3

    radic2c0c

    322

    radicx log x

    (445)

    where we are using the bounds M le Q2 le c2xq and q2 le 2c2x (just as in (416))Instead of (417) we havelfloor

    Dminus(Q+1)2

    qprime

    rfloorsumj=0

    (log

    x

    jqprime + Q+12

    )x

    jqprime + Q+12

    le x

    Q2log

    2x

    Q+x

    qprime

    int D

    Q+12

    logx

    t

    dt

    t

    le 2x

    Qlog

    2x

    Q+x

    qprimelog

    2x

    Qlog+ 2D

    Q

    recall that the coefficient in front of this sum will be halved by the condition that n isodd Instead of (418) we obtain

    qprimebDminus(Q+1)2

    qprime csumj=0

    radic1 +

    qprime

    jqprime + (Q+ 1)2

    (log

    x

    jqprime + Q+12

    )

    le qprimeradic

    3 + 2ε middot log2x

    Q+ 1+

    int D

    Q+12

    (1 +

    qprime

    2t

    )(log

    x

    t

    )dt

    le qprimeradic

    3 + 2ε middot log2x

    Q+ 1+D log

    ex

    D

    minus Q+ 1

    2log

    2ex

    Q+ 1+qprime

    2log

    2x

    Q+ 1log

    2D

    Q+ 1

    42 TYPE I ESTIMATES 69

    (The boundint ba

    log(xt)dtt le log(xa) log(ba) will be more practical than the exactexpression for the integral) Hence

    sumQ2ltmleD |Tm(α)| is at most

    2radicc0c1π

    D logex

    D

    +2radicc0c1π

    ((1 + ε)

    radic3 + 2ε+

    (1 + ε)

    2log

    2D

    Q+ 1

    )(Q+ 1) log

    2x

    Q+ 1

    minus2radicc0c1π

    middot Q+ 1

    2log

    2ex

    Q+ 1+

    3c12

    (2radic5

    +1 + ε

    εlog+ D

    Q2

    )radicx log

    radicx

    Summing this to (445) (with M = Q2) and using (421) and (422) as before weobtain that (443) is at most

    2radicc0c1π

    D logex

    D

    +2radicc0c1π

    (1 + ε)(Q+ 1)

    (radic3 + 2ε log+ 2

    radicex

    Q+ 1+

    1

    2log+ 2D

    Q+ 1log+ 2x

    Q+ 1

    )+

    3c12

    (2radic5

    +1 + ε

    εlog+ D

    Q2

    )radicx log

    radicx+

    40

    3

    radic2c0c

    322

    radicx log x

    Now we go over the case of |δ| small (or D le Q02) Instead of (423) we havesummleq2

    |Tm(α)| le 2|ηprime|1π

    qmax

    (1 log

    c0e3q2

    4π|ηprime|1x

    )log x (446)

    Suppose q2 lt 2c2x (Otherwise the sum we are about to estimate is empty) Insteadof (424) we havesumq2ltmleDprime

    q-m

    |Tm(α)| le 40

    3π2

    c0q3

    6x

    sum1lejleDprimeq + 1

    2

    (j +

    1

    2

    )log

    x(j minus 1

    2

    )q

    le 10c0q3

    3π2x

    (log

    2x

    q+

    1

    q

    int Dprime

    0

    logx

    tdt+

    1

    q

    int Dprime

    0

    t logx

    tdt+

    Dprime

    qlog

    x

    Dprime

    )

    =10c0q

    3

    3π2x

    (log

    2x

    q+

    (2Dprime

    q+

    (Dprime)2

    2q2

    )log

    radicex

    Dprime

    )le 5c0c2

    3π2

    (4radic

    2c2x log2x

    q+ 4radic

    2c2x log

    radicex

    Dprime+Dprime log

    radicex

    Dprime

    )le 5c0c2

    3π2

    (Dprime log

    radicex

    Dprime+ 4radic

    2c2x log2radicex

    c2

    )(447)

    where Dprime = min(c2xqD) (We are using the bounds q3x le (2c2)32 Dprimeq2x lec2q lt c

    322

    radic2x and Dprimeqx le c2) Instead of (425) we have

    sumRltmleD

    |Tm(α)| lebDminusRq csumj=0

    (3c12 x

    jq +R+

    4q

    π

    radicc1c0

    4

    (1 +

    q

    jq +R

    ))log

    x

    jq +R

    70 CHAPTER 4 TYPE I SUMS

    where R = max(c2xq q2) We can simply reuse (426) multiplying it by log xRthe only difference is that now we take care to bound min(qc2 2xq) by the geometricmean

    radic(qc2)(2xq) =

    radic2xc2 We replace (427) by

    b 1q (DminusR)csumj=0

    radic1 +

    q

    jq +Rlog

    x

    jq +Rleradic

    1 +q

    Rlog

    x

    R+

    1

    q

    int D

    R

    radic1 +

    q

    tlog

    x

    tdt

    leradic

    3 logq

    c2+

    (D

    qlog

    ex

    Dminus R

    qlog

    ex

    R

    )+

    1

    2log

    q

    c2log+ D

    R

    (448)We sum with (446) and (447) and obtain (441) as an upper bound for (443) (Just asin the proof of Lemma 421 the term (5c0c2(3π

    2))Dprime log(radicexDprime) is smaller than

    the term (2radicc1c0π)R log exR in (448) and thus gets absorbed by it when D gt R

    If D le R then again as in Lemma 421 the sumsumRltmleD |Tm(α)| is empty and

    we bound (5c0c2(3π2))Dprime log(

    radicexDprime) by the term (2

    radicc1c0π)D log exD which

    would not appear otherwise)

    Now comes the time to focus on our second type I sum namelysumvleVv odd

    Λ(v)sumuleUu odd

    micro(u)sumn

    n odd

    e(αvun)η(vunx)

    which corresponds to the term SI2 in (39) The innermost two sums on their ownare a sum of type I we have already seen Accordingly for q small we will be able tobound them using Lemma 422 If q is large then that approach does not quite worksince then the approximation avq to vα is not always good enough (As we shall latersee we need q le Qv for the approximation to be sufficiently close for our purposes)

    Fortunately when q is large we can also afford to lose a factor of log since thegains from q will be large Here is the estimate we will use for q large

    Lemma 424 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(2e 2

    radicx) Let η be continuous piecewise C2 and compactly

    supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Let c2 = 6π5radicc0 Assume

    that x ge e2c22Let U V ge 1 satisfy UV +(1918)Q0 le x56 Then if |δ| le 12c2 the absolute

    value of ∣∣∣∣∣∣∣∣sumvleVv odd

    Λ(v)sumuleUu odd

    micro(u)sumn

    n odd

    e(αvun)η(vunx)

    ∣∣∣∣∣∣∣∣ (449)

    is at most

    x

    2qmin

    (1

    c0(πδ)2

    )log V q

    +Olowast(

    1

    4minus 1

    π2

    )middot c0(D2 log V

    2qx+

    3c42

    UV 2

    x+

    (U + 1)2V

    2xlog q

    ) (450)

    42 TYPE I ESTIMATES 71

    plus

    2radicc0c1π

    (D log

    Dradice

    + q

    (radic3 log

    c2x

    q+

    logD

    2log+ D

    q2

    ))+

    3c12

    x

    qlogD log+ D

    c2xq+

    2|ηprime|1π

    qmax

    (1 log

    c0e3q2

    4π|ηprime|1x

    )log

    q

    2

    +3c1

    2radic

    2c2

    radicx log

    c2x

    2+

    25c04π2

    (2c2)32radicx log x

    (451)

    whereD = UV and c1 = 1+ |ηprime|1(2xD) and c4 = 103884 The same bound holdsif |δ| ge 12c2 but D le Q02

    In general if |δ| ge 12c2 the absolute value of (449) is at most (450) plus

    2radicc0c1π

    D logD

    e

    +2radicc0c1π

    (1 + ε)

    (x

    |δ|q+ 1

    )((radic

    3 + 2εminus 1) log

    x|δ|q + 1radic

    2+

    1

    2logD log+ e2D

    x|δ|q

    )

    +

    (3c12

    (1

    2+

    3(1 + ε)

    16εlog x

    )+

    20c03π2

    (2c2)32

    )radicx log x

    (452)for ε isin (0 1]

    Proof We proceed essentially as in Lemma 421 and Lemma 422 Let Q qprime and Qprime

    be as in the proof of Lemma 422 that is with 2α where Lemma 421 uses αLet M = min(UVQ2) We first consider the terms with uv le M u and v odd

    uv divisible by q If q is even there are no such terms Assume q is odd Then by(433) and (434) the absolute value of the contribution of these terms is at most

    sumaleMa oddq|a

    sumv|a

    aUlevleV

    Λ(v)micro(av)

    (xη(minusδ2)

    2a+O

    (a

    x

    |ηprimeprime|infin2π2

    middot (π2 minus 4)

    )) (453)

    Now

    sumaleMa oddq|a

    sumv|a

    aUlevleV

    Λ(v)micro(av)

    a

    =sumvleVv odd

    (vq)=1

    Λ(v)

    v

    sumulemin(UMV )

    u oddq|u

    micro(u)

    u+sumpαleVp oddp|q

    Λ(pα)

    sumulemin(UMV )

    u oddq

    (qpα)|u

    micro(u)

    u

    72 CHAPTER 4 TYPE I SUMS

    which equals

    micro(q)

    q

    sumvleVv odd

    (vq)=1

    Λ(v)

    v

    sumulemin(UqMV q)

    (u2q)=1

    micro(u)

    u

    +micro(

    q(qpα)

    )q

    sumpαleVp oddp|q

    Λ(pα)

    pα(q pα)

    sumulemin( U

    q(qpα)MV

    q(qpα) )u odd

    (u q(qpα) )=1

    micro(u)

    u

    =1

    qmiddotOlowast

    sumvleV

    (v2q)=1

    Λ(v)

    v+sumpαleVp oddp|q

    log p

    pα(q pα)

    where we are using (220) to bound the sums on u by 1 We notice that

    sumpαleVp oddp|q

    log p

    pα(q pα)lesump oddp|q

    (log p)

    vp(q) +sum

    αgtvp(q)

    pαleV

    1

    pαminusvp(q)

    le log q +

    sump oddp|q

    (log p)sumβgt0

    pβle V

    pvp(q)

    log p

    pβle log q +

    sumvleVv odd

    (vq)=1

    Λ(v)

    v

    and so

    sumaleMa oddq|a

    sumv|a

    aUlevleV

    Λ(v)micro(av)

    a=

    1

    qmiddotOlowast

    log q +sumvleV

    (v2)=1

    Λ(v)

    v

    =

    1

    qmiddotOlowast(log q + log V )

    by (212) The absolute value of the sum of the terms with η(minusδ2) in (453) is thus atmost

    x

    q

    η(minusδ2)

    2(log q + log V ) le x

    2qmin

    (1

    c0(πδ)2

    )log V q

    where we are bounding η(minusδ2) by (21) (with k = 2)

    42 TYPE I ESTIMATES 73

    The other terms in (453) contribute at most

    (π2 minus 4)|ηprimeprime|infin2π2

    1

    x

    sumuleU

    sumvleV

    uv odduvleM q|uvu sq-free

    Λ(v)uv (454)

    For any RsumuleRu oddq|u le R24q + 3R4 Using the estimates (212) (215)

    and (216) we obtain that the double sum in (454) is at mostsumvleV

    (v2q)=1

    Λ(v)vsum

    ulemin(UMv)

    u oddq|u

    u+sumpαleVp oddp|q

    (log p)pαsumuleUu oddq

    (qpα)|u

    u

    lesumvleV

    (v2q)=1

    Λ(v)v middot(

    (Mv)2

    4q+

    3M

    4v

    )+sumpαleVp oddp|q

    (log p)pα middot (U + 1)2

    4

    le M2 log V

    4q+

    3c44MV +

    (U + 1)2

    4V log q

    (455)

    where c4 = 103884From this point onwards we use the easy bound∣∣∣∣∣∣∣∣∣

    sumv|a

    aUlevleV

    Λ(v)micro(av)

    ∣∣∣∣∣∣∣∣∣ le log a

    What we must bound now issummleUVm odd

    q - m orm gt M

    (logm)sumn odd

    e(αmn)η(mnx) (456)

    The inner sum is the same as the sum Tm(α) in (435) we will be using the bound(436) Much as before we will be able to ignore the condition that m is odd

    Let D = UV What remains to do is similar to what we did in the proof of Lemma421 (or Lemma 422)

    Case (a) δ large |δ| ge 12c2 Instead of (416) we have

    sum1lemleMq-m

    (logm)|Tm(α)| le 40

    3π2

    c0q3

    4x

    sum0lejleMq

    (j + 1) log(j + 1)q

    74 CHAPTER 4 TYPE I SUMS

    and since M le min(c2xqD) q leradic

    2c2x (just as in the proof of Lemma 421) andsum0lejleMq

    (j + 1) log(j + 1)q

    le M

    qlogM +

    (M

    q+ 1

    )log(M + 1) +

    1

    q2

    int M

    0

    t log t dt

    le(

    2M

    q+ 1

    )log x+

    M2

    2q2log

    Mradice

    we conclude thatsum1lemleMq-m

    |Tm(α)| le 5c0c23π2

    M logMradice

    +20c03π2

    (2c2)32radicx log x

    (457)

    Instead of (417) we have

    bDminus(Q+1)2

    qprime csumj=0

    x

    jqprime + Q+12

    log

    (jqprime +

    Q+ 1

    2

    )le x

    Q+12

    logQ+ 1

    2+x

    qprime

    int D

    Q+12

    log t

    tdt

    le 2x

    Qlog

    Q

    2+

    (1 + ε)x

    2εQ

    ((logD)2 minus

    (log

    Q

    2

    )2)

    Instead of (418) we estimate

    qprime

    lfloorDminusQ+1

    2qprime

    rfloorsumj=0

    (log

    (Q+ 1

    2+ jqprime

    ))radic1 +

    qprime

    jqprime + Q+12

    le qprime(

    logD + (radic

    3 + 2εminus 1) logQ+ 1

    2

    )+

    int D

    Q+12

    log t dt+

    int D

    Q+12

    qprime log t

    2tdt

    le qprime(

    logD +(radic

    3 + 2εminus 1)

    logQ+ 1

    2

    )+

    (D log

    D

    eminus Q+ 1

    2log

    Q+ 1

    2e

    )+qprime

    2logD log+ D

    Q+12

    We conclude that when D ge Q2 the sumsumQ2ltmleD(logm)|Tm(α)| is at most

    2radicc0c1π

    (D log

    D

    e+ (Q+ 1)

    ((1 + ε)(

    radic3 + 2εminus 1) log

    Q+ 1

    2minus 1

    2log

    Q+ 1

    2e

    ))+

    radicc0c1π

    (Q+ 1)(1 + ε) logD log+ e2DQ+1

    2

    +3c12

    (2x

    Qlog

    Q

    2+

    (1 + ε)x

    2εQ

    ((logD)2 minus

    (log

    Q

    2

    )2))

    42 TYPE I ESTIMATES 75

    We must now add this to (457) Since

    (1 + ε)(radic

    3 + 2εminus 1) logradic

    2minus 1

    2log 2e+

    1 +radic

    133

    2log 2radice gt 0

    and Q ge 2radicx we conclude that (456) is at most

    2radicc0c1π

    D logD

    e

    +2radicc0c1π

    (1 + ε)(Q+ 1)

    ((radic

    3 + 2εminus 1) logQ+ 1radic

    2+

    1

    2logD log+ e2D

    Q+12

    )

    +

    (3c12

    (1

    2+

    3(1 + ε)

    16εlog x

    )+

    20c03π2

    (2c2)32

    )radicx log x

    (458)Case (b) δ small |δ| le 12c2 or D le Q02 The analogue of (423) is a bound of

    le 2|ηprime|1π

    qmax

    (1 log

    c0e3q2

    4π|ηprime|1x

    )log

    q

    2

    for the terms with m le q2 If q2 lt 2c2x then much as in (424) we havesumq2ltmleDprime

    q-m

    |Tm(α)|(logm) le 10

    π2

    c0q3

    3x

    sum1lejleDprimeq + 1

    2

    (j +

    1

    2

    )log(j + 12)q

    le 10

    π2

    c0q

    3x

    int Dprime+ 32 q

    q

    x log x dx

    (459)

    Sinceint Dprime+ 32 q

    q

    x log x dx =1

    2

    (Dprime +

    3

    2q

    )2

    logDprime + 3

    2qradiceminus 1

    2q2 log

    qradice

    =

    (1

    2Dprime2 +

    3

    2Dprimeq

    )(log

    Dprimeradice

    +3

    2

    q

    Dprime

    )+

    9

    8q2 log

    Dprime + 32qradiceminus 1

    2q2 log

    qradice

    =1

    2Dprime2 log

    Dprimeradice

    +3

    2Dprimeq logDprime +

    9

    8q2

    (2

    9+

    3

    2+ log

    (Dprime +

    19

    18q

    ))

    where Dprime = min(c2xqD) and since the assumption (UV + (1918)Q0) le x56implies that (29 + 32 + log(Dprime + (1918)q)) le x we conclude thatsum

    q2ltmleDprime

    q-m

    |Tm(α)|(logm)

    le 5c0c23π2

    Dprime logDprimeradice

    +10c03π2

    (3

    4(2c2)32

    radicx log x+

    9

    8(2c2)32

    radicx log x

    )le 5c0c2

    3π2Dprime log

    Dprimeradice

    +25c04π2

    (2c2)32radicx log x

    (460)

    76 CHAPTER 4 TYPE I SUMS

    Let R = max(c2xq q2) We bound the terms R lt m le D as in (425) with afactor of log(jq +R) inside the sum The analogues of (426) and (427) are

    b 1q (DminusR)csumj=0

    x

    jq +Rlog(jq +R) le x

    RlogR+

    x

    q

    int D

    R

    log t

    tdt

    leradic

    2x

    c2log

    radicc2x

    2+x

    qlogD log+ D

    R

    (461)

    where we use the assumption that x ge e2c2 and

    b 1q (DminusR)csumj=0

    log(jq +R)

    radic1 +

    q

    jq +Rleradic

    3 logR

    +1

    q

    (D log

    D

    eminusR log

    R

    e

    )+

    1

    2logD log

    D

    R

    (462)

    (or 0 if D lt R) We sum with (460) and the terms with m le q2 and obtain forDprime = c2xq = R

    2radicc0c1π

    (D log

    Dradice

    + q

    (radic3 log

    c2x

    q+

    logD

    2log+ D

    q2

    ))+

    3c12

    x

    qlogD log+ D

    c2xq+

    2|ηprime|1π

    qmax

    (1 log

    c0e3q2

    4π|ηprime|1x

    )log

    q

    2

    +3c1

    2radic

    2c2

    radicx log

    c2x

    2+

    25c04π2

    (2c2)32radicx log x

    which it is easy to check is also valid even if Dprime = D (in which case (461) and (462)do not appear) or R = q2 (in which case (460) does not appear)

    Chapter 5

    Type II sums

    We must now consider the sum

    SII =summgtU

    (mv)=1

    sumdgtUd|m

    micro(d)

    sumngtV

    (nv)=1

    Λ(n)e(αmn)η(mnx) (51)

    Here the main improvements over classical treatments of type II sums are as fol-lows

    1 obtaining cancellation in the term sumdgtUd|m

    micro(d)

    leading to a gain of a factor of log

    2 using a large sieve for primes getting rid of a further log

    3 exploiting via a non-conventional application of the principle of the large sieve(Lemma 521) the fact that α is in the tail of an interval (when that is the case)

    It should be clear that these techniques are of general applicability (It is also clear that(2) is not new though strangely enough it seems not to have been applied to Gold-bachrsquos problem Perhaps this oversight is due to the fact that proofs of Vinogradovrsquosresult given in textbooks often follow Linnikrsquos dispersion method rather than the largesieve Our treatment of the large sieve for primes will follow the lines set by Mont-gomery and Montgomery-Vaughan [MV73 (16)] The fact that the large sieve forprimes can be combined with the new technique (3) is of course a novelty)

    While (1) is particularly useful for the treatment of a term that generally arises inapplications of Vaughanrsquos identity all of the points above address issues that can arisein more general situations in number theory

    77

    78 CHAPTER 5 TYPE II SUMS

    It is technically helpful to express η as the (multiplicative) convolution of two func-tions of compact support ndash preferrably the same function

    η(x) = η1 lowastM η1 =

    int infin0

    η1(t)η1(xt)dt

    t (52)

    For the smoothing function η(t) = η2(t) = 4 max(log 2 minus | log 2t| 0) equation (52)holds with η1 = 2 middot 1[121] where 1[121] is the characteristic function of the interval[12 1] We will work with η = η2 yet most of our work will be valid for any η of theform η = η1 lowast η1

    By (52) the sum (51) equals

    4

    int infin0

    summgtU

    (mv)=1

    sumdgtUd|m

    micro(d)

    sumngtV

    (nv)=1

    Λ(n)e(αmn)η1(t)η1

    (mnx

    t

    )dt

    t

    = 4

    int xU

    V

    summax( x

    2W U)ltmle xW

    (mv)=1

    sumdgtUd|m

    micro(d)

    summax(VW2 )ltnleW

    (nv)=1

    Λ(n)e(αmn)dW

    W

    (53)by the substitution t = (mx)W (We can assume V le W le xU because otherwiseone of the sums in (54) is empty) As we can see the sums within the integral are nowunsmoothed This will not be truly harmful and to some extent it will be convenientin that ready-to-use large-sieve estimates in the literature have been optimized morecarefully for unsmoothed sums than for smooth sums The fact that the sums start atx2W and W2 rather than at 1 will also be slightly helpful

    (This is presumably why the weight η2 was introduced in [Tao14] which also usesthe large sieve As we will later see the weight η2 ndash or anything like it ndash will simplynot do on the major arcs which are much more sensitive to the choice of weights Onthe minor arcs however η2 is convenient and this is why we use it here For type Isums ndash as should be clear from our work so far which was stated for general weightsndash any function whose second derivative exists almost everywhere and lies in `1 woulddo just as well The option of having no smoothing whatsoever ndash as in Vinogradovrsquoswork or as in most textbook accounts ndash would not be quite as good for type I sumsand would lead to a routine but inconvenient splitting of sums into short intervals inplace of (53))

    We now do what is generally the first thing in type II treatments we use Cauchy-Schwarz A minor note however that may help avoid confusion the treatments fa-miliar to some readers (eg the dispersion method not followed here) start with thespecial case of Cauchy-Schwarz that is most common in number theory∣∣∣∣∣∣

    sumnleN

    an

    ∣∣∣∣∣∣2

    le NsumnleN

    |an|2

    79

    whereas here we apply the general rule

    summ

    ambm leradicsum

    m

    |am|2radicsum

    m

    |bm|2

    to the integrand in (53) At any rate we will have reduced the estimation of a sumto the estimation of two simpler sums

    summ |am|2

    summ |bm|2 but each of these two

    simpler sums will be of a kind that we will lead to a loss of a factor of log x (or(log x)3) if not estimated carefully Since we cannot afford to lose a single factor oflog x we will have to deploy and develop techniques to eliminate these factors of log xThe procedure followed will be quite different for the two sums a variety of techniqueswill be needed

    We separate n prime and n non-prime in the integrand of (53) and as we weresaying we apply Cauchy-Schwarz We obtain that the expression within the integral in(53) is at most

    radicS1(UW ) middot S2(U VW ) +

    radicS1(UW ) middot S3(W ) where

    S1(UW ) =sum

    max( x2W U)ltmle x

    W

    (mv)=1

    sumdgtUd|m

    micro(d)

    2

    S2(U VW ) =sum

    max( x2W U)ltmle x

    W

    (mv)=1

    ∣∣∣∣∣∣∣∣∣∣sum

    max(VW2 )ltpleW(pv)=1

    (log p)e(αmp)

    ∣∣∣∣∣∣∣∣∣∣

    2

    (54)

    and

    S3(W ) =sum

    x2W ltmle x

    W

    (mv)=1

    ∣∣∣∣∣∣∣∣sumnleW

    n non-prime

    Λ(n)

    ∣∣∣∣∣∣∣∣2

    =sum

    x2W ltmle x

    W

    (mv)=1

    (142620W 12

    )2

    le 10171x+ 20341W

    (55)

    (by [RS62 Thm 13]) We will assume V le w thus the condition (p v) = 1 will befulfilled automatically and can be removed

    The contribution of S3(W ) will be negligible We must bound S1(UW ) andS2(U VW ) from above

    80 CHAPTER 5 TYPE II SUMS

    51 The sum S1 cancellationWe shall bound

    S1(UW ) =sum

    max(Ux2W )ltmlexW(mv)=1

    sumdgtUd|m

    micro(d)

    2

    (56)

    There will be a surprising amount of cancellation the expression within the sumwill be bounded by a constant on average ndash a constant less than 1 and usually less than12 in fact In other words the inner sum in (56) is exactly 0 most of the time

    Recall that we need explicit constants throughout and that this essentially con-strains us to elementary means (We will at one point use Dirichlet series and ζ(s) fors real and greater than 1)

    511 Reduction to a sum with microIt is tempting to start by applying Mobius inversion to change d gt U to d le U in(56) but this just makes matters worse We could also try changing variables so thatmd (which is smaller than xUW ) becomes the variable instead of d but this leadsto complications for m non-square-free Instead we write

    summax(Ux2W )ltmlexW

    (mv)=1

    sumdgtUd|m

    micro(d)

    2

    =sum

    x2W ltmle x

    W

    (mv)=1

    sumd1d2|m

    micro(d1 gt U)micro(d2 gt U)

    =sum

    r1ltxWU

    sumr2ltxWU

    (r1r2)=1

    (r1r2v)=1

    suml

    (lr1r2)=1

    r1lr2lgtU

    (`v)=1

    micro(r1l)micro(r2l)sum

    x2W ltmle x

    W

    r1r2l|m(mv)=1

    1

    (57)where d1 = r1l d2 = r2l l = (d1 d2) (The inequality r1 lt xWU comes fromr1r2l|m m le xW r2l gt U r2 lt xWU is proven in the same way) Now (57)equals sum

    slt xWU

    (sv)=1

    sumr1lt

    xWUs

    sumr2lt

    xWUs

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)sum

    max(

    Umin(r1r2)

    xW

    2r1r2s

    )ltlle xW

    r1r2s

    (lr1r2)=1(micro(l))2=1

    (`v)=1

    1 (58)

    where we have set s = m(r1r2l) We begin by simplifying the innermost triple sumThis we do in the following Lemma it is not a trivial task and carrying it out efficientlyactually takes an idea

    51 THE SUM S1 CANCELLATION 81

    Lemma 511 Let z y gt 0 Thensumr1lty

    sumr2lty

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)sum

    min(

    zymin(r1r2)

    z2r1r2

    )ltlle z

    r1r2

    (lr1r2)=1(micro(l))2=1

    (`v)=1

    1 (59)

    equals

    6z

    π2

    v

    σ(v)

    sumr1lty

    sumr2lty

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)

    σ(r1)σ(r2)

    (1minusmax

    (1

    2r1

    yr2

    y

    ))

    +Olowast

    508 ζ

    (3

    2

    )2

    yradicz middotprodp|v

    (1 +

    1radicp

    )(1minus 1

    p32

    )2

    (510)

    If v = 2 the error term in (510) can be replaced by

    Olowast

    (127ζ

    (3

    2

    )2

    yradicz middot(

    1 +1radic2

    )(1minus 1

    232

    )2) (511)

    Proof By Mobius inversion (59) equalssumr1lty

    sumr2lty

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)sum

    lle zr1r2

    lgtmin(

    zymin(r1r2)

    z2r1r2

    )(`v)=1

    sumd1|r1d2|r2d1d2|l

    micro(d1)micro(d2)

    sumd3|vd3|l

    micro(d3)summ2|l

    (mr1r2v)=1

    micro(m)

    (512)

    We can change the order of summation of ri and di by defining si = ridi and we canalso use the obvious fact that the number of integers in an interval (a b] divisible by dis (bminus a)d+Olowast(1) Thus (512) equalssum

    d1d2lty

    (d1d2)=1

    (d1d2v)=1

    micro(d1)micro(d2)sum

    s1ltyd1s2ltyd2

    (d1s1d2s2)=1

    (s1s2v)=1

    micro(d1s1)micro(d2s2)

    sumd3|v

    micro(d3)sum

    mleradic

    z

    d21s1d22s2d3

    (md1s1d2s2v)=1

    micro(m)

    d1d2d3m2

    z

    s1d1s2d2

    (1minusmax

    (1

    2s1d1

    ys2d2

    y

    ))

    (513)

    82 CHAPTER 5 TYPE II SUMS

    plus

    Olowast

    sum

    d1d2lty

    (d1d2v)=1

    sums1ltyd1s2ltyd2

    (s1s2v)=1

    sumd3|v

    summle

    radicz

    d21s1d22s2d3

    m sq-free

    1

    (514)

    If we complete the innermost sum in (513) by removing the condition

    m leradicz(d2

    1sd22s2)

    we obtain (reintroducing the variables ri = disi)

    z middotsum

    r1r2lty

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)

    r1r2

    (1minusmax

    (1

    2r1

    yr2

    y

    ))

    sumd1|r1d2|r2

    sumd3|v

    summ

    (mr1r2v)=1

    micro(d1)micro(d2)micro(m)micro(d3)

    d1d2d3m2

    (515)

    times z Now (515) equalssumr1r2lty

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)z

    r1r2

    (1minusmax

    (1

    2r1

    yr2

    y

    )) prodp|r1r2

    or v

    (1minus 1

    p

    ) prodp-r1r2p-v

    (1minus 1

    p2

    )

    =6z

    π2

    v

    σ(v)

    sumr1r2lty

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)

    σ(r1)σ(r2)

    (1minusmax

    (1

    2r1

    yr2

    y

    ))

    ie the main term in (510) It remains to estimate the terms used to complete thesum their total is by definition given exactly by (513) with the inequality m leradicz(d2

    1sd22s2d3) changed to m gt

    radicz(d2

    1sd22s2d3) This is a total of size at most

    1

    2

    sumd1d2lty

    (d1d2v)=1

    sums1ltyd1s2ltyd2

    (s1s2v)=1

    sumd3|v

    summgt

    radicz

    d21s1d22s2d3

    m sq-free

    1

    d1d2d3m2

    z

    s1d1s2d2 (516)

    Adding this to (514) we obtain as our total error termsumd1d2lty

    (d1d2v)=1

    sums1ltyd1s2ltyd2

    (s1s2v)=1

    sumd3|v

    f

    (radicz

    d21s1d2

    2s2d3

    ) (517)

    51 THE SUM S1 CANCELLATION 83

    where

    f(x) =summlexm sq-free

    1 +1

    2

    summgtxm sq-free

    x2

    m2

    It is easy to see that f(x)x has a local maximum exactly when x is a square-free(positive) integer We can hence check that

    f(x) le 1

    2

    (2 + 2

    (ζ(2)

    ζ(4)minus 125

    ))x = 126981 x

    for all x ge 0 by checking all integers smaller than a constant using m m sq-free subm 4 - m and 15 middot (34) lt 126981 to bound f from below for x larger than aconstant Therefore (517) is at most

    127sum

    d1d2lty

    (d1d2v)=1

    sums1ltyd1s2ltyd2

    (s1s2v)=1

    sumd3|v

    radicz

    d21s1d2

    2s2d3

    = 127radiczprodp|v

    (1 +

    1radicp

    )middot

    sumdlty

    (dv)=1

    sumsltyd

    (sv)=1

    1

    dradics

    2

    We can bound the double sum simply by

    sumdlty

    (dv)=1

    sumsltyd

    1radicsdle 2

    sumdlty

    radicyd

    dle 2radicy middot ζ

    (3

    2

    )prodp|v

    (1minus 1

    p32

    )

    Alternatively if v = 2 we bound

    sumsltyd

    (sv)=1

    1radics

    =sumsltyd

    s odd

    1radicsle 1 +

    1

    2

    int yd

    1

    1radicsds =

    radicyd

    and thus

    sumdlty

    (dv)=1

    sumsltyd

    (sv)=1

    1radicsdle

    sumdlty

    (d2)=1

    radicyd

    dle radicy

    (1minus 1

    232

    (3

    2

    )

    Applying Lemma 511 with y = Ss and z = xWs where S = xWU we

    84 CHAPTER 5 TYPE II SUMS

    obtain that (58) equals

    6x

    π2W

    v

    σ(v)

    sumsltS

    (sv)=1

    1

    s

    sumr1ltSs

    sumr2ltSs

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)

    σ(r1)σ(r2)

    (1minusmax

    (1

    2r1

    Ssr2

    Ss

    ))

    +Olowast

    504ζ

    (3

    2

    )3

    S

    radicx

    W

    prodp|v

    (1 +

    1radicp

    )(1minus 1

    p32

    )3

    (518)with 504 replaced by 127 if v = 2 The main term in (518) can be written as

    6x

    π2W

    v

    σ(v)

    sumsleS

    (sv)=1

    1

    s

    int 1

    12

    sumr1leuSs

    sumr2leuSs

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)

    σ(r1)σ(r2)du (519)

    As we can see the use of an integral eliminates the unpleasant factor(1minusmax

    (1

    2r1

    Ssr2

    Ss

    ))

    From now on we will focus on the cases v = 1 and v = 2 for simplicity (Highervalues of v do not seem to be really profitable in the last analysis)

    512 Explicit bounds for a sum with microWe must estimate the expression within parentheses in (519) It is not too hard toshow that it tends to 0 the first part of the proof of Lemma 512 will reduce this to thefact that

    sumn micro(n)n = 0 Obtaining good bounds is a more delicate matter For our

    purposes we will need the expression to converge to 0 at least as fast as 1(log)2 witha good constant in front For this task the bound (221) on

    sumnlex micro(n)n is enough

    Lemma 512 Let

    gv(x) =sumr1lex

    sumr2lex

    (r1r2)=1

    (r1r2v)=1

    micro(r1)micro(r2)

    σ(r1)σ(r2)

    where v = 1 or v = 2 Then

    |g1(x)| le

    1x if 33 le x le 1061x (111536 + 55768 log x) if 106 le x lt 101000044325(log x)2 + 01079radic

    xif x ge 1010

    |g2(x)| le

    21x if 33 le x le 1061x (163434 + 817168 log x) if 106 le x lt 10100038128(log x)2 + 02046radic

    x if x ge 1010

    51 THE SUM S1 CANCELLATION 85

    Tbe proof involves what may be called a version of Rankinrsquos trick using Dirichletseries and the behavior of ζ(s) near s = 1

    Proof We prove the statements for x le 106 by a direct computation using intervalarithmetic (In fact in that range one gets 20895071x instead of 21x) Assumefrom now on that x gt 106

    Clearly

    g(x) =sumr1lex

    sumr2lex

    (r1r2v)=1

    sumd|(r1r2)

    micro(d)

    micro(r1)micro(r2)

    σ(r1)σ(r2)

    =sumdlex

    (dv)=1

    micro(d)sumr1lex

    sumr2lex

    d|(r1r2)

    (r1r2v)=1

    micro(r1)micro(r2)

    σ(r1)σ(r2)

    =sumdlex

    (dv)=1

    micro(d)

    (σ(d))2

    sumu1lexd

    (u1dv)=1

    sumu2lexd

    (u2dv)=1

    micro(u1)micro(u2)

    σ(u1)σ(u2)

    =sumdlex

    (dv)=1

    micro(d)

    (σ(d))2

    sumrlexd

    (rdv)=1

    micro(r)

    σ(r)

    2

    (520)

    Moreover sumrlexd

    (rdv)=1

    micro(r)

    σ(r)=

    sumrlexd

    (rdv)=1

    micro(r)

    r

    sumdprime|r

    prodp|dprime

    (p

    p+ 1minus 1

    )

    =sum

    dprimelexdmicro(dprime)2=1

    (dprimedv)=1

    prodp|dprime

    minus1

    p+ 1

    sumrlexd

    (rdv)=1

    dprime|r

    micro(r)

    r

    =sum

    dprimelexdmicro(dprime)2=1

    (dprimedv)=1

    1

    dprimeσ(dprime)

    sumrlexddprime

    (rddprimev)=1

    micro(r)

    r

    and sumrlexddprime

    (rddprimev)=1

    micro(r)

    r=

    sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

    1

    dprimeprime

    sumrlexddprimedprimeprime

    micro(r)

    r

    86 CHAPTER 5 TYPE II SUMS

    Hence

    |g(x)| lesumdlex

    (dv)=1

    (micro(d))2

    (σ(d))2

    sum

    dprimelexdmicro(dprime)2=1

    (dprimedv)=1

    1

    dprimeσ(dprime)

    sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

    1

    dprimeprimef(xddprimedprimeprime)

    2

    (521)

    where f(t) =∣∣∣sumrlet micro(r)r

    ∣∣∣We intend to bound the function f(t) by a linear combination of terms of the form

    tminusδ δ isin [0 12) Thus it makes sense now to estimate Fv(s1 s2 x) defined to be thequantity

    sumd

    (dv)=1

    (micro(d))2

    (σ(d))2

    sumdprime1

    (dprime1dv)=1

    micro(dprime1)2

    dprime1σ(dprime1)

    sumdprimeprime1 |(ddprime1v)infin

    1

    dprimeprime1middot (ddprime1dprimeprime1)1minuss1

    sum

    dprime2(dprime2dv)=1

    micro(dprime2)2

    dprime2σ(dprime2)

    sumdprimeprime2 |(ddprime2v)infin

    1

    dprimeprime2middot (ddprime2dprimeprime2)1minuss2

    for s1 s2 isin [12 1] This is equal to

    sumd

    (dv)=1

    micro(d)2

    ds1+s2

    prodp|d

    1

    (1 + pminus1)2

    (1minus pminuss1)prodp|v

    1(1minuspminuss1 )(1minuspminuss2 )

    (1minus pminuss2)

    middot

    sumdprime

    (dprimedv)=1

    micro(dprime)2

    (dprime)s1+1

    prodpprime|dprime

    1

    (1 + pprimeminus1) (1minus pprimeminuss1)

    middot

    sumdprime

    (dprimedv)=1

    micro(dprime)2

    (dprime)s2+1

    prodpprime|dprime

    1

    (1 + pprimeminus1) (1minus pprimeminuss2)

    which in turn can easily be seen to equalprodp-v

    (1 +

    pminuss1pminuss2

    (1minus pminuss1 + pminus1)(1minus pminuss2 + pminus1)

    )prodp|v

    1

    (1minus pminuss1)(1minus pminuss2)

    middotprodp-v

    (1 +

    pminus1pminuss1

    (1 + pminus1)(1minus pminuss1)

    )middotprodp-v

    (1 +

    pminus1pminuss2

    (1 + pminus1)(1minus pminuss2)

    ) (522)

    51 THE SUM S1 CANCELLATION 87

    Now for any 0 lt x le y le x12 lt 1

    (1+xminusy)(1minusxy)(1minusxy2)minus(1+x)(1minusy)(1minusx3) = (xminusy)(y2minusx)(xyminusxminus1)x le 0

    and so

    1 +xy

    (1 + x)(1minus y)=

    (1 + xminus y)(1minus xy)(1minus xy2)

    (1 + x)(1minus y)(1minus xy)(1minus xy2)le (1minus x3)

    (1minus xy)(1minus xy2)

    (523)For any x le y1 y2 lt 1 with y2

    1 le x y22 le x

    1 +y1y2

    (1minus y1 + x)(1minus y2 + x)le (1minus x3)2(1minus x4)

    (1minus y1y2)(1minus y1y22)(1minus y2

    1y2) (524)

    This can be checked as follows multiplying by the denominators and changing vari-ables to x s = y1 + y2 and r = y1y2 we obtain an inequality where the left sidequadratic on s with positive leading coefficient must be less than or equal to the rightside which is linear on s The left side minus the right side can be maximal for givenx r only when s is maximal or minimal This happens when y1 = y2 or when eitheryi =

    radicx or yi = x for at least one of i = 1 2 In each of these cases we have re-

    duced (524) to an inequality in two variables that can be proven automatically1 by aquantifier-elimination program the author has used QEPCAD [HB11] to do this

    Hence Fv(s1 s2 x) is at most

    prodp-v

    (1minus pminus3)2(1minus pminus4)

    (1minus pminuss1minuss2)(1minus pminus2s1minuss2)(1minus pminuss1minus2s2)middotprodp|v

    1

    (1minus pminuss1)(1minus pminuss2)

    middotprodp-v

    1minus pminus3

    (1 + pminuss1minus1)(1 + pminus2s1minus1)

    prodp-v

    1minus pminus3

    (1 + pminuss2minus1)(1 + pminus2s2minus1)

    = Cvs1s2 middotζ(s1 + 1)ζ(s2 + 1)ζ(2s1 + 1)ζ(2s2 + 1)

    ζ(3)4ζ(4)(ζ(s1 + s2)ζ(2s1 + s2)ζ(s1 + 2s2))minus1

    (525)where Cvs1s2 equals 1 if v = 1 and

    (1minus 2minuss1minus2s2)(1 + 2minuss1minus1)(1 + 2minus2s1minus1)(1 + 2minuss2minus1)(1 + 2minus2s2minus1)

    (1minus 2minuss1+s2)minus1(1minus 2minus2s1minuss2)minus1(1minus 2minuss1)(1minus 2minuss2)(1minus 2minus3)4(1minus 2minus4)

    if v = 2For 1 le t le x (221) and (224) imply

    f(t) le

    radic

    2t if x le 1010radic2t + 003

    log x

    (xt

    ) log log 1010

    log xminuslog 1010 if x gt 1010(526)

    1In practice the case yi =radicx leads to a polynomial of high degree and quantifier elimination increases

    sharply in complexity as the degree increases a stronger inequality of lower degree (with (1minus 3x3) insteadof (1minus x3)2(1minus x4)) was given to QEPCAD to prove in this case

    88 CHAPTER 5 TYPE II SUMS

    where we are using the fact that log x is convex-down Note that again by convexity

    log log xminus log log 1010

    log xminus log 1010lt (log t)prime|t=log 1010 =

    1

    log 1010= 00434294

    Obviouslyradic

    2t in (526) can be replaced by (2t)12minusε for any ε ge 0By (521) and (526)

    |gv(x)| le(

    2

    x

    )1minus2ε

    Fv(12 + ε 12 + ε x)

    for x le 1010 We set ε = 1 log x and obtain from (525) that

    Fv(12 + ε 12 + ε x) le Cv 12 +ε 12 +ε

    ζ(1 + 2ε)ζ(32)4ζ(2)2

    ζ(3)4ζ(4)

    le 55768 middot Cv 12 +ε 12 +ε middot(

    1 +log x

    2

    )

    (527)

    where we use the easy bound ζ(s) lt 1 + 1(sminus 1) obtained bysumns lt 1 +

    int infin1

    tsdt

    (For sharper bounds see [BR02]) Now

    C2 12 +ε 12 +ε le(1minus 2minus32minusε)2(1 + 2minus32)2(1 + 2minus2)2(1minus 2minus1minus2ε)

    (1minus 2minus12)2(1minus 2minus3)4(1minus 2minus4)

    le 14652983

    whereas C1 12 +ε 12 +ε = 1 (We are assuming x ge 106 and so ε le 1(log 106)) Hence

    |gv(x)| le

    1x (111536 + 55768 log x) if v = 11x (163434 + 817168 log x) if v = 2

    for 106 le x lt 1010For general x we must use the second bound in (526) Define c = 1(log 1010)

    We see that if x gt 1010

    |gv(x)| le 0032

    (log x)2F1(1minus c 1minus c) middot Cv1minusc1minusc

    + 2 middotradic

    2radicx

    003

    log xF (1minus c 12) middot Cv1minusc12

    +1

    x(111536 + 55768 log x) middot Cv 12 +ε 12 +ε

    For v = 1 this gives

    |g1(x)| le 00044325

    (log x)2+

    21626radicx log x

    +1

    x(111536 + 55768 log x)

    le 00044325

    (log x)2+

    01079radicx

    51 THE SUM S1 CANCELLATION 89

    for v = 2 we obtain

    |g2(x)| le 0038128

    (log x)2+

    25607radicx log x

    +1

    x(163434 + 817168 log x)

    le 0038128

    (log x)2+

    02046radicx

    513 Estimating the triple sumWe will now be able to bound the triple sum in (519) vizsum

    sleS(sv)=1

    1

    s

    int 1

    12

    gv(uSs)du (528)

    where gv is as in Lemma 512As we will soon see Lemma 512 that (528) is bounded by a constant (essentially

    because the integralint 12

    01t(log t)2 converges) We must give as good a constant as

    we can since it will affect the largest term in the final resultClearly gv(R) = gv(bRc) The contribution of each gv(m) 1 le m le S to (528)

    is exactly gv(m) timessumS

    m+1ltsleSm

    1

    s

    (sv)=1

    int 1

    msS

    1du+sum

    S2mltsle

    Sm+1

    1

    s

    (sv)=1

    int (m+1)sS

    msS

    1du

    +sum

    S2(m+1)

    ltsle S2m

    1

    s

    (sv)=1

    int (m+1)sS

    12

    du =sum

    Sm+1ltsle

    Sm

    (sv)=1

    (1

    sminus m

    S

    )

    +sum

    S2mltsle

    Sm+1

    (sv)=1

    1

    S+

    sumS

    2(m+1)ltsle S

    2m

    (sv)=1

    (m+ 1

    Sminus 1

    2s

    )

    (529)

    Write f(t) = 1S for S2m lt t le S(m+1) f(t) = 0 for t gt Sm or t lt S2(m+1) f(t) = 1tminusmS for S(m+ 1) lt t le Sm and f(t) = (m+ 1)S minus 12t forS2(m + 1) lt t le S2m then (529) equals

    sumn(nv)=1 f(n) By Euler-Maclaurin

    (second order)sumn

    f(n) =

    int infinminusinfin

    f(x)minus 1

    2B2(x)f primeprime(x)dx =

    int infinminusinfin

    f(x) +Olowast(

    1

    12|f primeprime(x)|

    )dx

    =

    int infinminusinfin

    f(x)dx+1

    6middotOlowast

    (∣∣∣∣f prime( 3

    2m

    )∣∣∣∣+

    ∣∣∣∣f prime( s

    m+ 1

    )∣∣∣∣)=

    1

    2log

    (1 +

    1

    m

    )+

    1

    6middotOlowast

    ((2m

    s

    )2

    +

    (m+ 1

    s

    )2)

    (530)

    90 CHAPTER 5 TYPE II SUMS

    Similarly

    sumn odd

    f(n) =

    int infinminusinfin

    f(2x+ 1)minus 1

    2B2(x)d

    2f(2x+ 1)

    dx2dx

    =1

    2

    int infinminusinfin

    f(x)dxminus 2

    int infinminusinfin

    1

    2B2

    (xminus 1

    2

    )f primeprime(x)dx

    =1

    2

    int infinminusinfin

    f(x)dx+1

    6

    int infinminusinfin

    Olowast (|f primeprime(x)|) dx

    =1

    4log

    (1 +

    1

    m

    )+

    1

    3middotOlowast

    ((2m

    s

    )2

    +

    (m+ 1

    s

    )2)

    We use these expressions form le C0 where C0 ge 33 is a constant to be computedlater they will give us the main term For m gt C0 we use the bounds on |g(m)| thatLemma 512 gives us

    (Starting now and for the rest of the paper we will focus on the cases v = 1v = 2 when giving explicit computational estimates All of our procedures wouldallow higher values of v as well but as will become clear much later the gains fromhigher values of v are offset by losses and complications elsewhere)

    Let us estimate (528) Let

    cv0 =

    16 if v = 113 if v = 2

    cv1 =

    1 if v = 125 if v = 2

    cv2 =

    55768 if v = 1817168 if v = 2

    cv3 =

    111536 if v = 1163434 if v = 2

    cv4 =

    00044325 if v = 10038128 if v = 2

    cv5 =

    01079 if v = 102046 if v = 2

    Then (528) equals

    summleC0

    gv(m) middot(φ(v)

    2vlog

    (1 +

    1

    m

    )+Olowast

    (cv0

    5m2 + 2m+ 1

    S2

    ))

    +sum

    S106lesltSC0

    1

    s

    int 1

    12

    Olowast(cv1uSs

    )du

    +sum

    S1010lesltS106

    1

    s

    int 1

    12

    Olowast(cv2 log(uSs) + cv3

    uSs

    )du

    +sum

    sltS1010

    1

    s

    int 1

    12

    Olowast

    (cv4

    (log uSs)2+

    cv5radicuSs

    )du

    51 THE SUM S1 CANCELLATION 91

    which issummleC0

    gv(m) middot φ(v)

    2vlog

    (1 +

    1

    m

    )+summleC0

    |g(m)| middotOlowast(cv0

    5m2 + 2m+ 1

    S2

    )

    +Olowast

    (cv1

    log 2

    C0+

    log 2

    106

    (cv3 + cv2(1 + log 106)

    )+

    2minusradic

    2

    10102cv5

    )

    +Olowast

    sumsltS1010

    cv42

    s(logS2s)2

    for S ge (C0 + 1) Note that

    sumsltS1010

    1s(logS2s)2 =

    int 21010

    01

    t(log t)2 dtNow

    cv42

    int 21010

    0

    1

    t(log t)2dt =

    cv42

    log(10102)=

    000009923 if v = 1

    0000853636 if v = 2

    and

    log 2

    106

    (cv3 + cv2(1 + log 106)

    )+

    2minusradic

    2

    105cv5 =

    00006506 if v = 1

    0009525 if v = 2

    For C0 = 10000

    φ(v)

    v

    1

    2

    summleC0

    gv(m) middot log

    (1 +

    1

    m

    )=

    0362482 if v = 10360576 if v = 2

    cv0summleC0

    |gv(m)|(5m2 + 2m+ 1) le

    62040665 if v = 1159113401 if v = 2

    and

    cv1 middot (log 2)C0 =

    000006931 if v = 1000017328 if v = 2

    Thus for S ge 100000sumsleS

    (sv)=1

    1

    s

    int 1

    12

    gv(uSs)du le

    036393 if v = 1037273 if v = 2

    (531)

    For S lt 100000 we proceed as above but using the exact expression (529) insteadof (530) Note (529) is of the form fsm1(S) + fsm2(S)S where both fsm1(S)and fsm2(S) depend only on bSc (and on s andm) Summing overm le S we obtaina bound of the form sum

    sleS(sv)=1

    1

    s

    int 1

    12

    gv(uSs)du le Gv(S)

    92 CHAPTER 5 TYPE II SUMS

    withGv(S) = Kv1(|S|) +Kv2(|S|)S

    where Kv1(n) and Kv2(n) can be computed explicitly for each integer n (For exam-ple Gv(S) = 1minus 1S for 1 le S lt 2 and Gv(S) = 0 for S lt 1)

    It is easy to check numerically that this implies that (531) holds not just for S ge100000 but also for 40 le S lt 100000 (if v = 1) or 16 le S lt 100000 (if v =

    2) Using the fact that Gv(S) is non-negative we can compareint T

    1Gv(S)dSS with

    log(T+1N) for each T isin [2 40]cap 1NZ (N a large integer) to show again numerically

    that int T

    1

    Gv(S)dS

    Sle

    03698 log T if v = 1037273 log T if v = 2

    (532)

    (We use N = 100000 for v = 1 already N = 10 gives us the answer above forv = 2 Indeed computations suggest the better bound 0358 instead of 037273 weare committed to using 037273 because of (531))

    Multiplying by 6vπ2σ(v) we conclude that

    S1(UW ) =x

    WmiddotH1

    ( x

    WU

    )+Olowast

    (508ζ(32)3 x32

    W 32U

    )(533)

    if v = 1

    S1(UW ) =x

    WmiddotH2

    ( x

    WU

    )+Olowast

    (127ζ(32)3 x32

    W 32U

    )(534)

    if v = 2 where

    H1(S) =

    6π2G1(S) if 1 le S lt 40022125 if S ge 40

    H2(s) =

    4π2G2(S) if 1 le S lt 16015107 if S ge 16

    (535)Hence (by (532)) int T

    1

    Hv(S)dS

    Sle

    022482 log T if v = 1015107 log T if v = 2

    (536)

    moreover

    H1(S) le 3

    π2 H2(S) le 2

    π2(537)

    for all S

    Note There is another way to obtain cancellation on micro applicable when (xW ) gtUq (as is unfortunately never the case in our main application) For this alternativeto be taken one must either apply Cauchy-Schwarz on n rather than m (resulting inexponential sums over m) or lump together all m near each other and in the same

    52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 93

    congruence class modulo q before applying Cauchy-Schwarz on m (one can indeed dothis if δ is small) We could then writesum

    msimWmequivr mod q

    sumd|mdgtU

    micro(d) = minussummsimW

    mequivr mod q

    sumd|mdleU

    micro(d) = minussumdleU

    micro(d)(Wqd+O(1))

    and obtain cancellation on d If Uq ge (xW ) however the error term dominates

    52 The sum S2 the large sieve primes and tailsWe must now bound

    S2(U primeW primeW ) =sum

    U primeltmle xW

    (mv)=1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αmp)

    ∣∣∣∣∣∣2

    (538)

    for U prime = max(U x2W ) W prime = max(VW2) (The condition (p v) = 1 will befulfilled automatically by the assumption V gt v)

    From a modern perspective this is clearly a case for a large sieve It is also clear thatwe ought to try to apply a large sieve for sequences of prime support What is subtlerhere is how to do things well for very large q (ie xq small) This is in some sense adual problem to that of q small but it poses additional complications for example it isnot obvious how to take advantage of prime support for very large q

    As in type I we avoid this entire issue by forbidding q large and then taking advan-tage of the error term δx in the approximation α = a

    q + δx This is one of the main

    innovations here Note this alternative method will allow us to take advantage of primesupport

    A key situation to study is that of frequencies αi clustering around given rationalsaq while nevertheless keeping at a certain small distance from each other

    Lemma 521 Let q ge 1 Let α1 α2 αk isin RZ be of the form αi = aiq + υi0 le ai lt q where the elements υi isin R all lie in an interval of length υ gt 0 and whereai = aj implies |υi minus υj | gt ν gt 0 Assume ν + υ le 1q Then for any WW prime ge 1W prime geW2

    ksumi=1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αip)

    ∣∣∣∣∣∣2

    le min

    (1

    2q

    φ(q)

    1

    log ((q(ν + υ))minus1)

    )middot(W minusW prime + νminus1

    ) sumW primeltpleW

    (log p)2

    (539)

    Proof For any distinct i j the angles αi αj are separated by at least ν (if ai = aj) orat least 1qminus|υiminusυj | ge 1qminusυ ge ν (if ai 6= aj) Hence we can apply the large sieve(in the optimal N + δminus1 minus 1 form due to Selberg [Sel91] and Montgomery-Vaughan[MV74]) and obtain the bound in (539) with 1 instead of min(1 ) immediately

    94 CHAPTER 5 TYPE II SUMS

    We can also apply Montgomeryrsquos inequality ([Mon68] [Hux72] see the exposi-tions in [Mon71 pp 27ndash29] and [IK04 sect74]) This gives us that the left side of (539)is at most

    sumrleR

    (rq)=1

    (micro(r))2

    φ(r)

    minus1 sum

    rleR(rq)=1

    sumaprime mod r(aprimer)=1

    ksumi=1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e((αi + aprimer)p)

    ∣∣∣∣∣∣2

    (540)

    If we add all possible fractions of the form aprimer r le R (r q) = 1 to the fractionsaiq we obtain fractions that are separated by at least 1qR2 If ν + υ ge 1qR2 thenthe resulting angles αi + aprimer are still separated by at least ν Thus we can apply thelarge sieve to (540) setting R = 1

    radic(ν + υ)q we see that we gain a factor of

    sumrleR

    (rq)=1

    (micro(r))2

    φ(r)ge φ(q)

    q

    sumrleR

    (micro(r))2

    φ(r)ge φ(q)

    q

    sumdleR

    1

    dge φ(q)

    2qlog((q(ν + υ))minus1

    )

    (541)since

    sumdleR 1d ge log(R) for all R ge 1 (integer or not)

    Let us first give a bound on sums of the type of S2(U VW ) using prime sup-port but not the error terms (or Lemma 521) This is something that can be donevery well using tools available in the literature (Not all of these tools seem to beknown as widely as they should be) Bounds (542) and (544) are completely standardlarge-sieve bounds To obtain the gain of a factor of log in (543) we use a lemmaof Montgomeryrsquos for whose modern proof (containing an improvement by Huxley)we refer to the standard source [IK04 Lemma 715] The purpose of Montgomeryrsquoslemma is precisely to gain a factor of log in applications of the large sieve to sequencessupported on the primes To use the lemma efficiently we apply Montgomery andVaughanrsquos large sieve with weights [MV73 (16)] rather than more common forms ofthe large sieve (The idea ndash used in [MV73] to prove an improved version of the Brun-Titchmarsh inequality ndash is that Farey fractions (rationals with bounded denominator)are not equidistributed this fact can be exploited if a large sieve with weights is used)

    Lemma 522 Let W ge 1 W prime geW2 Let α = aq +Olowast(1qQ) q le Q Then

    sumA0ltmleA1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αmp)

    ∣∣∣∣∣∣2

    lelceil

    A1 minusA0

    min(q dQ2e)

    rceilmiddot (W minusW prime + 2q)

    sumW primeltpleW

    (log p)2

    (542)

    52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 95

    If q lt W2 and Q ge 35W the following bound also holds

    sumA0ltmleA1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αmp)

    ∣∣∣∣∣∣2

    lelceilA1 minusA0

    q

    rceilmiddot q

    φ(q)

    W

    log(W2q)middot

    sumW primeltpleW

    (log p)2

    (543)

    If A1 minusA0 le q and q le ρQ ρ isin [0 1] the following bound also holds

    sumA0ltmleA1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αmp)

    ∣∣∣∣∣∣2

    le (W minusW prime + q(1minus ρ))sum

    W primeltpleW

    (log p)2

    (544)

    Proof Let k = min(q dQ2e) ge dq2e We split (A0 A1] into d(A1minusA0)ke blocksof at most k consecutive integers m0 + 1m0 + 2 For m mprime in such a block αmand αmprime are separated by a distance of at least

    |(aq)(mminusmprime)| minusOlowast(kqQ) = 1q minusOlowast(12q) ge 12q

    By the large sieve

    qsuma=1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(α(m0 + a)p)

    ∣∣∣∣∣∣2

    le ((W minusW prime)+2q)sum

    W primeltpleW

    (log p)2 (545)

    We obtain (542) by summing over all d(A1 minusA0)ke blocksIf A1 minus A0 le |q| and q le ρQ ρ isin [0 1] we obtain (544) simply by applying

    the large sieve without splitting the interval A0 lt m le A1Let us now prove (543) We will use Montgomeryrsquos inequality followed by Mont-

    gomery and Vaughanrsquos large sieve with weights An angle aq + aprime1r1 is separatedfrom other angles aprimeq + aprime2r2 (r1 r2 le R (ai ri) = 1) by at least 1qr1R ratherthan just 1qR2 We will choose R so that qR2 lt Q this implies 1Q lt 1qR2 le1qr1R

    By a lemma of Montgomeryrsquos [IK04 Lemma 715] applied (for each 1 le a le q)to S(α) =

    sumn ane(αn) with an = log(n)e(α(m0 + a)n) if n is prime and an = 0

    otherwise

    1

    φ(r)

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(α(m0 + a)p)

    ∣∣∣∣∣∣2

    lesum

    aprime mod r(aprimer)=1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e

    ((α (m0 + a) +

    aprime

    r

    )p

    )∣∣∣∣∣∣2

    (546)

    96 CHAPTER 5 TYPE II SUMS

    for each square-free r leW prime We multiply both sides of (546) by(W

    2+

    3

    2

    (1

    qrRminus 1

    Q

    )minus1)minus1

    and sum over all a = 0 1 q minus 1 and all square-free r le R coprime to q we willlater make sure that R leW prime We obtain that

    sumrleR

    (rq)=1

    (W

    2+

    3

    2

    (1

    qrRminus 1

    Q

    )minus1)minus1

    micro(r)2

    φ(r)

    middotqsuma=1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(α(m0 + a)p)

    ∣∣∣∣∣∣2

    (547)

    is at mostsumrleR

    (rq)=1

    r sq-free

    (W

    2+

    3

    2

    (1

    qrRminus 1

    Q

    )minus1)minus1

    qsuma=1

    sumaprime mod r(aprimer)=1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e

    ((α (m0 + a) +

    aprime

    r

    )p

    )∣∣∣∣∣∣2

    (548)

    We now apply the large sieve with weights [MV73 (16)] recalling that each angleα(m0 +a)+aprimer is separated from the others by at least 1qrRminus1Q we obtain that(548) is at most

    sumW primeltpleW (log p)2 It remains to estimate the sum in the first line of

    (547) (We are following here a procedure analogous to that used in [MV73] to provethe Brun-Titchmarsh theorem)

    Assume first that q leW135 Set

    R =

    (σW

    q

    )12

    (549)

    where σ = 12e2middot025068 = 030285 It is clear that qR2 lt Q q lt W prime and R ge 2Moreover for r le R

    1

    Qle 1

    35Wle σ

    35

    1

    σW=

    σ

    35

    1

    qR2le σ35

    qrR

    Hence

    W

    2+

    3

    2

    (1

    qrRminus 1

    Q

    )minus1

    le W

    2+

    3

    2

    qrR

    1minus σ35=W

    2+

    3r

    2(1minus σ

    35

    )Rmiddot 2σW

    2

    =W

    2

    (1 +

    1minus σ35rW

    R

    )ltW

    2

    (1 +

    rW

    R

    )

    52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 97

    and so

    sumrleR

    (rq)=1

    (W

    2+

    3

    2

    (1

    qrRminus 1

    Q

    )minus1)minus1

    micro(r)2

    φ(r)

    ge 2

    W

    sumrleR

    (rq)=1

    (1 + rRminus1)minus1micro(r)2

    φ(r)ge 2

    W

    φ(q)

    q

    sumrleR

    (1 + rRminus1)minus1micro(r)2

    φ(r)

    For R ge 2 sumrleR

    (1 + rRminus1)minus1micro(r)2

    φ(r)gt logR+ 025068

    this is true for R ge 100 by [MV73 Lemma 8] and easily verifiable numerically for2 le R lt 100 (It suffices to verify this for R integer with r lt R instead of r le R asthat is the worst case)

    Now

    logR =1

    2

    (log

    W

    2q+ log 2σ

    )=

    1

    2log

    W

    2qminus 025068

    Hence sumrleR

    (1 + rRminus1)minus1micro(r)2

    φ(r)gt

    1

    2log

    W

    2q

    and the statement followsNow consider the case q gt W135 If q is even then in this range inequality

    (542) is always better than (543) and so we are done Assume then that W135 ltq le W2 and q is odd We set R = 2 clearly qR2 lt W le Q and q lt W2 le W primeand so this choice of R is valid It remains to check that

    1

    W2 + 3

    2

    (12q minus

    1Q

    )minus1 +1

    W2 + 3

    2

    (14q minus

    1Q

    )minus1 ge1

    Wlog

    W

    2q

    This follows because

    112 + 3

    2

    (t2 minus

    135

    )minus1 +1

    12 + 3

    2

    (t4 minus

    135

    )minus1 ge logt

    2

    for all 2 le t le 135

    We need a version of Lemma 522 with m restricted to the odd numbers since weplan to set the parameter v equal to 2

    98 CHAPTER 5 TYPE II SUMS

    Lemma 523 Let W ge 1 W prime geW2 Let 2α = aq +Olowast(1qQ) q le Q Then

    sumA0ltmleA1

    m odd

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αmp)

    ∣∣∣∣∣∣2

    lelceilA1 minusA0

    min(2qQ)

    rceilmiddot (W minusW prime + 2q)

    sumW primeltpleW

    (log p)2

    (550)

    If q lt W2 and Q ge 35W the following bound also holds

    sumA0ltmleA1

    m odd

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αmp)

    ∣∣∣∣∣∣2

    lelceilA1 minusA0

    2q

    rceilmiddot q

    φ(q)

    W

    log(W2q)middot

    sumW primeltpleW

    (log p)2

    (551)

    If A1 minusA0 le 2q and q le ρQ ρ isin [0 1] the following bound also holds

    sumA0ltmleA1

    ∣∣∣∣∣∣sum

    W primeltpleW

    (log p)e(αmp)

    ∣∣∣∣∣∣2

    le (W minusW prime + q(1minus ρ))sum

    W primeltpleW

    (log p)2

    (552)

    Proof We follow the proof of Lemma 522 noting the differences Let

    k = min(q dQ2e) ge dq2e

    just as before We split (A0 A1] into d(A1 minusA0)ke blocks of at most 2k consecutiveintegers any such block contains at most k odd numbers For odd m mprime in such ablock αm and αmprime are separated by a distance of

    |α(mminusmprime)| =∣∣∣∣2α

    mminusmprime

    2

    ∣∣∣∣ = |(aq)k| minusOlowast(kqQ) ge 12q

    We obtain (550) and (552) just as we obtained (542) and (544) before To obtain(551) proceed again as before noting that the angles we are working with can belabelled as α(m0 + 2a) 0 le a lt q

    The idea now (for large δ) is that if δ is not negligible then as m increases andαm loops around the circle RZ αm roughly repeats itself every q steps ndash but with aslight displacement This displacement gives rise to a configuration to which Lemma521 is applicable The effect is that we can apply the large sieve once instead of manytimes thus leading to a gain of a large factor (essentially the number of times the largesieve would have been used) This is how we obtain the factor of |δ| in the denominatorof the main term x|δ|q in (556) and (557)

    52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 99

    Proposition 524 Let x ge W ge 1 W prime ge W2 U prime ge x2W Let Q ge 35W Let2α = aq + δx (a q) = 1 |δx| le 1qQ q le Q Let S2(U primeW primeW ) be as in(538) with v = 2

    For q le ρQ where ρ isin [0 1]

    S2(U primeW primeW ) le(

    max(1 2ρ)

    (x

    8q+

    x

    2W

    )+W

    2+ 2q

    )middot

    sumW primeltpleW

    (log p)2

    (553)If q lt W2

    S2(U primeW primeW ) le(

    x

    4φ(q)

    1

    log(W2q)+

    q

    φ(q)

    W

    log(W2q)

    )middot

    sumW primeltpleW

    (log p)2

    (554)If W gt x4q the following bound also holds

    S2(U primeW primeW ) le(W

    2+

    q

    1minus x4Wq

    ) sumW primeltpleW

    (log p)2 (555)

    If δ 6= 0 and x4W + q le x|δ|q

    S2(U primeW primeW ) le min

    12qφ(q)

    log(

    x|δq|(q + x

    4W

    )minus1)

    middot(

    x

    |δq|+W

    2

    ) sumW primeltpleW

    (log p)2

    (556)

    Lastly if δ 6= 0 and q le ρQ where ρ isin [0 1)

    S2(U primeW primeW ) le(

    x

    |δq|+W

    2+

    x

    8(1minus ρ)Q+

    x

    4(1minus ρ)W

    ) sumW primeltpleW

    (log p)2

    (557)

    The trivial bound would be in the order of

    S2(U primeW primeW ) = (x2 log x)sum

    W primeltpleW

    (log p)2

    In practice (555) gets applied when W ge xq

    Proof Let us first prove statements (554) and (553) which do not involve δ Assumefirst q leW2 Then by (551) with A0 = U prime A1 = xW

    S2(U primeW primeW ) le(xW minus U prime

    2q+ 1

    )q

    φ(q)

    W

    log(W2q)

    sumW primeltpleW

    (log p)2

    Clearly (xW minus U prime)W le (x2W ) middotW = x2 Thus (554) holds

    100 CHAPTER 5 TYPE II SUMS

    Assume now that q le ρQ Apply (550) with A0 = U prime A1 = xW Then

    S2(U primeW primeW ) le(

    xW minus U prime

    q middotmin(2 ρminus1)+ 1

    )(W minusW prime + 2q)

    sumW primeltpleW

    (log p)2

    Now (xW minus U prime

    q middotmin(2 ρminus1)+ 1

    )middot (W minusW prime + 2q)

    le( xWminus U prime

    ) W minusW prime

    qmin(2 ρminus1)+ max(1 2ρ)

    ( xWminus U prime

    )+W2 + 2q

    le x4

    qmin(2 ρminus1)+ max(1 2ρ)

    x

    2W+W2 + 2q

    This implies (553)If W gt x4q apply (544) with = x4Wq ρ = 1 This yields (555)Assume now that δ 6= 0 and x4W + q le x|δq| Let Qprime = x|δq| For any m1

    m2 with x2W lt m1m2 le xW we have |m1 minusm2| le x2W le 2(Qprime minus q) andso ∣∣∣∣m1 minusm2

    2middot δx+ qδx

    ∣∣∣∣ le Qprime|δ|x =1

    q (558)

    The conditions of Lemma 521 are thus fulfilled with υ = (x4W ) middot |δ|x and ν =|δq|x We obtain that S2(U primeW primeW ) is at most

    min

    (1

    2q

    φ(q)

    1

    log ((q(ν + υ))minus1)

    )(W minusW prime + νminus1

    ) sumW primeltpleW

    (log p)2

    Here W minusW prime + νminus1 = W minusW prime + x|qδ| leW2 + x|qδ| and

    (q(ν + υ))minus1 =

    (q|δ|x

    )minus1 (q +

    x

    4W

    )minus1

    Lastly assume δ 6= 0 and q le ρQ We let Qprime = x|δq| ge Q again and we splitthe range U prime lt m le xW into intervals of length 2(Qprime minus q) so that (558) still holdswithin each interval We apply Lemma 521 with υ = (Qprimeminus q) middot |δ|x and ν = |δq|xWe obtain that S2(U primeW primeW ) is at most(

    1 +xW minus U2(Qprime minus q)

    )(W minusW prime + νminus1

    ) sumW primeltpleW

    (log p)2

    Here W minusW prime + νminus1 leW2 + xq|δ| as before Moreover(W

    2+

    x

    q|δ|

    )(1 +

    xW minus U2(Qprime minus q)

    )le(W

    2+Qprime

    )(1 +

    x2W

    2(1minus ρ)Qprime

    )le W

    2+Qprime +

    x

    8(1minus ρ)Qprime+

    x

    4W (1minus ρ)

    le x

    |δq|+W

    2+

    x

    8(1minus ρ)Q+

    x

    4(1minus ρ)W

    Hence (557) holds

    Chapter 6

    Minor-arc totals

    It is now time to make all of our estimates fully explicit choose our parameters putour type I and type II estimates together and give our final minor-arc estimates

    Let x gt 0 be given Starting in section 631 we will assume that x ge x0 =216 middot1020 We will choose our main parameters U and V gradually as the need ariseswe assume from the start that 2 middot 106 le V lt x4 and UV le x

    We are also given an angle α isin RZ We choose an approximation 2α = aq +δx (a q) = 1 q le Q |δx| le 1qQ The parameter Q will be chosen later weassume from the start that Q ge max(16 2

    radicx) and Q ge max(2U xU)

    (Actually U and V will be chosen in different ways depending on the size of qActually evenQ will depend on the size of q this may seem circular but what actuallyhappens is the following we will first set a value for Q depending only on x and ifthe corresponding value of q le Q is larger than a certain parameter y depending on xthen we reset U V and Q and obtain a new value of q)

    Let SI1 SI2 SII S0 be as in (39) with the smoothing function η = η2 as in(34) (We bounded the type I sums SI1 SI2 for a general smoothing function η it isonly here that we are specifying η)

    The term S0 is 0 because V lt x4 and η2 is supported on [minus14 1] We set v = 2

    61 The smoothing functionFor the smoothing function η2 in (34)

    |η2|1 = 1 |ηprime2|1 = 8 log 2 |ηprimeprime2 |1 = 48 (61)

    as per [Tao14 (59)ndash(513)] Similarly for η2ρ(t) = log(ρt)η2(t) where ρ ge 4

    |η2ρ|1 lt log(ρ)|η2|1 = log(ρ)

    |ηprime2ρ|1 = 2η2ρ(12) = 2 log(ρ2)η2(12) lt (8 log 2) log ρ

    |ηprimeprime2ρ|1 = 4 log(ρ4) + |2 log ρminus 4 log(ρ4)|+ |4 log 2minus 4 log ρ|+ | log ρminus 4 log 2|+ | log ρ| lt 48 log ρ

    (62)

    101

    102 CHAPTER 6 MINOR-ARC TOTALS

    In the first inequality we are using the fact that log(ρt) is always positive (and less thanlog(ρ)) when t is in the support of η2

    Write log+ x for max(log x 0)

    62 Contributions of different types

    621 Type I terms SI1The term SI1 can be handled directly by Lemma 423 with ρ0 = 4 and D = U (Condition (438) is valid thanks to (62)) Since U le Q2 the contribution of SI1gets bounded by (440) and (441) the absolute value of SI1 is at most

    x

    qmin

    (1c0δ

    2

    (2π)2

    ) ∣∣∣∣∣∣∣∣∣∣summleUq

    (mq)=1

    micro(m)

    mlog

    x

    mq

    ∣∣∣∣∣∣∣∣∣∣+x

    q|log middotη(minusδ)|

    ∣∣∣∣∣∣∣∣∣∣summleUq

    (mq)=1

    micro(m)

    m

    ∣∣∣∣∣∣∣∣∣∣+

    2radicc0c1π

    (U log

    ex

    U+radic

    3q logq

    c2+q

    2log

    q

    c2log+ 2U

    q

    )+

    3c1x

    2qlog

    q

    c2log+ U

    c2xq

    +3c12

    radic2x

    c2log

    2x

    c2+

    (c02minus 2c0π2

    )(U2

    4qxlog

    e12x

    U+

    1

    e

    )+

    2|ηprime|1π

    qmax

    (1 log

    c0e3q2

    4π|ηprime|1x

    )log x+

    20c0c322

    3π2

    radic2x log

    2radicex

    c2

    (63)where c0 = 31521 (by Lemma B23) c1 = 10000028 gt 1 + (8 log 2)V ge 1 +(8 log 2)(xU) and c2 = 6π5

    radicc0 = 067147 By (21) (with k = 2) (B17) and

    Lemma B24

    |log middotη(minusδ)| le min

    (2minus log 4

    24 log 2

    π2δ2

    )

    By (220) (222) and (223) the first line of (63) is at most

    x

    qmin

    (1cprime0δ2

    )(min

    (4

    5

    qφ(q)

    log+ Uq2

    1

    )log

    x

    U+ 100303

    q

    φ(q)

    )

    +x

    qmin

    (2minus log 4

    cprimeprime0δ2

    )min

    (4

    5

    qφ(q)

    log+ Uq2

    1

    )

    where cprime0 = 0798437 gt c0(2π)2 cprimeprime0 = 1685532 Clearly cprimeprime0c0 gt 1 gt 2minus log 4Taking derivatives we see that t 7rarr (t2) log(tc2) log+ 2Ut takes its maxi-

    mum (for t isin [1 2U ]) when log(tc2) log+ 2Ut = log tc2 minus log+ 2Ut sincetrarr log tc2 minus log+ 2Ut is increasing on [1 2U ] we conclude that

    q

    2log

    q

    c2log+ 2U

    qle U log

    2U

    c2

    62 CONTRIBUTIONS OF DIFFERENT TYPES 103

    Similarly t 7rarr t log(xt) log+(Ut) takes its maximum at a point t isin [0 U for whichlog(xt) log+(Ut) = log(xt) + log+(Ut) and so

    x

    qlog

    q

    c2log+ U

    c2xq

    le U

    c2(log x+ logU)

    We conclude that

    |SI1| lex

    qmin

    (1cprime0δ2

    )(min

    (4qφ(q)

    5 log+ Uq2

    1

    )(log

    x

    U+ c3I

    )+ c4I

    q

    φ(q)

    )

    +

    (c7I log

    q

    c2+ c8I log xmax

    (1 log

    c11Iq2

    x

    ))q + c10I

    U2

    4qxlog

    e12x

    U

    +

    (c5I log

    2U

    c2+ c6I log xU

    )U + c9I

    radicx log

    2radicex

    c2+c10I

    e

    (64)where c2 and cprime0 are as above c3I = 211104 gt cprimeprime0c

    prime0 c4I = 100303 c5I =

    357422 gt 2radicc0c1π c6I = 223389 gt 3c12c2 c7I = 619072 gt 2

    radic3c0c1π

    c8I = 353017 gt 2(8 log 2)π

    c9I = 191568 gt3radic

    2c12radicc2

    +20radic

    2c0c322

    3π2

    c10I = 937301 gt c0(12minus 2π2) and c11I = 90857 gt c0e3(4π middot 8 log 2)

    622 Type I terms SI2The case q le QV If q le QV then for v le V

    2vα =va

    q+Olowast

    (v

    Qq

    )=va

    q+Olowast

    (1

    q2

    )

    and so vaq is a valid approximation to 2vα (Here we are using v to label an integervariable bounded above by v le V we no longer need v to label the quantity in (310)since that has been set equal to the constant 2) Moreover for Qv = Qv we see that2vα = (vaq) +Olowast(1qQv) If α = aq + δx then vα = vaq + δ(xv) Now

    SI2 =sumvleVv odd

    Λ(v)summleUm odd

    micro(m)sumn

    n odd

    e((vα) middotmn)η(mn(xv)) (65)

    We can thus estimate SI2 by applying Lemma 422 to each inner double sum in (65)We obtain that if |δ| le 12c2 where c2 = 6π5

    radicc0 and c0 = 31521 then |SI2| is

    at most

    sumvleV

    Λ(v)

    xv2qvmin

    (1

    c0(πδ)2

    ) ∣∣∣∣∣∣∣∣∣sum

    mleMvq

    (m2q)=1

    micro(m)

    m

    ∣∣∣∣∣∣∣∣∣+c10Iq

    4xv

    (U

    qv+ 1

    )2

    (66)

    104 CHAPTER 6 MINOR-ARC TOTALS

    plus

    sumvleV

    Λ(v)

    (2radicc0c+

    πU +

    3c+2

    x

    vqvlog+ U

    c2xvqv

    +

    radicc0c+

    πqv log+ U

    qv2

    )

    +sumvleV

    Λ(v)

    (c8I max

    (log

    c11Iq2v

    xv 1

    )qv +

    (2radic

    3c0c+π

    +3c+2c2

    +55c0c2

    6π2

    )qv

    )

    (67)where qv = q(q v) Mv isin [min(Q2v U) U ] and c+ = 1 + (8 log 2)(xUV ) if|δ| ge 12c2 then |SI2| is at most (66) plus

    sumvleV

    Λ(v)

    radicc0c1π2

    U +3c12

    2 +(1 + ε)

    εlog+ 2U

    xv|δ|qv

    x

    Q+

    35c0c23π2

    qv

    +sumvleV

    Λ(v)

    radicc0c1π2

    (1 + ε) min

    (lfloorxv

    |δ|qv

    rfloor+ 1 2U

    )radic3 + 2ε+

    log+ 2U

    b xv|δ|qv c+1

    2

    (68)

    Write SV =sumvleV Λ(v)(vqv) By (212)

    SV lesumvleV

    Λ(v)

    vq+

    sumvleV

    (vq)gt1

    Λ(v)

    v

    ((q v)

    qminus 1

    q

    )

    le log V

    q+

    1

    q

    sump|q

    (log p)

    vp(q) +sumαge1

    pα+vp(q)leV

    1

    pαminussumαge1

    pαleV

    1

    le log V

    q+

    1

    q

    sump|q

    (log p)vp(q) =log V q

    q

    (69)

    This helps us to estimate (66) We could also use this to estimate the second term inthe first line of (67) but for that purpose it will actually be wiser to use the simplerbound sum

    vleV

    Λ(v)x

    vqvlog+ U

    c2xvqv

    lesumvleV

    Λ(v)Uc2ele 10004

    ec2UV (610)

    (by (214) and the fact that t log+At takes its maximum at t = Ae)We bound the sum over m in (66) by (220) and (222)∣∣∣∣∣∣∣∣∣

    summleMvq

    (m2q)=1

    micro(m)

    m

    ∣∣∣∣∣∣∣∣∣ le min

    (4

    5

    qφ(q)

    log+ Mv

    2q2 1

    )

    62 CONTRIBUTIONS OF DIFFERENT TYPES 105

    To bound the terms involving (Uqv + 1)2 we usesumvleV

    Λ(v)v le 05004V 2 (by (217))

    sumvleV

    Λ(v)v(v q)j lesumvleV

    Λ(v)v + VsumvleV

    (vq)6=1

    Λ(v)(v q)j

    sumvleV

    (vq) 6=1

    Λ(v)(v q) lesump|q

    (log p)sum

    1leαlelogp V

    pvp(q) lesump|q

    (log p)log V

    log ppvp(q)

    le (log V )sump|q

    pvp(q) le q log V

    and sumvleV

    (vq)6=1

    Λ(v)(v q)2 lesump|q

    (log p)sum

    1leαlelogp V

    pvp(q)+α

    lesump|q

    (log p) middot 2pvp(q) middot plogp V le 2qV log q

    Using (214) and (69) as well we conclude that (66) is at most

    x

    2qmin

    (1

    c0(πδ)2

    )min

    (4

    5

    qφ(q)

    log+ min(Q2VU)2q2

    1

    )log V q

    +c10I

    4x

    (05004V 2q

    (U

    q+ 1

    )2

    + 2UV q log V + 2U2V log V

    )

    AssumeQ le 2UVe Using (214) (610) (218) and the inequality vq le V q le Q(which implies q2 le Ue) we see that (67) is at most

    10004

    ((2radicc0c+

    π+

    3c+2ec2

    )UV +

    radicc0c+

    πQ log

    U

    q2

    )+

    (c5I2 max

    (log

    c11Iq2V

    x 2

    )+ c6I2

    )Q

    where c5I2 = 353312 gt 10004 middot c8I and

    c6I2 = 10004

    (2radic

    3c0c+π

    +3c+2c2

    +55c0c2

    6π2

    ) (611)

    The expressions in (68) get estimated similarly The first line of (68) is at most

    10004

    (2radicc0c+

    πUV +

    3c+2

    (2 +

    1 + ε

    εlog+ 2UV |δ|q

    x

    )xV

    Q+

    35c0c23π2

    qV

    )

    106 CHAPTER 6 MINOR-ARC TOTALS

    by (214) Since q le QV we can obviously bound qV by Q As for the second lineof (68) ndash

    sumvleV

    Λ(v) min

    (lfloorxv

    |δ|qv

    rfloor+ 1 2U

    )middot 1

    2log+ 2Ulfloor

    xv|δ|qv

    rfloor+ 1

    lesumvleV

    Λ(v) maxtgt0

    t log+ U

    tlesumvleV

    Λ(v)U

    e=

    10004

    eUV

    but

    sumvleV

    Λ(v) min

    (lfloorxv

    |δ|qv

    rfloor+ 1 2U

    )le

    sumvle x

    2U|δ|q

    Λ(v) middot 2U

    +sum

    x2U|δ|qltvleV

    (vq)=1

    Λ(v)x|δ|vq

    +sumvleV

    Λ(v) +sumvleV

    (vq)6=1

    Λ(v)x|δ|v

    (1

    qvminus 1

    q

    )

    le 103883x

    |δ|q+

    x

    |δ|qmax

    (log V minus log

    x

    2U |δ|q+ log

    3radic2 0

    )+ 10004V +

    x

    |δ|1

    q

    sump|q

    (log p)vp(q)

    le x

    |δ|q

    (103883 + log q + log+ 6UV |δ|qradic

    2x

    )+ 10004V

    by (212) (213) (214) and (215) we are proceeding much as in (69)

    Let us collect our bounds If |δ| le 12c2 then assuming Q le 2UVe we con-clude that |SI2| is at most

    x

    2φ(q)min

    (1

    c0(πδ)2

    )min

    (45

    log+ Q4V q2

    1

    )log V q

    + c8I2x

    q

    (UV

    x

    )2 (1 +

    q

    U

    )2

    +c10I

    2

    (UV

    xq log V +

    U2V

    xlog V

    ) (612)

    plus

    (c4I2 +c9I2)UV +(c10I2 logU

    q+c5I2 max

    (log

    c11Iq2V

    x 2

    )+c12I2)middotQ (613)

    62 CONTRIBUTIONS OF DIFFERENT TYPES 107

    where

    c4I2 = 357565(1 + ε0) gt 10004 middot 2radicc0c+πc5I2 = 353312 gt 10004 middot c8I

    c8I2 = 117257 gtc10I

    4middot 05004

    c9I2 = 082214(1 + 2ε0) gt 3c+ middot 100042ec2

    c10I2 = 178783radic

    1 + 2ε0 gt 10004radicc0c+π

    c12I2 = 293333 + 11902ε0

    gt 10004

    (3

    2c2c+ +

    2radic

    3c0π

    radicc+ +

    55c0c26π2

    )+ 178783(1 + ε0) log 2

    = c6I2 + c10I2 log 2

    and c10I = 937301 as before Here ε0 = (4 log 2)(xUV ) and c6I2 is as in (611)If |δ| ge 12c2 then |SI2| is at most (612) plus

    (c4I2 + (1 + ε)c13I2)UV + cε

    (c14I2

    (log q + log+ 6UV |δ|qradic

    2x

    )+ c15I2

    )x

    |δ|q

    + c16I2

    (2 +

    1 + ε

    εlog+ 2UV |δ|q

    x

    )x

    QV+ c17I2Q+ cε middot c4I2V

    (614)where

    c13I2 = 131541(1 + ε0) gt2radicc0c+

    πmiddot 10004

    e

    c14I2 = 357422radic

    1 + 2ε0 gt2radicc0c+

    π

    c15I2 = 371301radic

    1 + 2ε0 gt2radicc0c+

    πmiddot 103883

    c16I2 = 15006(1 + 2ε0) gt 10004 middot 3c+2

    c17I2 = 250295 gt 10004 middot 35c0c23π2

    and cε = (1 + ε)radic

    3 + 2ε We recall that c2 = 6π5radicc0 = 067147 We will

    choose ε isin (0 1) later we also leave the task of bounding ε0 for laterThe case q gt QV We use Lemma 424 in this case

    623 Type II termsAs we showed in (51)ndash(55) SII (given in (51)) is at most

    4

    int xU

    V

    radicS1(UW ) middot S2(U VW )

    dW

    W+4

    int xU

    V

    radicS1(UW ) middot S3(W )

    dW

    W (615)

    where S1 S2 and S3 are as in (54) and (55) We bounded S1 in (533) and (534) S2

    in Prop 524 and S3 in (55)

    108 CHAPTER 6 MINOR-ARC TOTALS

    Let us try to give some structure to the bookkeeping we must now inevitably doThe second integral in (615) will be negligible (because S3 is) let us focus on the firstintegral

    Thanks to our work in sect51 the term S1(UW ) is bounded by a (small) constanttimes xW (This represents a gain of several factors of log with respect to the trivialbound) We bounded S2(U VW ) using the large sieve we expected and got a boundthat is better than trivial by a factor of size roughly radicq log x ndash the exact factor inthe bound depends on the value of W In particular it is only in the central part of therange for W that we will really be able to save a factor of radicq log x as opposed tojust radicq We will have to be slightly clever in order to get a good total bound in theend

    We first recall our estimate for S1 In the whole range [V xU ] for W we knowfrom (533) (534) and (537) that S1(UW ) is at most

    2

    π2

    x

    W+ κ0ζ(32)3 x

    W

    radicxWU

    U (616)

    whereκ0 = 127

    (We recall we are working with v = 2)We have better estimates for the constant in front in some parts of the range in

    what is usually the main part (534) and (536) give us a constant of 015107 insteadof 2π2 Note that 127ζ(32)3 = 226417 We should choose U V so that thefirst term in (616) dominates For the while being assume only

    U ge 5 middot 105 x

    V U (617)

    then (616) givesS1(UW ) le κ1

    x

    W (618)

    whereκ1 =

    2

    π2+

    226418radic1062

    le 02347

    This will suffice for our cruder estimatesThe second integral in (615) is now easy to bound By (55)

    S3(W ) le 10171x+ 20341W le 10172x

    since W le xU le x5 middot 105 Hence

    4

    int xU

    V

    radicS1(UW ) middot S3(W )

    dW

    Wle 4

    int xU

    V

    radicκ1

    x

    Wmiddot 10172x

    dW

    W

    le κ9xradicV

    62 CONTRIBUTIONS OF DIFFERENT TYPES 109

    whereκ9 = 8 middot

    radic10172 middot κ1 le 39086

    Let us now examine S2 which was bounded in Prop 524 We set the parametersW prime U prime as follows in accordance with (54)

    W prime = max(VW2) U prime = max(U x2W )

    Since W prime geW2 and W ge V gt 117 we can always boundsumW primeltpleW

    (log p)2 le 1

    2W (logW ) (619)

    by (219)Bounding S2 for δ arbitrary We set

    W0 = min(max(2θq V ) xU)

    where θ ge e is a parameter that will be set laterFor V leW lt W0 we use the bound (553)

    S2(U primeW primeW ) le(

    max(1 2ρ)

    (x

    8q+

    x

    2W

    )+W

    2+ 2q

    )middot 1

    2W (logW )

    le max

    (1

    2 ρ

    )(W

    8q+

    1

    2

    )x logW +

    W 2 logW

    4+ qW logW

    where ρ = qQIf W0 gt V the contribution of the terms with V leW lt W0 to (615) is (by 618)

    bounded by

    4

    int W0

    V

    radicκ1

    x

    W

    (ρ0

    4

    (W

    4q+ 1

    )x logW +

    W 2 logW

    4+ qW logW

    )dW

    W

    le κ2

    2

    radicρ0x

    int W0

    V

    radiclogW

    W 32dW +

    κ2

    2

    radicx

    int W0

    V

    radiclogW

    W 12dW

    + κ2

    radicρ0x2

    16q+ qx

    int W0

    V

    radiclogW

    WdW

    le(κ2radicρ0

    xradicV

    + κ2

    radicxW0

    )radiclogW0

    +2κ2

    3

    radicρ0x2

    16q+ qx

    ((logW0)32 minus (log V )32

    )

    (620)

    where ρ0 = max(1 2ρ) and

    κ2 = 4radicκ1 le 193768

    (We are using the easy boundradica+ b+ c le

    radica+radicb+radicc)

    110 CHAPTER 6 MINOR-ARC TOTALS

    We now examine the terms with W ge W0 If 2θq gt xU then W0 = xU thecontribution of the case is nil and the computations below can be ignored Thus wecan assume that 2θq le xU

    We use (554)

    S2(U primeW primeW ) le(

    x

    4φ(q)

    1

    log(W2q)+

    q

    φ(q)

    W

    log(W2q)

    )middot 1

    2W logW

    Byradica+ b le

    radica+radicb we can take out the qφ(q) middotW log(W2q) term and estimate

    its contribution on its own it is at most

    4

    int xU

    W0

    radicκ1

    x

    Wmiddot q

    φ(q)middot 1

    2W 2

    logW

    logW2q

    dW

    W

    =κ2radic

    2

    radicq

    φ(q)

    int xU

    W0

    radicx logW

    W logW2qdW

    le κ2radic2

    radicqx

    φ(q)

    int xU

    W0

    1radicW

    (1 +

    radiclog 2q

    logW2q

    )dW

    (621)

    Nowint xU

    W0

    1radicW

    radiclog 2q

    logW2qdW =

    radic2q log 2q

    int x2Uq

    max(θV2q)

    1radict log t

    dt

    We bound this last integral somewhat crudely for T ge e

    int T

    e

    1radict log t

    dt le 23

    radicT

    log T (622)

    (This is shown as follows since

    1radicT log T

    lt

    (23

    radicT

    log T

    )prime

    if and only if T gt T0 where T0 = e(1minus223)minus1

    = 213594 it is enough to check(numerically) that (622) holds for T = T0) Since θ ge e this gives us that

    int xU

    W0

    1radicW

    (1 +

    radiclog 2q

    logW2q

    )dW

    le 2

    radicx

    U+ 23

    radic2q log 2q middot

    radicx2Uq

    log x2Uq

    62 CONTRIBUTIONS OF DIFFERENT TYPES 111

    and so (621) is at most

    radic2κ2

    radicq

    φ(q)

    (1 + 115

    radiclog 2q

    log x2Uq

    )xradicU

    We are left with what will usually be the main term viz

    4

    int xU

    W0

    radicS1(UW ) middot

    (x

    8φ(q)

    logW

    logW2q

    )WdW

    W (623)

    which by (534) is at most xradicφ(q) times the integral of

    1

    W

    radicradicradicradic(2H2

    ( x

    WU

    )+κ4

    2

    radicxWU

    U

    )logW

    logW2q

    for W going from W0 to xU where H2 is as in (535) and

    κ4 = 4κ0ζ(32)3 le 905671

    By the arithmeticgeometric mean inequality the integrand is at most 1W times

    β + βminus1 middot 2H2(xWU)

    2+βminus1

    2

    κ4

    2

    radicxWU

    U+β

    2

    log 2q

    logW2q(624)

    for any β gt 0 We will choose β laterThe first summand in (624) gives what we can think of as the main or worst term

    in the whole paper let us compute it first The integral isint xU

    W0

    β + βminus1 middot 2H2(xWU)

    2

    dW

    W=

    int xUW0

    1

    β + βminus1 middot 2H2(s)

    2

    ds

    s

    le(β

    2+κ6

    )log

    x

    UW0

    (625)

    by (536) whereκ6 = 060428

    Thus the main term is simply(β

    2+κ6

    )xradicφ(q)

    logx

    UW0 (626)

    The integral of the second summand is at most

    βminus1 middot κ4

    4

    radicx

    U

    int xU

    V

    dW

    W 32le βminus1 middot κ4

    2

    radicxUV

    U

    112 CHAPTER 6 MINOR-ARC TOTALS

    By (617) this is at most

    βminus1

    radic2middot 10minus3 middot κ4 le βminus1κ72

    where

    κ7 =

    radic2κ4

    1000le 01281

    Thus the contribution of the second summand is at most

    βminus1κ7

    2middot xradic

    φ(q)

    The integral of the third summand in (624) is

    β

    2

    int xU

    W0

    log 2q

    logW2q

    dW

    W (627)

    If V lt 2θq le xU this is

    β

    2

    int xU

    2θq

    log 2q

    logW2q

    dW

    W=β

    2log 2q middot

    int x2Uq

    θ

    1

    log t

    dt

    t

    2log 2q middot

    (log log

    x

    2Uqminus log log θ

    )

    If 2θq gt xU the integral is over an empty range and its contribution is hence 0If 2θq le V (627) is

    β

    2

    int xU

    V

    log 2q

    logW2q

    dW

    W=β log 2q

    2

    int x2Uq

    V2q

    1

    log t

    dt

    t

    =β log 2q

    2middot (log log

    x

    2Uqminus log log V2q)

    =β log 2q

    2middot log

    (1 +

    log xUV

    log V2q

    )

    (628)

    (Let us stop for a moment and ask ourselves when this will be smaller than whatwe can see as the main term namely the term (β2) log xUW0 in (625) Clearlylog(1 + (log xUV )(log V2q)) le (log xUV )(log V2q) and that is smaller than(log xUV ) log 2q when V2q gt 2q Of course it does not actually matter if (628)is smaller than the term from (625) or not since we are looking for upper bounds herenot for asymptotics)

    The total bound for (623) is thus

    xradicφ(q)

    middot(β middot(

    1

    2log

    x

    UW0+

    Φ

    2

    )+ βminus1

    (1

    4κ6 log

    x

    UW0+κ7

    2

    )) (629)

    62 CONTRIBUTIONS OF DIFFERENT TYPES 113

    where

    Φ =

    log 2q(

    log log x2Uq minus log log θ

    )if V2θ lt q lt x(2θU)

    log 2q log(

    1 + log xUVlog V2q

    )if q le V2θ

    (630)

    Choosing β optimally we obtain that (623) is at most

    xradic2φ(q)

    radic(log

    x

    UW0+ Φ

    )(κ6 log

    x

    UW0+ 2κ7

    ) (631)

    where Φ is as in (630)Bounding S2 for |δ| ge 8 Let us see how much a non-zero δ can help us It makes

    sense to apply (556) only when |δ| ge 8 otherwise (554) is almost certainly betterNow by definition |δ|x le 1qQ and so |δ| ge 8 can happen only when q le x8Q

    With this in mind let us apply (556) assuming |δ| gt 8 Note first that

    x

    |δq|

    (q +

    x

    4W

    )minus1

    ge 1|δq|qx + 1

    4W

    ge 4|δq|1

    2Q + 1W

    ge 4W

    |δ|qmiddot 1

    1 + W2Q

    ge 4W

    |δ|qmiddot 1

    1 + xU2Q

    This is at least 2 min(2QW )|δq| Thus we are allowed to apply (556) when |δq| le2 min(2QW ) Since Q ge xU we know that min(2QW ) = W for all W le xU and so it is enough to assume that |δq| le 2W We will soon be making a strongerassumption

    Recalling also (619) we see that (556) gives us

    S2(U primeW primeW ) le min

    12qφ(q)

    log

    (4W|δ|q middot

    1

    1+xU2Q

    )( x

    |δq|+W

    2

    )middot 1

    2W (logW )

    (632)Similarly to before we define W0 = max(V θ|δq|) where θ ge 3e28 will be set

    later (Here θ ge 3e28 is an assumption we do not yet need but we will be using itsoon to simplify matters slightly) For W geW0 we certainly have |δq| le 2W Hencethe part of the first term of (615) coming from the range W0 leW lt xU is

    4

    int xU

    W0

    radicS1(UW ) middot S2(U VW )

    dW

    W

    le 4

    radicq

    φ(q)

    int xU

    W0

    radicradicradicradicradicS1(UW ) middot logW

    log

    (4W|δ|q middot

    1

    1+xU2Q

    ) (Wx

    |δq|+W 2

    2

    )dW

    W

    (633)

    114 CHAPTER 6 MINOR-ARC TOTALS

    By (534) the contribution of the term Wx|δq| to (633) is at most

    4xradic|δ|φ(q)

    int xU

    W0

    radicradicradicradicradicradic(H2

    ( x

    WU

    )+κ4

    4

    radicxWU

    U

    )logW

    log

    (4W|δ|q middot

    1

    1+xU2Q

    ) dWW

    Note that 1 + (xU)2Q le 32 Proceeding as in (623)ndash(631) we obtain that this isat most

    2xradic|δ|φ(q)

    radic(log

    x

    UW0+ Φ

    )(κ6 log

    x

    UW0+ 2κ7

    )

    where

    Φ =

    log (1+ε1)|δq|4 log

    (1 + log xUV

    log 4V|δ|(1+ε1)q

    )if |δq| le Vθ

    log 3|δq|8

    (log log 8x

    3U |δq| minus log log 8θ3

    )if Vθ lt |δq| le xθU

    (634)

    where ε1 = x2UQ This is what we think of as the main termBy (618) the contribution of the term W 22 to (633) is at most

    4

    radicq

    φ(q)

    int xU

    W0

    radicκ1

    2xdWradicWmiddot maxW0leWle x

    U

    radiclogW

    log 8W3|δq|

    (635)

    Since trarr (log t)(log tc) is decreasing for t gt c (635) is at most

    4radic

    2κ1

    radicq

    φ(q)

    (xradicUminusradicxW0

    )radiclogW0

    log 8W0

    3|δq| (636)

    If W0 gt V we also have to consider the range V leW lt W0 By Prop 524 and(619) the part of (615) coming from this is

    4

    int θ|δq|

    V

    radicS1(UW ) middot (logW )

    (Wx

    2|δq|+W 2

    4+

    Wx

    16(1minus ρ)Q+

    x

    8(1minus ρ)

    )dW

    W

    The contribution of W 24 is at most

    4

    int W0

    V

    radicκ1

    x

    WlogW middot W

    2

    4

    dW

    Wle 4radicκ1 middot

    radicxW0 middot

    radiclogW

    the sum of this and (636) is at most

    4radicκ1

    (radic2q

    φ(q)

    (xradicUminusradicxW0

    )radiclogW0

    log 8θ3

    +radicxW0

    radiclogW0

    )

    le κ2 middotradic

    q

    φ(q)

    xradicU

    radiclogW0

    62 CONTRIBUTIONS OF DIFFERENT TYPES 115

    where we use the facts that W0 = θ|δq| (by W0 gt V ) and θ ge 3e28 and where werecall that κ2 = 4

    radicκ1

    The terms Wx2|δ|q and Wx(16(1minus ρ)Q) contribute at most

    4radicκ1

    int θ|δq|

    V

    radicx

    Wmiddot (logW )W

    (x

    2|δq|+

    x

    16(1minus ρ)Q

    )dW

    W

    = κ2x

    (1radic2|δ|q

    +1

    4radic

    (1minus ρ)Q

    )int θ|δq|

    V

    radiclogW

    dW

    W

    =2κ2

    3x

    (1radic2|δ|q

    +1

    4radic

    (1minus ρ)Q

    )((log θ|δ|q)32 minus (log V )32

    )

    The term x8(1minus ρ) contributes

    radic2κ1x

    int θ|δq|

    V

    radiclogW

    W (1minus ρ)

    dW

    Wleradic

    2κ1xradic1minus ρ

    int infinV

    radiclogW

    W 32dW

    le κ2xradic2(1minus ρ)V

    (radic

    log V +radic

    1 log V )

    where we use the estimate

    int infinV

    radiclogW

    W 32dW =

    1radicV

    int infin1

    radiclog u+ log V

    u32du

    le 1radicV

    int infin1

    radiclog V

    u32du+

    1radicV

    int infin1

    1

    2radic

    log V

    log u

    u32du

    = 2

    radiclog VradicV

    +1

    2radicV log V

    middot 4 le 2radicV

    (radiclog V +

    radic1 log V

    )

    It is time to collect all type II terms Let us start with the case of general δ We willset θ ge e later If q le V2θ then |SII | is at most

    xradic2φ(q)

    middot

    radic(log

    x

    UV+ log 2q log

    (1 +

    log xUV

    log V2q

    ))(κ6 log

    x

    UV+ 2κ7

    )+radic

    2κ2

    radicq

    φ(q)

    (1 + 115

    radiclog 2q

    log x2Uq

    )xradicU

    + κ9xradicV

    (637)

    116 CHAPTER 6 MINOR-ARC TOTALS

    If V2θ lt q le x2θU then |SII | is at most

    xradic2φ(q)

    middot

    radic(log

    x

    U middot 2θq+ log 2q log

    log x2Uq

    log θ

    )(κ6 log

    x

    U middot 2θq+ 2κ7

    )

    +radic

    2κ2

    radicq

    φ(q)

    (1 + 115

    radiclog 2q

    log x2Uq

    )xradicU

    + (κ2

    radiclog 2θq + κ9)

    xradicV

    +κ2

    6

    ((log 2θq)32 minus (log V )32

    ) xradicq

    + κ2

    (radic2θ middot log 2θq +

    2

    3((log 2θq)32 minus (log V )32)

    )radicqx

    (638)where we use the fact that Q ge xU (implying that ρ0 = max(1 2qQ) equals 1 forq le x2U ) Finally if q gt x2θU

    |SII | le (κ2

    radic2 log xU + κ9)

    xradicV

    + κ2

    radiclog xU

    xradicU

    +2κ2

    3((log xU)32 minus (log V )32)

    (x

    2radic

    2q+radicqx

    )

    (639)

    Now let us examine the alternative bounds for |δ| ge 8 Here we assume θ ge 3e28If |δq| le Vθ then |SII | is at most

    2xradic|δ|φ(q)

    radicradicradicradiclogx

    UV+ log

    |δq|(1 + ε1)

    4log

    (1 +

    log xUV

    log 4V|δ|(1+ε1)q

    )

    middotradicκ6 log

    x

    UV+ 2κ7

    + κ2

    radic2q

    φ(q)middot

    radiclog V

    log 2V|δq|middot xradic

    U+ κ9

    xradicV

    (640)

    where ε1 = x2UQ If Vθ lt |δ|q le xθU then |SII | is at most

    2xradic|δ|φ(q)

    radicradicradicradic(logx

    U middot θ|δ|q+ log

    3|δq|8

    loglog 8x

    3U |δq|

    log 8θ3

    )(κ6 log

    x

    U middot θ|δq|+ 2κ7

    )

    +2κ2

    3

    (xradic2|δq|

    +x

    4radicQminus q

    )((log θ|δq|)32 minus (log V )32

    )+

    (κ2radic

    2(1minus ρ)

    (radiclog V +

    radic1 log V

    )+ κ9

    )xradicV

    + κ2

    radicq

    φ(q)middotradic

    log θ|δq| middot xradicU

    (641)

    63 ADJUSTING PARAMETERS CALCULATIONS 117

    where ρ = qQ Note that |δ| le xQq implies ρ le xQ2 and so ρ will be very smalland Qminus q will be very close to Q

    The case |δq| gt xθU will not arise in practice essentially because of |δ|q le xQ

    63 Adjusting parameters Calculations

    We must bound the exponential sumsumn Λ(n)e(αn)η(nx) By (38) it is enough to

    sum the bounds we obtained in sect62 We will now see how it will be best to set U Vand other parameters

    Usually the largest terms will be

    C0UV (642)

    where C0 equals

    c4I2 + c9I2 = 439779 + 521993ε0 if |δ| le 12c2 sim 074463c4I2 + (1 + ε)c13I2 = (489106 + 131541ε)(1 + ε0) if |δ| gt 12c2

    (643)(from (613) and (614) type I we will specify ε and ε0 = (4 log 2)(xUV ) later)and

    xradicδ0φ(q)

    radicradicradicradiclogx

    UV+ (log δ0(1 + ε1)q) log

    (1 +

    log xUV

    log Vδ0(1+ε1)q

    )

    middotradicκ6 log

    x

    UV+ 2κ7

    (644)

    (from (637) and (640) type II here δ0 = max(2 |δ|4) while ε1 = x2UQ for|δ| gt 8 and ε1 = 0 for |δ| lt 8

    We set UV = κxradicqδ0 we must choose κ gt 0

    Let us first optimize (or rather almost optimize) κ in the case |δ| le 4 so thatδ0 = 2 and ε1 = 0 For the purpose of choosing κ we replace

    radicφ(q) by

    radicqC1

    where C1 = 23536 sim 510510φ(510510) and also replace V by q2c c a constantWe use the approximation

    log

    (1 +

    log xUV

    log V|2q|

    )= log

    (1 +

    log(radic

    2qκ)

    log(q2c)

    )= log

    (3

    2+

    log 2radiccκ

    log q2c

    )sim log

    3

    2+

    2 log 2radiccκ

    3 log q2c

    118 CHAPTER 6 MINOR-ARC TOTALS

    What we must minimize then is

    C0κradic2q

    +C1radic2q

    radicradicradicradic(log

    radic2q

    κ+ log 2q

    (log

    3

    2+

    2 log 2radicc

    κ3 log q

    2c

    ))(κ6 log

    radic2q

    κ+ 2κ7

    )

    le C0κradic2q

    +C1

    2radicq

    radicκ6radicκprime1

    radicκprime1 log q minus

    (5

    3+

    2

    3

    log 4c

    log q2c

    )logκ + κprime2

    middot

    radicκprime1 log q minus 2κprime1 logκ +

    4κprime1κ7

    κ6+ κprime1 log 2

    le C0radic2q

    (κ + κprime4

    (κprime1 log q minus

    ((5

    6+ κprime1

    )+

    1

    3

    log 4c

    log q2c

    )logκ + κprime3

    ))

    (645)where

    κprime1 =1

    2+ log

    3

    2 κprime2 = log

    radic2 + log 2 log

    3

    2+

    log 4c log 2q

    3 log q2c

    κprime3 =1

    2

    (κprime2 +

    4κprime1κ7

    κ6+ κprime1 log 2

    )=

    log 4c

    6+

    (log 4c)2

    6 log q2c

    + κprime5

    κprime4 =C1

    C0

    radicκ6

    2κprime1sim

    030915

    1+118694ε0if |δ| le 4

    027797(1+026894ε)(1+ε0) if |δ| gt 4

    κprime5 =1

    2(logradic

    2 + log 2 log3

    2+

    4κprime1κ7

    κ6+ κprime1 log 2) sim 101152

    Taking derivatives we see that the minimum is attained when

    κ =

    (5

    6+ κprime1 +

    1

    3

    log 4c

    log q2c

    )κprime4 sim

    (17388 +

    log 4c

    3 log q2c

    )middot 030915

    1 + 119ε0(646)

    provided that |δ| le 4 (What we obtain for |δ| gt 4 is essentially the same only withδ0q = δq4 instead of 2q and 027797((1 + 027ε)(1 + ε0)) in place of 030915) Forq = 5 middot 105 c = 25 and |δ| le 4 (typical values in the most delicate range) we get thatκ should be about 05582(1 + 119ε0) Values of q c nearby give similar values forκ whether |δ| le 4 or for |δ| gt 4

    (Incidentally at this point we could already give a back-of-the-envelope estimatefor the last line of (645) ie our main term It suggests that choosing w = 1 insteadof w = 2 would have given bounds worse by about 15 percent)

    We make the choices

    κ = 12 and so UV =x

    2radicqδ0

    for the sake of simplicity (Unsurprisingly (645) changes very slowly around its min-imum) Note by the way that this means that ε0 = (2 log 2)

    radicqδ0

    Now we must decide how to choose U V and Q given our choice of UV We willactually make two sets of choices

    63 ADJUSTING PARAMETERS CALCULATIONS 119

    First we will use the SI2 estimates for q le QV to treat all α of the form α =aq +Olowast(1qQ) q le y (Here y is a parameter satisfying y le QV )

    Then the remaining α will get treated with the (coarser) SI2 estimate for q gtQV with Q reset to a lower value (call it Qprime) If α was not treated in the first go (sothat it must be dealt with the coarser estimate) then α = aprimeqprime + δprimex where eitherqprime gt y or δprimeqprime gt xQ (Otherwise α = aprimeqprime +Olowast(1qprimeQ) would be a valid estimatewith qprime le y) The value of Qprime is set to be smaller than Q both because this is helpful(it diminishes error terms that would be large for large q) and because this is harmless(since we are no longer assuming that q le QV )

    631 First choice of parameters q le y

    The largest items affected strongly by our choices at this point are

    c16I2

    (2 +

    1 + ε

    εlog+ 2UV |δ|q

    x

    )x

    QV+ c17I2Q (from SI2 |δ| gt 12c2)(

    c10I2 logU

    q+ 2c5I2 + c12I2

    )Q (from SI2 |δ| le 12c2)

    (647)and

    κ2

    radic2q

    φ(q)

    (1 + 115

    radiclog 2q

    log x2Uq

    )xradicU

    + κ9xradicV

    (from SII any |delta|)

    (648)with

    κ2

    radic2q

    φ(q)middot

    radiclog V

    log 2V|δq|middot xradic

    U(from SII )

    as an alternative to (648) for |δ| ge 8 (In several of these expressions we are apply-ing some minor simplifications that our later choices will justify Of course even ifthese simplifications were not justified we would not be getting incorrect results onlypotentially suboptimal ones we are trying to decide how choose certain parameters)

    In addition we have a relatively mild but important dependence on V in the mainterm (644) even when we hold UV constant (as we do in so far as we have alreadychosen UV ) We must also respect the condition q le QV the lower bound onU given by (617) and the assumptions made at the beginning of the chapter (egQ ge xU V ge 2 middot 106) Recall that UV = x2

    radicqδ0

    We setQ =

    x

    8y

    since we will then have not just q le y but also q|δ| le xQ = 8y and so qδ0 le 2yWe want q le QV to be true whenever q le y this means that

    q le Q

    V=QU

    UV=

    QU

    x2radicqδ0

    =Uradicqδ0

    4y

    120 CHAPTER 6 MINOR-ARC TOTALS

    must be true when q le y and so it is enough to set U = 4y2radicqδ0 The following

    choices make sense we will work with the parameters

    y =x13

    6 Q =

    x

    8y=

    3

    4x23 xUV = 2

    radicqδ0 le 2

    radic2y

    U =4y2

    radicqδ0

    =x23

    9radicqδ0

    V =x

    (xUV ) middot U=

    x

    8y2=

    9x13

    2

    (649)

    where as before δ0 = max(2 |δ|4) So for instance we obtain ε1 le x2UQ =6radicqδ0x

    13 le 2radic

    3x16 Assuming

    x ge 216 middot 1020 (650)

    we obtain that U(xUV ) ge (x239radicqδ0)(2

    radicqδ0) = x2318qδ0 ge x136 ge

    106 and so (617) holds We also get that ε1 le 0002Since V = x8y2 = (92)x13 (650) also implies that V ge 2 middot 106 (in fact

    V ge 27 middot 106) It is easy to check that

    V lt x4 UV le x Q ge max(16 2radicx) Q ge max(2U xU) (651)

    as stated at the beginning of the chapter Let θ = (32)3 = 278 Then

    V

    2θq=x8y2

    2θqge x

    16θy3=

    x

    54y3= 4 gt 1

    V

    θ|δq|ge x8y2

    8θyge x

    64θy3=

    x

    216y3= 1

    (652)

    The first type I bound is

    |SI1| lex

    qmin

    (1cprime0δ2

    )min

    45

    qφ(q)

    log+ x23 9

    q52 δ

    120

    1

    (log 9x13

    radicqδ0 + c3I

    )+c4Iq

    φ(q)

    +

    (c7I log

    y

    c2+ c8I log x

    )y +

    c10Ix13

    3422q32δ120

    (log 9x13radiceqδ0)

    +

    (c5I log

    2x23

    9c2radicqδ0

    + c6I logx53

    9radicqδ0

    )x23

    9radicqδ0

    + c9Iradicx log

    2radicex

    c2+c10I

    e

    (653)where the constants are as in sect621 For any cR ge 1 the function

    xrarr (log cx)(log xR)

    attains its maximum on [Rprimeinfin] Rprime gt R at x = Rprime Hence for qδ0 fixed

    min

    45

    log+ 4x23

    9(δ0q)52

    1

    (log 9x13

    radicqδ0 + c3I

    )(654)

    63 ADJUSTING PARAMETERS CALCULATIONS 121

    attains its maximum for x isin [(9e45(δ0q)524)32infin) at

    x =(

    9e45(δ0q)524

    )32

    = (278)e65(qδ0)154 (655)

    Now notice that for smaller values of x (654) increases as x increases since the termmin( 1) equals the constant 1 Hence (654) attains its maximum for x isin (0infin)at (655) and so

    min

    45

    log+ 4x23

    9(δ0q)52

    1

    (log 9x13

    radicqδ0 + c3I

    )+ c4I

    le log27

    2e25(δ0q)

    74 + c3I + c4I le7

    4log δ0q + 611676

    Examining the other terms in (653) and using (650) we conclude that

    |SI1| lex

    qmin

    (1cprime0δ2

    )middot q

    φ(q)

    (7

    4log δ0q + 611676

    )+

    x23

    radicqδ0

    (067845 log xminus 120818) + 037864x23

    (656)

    where we are using (650) (and of course the trivial bound δ0q ge 2) to simplify thesmaller error terms We recall that cprime0 = 0798437 gt c0(2π)2

    Let us now consider SI2 The terms that appear both for |δ| small and |δ| large aregiven in (612) The second line in (612) equals

    c8I2

    (x

    4q2δ0+

    2UV 2

    x+qV 2

    x

    )+c10I

    2

    (q

    2radicqδ0

    +x23

    18qδ0

    )log

    9x13

    2

    le c8I2(

    x

    4q2δ0+

    9x13

    2radic

    2+

    27

    8

    )+c10I

    2

    (y16

    232+

    x23

    18qδ0

    )(1

    3log x+ log

    9

    2

    )le 029315

    x

    q2δ0+ (008679 log x+ 039161)

    x23

    qδ0+ 000153

    radicx

    where we are using (650) to simplify Now

    min

    (45

    log+ Q4V q2

    1

    )log V q = min

    (45

    log+ y4q2

    1

    )log

    9x13q

    2(657)

    can be bounded trivially by log(9x13q2) le (23) log x+log 34 We can also bound(657) as we bounded (654) before namely by fixing q and finding the maximum forx variable In this way we obtain that (657) is maximal for y = 4e45q2 since bydefinition x136 = y (657) then equals

    log9(6 middot 4e45q2)q

    2= 3 log q + log 108 +

    4

    5le 3 log q + 548214

    122 CHAPTER 6 MINOR-ARC TOTALS

    We conclude that (612) is at most

    min

    (1

    4cprime0δ2

    )middot(

    3

    2log q + 274107

    )x

    φ(q)

    + 029315x

    q2δ0+ (00434 log x+ 01959)x23

    (658)

    If |δ| le 12c2 we must consider (613) This is at most

    (c4I2 + c9I2)x

    2radicqδ0

    + (c10I2 logx23

    9q32radicδ0

    + 2c5I2 + c12I2) middot 3

    4x23

    le 21989xradicqδ0

    +361818x

    qδ0+ (177019 log x+ 292955)x23

    where we recall that ε0 = (4 log 2)(xUV ) = (2 log 2)radicqδ0 which can be bounded

    crudely byradic

    2 log 2 (Thus c10I2 leradic

    1 +radic

    8 log 2middot178783 lt 354037 and c12I2 le293333 + 11902

    radic2 log 2 le 410004)

    If |δ| gt 12c2 we must consider (614) instead For ε = 007 that is at most

    (c4I2 + (1 + ε)c13I2)x

    2radicqδ0

    (1 +

    2 log 2radicqδ0

    )+ (338845

    (1 +

    2 log 2radicqδ0

    )log δq3 + 208823)

    x

    |δ|q

    +

    (688133

    (1 +

    4 log 2radicqδ0

    )log |δ|q + 720828

    )x23 + 604141x13

    = 249157xradicqδ0

    (1 +

    2 log 2radicqδ0

    )+ (338845 log δq3 + 326771)

    x

    |δ|q

    +

    (229378 log x+ 190791

    log |δ|qradicqδ0

    + 130691

    )x

    23

    le 249157xradicqδ0

    + (359676 log δ0 + 273032 log q + 912218)x

    qδ0

    + (229378 log x+ 411228)x23

    where besides the crude bound ε0 leradic

    2 log 2 we use the inequalities

    log |δ|qradicqδ0

    le log 4qδ0radicqδ0

    le log 8radic2

    log qradicqδ0le 1radic

    2

    log qradicqle 1radic

    2

    log e2

    e=

    radic2

    e

    1

    |δ|le 4c2

    δ0

    log |δ||δ|

    le 2

    e log 2middot log δ0

    δ0

    (Obviously 1|δ| le 4c2δ0 is based on the assumption |δ| gt 12c2 and on the inequal-ity 16c2 ge 1 The bound on (log |δ|)|δ| is based on the fact that (log t)t reaches itsmaximum at t = e and (log δ0)δ0 = (log 2)2 for |δ| le 8)

    63 ADJUSTING PARAMETERS CALCULATIONS 123

    We sum (658) and whichever one of our bounds for (613) and (614) is greater(namely the latter) We obtain that for any δ

    |SI2| le 249157xradicqδ0

    + min

    (1

    4cprime0δ2

    )middot(

    3

    2log q + 274107

    )x

    φ(q)

    + (359676 log δ0 + 273032 log q + 91515)x

    qδ0+ (229812 log x+ 411424)x23

    (659)where we bound one of the lower-order terms in (658) by xq2δ0 le xqδ0

    For type II we have to consider two cases (a) |δ| lt 8 and (b) |δ| ge 8 Considerfirst |δ| lt 8 Then δ0 = 2 Recall that θ = 278 We have q le V2θ and |δq| le Vθthanks to (652) We apply (637) and obtain that for |δ| lt 8

    |SII | lexradic

    2φ(q)middot

    radicradicradicradic1

    2log 4qδ0 + log 2q log

    (1 +

    12 log 4qδ0

    log V2q

    )middotradic

    030214 log 4qδ0 + 02562

    + 822088

    radicq

    φ(q)

    1 + 115

    radicradicradicradic log 2q

    log 9x13radicδ0

    2radicq

    (qδ0)14x23 + 184251x56

    le xradic2φ(q)

    middotradicCx2q log 2q +

    log 8q

    2middotradic

    030214 log 2q + 067506

    + 16406

    radicq

    φ(q)x34 + 184251x56

    (660)where we bound

    log 2q

    log 9x13radicδ0

    2radicq

    lelog x13

    3

    log 9x16radic

    2

    2radic

    16

    lt limxrarrinfin

    log x13

    3

    log 9x16radic

    2

    2radic

    16

    = 2

    and where we define

    Cxt = log

    (1 +

    log 4t

    2 log 9x13

    2004t

    )

    for 0 lt t lt 9x132 (We have 2004 here instead of 2 because we want a constantge 2(1 + ε1) in later occurences of Cxt for reasons that will soon become clear)

    For purposes of later comparison we remark that 16404 le 157863x45minus34 forx ge 216 middot 1020

    Consider now case (b) namely |δ| ge 8 Then δ0 = |δ|4 By (652) |δq| le Vθ

    124 CHAPTER 6 MINOR-ARC TOTALS

    Hence (640) gives us that

    |SII | le2xradic|δ|φ(q)

    middot

    radicradicradicradic1

    2log |δq|+ log

    |δq|(1 + ε1)

    4log

    (1 +

    log |δ|q2 log 18x13

    |δ|(1+ε1)q

    )middotradic

    030214 log |δ|q + 02562

    + 822088

    radicq

    φ(q)

    radicradicradicradic log 9x13

    2

    log 9x13

    |δq|

    middot (qδ0)14x23 + 184251x56

    le xradicδ0φ(q)

    radicCxδ0q log δ0(1 + ε1)q +

    log 4δ0q

    2

    radic030214 log δ0q + 067506

    + 179926

    radicq

    φ(q)x45 + 184251x56

    (661)since

    822088

    radicradicradicradic log 9x13

    2

    log 9x13

    |δq|

    middot (qδ0)14 le 822088

    radiclog 9x13

    2

    log 274

    middot (x133)14

    le 179926x45minus23

    for x ge 216 middot 1020 Clearly

    log δ0(1 + ε1)q = log δ0q + log(1 + ε1) le log δ0q + ε1

    By Lemma C22 qφ(q) le z(y) = z(x136) (since x ge 183) It is easy tocheck that x rarr

    radicz(x136)x45minus56 is decreasing for x ge 216 middot 1020 (in fact for

    183) Using (650) we conclude that 167718radicqφ(q)x45 le 089657x56 and by

    the way 16406radicqφ(q)x34 le 078663x56 This allows us to simplify the last lines

    of (660) and (661) We obtain that for δ arbitrary

    |SII | lexradicδ0φ(q)

    radicCxδ0q(log δ0q + ε1) +

    log 4δ0q

    2

    radic030214 log δ0q + 067506

    + 273908x56(662)

    It is time to sum up SI1 SI2 and SII The main terms come from the first lineof (662) and the first term of (659) Lesser-order terms can be dealt with roughlywe bound min(1 cprime0δ

    2) and min(1 4cprime0δ2) from above by 2δ0 (using the fact that

    cprime0 = 0798437 lt 16 which implies that 8δ gt 4cprime0δ2 for δ gt 8 of course for δ le 8

    we have min(1 4cprime0δ2) le 1 = 22 = 2δ0)

    63 ADJUSTING PARAMETERS CALCULATIONS 125

    The terms inversely proportional to q φ(q) or q2 thus add up to at most

    2x

    δ0qmiddot q

    φ(q)

    (7

    4log δ0q + 611676

    )+

    2x

    δ0φ(q)

    (3

    2log q + 274107

    )+ (359676 log δ0 + 273032 log q + 91515)

    x

    qδ0

    le 2x

    δ0φ(q)

    (13

    4log δ0q + 781811

    )+

    2x

    δ0q(136516 log δ0q + 375415)

    where for instance we bound (32) log q + 274107 by (32) log δ0q + 274107 minus(32) log 2

    As for the other terms ndash we use the assumption x ge 216 middot 1020 to bound x23

    and x23 log x by a small constant times x56 We bound x23radicqδ0 by x23

    radic2 (in

    (656)) We obtain

    x23

    radic2

    (067845 log xminus 120818) + 037864x23

    + (229812 log x+ 411424)x23 + 273908x

    56 le 335531x56

    The sums S0infin and S0w in (311) are 0 (by (650) and the fact that η2(t) = 0 fort le 14) We conclude that for q le y = x136 x ge 216 middot 1020 and η = η2 as in(34)

    |Sη(x α)| le |SI1|+ |SI2|+ |SII |

    le xradicδ0φ(q)

    radicCxδ0q(log δ0q + 0002) +

    log 4δ0q

    2

    radic030214 log δ0q + 067506

    +249157xradic

    δ0q+

    2x

    δ0φ(q)

    (13

    4log δ0q + 781811

    )+

    2x

    δ0q(136516 log δ0q + 375415)

    + 335531x56(663)

    where

    δ0 = max(2 |δ|4) Cxt = log

    (1 +

    log 4t

    2 log 9x13

    2004t

    ) (664)

    SinceCxt is an increasing function as a function of t (for x fixed and t le 9x132004)and δ0q le 2y we see that Cxt le Cx2y It is clear that x 7rarr Cxt (fixed t) is adecreasing function of x For x = 216 middot 1020 Cx2y = 139942

    632 Second choice of parameters

    If with the original choice of parameters we obtained q gt y = x136 we now resetour parameters (Q U and V ) Recall that while the value of q may now change (due tothe change inQ) we will be able to assume that either q gt y or |δq| gt x(x8y) = 8y

    126 CHAPTER 6 MINOR-ARC TOTALS

    We want U(xUV ) ge 5 middot 105 (this is (617)) We also want UV small With thisin mind we let

    V =x13

    3 U = 500

    radic6x13 Q =

    x

    U=

    x23

    500radic

    6 (665)

    Then (617) holds (as an equality) Since we are assuming (650) we have V ge 2 middot106It is easy to check that (650) also implies that U le

    radicx2 and Q ge 2

    radicx and so the

    inequalities in (651) all holdWrite 2α = aq + δx for the new approximation we must have either q gt y or

    |δ| gt 8yq since otherwise aq would already be a valid approximation under the firstchoice of parameters Thus either (a) q gt y or both (b1) |δ| gt 8 and (b2) |δ|q gt 8ySince now V = 2y we have q gt V2θ in case (a) and |δq| gt Vθ in case (b) for anyθ ge 1 We set θ = 4

    (Thanks to this choice of θ we have |δq| le xQ le xθU as we commented at theend of sect623 this will help us avoid some case-work later)

    By (64)

    |SI1| lex

    qmin

    (1cprime0δ2

    )(log x23 minus log 500

    radic6 + c3I + c4I

    q

    φ(q)

    )+

    (c7I log

    Q

    c2+ c8I log x log c11I

    Q2

    x

    )Q+ c10I

    U2

    4xlog

    e12x23

    500radic

    6+c10I

    e

    +

    (c5I log

    1000radic

    6x13

    c2+ c6I log 500

    radic6x43

    )middot 500radic

    6x13 + c9Iradicx log

    2radicex

    c2

    le x

    qmin

    (1cprime0δ2

    )(2

    3log xminus 499944 + 100303

    q

    φ(q)

    )+

    289

    1000x23(log x)2

    where we are bounding

    c7I logQ

    c2+ c8I log x log c11I

    Q2

    x

    =c8I(log x)2 minus(c8I(log 1500000minus log c11I)minus

    2

    3c7I

    )log x+ c7I log

    1

    500radic

    6c2

    lec8I(log x)2 minus 38 log x

    We are also using the assumption (650) repeatedly in order to show that the sum ofall lower-order terms is less than (38c8I log x)(500

    radic6) Note that c8I(log x)2Q le

    000289x23(log x)2We have qφ(q) le z(Q) (where z is as in (C19)) and since Q gt

    radic6 middot 12 middot 109

    for x ge 216 middot 1020

    100303z(Q) le 100303

    (eγ log logQ+

    250637

    log logradic

    6 middot 12 middot 109

    )le 02359 logQ+ 079 lt 01573 log x

    63 ADJUSTING PARAMETERS CALCULATIONS 127

    (It is possible to give a much better estimation but it is not worthwhile since this willbe a very minor term) We have either q gt y or q|δ| gt 8y if q|δ| gt 8y but q le y then|δ| ge 8 and so cprime0δ

    2q lt 18|δ|q lt 164y lt 1y Hence

    |SI1| lex

    y

    ((2

    3+ 01573

    )log x

    )+ 000289x23(log x)2

    le 24719x23 log x+ 000289x23(log x)2

    We bound |SI2| using Lemma 424 First we bound (450) this is at most

    x

    2qmin

    (1

    4cprime0δ2

    )log

    x13q

    3

    + c0

    (1

    4minus 1

    π2

    ) (UV )2 log x13

    3

    2x+

    3c42

    500radic

    6

    9+

    (500radic

    6x13 + 1)2x13 log x

    23

    6x

    where c4 = 103884 We bound the second line of this using (650) As for the firstline we have either q ge y (and so the first line is at most (x2y)(log x13y3)) orq lt y and 4cprime0δ

    2q lt 116y lt 1y (and so the same bound applies) Hence (450) isat most

    3x23

    (2

    3log xminus log 18

    )+ 002017x23 log x = 202017x23 log xminus3(log 18)x23

    Now we bound (451) which comes up when |δ| le 12c2 where c2 = 6π5radicc0

    c0 = 31521 (and so c2 = 06714769 ) Since 12c2 lt 8 it follows that q gt y (thealternative q le y q|δ| gt 8y is impossible since it implies |δ| gt 8) Then (451) is atmost

    2radicc0c1π

    (UV log

    UVradice

    +Q

    (radic3 log

    c2x

    Q+

    logUV

    2log

    UV

    Q2

    ))+

    3c12

    x

    ylogUV log

    UV

    c2xy+

    16 log 2

    πQ log

    c0e3Q2

    4π middot 8 log 2 middot xlog

    Q

    2

    +3c1

    2radic

    2c2

    radicx log

    c2x

    2+

    25c04π2

    (3c2)12radicx log x

    (666)

    where c1 = 1000189 gt 1 + (8 log 2)(2xUV )The first line of (666) is a linear combination of terms of the form x23 logCx

    C gt 1 using (650) we obtain that it is at most 1144693x23 log x (The main contri-bution comes from the first term) Similarly we can bound the first term in the secondline by 330536x23 log x Since log(c0e

    3Q2(4π middot 8 log 2 middot x)) logQ2 is at mostlog x13 log x23 the second term in the second line is at most 00006406x(log x)2The third line of (666) can be bounded easily by 00122x23 log x

    Hence (666) is at most

    117776x23 log x+ 00006406x23(log x)2

    128 CHAPTER 6 MINOR-ARC TOTALS

    If |δ| gt 12c2 then we know that |δq| gt min(y2c2 8y) = y2c2 Thus (452)(with ε = 001) is at most

    2radicc0c1π

    UV logUVradice

    +202radicc0c1

    π

    (x

    y2c2+ 1

    )((radic

    302minus 1) log

    xy2c2

    + 1radic

    2+

    1

    2logUV log

    e2UVx

    y2c2

    )

    +

    (3c12

    (1

    2+

    303

    016log x

    )+

    20c03π2

    (2c2)32

    )radicx log x

    Again by (650) and in much the same way as before this simplifies to

    le (114466 + 15107 + 68523)x23 log x+ 29136x12(log x)2

    le 122885x23(log x)

    Hence in total and for any |δ|

    |SI2| le 202017x23 log x+ 122885x23(log x) + 00006406x23(log x)2

    le 12309x23(log x) + 00006406x23(log x)2

    Now we must estimate SII As we said before either (a) q gt y or both (b1)|δ| gt 8 and (b2) |δ|q gt 8y Recall that θ = 4 In case (a) we have q gt x136 =V2 gt V2θ thus we can use (638) and obtain that if q le x8U |SII | is at most

    xradicz(q)radic2q

    radic(log

    x

    U middot 8q+ log 2q log

    log x(2Uq)

    log 4

    )(κ6 log

    x

    U middot 8q+ 2κ7

    )

    +radic

    2κ2

    radicz( x

    8U

    )(1 + 115

    radiclog x4U

    log 4

    )xradicU

    + (κ2

    radiclog xU + κ9)

    xradicV

    +κ2

    6

    ((log 8y)32 minus (log 2y)32

    ) xradicy

    + κ2

    (radic8 log xU +

    2

    3((log xU)32 minus (log V )32)

    )xradic8U

    (667)where z is as in (C19) (We are already simplifying the third line the bound givenis justified by a derivative test) It is easy to check that q rarr (log 2q)(log log q)q isdecreasing for q ge y (indeed for q ge 9) and so the first line of (667) is maximal forq = y

    63 ADJUSTING PARAMETERS CALCULATIONS 129

    We can thus bound (667) by x56 timesradic3z(et36)

    (t

    3minus log 8c+

    (t

    3minus log 3

    )log

    t3 minus log 2c

    log 4

    )(κ6

    3tminus 4214

    )+

    radic2κ2radic6c

    radicz(e2t3

    48c

    )1 + 115

    radic23 tminus log 24c

    log 4

    +

    (κ2

    radic2t

    3minus log 6c+ κ9

    )radic

    3

    +κ2radic

    6

    ((t

    3+ log

    8

    6

    ) 32

    minus(t

    3+ log

    2

    6

    ) 32

    )

    +κ2radic48c

    (radic8

    (2t

    3minus log 6c

    )+

    2

    3

    ((2t

    3minus log 6c

    ) 32

    minus(t

    3minus log 3

    ) 32

    ))(668)

    where t = log x and c = 500radic

    6 Asymptotically the largest term in (667) comesfrom the last line (of order t32) even if the first line is larger in practice (while beingof order at most t log t) Let us bound (668) by a multiple of t32

    First of all notice that

    d

    dt

    z(et3

    6

    )log t

    =

    (eγ log

    (t3 minus log 6

    )+ 250637

    log( t3minuslog 6)

    )primelog t

    minusz(et3

    6

    )t(log t)2

    =eγ minus 250637

    log2( t3minuslog 6)

    (tminus 3 log 6) log tminuseγ + 250637

    log2( t3minuslog 6)

    t log tmiddot

    log(t3 minus log 6

    )log t

    (669)

    which for t ge 100 is

    gteγ log 3minus 2middot250637 log t

    log2( t3minuslog 6)

    t(log t)2ge

    195671minus 892482log t

    t(log t)2gt 0

    Similarly for t ge 2000

    d

    dt

    z(e2t3

    48c

    )log t

    gteγ log 3

    2 minus250637 log t

    log2( 2t3 minuslog 48c)

    minus 250637

    log( 2t3 minuslog 48c)

    t(log t)2

    ge072216minus 545234

    log t

    t(log t)2gt 0

    Thus

    z(et3

    6

    )le (log t) middot lim

    srarrinfin

    z(es3

    6

    )log s

    = eγ log t for t ge 100

    z(e2t3

    48c

    )le (log t) middot lim

    srarrinfin

    z(e2s3

    48c

    )log s

    = eγ log t for t ge 2000

    (670)

    130 CHAPTER 6 MINOR-ARC TOTALS

    Also note that since (x32)prime = (32)radicx((

    t

    3+ log

    8

    6

    ) 32

    minus(t

    3+ log

    2

    6

    ) 32

    )le 3

    2

    radict

    3+ log

    8

    6middot log 4 le 120083

    radict

    for t ge 2000 We also have(2t

    3minus log 6c

    ) 32

    minus(t

    3minus log 3

    ) 32

    lt

    (2t

    3minus log 9

    ) 32

    minus(t

    3minus log 3

    ) 32

    = (232 minus 1)

    (t

    3minus log 3

    ) 32

    lt (232 minus 1)t32

    332le 035189t32

    Of course

    t

    3minus log 8c+

    (t

    3minus log 3

    )log

    t3 minus log 2c

    log 4lt

    (t

    3+t

    3log

    t

    3

    )ltt

    3log t

    We conclude that for t ge 2000 (668) is at mostradic3 middot eγ log t middot t

    3log t middot κ6

    3t+

    radic2κ2radic6c

    radiceγ log t

    (1 + 079749

    radict)

    +

    (κ2

    radic2

    3t12 + κ9

    )radic

    3 +κ2radic

    6middot 12009

    radict+

    κ2radic48c

    (radic16t

    3+

    2

    3middot 035189t32

    )le (010181 + 000012 + 000145 + 0000048 + 000462)t32 le 010848t32

    On the remaining interval log(216 middot 1020) le t le log 2000 we use interval arith-metic (as in sect26 with 30 iterations) to bound the ratio of (668) to t32 We obtain thatit is at most

    0275964t32

    Hence for all x ge 216 middot 1020

    |SII | le 0275964x56(log x)32 (671)

    in the case y lt q le x8U If x8U lt q le Q we use (639) In this range x2

    radic2q +

    radicqx adopts its max-

    imum at q = Q (because x2radic

    2q for q = x8U is smaller thanradicqx for q = Q by

    (665) and (650)) Hence (639) is at most x56 times(κ2

    radic2

    (2

    3tminus log cprime

    )+ κ9

    )radic

    3 + κ2

    radic2

    3tminus log cprime middot 1radic

    cprime

    +2κ2

    3

    ((2

    3tminus log cprime

    ) 32

    minus(t

    3minus log 3

    ) 32

    )( radiccprime

    2radic

    2eminust6 +

    1radiccprime

    )

    63 ADJUSTING PARAMETERS CALCULATIONS 131

    where t = log x (as before) and cprime = 500radic

    6 This is at most

    (2κ2 +radic

    3κ9)radict+

    κ2radiccprime

    radic2

    3

    radict+

    2κ2

    3

    232 minus 1

    332t32

    ( radiccprime

    2radic

    2eminust6 +

    1radiccprime

    )le 010327

    for t ge log(216 middot 1020

    ) and so

    |SII | le 010327x56(log x)32

    for x8U lt q le Q using the assumption x ge 216 middot 1020Finally let us treat case (b) that is |δ| gt 8 and |δ|q gt 8y we can also assume

    q le y as otherwise we are in case (a) which has already been treated Since |δx| le1qQ we know that

    |δq| le x

    Q= U = 500

    radic6x13 le x23

    2000radic

    6=

    x

    4U=

    x

    θU

    again under assumption (650) We apply (641) and obtain that |SII | is at most

    2xradicz(y)radic8y

    radic(log

    x

    U middot 4 middot 8y+ log 3y log

    log x3Uy

    log 323

    )(κ6 log

    x

    U middot 4 middot 8y+ 2κ7

    )+

    2κ2

    3

    (xradic16y

    ((log 32y)32 minus (log 2y)

    32 ) +

    x4radicQminus y

    ((log 4U)32 minus (log 2y)

    32 )

    )+

    (κ2radic

    2(1minus yQ)

    (radiclog V +

    radic1 log V

    )+ κ9

    )xradicV

    + κ2

    radicz(y) middot

    radiclog 4U middot xradic

    U

    (672)where we are using the facts that (log 3t8)t is increasing for t ge 8y gt 8e3 and that

    d

    dt

    (log t)32 minus (log V )32

    radict

    =3(log t)12 minus ((log t)32 minus (log V )32)

    2t32

    = minuslog t

    e3 middotradic

    log tminus (log V )32

    2t32lt 0

    for t ge θ middot 8y = 16V thanks to(log

    16V

    e3

    )2

    log 16V gt (log V )3 +

    (log 16minus 2 log

    e3

    16

    )(log V )2

    +

    ((log

    16

    e3

    )2

    minus 2 loge3

    16log 16

    )log V gt (log V )3

    132 CHAPTER 6 MINOR-ARC TOTALS

    (valid for log V ge 1) Much as before we can rewrite (672) as x56 times

    2radicz(et36)radic

    86

    radict

    3minus log 32c+

    (t

    3minus log 2

    )log

    t3 minus log 3c

    log 323

    middot

    radicκ6

    (t

    3minus log 32c

    )+ 2κ7 +

    2κ2

    3

    radic3

    8

    ((t

    3+ log

    32

    6

    ) 32

    minus(t

    3minus log 3

    ) 32

    )

    +2κ2

    3

    14radicet3

    6c minus16

    ((t

    3+ log 24c

    )32

    minus(t

    3minus log 3

    )32)

    +κ2

    radic3radic

    2(1minus c

    et3

    )(radic

    t3minus log 3 +1radic

    t3minus log 3

    )+ κ9

    radic3

    + κ2

    radicz(et36)

    radict3 + log 24c

    6c

    (673)where t = log x and c = 500

    radic6 For t ge 100 we use (670) to bound z(et36)

    and we obtain that (673) is at most

    2radiceγradic

    86

    radic1

    3middot κ6

    3middot (log t)t+

    2κ2

    3

    radic3

    8middot 1

    2

    (t

    3+ log

    32

    6

    )12

    middot log 16

    +2κ2

    3

    14radice1003

    6c minus 16

    middot 1

    2

    (t

    3+ log 24c

    )12

    middot log 72c

    +κ2

    radic3radic

    2(1minus c

    e1003

    )(radic

    t3 +1radict3

    )+ κ9

    radic3 + κ2

    radiceγ log t

    radict3 + log 24c

    6c

    (674)where we have bounded expressions of the form a32minusb32 (a gt b) by (a122)middot(aminusb)The ratio of (674) to t32 is clearly a decreasing function of t For t = 200 this ratiois 023747 hence (674) (and thus (673)) is at most 023748t32 for t ge 200

    On the range log(216 middot 1020) le t le 200 the bisection method (with 25 iterations)gives that the ratio of (673) to t32 is at most 023511

    We conclude that when |δ| gt 8 and |δ|q gt 8y

    |SII | le 023511x56(log x)32

    Thus (671) gives the worst caseWe now take totals and obtain

    Sη(x α) le |SI1|+ |SI2|+ |SII |le (24719 + 12309)x23 log x+ (000289 + 00006406)x23(log x)2

    + 0275964x56(log x)32

    le 027598x56(log x)32 + 123338x23 log x(675)

    64 CONCLUSION 133

    where we use (650) yet again

    64 ConclusionProof of Theorem 311 We have shown that |Sη(α x)| is at most (663) for q lex136 and at most (675) for q gt x136 It remains to simplify (663) slightlyBy the geometric meanarithmetic mean inequalityradic

    Cxδ0q(log δ0q + 0002) +log 4δ0q

    2

    radic030214 log δ0q + 067506 (676)

    is at most

    1

    2radicρ

    (Cxδ0q(log δ0q + 0002) +

    log 4δ0q

    2

    )+

    radicρ

    2(030214 log δ0q + 067506)

    for any ρ gt 0 We recall that

    Cxt = log

    (1 +

    log 4t

    2 log 9x13

    2004t

    )

    Let

    ρ =Cx12q0(log 2q0 + 0002) + log 8q0

    2

    030214 log 2q0 + 067506= 3397962

    where x1 = 1025 q0 = 2 middot 105 (In other words we are optimizing matters for x = x1δ0q = 2q0 the losses in nearby ranges will be very slight) We obtain that (676) is atmost

    Cxδ0q2radicρ

    (log δ0q + 0002) +

    (1

    4radicρ

    +

    radicρ middot 030214

    2

    )log δ0q

    +1

    2

    (log 2radicρ

    +

    radicρ

    2middot 067506

    )le 027125Cxt(log δ0q + 0002) + 04141 log δ0q + 049911

    (677)

    Now for x ge x0 = 216 middot 1020

    Cxtlog t

    le Cx0t

    log t=

    1

    log tlog

    (1 +

    log 4t

    2 log 54middot106

    2004t

    )le 008659

    for 8 le t le 106 (by the bisection method with 20 iterations) and

    Cxtlog t

    leC(6t)3t

    log tle 1

    log tlog

    (1 +

    log 4t

    2 log 9middot62004

    )le 008659

    if 106 lt t le x136 Hence

    027125 middot Cxδ0q middot 0002 le 0000047 log δ0q

    134 CHAPTER 6 MINOR-ARC TOTALS

    We conclude that for q le x136

    |Sη(α x)| le Rxδ0q log δ0q + 049911radicφ(q)δ0

    middot x+2492xradicqδ0

    +2x

    δ0φ(q)

    (13

    4log δ0q + 782

    )+

    2x

    δ0q(1366 log δ0q + 3755) + 336x56

    where

    Rxt = 027125 log

    (1 +

    log 4t

    2 log 9x13

    2004t

    )+ 041415

    Part II

    Major arcs

    135

    Chapter 7

    Major arcs overview andresults

    Our task as in Part I will be to estimate

    Sη(α x) =sumn

    Λ(n)e(αn)η(nx) (71)

    where η R+ rarr C us a smooth function Λ is the von Mangoldt function and e(t) =e2πit Here we will treat the case of α lying on the major arcs

    We will see how we can obtain good estimates by using smooth functions η basedon the Gaussian eminust

    22 This will involve proving new fully explicit bounds for theMellin transform of the twisted Gaussian or what is the same bounds on paraboliccylindrical functions in certain ranges It will also require explicit formulae that aregeneral and strong enough even for moderate values of x

    Let α = aq + δx For us saying that α lies on a major arc will be the same assaying that q and δ are bounded more precisely q will be bounded by a constant r and|δ| will be bounded by a constant times rq As is customary on the major arcs wewill express our exponential sum (31) as a linear combination of twisted sums

    Sηχ(δx x) =

    infinsumn=1

    Λ(n)χ(n)e(δnx)η(nx) (72)

    for χ Zrarr C a Dirichlet character mod q ie a multiplicative character on (ZqZ)lowast

    lifted to Z (The advantage here is that the phase term is now e(δnx) rather thane(αn) and e(δnx) varies very slowly as n grows) Our task then is to estimateSηχ(δx x) for δ small

    Estimates on Sηχ(δx x) rely on the properties of DirichletL-functionsL(s χ) =sumn χ(n)nminuss What is crucial is the location of the zeroes of L(s χ) in the critical strip

    0 le lt(s) le 1 (a region in which L(s χ) can be defined by analytic continuation) Incontrast to most previous work we will not use zero-free regions which are too narrowfor our purposes Rather we use a verification of the Generalized Riemann Hypothesisup to bounded height for all conductors q le 300000 (due to D Platt [Plab])

    137

    138 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

    A key feature of the present work is that it allows one to mimic a wide varietyof smoothing functions by means of estimates on the Mellin transform of a singlesmoothing function ndash here the Gaussian eminust

    22

    71 Results

    Write ηhearts(t) = eminust22 Let us first give a bound for exponential sums on the primes

    using ηhearts as the smooth weight Without loss of generality we may assume that ourcharacter χ mod q is primitive ie that it is not really a character to a smaller modulusqprime|q

    Theorem 711 Let x be a real numberge 108 Let χ be a primitive Dirichlet charactermod q 1 le q le r where r = 300000

    Then for any δ isin R with |δ| le 4rq

    infinsumn=1

    Λ(n)χ(n)e

    xn

    )eminus

    (nx)2

    2 = Iq=1 middot ηhearts(minusδ) middot x+ E middot x

    where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

    |E| le 4306 middot 10minus22 +1radicx

    (650400radicq

    + 112

    )

    We normalize the Fourier transform f as follows f(t) =intinfinminusinfin e(minusxt)f(x)dx Of

    course ηhearts(minusδ) is justradic

    2πeminus2π2δ2 As it turns out smooth weights based on the Gaussian are often better in applica-

    tions than the Gaussian ηhearts itself Let us give a bound based on η(t) = t2ηhearts(t)

    Theorem 712 Let η(t) = t2eminust22 Let x be a real number ge 108 Let χ be a

    primitive character mod q 1 le q le r where r = 300000Then for any δ isin R with |δ| le 4rq

    infinsumn=1

    Λ(n)χ(n)e

    xn

    )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

    where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

    |E| le 2485 middot 10minus19 +1radicx

    (281200radicq

    + 56

    )

    The advantage of η(t) = t2ηhearts(t) over ηhearts is that it vanishes at the origin (to secondorder) as we shall see this makes it is easier to estimate exponential sums with thesmoothing η lowastM g where lowastM is a Mellin convolution and g is nearly arbitrary Here isa good example that is used crucially in Part III

    71 RESULTS 139

    Corollary 713 Let η(t) = t2eminust22 lowastM η2(t) where η2 = η1 lowastM η1 and η1 =

    2 middot I[121] Let x be a real number ge 108 Let χ be a primitive character mod q1 le q le r where r = 300000

    Then for any δ isin R with |δ| le 4rq

    infinsumn=1

    Λ(n)χ(n)e

    xn

    )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

    where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

    |E| le 2485 middot 10minus19 +1radicx

    (381500radicq

    + 76

    )

    Let us now look at a different kind of modification of the Gaussian smoothing Saywe would like a weight of a specific shape for example what we will need to do inPart III we would like an approximation to the function

    η t 7rarr

    t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

    (73)

    At the same time what we have is an estimate for the Mellin transform of the Gaussianeminust

    22 centered at t = 0The route taken here is to work with an approximation η+ to η We let

    η+(t) = hH(t) middot teminust22 (74)

    where hH is a band-limited approximation to

    h(t) =

    t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

    (75)

    By band-limited we mean that the restriction of the Mellin transform of hH to theimaginary axis is of compact support (We could alternatively let hH be a functionwhose Fourier transform is of compact support this would be technically easier insome ways but it would also lead to using GRH verifications less efficiently)

    To be precise we define

    FH(t) =sin(H log y)

    π log y

    hH(t) = (h lowastM FH)(y) =

    int infin0

    h(tyminus1)FH(y)dy

    y

    (76)

    and H is a positive constant It is easy to check that MFH(iτ) = 1 for minusH ltτ lt H and MFH(iτ) = 0 for τ gt H or τ lt minusH (unsurprisingly since FH is aDirichlet kernel under a change of variables) Since in general the Mellin transform ofa multiplicative convolution f lowastM g equals Mf middotMg we see that the Mellin transform

    140 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

    of hH on the imaginary axis equals the truncation of the Mellin transform of h to[minusiH iH] Thus hH is a band-limited approximation to h as we desired

    The distinction between the odd and the even case in the statement that followssimply reflects the two different points up to which computations where carried out in[Plab] these computations were in turn to some extent tailored to the needs of thepresent work (as was the shape of η+ itself)

    Theorem 714 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

    H = 200 Let x be a real numberge 1012 Let χ be a primitive character mod q where1 le q le 150000 if q is odd and 1 le q le 300000 if q is even

    Then for any δ isin R with |δ| le 600000 middot gcd(q 2)q

    infinsumn=1

    Λ(n)χ(n)e

    xn

    )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

    where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

    |E| le 13482 middot 10minus14 +1617 middot 10minus10

    q+

    1radicx

    (499900radicq

    + 52

    )

    If q = 1 we have the sharper bound

    |E| le 4772 middot 10minus11 +251400radic

    x

    This is a paradigmatic example in that following the proof given in sect94 we canbound exponential sums with weights of the form hH(t)eminust

    22 where hH is a band-limited approximation to just about any continuous function of our choosing

    Lastly we will need an explicit estimate of the `2 norm corresponding to the sumin Thm 714 for the trivial character

    Proposition 715 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

    H = 200 Let x be a real number ge 1012Theninfinsumn=1

    Λ(n)(log n)η2(nx) = x middotint infin

    0

    η2+(t) log xt dt+ E1 middot x log x

    = 0640206x log xminus 0021095x+ E2 middot x log x

    where|E1| le 5123 middot 10minus15 +

    36691radicx

    |E2| le 2 middot 10minus6 +36691radic

    x

    72 Main ideasAn explicit formula gives an expression

    Sηχ(δx x) = Iq=1η(minusδ)xminussumρ

    Fδ(ρ)xρ + small error (77)

    72 MAIN IDEAS 141

    where Iq=1 = 1 if q = 1 and Iq=1 = 0 otherwise Here ρ runs over the complexnumbers ρ with L(ρ χ) = 0 and 0 lt lt(ρ) lt 1 (ldquonon-trivial zerosrdquo) The function Fδis the Mellin transform of e(δt)η(t) (see sect24)

    The questions are then where are the non-trivial zeros ρ of L(s χ) How fast doesFδ(ρ) decay as =(ρ)rarr plusmninfin

    Write σ = lt(s) τ = =(s) The belief is of course that σ = 12 for every non-trivial zero (Generalized Riemann Hypothesis) but this is far from proven Most workto date has used zero-free regions of the form σ le 1minus1C log q|τ | C a constant Thisis a classical zero-free region going back qualitatively to de la Vallee-Poussin (1899)The best values of C known are due to McCurley [McC84a] and Kadiri [Kad05]

    These regions seem too narrow to yield a proof of the three-primes theorem Whatwe will use instead is a finite verification of GRH ldquoup to Tqrdquo ie a computation show-ing that for every Dirichlet character of conductor q le r0 (r0 a constant as above)every non-trivial zero ρ = σ + iτ with |τ | le Tq satisfies lt(σ) = 12 Such verifica-tions go back to Riemann modern computer-based methods are descended in part froma paper by Turing [Tur53] (See the historical article [Boo06b]) In his thesis [Pla11]D Platt gave a rigorous verification for r0 = 105 Tq = 108q In coordination withthe present work he has extended this to

    bull all odd q le 3 middot 105 with Tq = 108q

    bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

    This was a major computational effort involving in particular a fast implementationof interval arithmetic (used for the sake of rigor)

    What remains to discuss then is how to choose η in such a way Fδ(ρ) decreasesfast enough as |τ | increases so that (77) gives a good estimate We cannot hope forFδ(ρ) to start decreasing consistently before |τ | is at least as large as a constant times|δ| Since δ varies within (minuscr0q cr0q) this explains why Tq is taken inverselyproportional to q in the above As we will work with r0 ge 150000 we also see that wehave little margin for maneuver we want Fδ(ρ) to be extremely small already for say|τ | ge 80|δ| We also have a Scylla-and-Charybdis situation courtesy of the uncertaintyprinciple roughly speaking Fδ(ρ) cannot decrease faster than exponentially on |τ ||δ|both for |δ| le 1 and for δ large

    The most delicate case is that of δ large since then |τ ||δ| is small It turns outwe can manage to get decay that is much faster than exponential for δ large while noslower than exponential for δ small This we will achieve by working with smoothingfunctions based on the (one-sided) Gaussian ηhearts(t) = eminust

    22The Mellin transform of the twisted Gaussian e(δt)eminust

    22 is a parabolic cylinderfunction U(a z) with z purely imaginary Since fully explicit estimates for U(a z)z imaginary have not been worked in the literature we will have to derive them our-selves

    Once we have fully explicit estimates for the Mellin transform of the twisted Gaus-sian we are able to use essentially any smoothing function based on the Gaussianηhearts(t) = eminust

    22 As we already saw we can and will consider smoothing functionsobtained by convolving the twisted Gaussian with another function and also functionsobtained by multiplying the twisted Gaussian with another function All we need to

    142 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

    do is use an explicit formula of the right kind ndash that is a formula that does not as-sume too much about the smoothing function or the region of holomorphy of its Mellintransform but still gives very good error terms with simple expressions

    All results here will be based on a single general explicit formula (Lem 911) validfor all our purposes The contribution of the zeros in the critical trip can be handled ina unified way (Lemmas 913 and 914) All that has to be done for each smoothingfunction is to bound a simple integral (in (924)) We then apply a finite verification ofGRH and are done

    Chapter 8

    The Mellin transform of thetwisted Gaussian

    Our aim in this chapter is to give fully explicit yet relatively simple bounds for theMellin transform Fδ(ρ) of e(δt)ηhearts(t) where ηhearts(t) = eminust

    22 and δ is arbitrary Therapid decay that results will establish that the Gaussian ηhearts is a very good choice for asmoothing particularly when the smoothing has to be twisted by an additive charactere(δt)

    The Gaussian smoothing has been used before in number theory see notablyHeath-Brownrsquos well-known paper on the fourth power moment of the Riemann zetafunction [HB79] What is new here is that we will derive fully explicit bounds on theMellin transform of the twisted Gaussian This means that the Gaussian smoothing willbe a real option in explicit work on exponential sums in number theory and elsewherefrom now on1

    Theorem 801 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of fδ

    Let s = σ + iτ σ ge 0 τ 6= 0 Let ` = minus2πδ Then if sgn(δ) 6= sgn(τ) and δ 6= 0

    |Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

    c1σττ

    σ2 for ρ arbitraryc2στ`

    σ for ρ le 32(81)

    1 There has also been work using the Gaussian after a logarithmic change of variables see in particular[Leh66] In that case the Mellin transform is simply a Gaussian (as in eg [MV07 Ex XII29]) Howeverfor δ non-zero the Mellin transform of a twist e(δt)eminus(log t)22 decays very slowly and thus would not beuseful for our purposes or in general for most applications in which GRH is not assumed

    143

    144 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    where ρ = 4τ`2

    E(ρ) =1

    2

    (arccos

    1

    υ(ρ)minus 2(υ(ρ)minus 1)

    ρ

    )

    c1στ =1

    2

    1 + 214

    (2

    1 + sin2 π8

    )σ2+eminus(radic

    2minus12

    )τ(

    tan π8

    c2στ =1

    2

    1 + min

    2σ+ 12

    radicsec 2π

    5(sin π

    5

    )σ+

    eminusτ6

    (1radic

    3)σ

    (82)

    and

    υ(ρ) =

    radic1 +

    radicρ2 + 1

    2

    If sgn(δ) = sgn(τ) or δ = 0

    |Fδ(s)| le |x0|minusσ middot eminus12 `

    2

    |Γ(s)|eπ2 |τ | middot((

    1 +π

    232

    )eminus

    π4 |τ | +

    1

    2eminusπ|τ |

    ) (83)

    where

    |x0| ge

    051729

    radicτ for ρ arbitrary

    084473 |τ ||`| for ρ le 32(84)

    As we shall see the choice of smoothing function η(t) = eminust22 can be easily

    motivated by the method of stationary phase but the problem is actually solved by thesaddle-point method One of the challenges here is to keep all expressions explicit andpractical

    (In particular the more critical estimate (81) is optimal up to a constant dependingon σ the constants we give will be good rather than optimal)

    The expressions in Thm 801 can be easily simplified further especially if one isready to introduce some mild constraints and make some sacrifices in the main term

    Corollary 802 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of

    fδ Let s = σ + iτ where σ isin [0 1] and |τ | ge 20 Then for 0 le k le 2

    |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

    κk0(|τ ||`|

    )keminus01065( 2|τ|

    |`| )2

    if 4|τ |`2 lt 32

    κk1|τ |k2eminus01598|τ | if 4|τ |`2 ge 32

    whereκ00 le 3001 κ10 le 4903 κ20 le 796

    κ01 le 3286 κ11 le 4017 κ21 le 513

    We are considering Fδ(s + k) and not just Fδ(s) because bounding Fδ(s + k)

    enables us to work with smoothing functions equal to or based on tkeminust22 Clearly

    we can easily derive bounds with k arbitrary from Thm 801 It is just that we will

    81 HOW TO CHOOSE A SMOOTHING FUNCTION 145

    use k = 0 1 2 in practice Corollary 802 is meant to be applied to cases where τis larger than a constant (10 say) times |`| and σ cannot be bounded away from 1 ifeither condition fails to hold it is better to apply Theorem 801 directly

    Let us end by a remark that may be relevant to applications outside number theoryBy (89) Thm 801 gives us bounds on the parabolic cylinder function U(a z) for zpurely imaginary (Surprisingly there seem to have been no fully explicit bounds forthis case in the literature) The bounds are useful when |=(a)| is at least somewhatlarger than |=(z)| (ie when |τ | is large compared to `) While the Thm 801 is statedfor σ ge 0 (ie for lt(a) ge minus12) extending the result to larger half-planes for a isnot hard

    81 How to choose a smoothing functionLet us motivate our choice of smoothing function η The method of stationary phase([Olv74 sect411] [Won01 sectII3])) suggests that the main contribution to the integral

    Fδ(t) =

    int infin0

    e(δt)η(t)tsdt

    t(85)

    should come when the phase has derivative 0 The phase part of (85) is

    e(δt)t=(s)i = e(2πδt+τ log t)i

    (where we write s = σ + iτ ) clearly

    (2πδt+ τ log t)prime = 2πδ +τ

    t= 0

    when t = minusτ2πδ This is meaningful when t ge 0 ie sgn(τ) 6= sgn(δ) Thecontribution of t = minusτ2πδ to (85) is then

    η(t)e(δt)tsminus1 = η

    (minusτ2πδ

    )eminusiτ

    (minusτ2πδ

    )σ+iτminus1

    (86)

    multiplied by a ldquowidthrdquo approximately equal to a constant divided byradic|(2πiδt+ τ log t)primeprime| =

    radic| minus τt2| = 2π|δ|radic

    |τ |

    The absolute value of (86) is

    η(minus τ

    2πδ

    )middot∣∣∣∣ minusτ2πδ

    ∣∣∣∣σminus1

    (87)

    In other words if sgn(τ) 6= sgn(δ) and δ is not too small asking that Fδ(σ + iτ)decay rapidly as |τ | rarr infin amounts to asking that η(t) decay rapidly as t rarr 0 Thusif we ask for Fδ(σ + iτ) to decay rapidly as |τ | rarr infin for all moderate δ we arerequesting that

    146 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    1 η(t) decay rapidly as trarrinfin

    2 the Mellin transform F0(σ + iτ) decay rapidly as τ rarr plusmninfin

    Requirement (2) is there because we also need to consider Fδ(σ+ it) for δ very smalland in particular for δ = 0

    There is clearly an uncertainty-principle issue here one cannot do arbitrarily wellin both aspects at the same time Once we are conscious of this the choice η(t) = eminust

    in Hardy-Littlewood actually looks fairly good obviously η(t) = eminust decays expo-nentially and its Mellin transform Γ(s + iτ) also decays exponentially as τ rarr plusmninfinMoreover for this choice of η the Mellin transform Fδ(s) can be written explicitlyFδ(s) = Γ(s)(1minus 2πiδ)s

    It is not hard to work out an explicit formula2 for η(t) = eminust However it is nothard to see that for Fδ(s) as above Fδ(12 + it) decays like eminust2π|δ| just as weexpected from (87) This is a little too slow for our purposes we will often haveto work with relatively large δ and we would like to have to check the zeroes of Lfunctions only up to relatively low heights t ndash say up to 50|δ| Then eminust2π|δ| gteminus8 = 000033 which is not very small We will settle for a different choice of ηthe Gaussian

    The decay of the Gaussian smoothing function η(t) = eminust22 is much faster than

    exponential Its Mellin transform is Γ(s2) which decays exponentially as =(s) rarrplusmninfin Moreover the Mellin transform Fδ(s) (δ 6= 0) while not an elementary orvery commonly occurring function equals (after a change of variables) a relativelywell-studied special function namely a parabolic cylinder function U(a z) (or inWhittakerrsquos [Whi03] notation Dminusaminus12(z))

    For δ not too small the main term will indeed work out to be proportional toeminus(τ2πδ)22 as the method of stationary phase indicated This is of course muchbetter than eminusτ2π|δ| The ldquocostrdquo is that the Mellin transform Γ(s2) for δ = 0 nowdecays like eminus(π4)|τ | rather than eminus(π2)|τ | This we can certainly afford

    82 The twisted Gaussian overview and setup

    821 Relation to the existing literatureWe wish to approximate the Mellin transform

    Fδ(s) =

    int infin0

    eminust22e(δt)ts

    dt

    t (88)

    where δ isin R The parabolic cylinder function U C2 rarr C is given by

    U(a z) =eminusz

    24

    Γ(

    12 + a

    ) int infin0

    taminus12 eminus

    12 t

    2minusztdt

    2There may be a minor gap in the literature in this respect The explicit formula given in [HL22 Lemma4] does not make all constants explicit The constants and trivial-zero terms were fully worked out forq = 1 by [Wig20] (cited in [MV07 Exercise 12118(c)] the sign of hypκq(z) there seems to be off) Aswas pointed out by Landau (see [Har66 p 628]) [HL22] seems to neglect the effect of the zeros ρ withlt(ρ) = 0 =(ρ) 6= 0 for χ non-primitive (The author thanks R C Vaughan for this information and thereferences)

    82 THE TWISTED GAUSSIAN OVERVIEW AND SETUP 147

    for lt(a) gt minus12 the function can be extended to all a z isin C either by analyticcontinuation or by other integral representations ([AS64 sect195] [Tem10 sect125(i)])Hence

    Fδ(s) = e(πiδ)2Γ(s)U

    (sminus 1

    2minus2πiδ

    ) (89)

    The second argument of U is purely imaginary it would be otherwise if a Gaussian ofnon-zero mean were chosen

    Let us briefly discuss the state of knowledge up to date on Mellin transforms ofldquotwistedrdquo Gaussian smoothings that is eminust

    22 multiplied by an additive charactere(δt) As we have just seen these Mellin transforms are precisely the parabolic cylin-der functions U(a z)

    The function U(a z) has been well-studied for a and z real see eg [Tem10]Less attention has been paid to the more general case of a and z complex The mostnotable exception is by far the work of Olver [Olv58] [Olv59] [Olv61] [Olv65] hegave asymptotic series for U(a z) a z isin C These were asymptotic series in the senseof Poincare and thus not in general convergent they would solve our problem if andonly if they came with error term bounds Unfortunately it would seem that all fullyexplicit error terms in the literature are either for a and z real or for a and z outsideour range of interest (see both Olverrsquos work and [TV03]) The bounds in [Olv61]involve non-explicit constants Thus we will have to find expressions with expliciterror bounds ourselves Our case is that of a in the critical strip z purely imaginary

    822 General approach

    We will use the saddle-point method (see eg [dB81 sect5] [Olv74 sect47] [Won01sectII4]) to obtain bounds with an optimal leading-order term and small error terms (Weused the stationary-phase method solely as an exploratory tool)

    What do we expect to obtain Both the asymptotic expressions in [Olv59] and thebounds in [Olv61] make clear that if the sign of τ = =(s) is different from that of δthere will a change in behavior when τ gets to be of size about (2πδ)2 This is unsur-prising given our discussion using stationary phase for |=(a)| smaller than a constanttimes |=(z)|2 the term proportional to eminus(π4)|τ | = eminus|=(a)|2 should be dominantwhereas for |=(a)| much larger than a constant times |=(z)|2 the term proportional to

    eminus12 ( τ

    2πδ )2

    should be dominantThere is one important difference between the approach we will follow here and

    that in [Hela] In [Hela] the integral (88) was estimated by a direct application ofthe saddle-point method Here following a suggestion of N Temme we will use theidentity

    U(a z) =e

    14 z

    2

    radic2πi

    int c+iinfin

    cminusiinfineminuszu+u2

    2 uminusaminus12 du (810)

    (see eg [OLBC10 (1256)] c gt 0 is arbitrary) Together (89) and (810) give usthat

    Fδ(s) =eminus2π2δ2Γ(s)radic

    2πi

    int c+iinfin

    cminusiinfine2πiδu+u2

    2 uminussdu (811)

    148 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    Estimating the integral in (811) turns out to be a somewhat cleaner task than estimating(88) The overall procedure however is in essence the same in both cases

    We write

    φ(u) = minusu2

    2minus (2πiδ)u+ iτ log u (812)

    for u real or complex so that the integral in (811) equals

    I(s) =

    int c+iinfin

    cminusiinfineminusφ(u)uminusσdu (813)

    We wish to find a saddle point A saddle point is a point u at which φprime(u) = 0This means that

    minus uminus 2πiδ +iτ

    u= 0 ie u2 minus i`uminus iτ = 0 (814)

    where ` = minus2πδ The solutions to φprime(u) = 0 are thus

    u0 =i`plusmnradicminus`2 + 4iτ

    2 (815)

    The value of φ(u) at u0 is

    φ(u0) = minus i`u0 + iτ

    2+ i`u0 + iτ log u0

    =i`

    2u0 + iτ log

    u0radice

    (816)

    The second derivative at u0 is

    φprimeprime(u0) = minus 1

    u20

    (u2

    0 + iτ)

    = minus 1

    u20

    (i`u0 + 2iτ) (817)

    Assign the names u0+ u0minus to the roots in (815) according to the sign in frontof the square-root (where the square-root is defined so as to have its argument in theinterval (minusπ2 π2]) We will actually have to pay attention just to u0+ since unlikeu0minus it lies on the right half of the plane where our contour of integration also liesWe remark that

    u0+ =i`+ |`|

    radicminus1 + 4iτ

    `2

    2=`

    2

    (iplusmnradicminus1 +

    `2i

    )(818)

    where the sign plusmn is + if ` gt 0 and minus if ` lt 0 If ` = 0 then u0+ = (1radic

    2 +iradic

    2)radicτ

    We can assume without loss of generality that τ ge 0 We will find it convenient toassume τ gt 0 since we can deal with τ = 0 simply by letting τ rarr 0+

    83 THE SADDLE POINT 149

    83 The saddle point

    831 The coordinates of the saddle point

    We should start by determining u0+ explicitly both in rectangular and polar coordi-nates For one thing we will need to estimate the integrand in (813) for u = u0+ Theabsolute value of the integrand is then

    ∣∣eminusφ(u0+)uminusσ0+

    ∣∣ = |u0+|minusσeminusltφ(u0+) and by(816)

    ltφ(u0+) = minus `2=(u0+)minus arg(u0+)τ (819)

    If ` = 0 we already know that lt(u0+) = =(u0+) =radicτ2 |u0+| =

    radicτ and

    arg u0+ = π4 Assume from now on that ` 6= 0

    We will use the expression for u0+ in (818) Solving a quadratic equation we seethat

    radicminus1 +

    `2i =

    radicj(ρ)minus 1

    2+ i

    radicj(ρ) + 1

    2 (820)

    where j(ρ) = (1 + ρ2)12 and ρ = 4τ`2 Hence

    lt(u0+) = plusmn `2

    radicj(ρ)minus 1

    2 =(u0+) =

    `

    2

    (1plusmn

    radicj(ρ) + 1

    2

    ) (821)

    Here and in what follows the signplusmn is + if ` gt 0 andminus if ` lt 0 (Notice thatlt(u0+)and =(u0+) are always positive except for τ = ` = 0 in which case lt(u0+) ==(u0+) = 0) By (821)

    |u0+| =|`|2middot

    ∣∣∣∣∣radicminus1 + j(ρ)

    2+

    (1plusmn

    radic1 + j(ρ)

    2

    )i

    ∣∣∣∣∣=|`|2

    radicminus1 + j(ρ)

    2+

    1 + j(ρ)

    2+ 1plusmn 2

    radic1 + j(ρ)

    2

    =|`|2

    radic1 + j(ρ)plusmn 2

    radic1 + j(ρ)

    2=|`|radic

    2

    radicυ(ρ)2 plusmn υ(ρ)

    (822)

    150 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    where υ(ρ) =radic

    (1 + j(ρ))2 We now compute the argument of u0+

    arg(u0+) = arg(`(iplusmnradicminus1 + iρ

    ))= arg

    (radicminus1 + j(ρ)

    2+ i

    (plusmn1 +

    radic1 + j(ρ)

    2

    ))

    = arcsin

    plusmn1 +radic

    1+j(ρ)2radic

    1 + j(ρ)plusmn 2radic

    1+j(ρ)2

    = arcsin

    radicplusmn1 +

    radic1+j(ρ)

    2radic2radic

    1+j(ρ)2

    = arcsin

    radicradicradicradic1

    2

    (1plusmn

    radic2

    1 + j(ρ)

    ) =π

    2minus 1

    2arccos

    (plusmn

    radic2

    1 + j(ρ)

    )(823)

    (by cos(π minus 2θ) = minus cos 2θ = 2 sin2 θ minus 1) Thus

    arg(u0+) =

    π2 minus

    12 arccos 1

    υ(ρ) = 12 arccos minus1

    υ(ρ) if ` gt 012 arccos 1

    υ(ρ) if ` lt 0(824)

    In particular arg(u0+) lies in [0 π2] and is close to π2 only when ` gt 0 andρ rarr 0+ Here and elsewhere we follow the convention that arcsin and arctan haveimage in [minusπ2 π2] whereas arccos has image in [0 π]

    832 The direction of steepest descent

    As is customary in the saddle-point method it is now time to determine the directionof steepest descent at the saddle-point u0+ Even if we decide to use a contour thatgoes through the saddle-point in a direction that is not quite optimal it will be usefulto know what the direction w of steepest descent actually is A contour that passesthrough the saddle-point making an angle between minusπ4 + ε and π4 minus ε with wmay be acceptable in that the contribution of the saddle point is then suboptimal by atmost a bounded factor depending on ε an angle approaching minusπ4 or π4 leads to acontribution suboptimal by an unbounded factor

    Let w isin C be the unit vector pointing in the direction of steepest descent Thenby definition w2φprimeprime(u0+) is real and positive where φ is as in (812) Thus arg(w) =minus arg(φprimeprime(u0+))2 modπ (The direction of steepest descent is defined only moduloπ) By (817)

    arg(φprimeprime(u0+)) = minusπ + arg(i`u0+ + 2iτ)minus 2 arg(u0+) mod 2π

    = minusπ2

    + arg(`u0+ + 2τ)minus 2 arg(u0+) mod 2π

    83 THE SADDLE POINT 151

    By (821)

    lt(`u0+ + 2τ) =`2

    2

    (plusmnradicj(ρ)minus 1

    2+

    `2

    )=`2

    2

    (ρplusmn

    radicj(ρ)minus 1

    2

    )

    =(`u0+ + 2τ) =`2

    2

    (1plusmn

    radicj(ρ) + 1

    2

    )

    Therefore arg(`u0+ + 2τ) = arctan$ where

    $ =1plusmn

    radicj(ρ)+1

    2

    ρplusmnradic

    j(ρ)minus12

    It is easy to check that sgn$ = sgn ` Hence

    arctan$ = plusmnπ2minus arctan

    ρplusmnradic

    j(ρ)minus12

    1plusmnradic

    j(ρ)+12

    At the same time

    ρplusmnradic

    jminus12

    1plusmnradic

    j+12

    =

    (ρplusmn

    radicjminus1

    2

    )(1∓

    radicj+1

    2

    )1minus j+1

    2

    =ρplusmn

    radic2(j minus 1)∓ ρ

    radic2(j + 1)

    1minus j

    =ρplusmn

    radic2j+1

    (radicj2 minus 1minus ρ middot (j + 1)

    )1minus j

    =ρplusmn 1

    υ (ρminus ρ middot (j + 1))

    1minus j

    =ρ(1∓ jυ)

    1minus j=

    (minus1plusmn jυ)(j + 1)

    ρ=

    2υ(minusυ plusmn j)ρ

    (825)Hence modulo 2π

    arg(φprimeprime(u0+)) = minus arctan2υ(minusυ plusmn j)

    ρminus 2 arg(u0+)minus

    0 if ` ge 0

    π if ` lt 0

    Therefore the direction of steepest descent is

    arg(w) = minusarg(φprimeprime(u0+))

    2= arg(u0+) +

    1

    2arctan

    2υ(minusυ plusmn j)ρ

    +

    0 if ` ge 0π2 if ` lt 0

    (826)By (824) and arccos 1υ = arctan

    radicυ2 minus 1 = arctan

    radic(j minus 1)2 we conclude that

    arg(w) =

    π2 + 1

    2

    (minus arctan 2υ(j+υ)

    ρ + arctanradic

    jminus12

    )if ` lt 0

    π2 + 1

    2

    (arctan 2υ(jminusυ)

    ρ minus arctanradic

    jminus12

    )if ` ge 0

    (827)

    152 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    Figure 81 arg(w) minus π2 as a function ofρ for ` lt 0

    Figure 82 arg(w) minus π2 as a function ofρ for ` ge 0

    There is nothing wrong in using plots here to get an idea of the behavior of arg(w)since at any rate the direction of steepest descent will play only an advisory role inour choices See Figures 81 and 82

    84 The integral over the contourWe must now choose the contour of integration The optimal contour should be one onwhich the phase of the integrand in (813) is constant ie =(φ(u)) is constant Thisis so because throughout the contour we want to keep descending from the saddleas rapidly as possible and so we want to maximize the absolute value of the deriva-tive of the real part of the exponent minusφ(u) At any point u if we are to maximize|lt(dφ(u)dt)| we want our contour to be such that =(dφ(u)dt) = 0 (We can alsosee this as follows if =(φ(u)) is constant there is no cancellation in (813) for us tomiss)

    Writing u = x+ iy we obtain from (812) that

    =(φ(u)) = minusxy + `x+ τ logradicx2 + y2 (828)

    We would thus be considering the curve =(φ(u)) = c where c is a constant Since weneed the contour to pass through the saddle point u0+ we set c = =(φ(u0+)) Theonly problem is that the curve =(φ(u)) = 0 given by (828) is rather uncomfortable towork with

    Instead we shall use several rather simple contours each appropriate for differentvalues of ` and τ

    841 A simple contourAssume first that ` gt 0 We could just let our contour L be the vertical line goingthrough u0+ Since the direction of steepest descent is never far from vertical (see

    84 THE INTEGRAL OVER THE CONTOUR 153

    (82)) this would be a good choice However the vertical line has the defect of goingtoo close to the origin when ρrarr 0

    Instead we will let L consist of three segments (a) the straight vertical ray

    (x0 y) y ge y0

    where x0 = ltu0+ ge 0 y0 = =u0+ gt 0 (b) the straight segment going downwardsand to the right from u0+ to the x-axis forming an angle of π2 minus β (where β gt 0will be determined later) with the x-axis at a point (x1 0) (c) the straight vertical ray(x1 y) y le 0 Let us call these three segments L1 L2 L3 Shifting the contour in(813) we obtain

    I =

    intL

    eminusφ(u)uminusσdu

    and so |I| le I1 + I2 + I3 where

    Ij =

    intLj

    ∣∣∣eminusφ(u)uminusσ∣∣∣ |du| (829)

    As we shall see we have chosen the segments Lj so that each of the three integrals Ijwill be easy to bound

    Let us start with I1 Since σ ge 0

    I1 le |u0+|minusσint infiny0

    eminusltφ(x0+iy)dy

    where by (812)

    ltφ(x+ iy) =y2 minus x2

    2minus `y minus τ arg(x+ iy) (830)

    Let us expand the expression on the right of (830) for x = x0 and y around y0 ==u0+ gt 0 The constant term is

    ltφ(u0+) = minus `2y0 minus τ arg(u0+) = minus`

    2

    4(1 + υ(ρ))minus τ

    2arccos

    minus1

    υ(ρ)

    = minus(

    1 + υ(ρ)

    ρ+

    1

    2arccos

    minus1

    υ(ρ)

    (831)

    where we are using (819) (821) and (824)The linear term vanishes because u0+ is a saddle-point (and thus a local extremum

    on L) It remains to estimate the quadratic term Now in (830) the term arg(x+ iy)equals arctan(yx) whose quadratic term we should now examine ndash but instead weare about to see that we can bound it trivially In general for t0 t isin R and f isin C2

    f(t) = f(t0) + f prime(t0) middot (tminus t0) +

    int t

    t0

    int r

    t0

    f primeprime(s)dsdr (832)

    Now arctanprimeprime(s) = minus2s(s2 + 1)2 and this is negative for s gt 0 and obeys

    arctanprimeprime(minuss) = minus arctanprimeprime(s)

    154 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    for all s Hence for t0 ge 0 and t ge minust0

    arctan t le arctan t0 + (arctanprime t0) middot (tminus t0) (833)

    Therefore in (830) we can consider only the quadratic term coming from (y2minusx2)2ndash namely (yminusy0)22 ndash and ignore the quadratic term coming from arg(x+ iy) Thus

    ltφ(x0 + iy) ge (y minus y0)2

    2+ ltφ(u0+) (834)

    for y ge minusy0 and in particular for y ge y0 Henceint infiny0

    eminusltφ(x0+iy)dy le eminusltφ(u0+)

    int infiny0

    eminus12 (yminusy0)2dy =

    radicπ2 middot eminusltφ(u0+) (835)

    Notice that once we choose to use the approximation (833) the vertical direction isactually optimal (In turn the fact that the direction of steepest descent is close tovertical shows us that we are not losing much by using the approximation (833))

    As for |u0+|minusσ we will estimate it by the easy bound

    |u0+| =`radic2

    radicυ2 + υ ge `radic

    2max

    (radicρ

    2radic

    2

    )= max(

    radicτ `) (836)

    where we use (822)Let us now bound I2 As we already said the linear term at u0+ vanishes Let

    u be the point at which L2 meets the line normal to it through the origin We musttake care that the angle formed by the origin u0+ and u be no larger than the angleformed by the origin (x1 0) and u0 this will ensure that we are in the range in whichthe approximation (833) is valid (namely t ge minust0 where t0 = tanα0) The firstangle is π2 +βminus arg u0+ whereas the second angle is π2minusβ Hence it is enoughto set β le (arg u0+)2 Then we obtain from (812) and (833) that

    ltφ(u) ge ltφ(u0+)minuslt (uminus u0+)2

    2 (837)

    If we let s = |uminus u0+| we see that

    lt (uminus u0+)2

    2=s2

    2cos(

    2 middot(π

    2minus β

    ))= minuss

    2

    2cos 2β

    Hence

    I2 le |u|minusσintL2

    eminusltφ(u)|du|

    lt |u|minusσint infin

    0

    eminusltφ(u0+)minus s22 cos 2βds = |u|minusσeminusltφ(u0+)

    radicπ

    2 cos 2β

    (838)

    Since arg u0 = arg u0+ minus β we see that by (821)

    |u| = lt ((x0 + iy0) (cosβ minus i sinβ))

    =`

    2

    (radicj minus 1

    2cosβ +

    (1 +

    radicj + 1

    2

    )sinβ

    )

    (839)

    84 THE INTEGRAL OVER THE CONTOUR 155

    The square of the expression within the outer parentheses is at least

    j minus 1

    2cos2 β +

    (1 +

    j + 1

    2+radic

    2(j + 1)

    )sin2 β +

    (radicj2 minus 1

    4+

    radicj minus 1

    2

    )sin 2β

    ge j

    2+

    7

    2sin2 β minus 1

    2cos2 β +

    j

    2sin2 β

    If β ge π8 then tanβ gt 1radic

    7 and so since j gt ρ we obtain

    |u| ge`

    2

    radicj

    2(1 + sin2 β) gt

    `radicρ

    232

    radic1 + sin2 β

    We can also apply the trivial bound j ge 1 directly to (839) Thus

    |u| ge max

    (radicτ

    2

    radic1 + sin2 β ` sinβ

    )

    Let us choose β as follows We could always set β = π8 since arg u0+ ge π4 wethen have β le (arg u0+)2 as required However if ρ le 32 then υ(ρ) le 118381and so by (824) arg u0+ ge 128842 We can thus set either β = π6 = 0523598 or β = π5 = 0628318 say either of which is smaller than (arg u0+)2 Goingback to (838) we conclude that

    I2 le eminusltφ(u0+) middotradicπ

    214

    ∣∣∣∣radicτ

    2

    radic1 + sin2 π

    8

    ∣∣∣∣minusσfor ρ arbitrary and

    I2 le eminusltφ(u0+) middotmin

    (radicπ2

    cos 2π5middot∣∣∣` sin

    π

    5

    ∣∣∣minusσ radicπ ∣∣∣∣ `2∣∣∣∣minusσ)

    when υ(ρ) le 32It remains to estimate I3 For u = x1

    minuslt (uminus u0+)2

    2= minuslty

    20 (tanβ minus i)2

    2=

    1

    2

    (1minus tan2 β

    )y2

    0

    ge(1minus tan2 β

    )middot `

    2

    8

    (1 +

    j + 1

    2

    )ge `2

    8

    (1minus tan2 β

    )middot ρ

    2

    ge 1

    4

    (1minus tan2 β

    where we are using (821) Thus (837) tells us that

    ltφ(x1) ge ltφ(u0+) +1minus tan2 β

    At the same time by (830) and τ ` ge 0

    ltφ(x1 + iy) ge ltφ(x1) +y2

    2

    156 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    for y le 0 Hence

    I3 le |x1|minusσintL3

    eminusltφ(u)|du| le |x1|minusσeminusltφ(x1)

    int 0

    minusinfineminusy

    22dy

    le |x1|minusσ middotradicπ

    2eminus

    1minustan2 β4 τeminusltφ(u0+)

    Here note that x1 ge (tanβ)|u0+| and so by (836)

    x1 ge tanβ middotmax(radicτ `)

    We conclude that for ` gt 0

    |I| le

    1 + 214

    (2

    1 + sin2 π8

    )σ2+eminus(radic

    2minus12

    )τ(

    tan π8

    )σ middot radicπ2

    τσ2eminusltφ(u0+)

    (since (1minus tan2 π8)4 = (radic

    2minus 1)2) and when ρ le 32

    |I| le

    1 + min

    2σ+ 12

    radicsec 2π

    5(sin π

    5

    )σ+

    eminusτ6

    (1radic

    3)σ

    middot radicπ2`σ

    eminusltφ(u0+)

    We know ltφ(u0+) from (831) Write

    E(ρ) =1

    2arccos

    1

    υ(ρ)minus υ(ρ)minus 1

    ρ (840)

    so that

    minusltφ(u0+) =1 + υ(ρ)

    ρ+

    1

    2arccos

    minus1

    υ(ρ)=π

    2minus E(ρ) +

    2

    ρ

    To finish we just need to apply (811) It makes sense to group together Γ(s)eπ2 τ

    since it is bounded on the critical line (by the classical formula |Γ(12 + iτ)| =radicπ coshπτ as in [MV07 Exer C1(b)]) and in general of slow growth on bounded

    strips Using (811) and noting that 2π2δ2 = `22 = (2ρ) middot τ we obtain

    |Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

    c1σττ

    σ2 for ρ arbitraryc2στ`

    σ for ρ le 32(841)

    where

    c1στ =1

    2

    1 + 214

    (2

    1 + sin2 π8

    )σ2+eminus(radic

    2minus12

    )τ(

    tan π8

    c2στ =1

    2

    1 + min

    2σ+ 12

    radicsec 2π

    5(sin π

    5

    )σ+

    eminusτ6

    (1radic

    3)σ

    (842)

    84 THE INTEGRAL OVER THE CONTOUR 157

    We have assumed throughout that ` ge 0 and τ ge 0 We can immediately obtain abound valid for ` le 0 τ le 0 by reflection on the x-axis we simply put absolutevalues around τ and ` in (841)

    We see that we have obtained a bound in a neat closed form without too mucheffort Of course this effortlessness is usually in part illusory the contour we haveused here is actually the product of some trial and error in that some other contoursgive results that are comparable in quality but harder to simplify We will have tochoose a different contour when sgn(`) 6= sgn(τ)

    842 Another simple contourWe now wish to give a bound for the case of sgn(`) 6= sgn(τ) ie sgn(δ) = sgn(τ)We expect a much smaller upper bound than for sgn(`) = sgn(τ) given what wealready know from the method of stationary phase This also means that we will notneed to be as careful in order to get a bound that is good enough for all practicalpurposes

    Our contour L will consist of three segments (a) the straight vertical ray (x0 y) y ge 0 (b) the quarter-circle from (x0 0) to (0minusx0) (that is an arc where the argu-ment runs from 0 to minusπ2) and (c) the straight vertical ray (0 y) y le minusx0 Wecall these segments L1 L2 L3 and define the integrals I1 I2 and I3 just as in (829)

    Much as before we have

    I1 le xminusσ0

    int infin0

    eminusltφ(x0+iy)dy

    Since (833) is valid for t ge 0 (834) holds and so

    I1 le xminusσ0 eminusltφ(u0+)

    int infinminusinfin

    eminus12 (yminusy0)2dy = xminusσ0

    radic2π middot eminusltφ(u0+)

    By (812) and (830)

    I2 le xminusσ0

    intL2

    eminusltφ(u)du = x1minusσ0

    int π2

    0

    eminus(minus x

    20 cos 2α

    2 +`x0 sinα+τα

    )dα (843)

    Now for α ge 0 and ` le 0

    (`x0 sinα+ τα)prime

    = `x0 cosα+ τ ge `x0 + τ

    Since j =radic

    1 + ρ2 le 1 + ρ22 we haveradic

    (j minus 1)2 le ρ2 and so by (821)|`x0| le `2ρ4 = τ and thus `x0 + τ ge 0 In other words the exponent in (843)equals (x2

    0 cos 2α)2 minus an increasing function and so since ltφ(x0) = minusx202

    I2 le xminusσ0 middot x0

    int π2

    0

    ex20 cos 2α

    2 dα = xminusσ0 middot π2x0 middot I0(x2

    02)

    where I0(t) = 1π

    int π0et cos θdθ is the modified Bessel function of the first kind (and

    order 0)

    158 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    Since cos θ =radic

    1minus sin2 θ lt 1minus (sin2 θ)2 le 1minus 2θ2π2 we have3

    I0(t) le 1

    π

    int π

    0

    et(

    1minus 2θ2

    π2

    )dθ lt et middot 1

    π

    int infin0

    eminus2tπ2 θ

    2

    dθ = etπradic

    2t

    π

    radicπ

    2=

    radicπ

    232

    etradict

    for t ge 0Using the fact that ltφ(x0) = minusx2

    02 we conclude that

    I2 le xminusσ0 middot π2x0 middot

    radicπ

    232

    ex202

    x0radic

    2=π32

    4xminusσ0 eminusltφ(x0)

    By (834) which is valid for all ` we know that ltφ(x0) ge ltφ(u0+)Let us now estimate the integral on L3 Again by (830) for y lt 0

    ltφ(iy) =y2

    2minus `y + τ

    π

    2

    Hence ∣∣∣∣intL3

    eminusφ(u)uminusσdu

    ∣∣∣∣ le xminusσ0

    int minusx0

    minusinfineminus(y2

    2 minus`y+τ π2

    )du

    = xminusσ0 e12 `

    2

    eminusτπ2

    int minusx0

    minusinfineminus

    12 (yminus`)2dy = xminusσ0 eminus

    τπ2

    radicπ

    2

    since yminus` le minus` for y le minusx0 andint minus`minusinfin eminust

    22dt leradicπ2middoteminus`22 (by [AS64 7113])

    Now that we have bounded the integrals over L1 L2 and L3 it remains to boundx0 from below starting from (821) We will bound it differently for ρ lt 32 and forρ ge 32 (The choice of 32 is fairly arbitrary)

    Expanding (radic

    1 + t minus 1)2 gt 0 we obtain that 2(1 + t) minus 2radic

    1 + t ge t for allt ge minus1 and so(radic

    1 + tminus 1

    t

    )prime=

    1

    t2

    (t

    2radic

    1 + tminus (radic

    1 + tminus 1)

    )lt 0

    ie (radic

    1 + tminus 1)t decreases as t increases Hence for ρ le ρ0 where ρ0 ge 0

    j(ρ) =radic

    1 + ρ2 ge 1 +

    radic1 + ρ2

    0 minus 1

    ρ20

    ρ2 (844)

    which equals 1 + (29)(radic

    13minus 2)ρ2 for ρ0 = 32 Thus for ρ le 32

    x0 ge|`|2

    radic29 (radic

    13minus 2)ρ2

    2=

    radicradic13minus 2

    6|`|ρ

    =2radicradic

    13minus 2

    3

    τ

    |`|ge 084473

    |τ |`

    (845)

    3It is actually not hard to prove rigorously the better bound I0(t) le 0468823etradict For t ge 8 this can

    be done directly by the change of variables cos θ = 1 minus 2s2 dθ = 2dsradic

    1minus s2 followed by the usageof different upper bounds on the the integrand exp(minus2ts2

    radic1minus s2) for 0 le s le 12 and 12 le s le 1

    (Thanks are due G Kuperberg for this argument) For t lt 8 use the Taylor expansion of I0(t) aroundt = 0 [AS64 (9612)] truncate it after 16 terms and then bound the maximum of the truncated series bythe bisection method implemented via interval arithmetic (as described in sect26)

    85 CONCLUSIONS 159

    On the other hand(j(ρ)minus 1

    ρ

    )prime=

    1

    ρ2(jprime(ρ)ρminus (j(ρ)minus 1)) =

    ρ2 minus (1 + ρ2) +radic

    1 + ρ2

    ρ2radic

    1 + ρ2ge 0

    and so for ρ ge 32 (j(ρ) minus 1)ρ is minimal at ρ = 32 where it takes the value(radic

    13minus 2)3 Hence

    x0 =|`|2

    radicj(ρ)minus 1

    2ge|`|radicρ

    2

    radicradic13minus 2radic

    6=

    radicradic13minus 2radic

    6

    radicτ ge 051729

    radicτ (846)

    We now sum I1 I2 and I3 and then use (811) we obtain that when ` lt 0 andτ ge 0

    |Fδ(s)| leeminus2π2δ2 |Γ(s)|radic

    ∣∣∣∣intL

    eminusφ(u)uminusσdu

    ∣∣∣∣le |x0|minusσ

    ((1 +

    π

    232

    )eminusltφ(u0+) +

    1

    2eminus

    τπ2

    )eminus

    12 `

    2

    |Γ(s)|(847)

    By (819) (821) and (824)

    minuslt(φ(u0+)) =`2

    4(1minus υ(ρ)) +

    τ

    2arccos

    1

    υ(ρ)ltτ

    2arccos

    1

    υ(ρ)le π

    We conclude that when sgn(`) 6= sgn(τ) (ie sgn(δ) = sgn(τ))

    |Fδ(s)| le |x0|minusσ middot eminus12 `

    2

    |Γ(s)|eπ2 |τ | middot((

    1 +π

    232

    )eminus

    π4 |τ | +

    1

    2eminusπ|τ |

    )

    where x0 can be bounded as in (845) and (846) Here as before we reducing the caseτ lt 0 to the case τ gt 0 by reflection This concludes the proof of Theorem 801

    85 ConclusionsWe have obtained bounds on |Fδ(s)| for sgn(δ) 6= sgn(τ) (841) and for sgn(δ) =sgn(τ) (847) Our task is now to simplify them

    First let us look at the exponent E(ρ) defined as in (82) Its plot can be seen inFigure 85 We claim that

    E(ρ) ge

    01598 if ρ ge 1501065ρ if ρ lt 15

    (848)

    This is so for ρ ge 15 because E(ρ) is increasing on ρ and E(15) = 015982 Thecase ρ lt 15 is a little more delicate We can easily see that arccos(1minus t22) ge t for0 le t ge 2 (since the derivative of the left side is 1

    radic1minus t24 which is always ge 1)

    We also have

    1 +ρ2

    2minus ρ4

    8le j(ρ) le 1 +

    ρ2

    2

    160 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    Figure 83 The function E(ρ)

    for 0 le ρ leradic

    8 and so

    1 +ρ2

    8minus 5ρ4

    128le υ(ρ) le 1 +

    ρ2

    8

    for 0 le ρ leradic

    325 this in turn gives us that 1υ(ρ) le 1minus ρ28 + 7ρ4128 (againfor 0 le ρ le

    radic325) and so 1υ(ρ) le 1 minus (1 minus 764)ρ28 for 0 le ρ le 12 We

    conclude that

    arccos1

    υ(ρ)ge 1

    2

    radic57

    64ρ

    therefore

    E(ρ) ge 1

    4

    radic57

    64ρminus ρ

    8gt 011093ρ gt 01065ρ

    In the remaining range 12 le ρ le 32 we prove that E(ρ)ρ gt 0106551 usingthe bisection method (with 20 iterations) implemented by means of interval arithmeticThis concludes the proof of (848)

    Assume from this point onwards that |τ | ge 20 Let us show that the contributionof (83) is negligible relative to that of (81) Indeed((

    1 +π

    232

    )eminus

    π4 |τ | +

    1

    2eminusπ|τ |

    )le 78

    106eminus01598τ

    It is useful to note that eminus`22 = eminus2τρ and so for σ le k + 1 and ρ le 32

    eminus2τρ

    (084473|τ |`)σle eminus40ρ(

    0844734 ρ

    )σ`σle 1

    (4

    084473 middot 15

    )σeminus80(3t)

    le 1

    `σmiddot 315683k+1 e

    minus80(3t)

    tk+1

    (849)

    85 CONCLUSIONS 161

    where t = 2ρ3 le 1 Since eminuscttk+1 attains its maximum at t = c(k + 1)

    eminus80(3t)

    tk+1le eminus(k+1)

    (3(k + 1)

    80

    )k+1

    and so for ρ le 32

    |x0|minusσeminus12 `

    2

    le 1

    `σmiddot

    004355 if 0 le σ le 1

    000759 if 1 le σ le 2

    000224 if 2 le σ le 3

    whereas |x0|minusσeminus`22 le |x0|minusσ le (051729

    radicτ)minusσ for ρ ge 32

    We conclude that for |τ | ge 20 and σ le 3

    |Fδ(s)| le |Γ(s)|eπ2 τ middot eminus01598τ middot

    4

    1071`σ if ρ le 32

    6105

    1τσ2

    if ρ ge 32(850)

    provided that sgn(δ) = sgn(τ) or δ = 0 This will indeed be negligible compared toour bound for the case sgn(δ) = minus sgn(τ)

    Let us now deal with the factor |Γ(s)|eπ2 τ By Stirlingrsquos formula with remainderterm [GR94 (8344)]

    log Γ(s) =1

    2log(2π) +

    (sminus 1

    2

    )log sminus s+

    1

    12s+R2(s)

    where

    |R2(s)| lt 130

    12|s|3 cos3(

    arg s2

    ) =

    radic2

    180|s|3

    for lt(s) ge 0 The real part of (sminus 12) log sminus s is

    (σ minus 12) log |s| minus τ arg(s)minus σ = (σ minus 12) log |s| minus π

    2τ + τ

    (arctan

    σ

    |τ |minus σ

    |τ |

    )for s = σ + iτ σ ge 0 Since arctan(r) le r for r ge 0 we conclude that

    |Γ(s)|eπ2 τ leradic

    2π|s|σminus 12 e

    112|s|+

    radic2

    180|s|3 (851)

    Lastly |s|σminus12 = |τ |σminus12|1 + iστ |σminus12 For |τ | ge 20

    |1 + iστ |σminus12 le

    1000625 if 0 le σ le 11007491 if 1 le σ le 21028204 if 2 le σ le 3

    ande

    112|τ|+

    radic2

    180|τ|3 le 1004177

    162 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

    Thus

    |Γ(s)|eπ2 τ le |τ |σminus12 middot

    251868 if 0 le σ le 1253596 if 1 le σ le 225881 if 2 le σ le 3

    (852)

    Let us now estimate the constants c1στ and c2στ in (82) By |τ | ge 20

    eminus(radic

    2minus12

    )τ le 0015889 eminus

    τ6 le 0035674 (853)

    Since 8 sin(π8) = 3061467 gt 1 we obtain that

    c1στ le

    130454 if 0 le σ le 1158361 if 1 le σ le 2198186 if 2 le σ le 3

    c2στ le

    194511 if 0 le σ le 1315692 if 1 le σ le 2502186 if 2 le σ le 3

    Lastly note that for k le σ le k + 1 we have

    1

    τσ2middot |τ |σminus12 = |τ |(σminus1)2 le τk2

    whereas for ρ le 32 and 0 le γ le 1

    |τ |γminus12

    |`|γle |τ |

    γ2minus

    12

    ( τ`2

    )γ2le 20

    γ2minus

    12

    (32

    4

    )γ2le(

    3

    8

    )12

    and so1

    `σmiddot |τ |σminus12 =

    (|τ |`

    )k |τ |σminus12

    |`|σleradic

    3

    8middot(|τ |`

    )k

    Multiplying and remembering to add (850) we obtain that for k = 0 1 2 σ isin[0 1] and |τ | ge 20

    |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

    κk0(|τ ||`|

    )keminus01065( 2|τ|

    |`| )2

    if ρ lt 32

    κk1|τ |keminus01598|τ | if ρ ge 32

    whereκ00 le (4 middot 10minus7 + 194511) middot 251868 middot

    radic38 le 3001

    κ10 le (4 middot 10minus7 + 315692) middot 253596 middotradic

    38 le 4903

    κ20 le (4 middot 10minus7 + 502186) middot 25881 middotradic

    38 le 796

    and similarly

    κ01 le (6 middot 10minus5 + 130454) middot 251868 le 3286

    κ11 le (6 middot 10minus5 + 158361) middot 253596 le 4017

    κ21 le (6 middot 10minus5 + 198186) middot 25881 le 513

    This concludes the proof of Corollary 802

    Chapter 9

    Explicit formulas

    An explicit formula is an expression restating a sum such as Sηχ(δx x) as a sum ofthe Mellin transformGδ(s) over the zeros of the L function L(s χ) More specificallyfor us Gδ(s) is the Mellin transform of η(t)e(δt) for some smoothing function η andsome δ isin R We want a formula whose error terms are good both for δ very close orequal to 0 and for δ farther away from 0 (Indeed our choice(s) of η will be made sothat Fδ(s) decays rapidly in both cases)

    We will be able to base all of our work on a single general explicit formula namelyLemma 911 This explicit formula has simple error terms given purely in terms of afew norms of the given smoothing function η We also give a common framework forestimating the contribution of zeros on the critical strip (Lemmas 913 and 914)

    The first example we work out is that of the Gaussian smoothing η(t) = eminust22

    We actually do this in part for didactic purposes and in part because of its likely ap-plicability elsewhere for our applications we will always use smoothing functionsbased on teminust

    22 and t2eminust22 generally in combination with something else Since

    η(t) = eminust22 does not vanish at t = 0 its Mellin transform has a pole at s = 0

    ndash something that requires some additional work (Lemma 912 see also the proof ofLemma 911)

    Other than that for each function η(t) all that has to be done is to bound an integral(from Lemma 913) and bound a few norms Still both for ηlowast and for η+ we find afew interesting complications Since η+ is defined in terms of a truncation of a Mellintransform (or alternatively in terms of a multiplicative convolution with a Dirichletkernel as in (74) and (76)) bounding the norms of η+ and ηprime+ takes a little work Weleave this to Appendix A The effect of the convolution is then just to delay the decaya shift in that a rapidly decaying function f(τ) will get replaced by f(τ minus H) H aconstant

    The smoothing function ηlowast is defined as a multiplicative convolution of t2eminust22

    with something else Given that we have an explicit formula for t2eminust22 we obtain an

    explicit formula for ηlowast by what amounts to just exchanging the order of a sum and anintegral (We already went over this in the introduction in (140))

    163

    164 CHAPTER 9 EXPLICIT FORMULAS

    91 A general explicit formulaWe will prove an explicit formula valid whenever the smoothing η and its derivative ηprime

    satisfy rather mild assumptions ndash they will be assumed to be L2-integrable and to havestrips of definition containing s 12 le lt(s) le 32 though any strip of the forms ε le lt(s) le 1 + ε would do just as well

    (For explicit formulas with different sets of assumptions see eg [IK04 sect55] and[MV07 Ch 12])

    The main idea in deriving any explicit formula is to start with an expression givinga sum as integral over a vertical line with an integrand involving a Mellin transform(here Gδ(s)) and an L-function (here L(s χ)) We then shift the line of integration tothe left If stronger assumptions were made (as in Exercise 5 in [IK04 sect55]) we couldshift the integral all the way tolt(s) = minusinfin the integral would then disappear replacedentirely by a sum over zeros (or even as in the same Exercise 5 by a particularly simpleintegral) Another possibility is to shift the line only to lt(s) = 12 + ε for some ε gt 0ndash but this gives a weaker result and at any rate the factor Lprime(s χ)L(s χ) can be largeand messy to estimate within the critical strip 0 lt lt(s) lt 1

    Instead we will shift the line to lts = minus12 We can do this because the assump-tions on η and ηprime are enough to continue Gδ(s) analytically up to there (with a possiblepole at s = 0) The factor Lprime(s χ)L(s χ) is easy to estimate for lts lt 0 and s = 0(by the functional equation) and the part of the integral on lts = minus12 coming fromGδ(s) can be estimated easily using the fact that the Mellin transform is an isometry

    Lemma 911 Let η R+0 rarr R be in C1 Let x isin R+ δ isin R Let χ be a primitive

    character mod q q ge 1Write Gδ(s) for the Mellin transform of η(t)e(δt) Assume that η(t) and ηprime(t) are

    in `2 (with respect to the measure dt) and that η(t)tσminus1 and ηprime(t)tσminus1 are in `1 (againwith respect to dt) for all σ in an open interval containing [12 32]

    Theninfinsumn=1

    Λ(n)χ(n)e

    xn

    )η(nx) = Iq=1 middot η(minusδ)xminus

    sumρ

    Gδ(ρ)xρ

    minusR+Olowast ((log q + 601) middot (|ηprime|2 + 2π|δ||η|2))xminus12

    (91)

    where

    Iq=1 =

    1 if q = 10 if q 6= 1

    R = η(0)

    (log

    q+ γ minus Lprime(1 χ)

    L(1 χ)

    )+Olowast(c0)

    (92)

    for q gt 1 R = η(0) log 2π for q = 1 and

    c0 =2

    3Olowast(∣∣∣∣ηprime(t)radict

    ∣∣∣∣1

    +∣∣∣ηprime(t)radict∣∣∣

    1+ 2π|δ|

    (∣∣∣∣η(t)radict

    ∣∣∣∣1

    + |η(t)radict|1))

    (93)

    The norms |η|2 |ηprime|2 |ηprime(t)radict|1 etc are taken with respect to the usual measure dt

    The sumsumρ is a sum over all non-trivial zeros ρ of L(s χ)

    91 A GENERAL EXPLICIT FORMULA 165

    Proof Since (a) η(t)tσminus1 is in `1 for σ in an open interval containing 32 and (b)η(t)e(δt) has bounded variation (since η ηprime isin `1 implying that the derivative ofη(t)e(δt) is also in `1) the Mellin inversion formula (as in eg [IK04 4106]) holds

    η(nx)e(δnx) =1

    2πi

    int 32 +iinfin

    32minusiinfin

    Gδ(s)xsnminussds

    Since Gδ(s) is bounded for lt(s) = 32 (by η(t)t32minus1 isin `1) andsumn Λ(n)nminus32 is

    bounded as well we can change the order of summation and integration as follows

    infinsumn=1

    Λ(n)χ(n)e(δnx)η(nx) =

    infinsumn=1

    Λ(n)χ(n) middot 1

    2πi

    int 32 +iinfin

    32minusiinfin

    Gδ(s)xsnminussds

    =1

    2πi

    int 32 +iinfin

    32minusiinfin

    infinsumn=1

    Λ(n)χ(n)Gδ(s)xsnminussds

    =1

    2πi

    int 32 +iinfin

    32minusiinfin

    minusLprime(s χ)

    L(s χ)Gδ(s)x

    sds

    (94)

    (This is the way the procedure always starts see for instance [HL22 Lemma 1] orto look at a recent standard reference [MV07 p 144] We are being very scrupulousabout integration because we are working with general η)

    The first question we should ask ourselves is up to where can we extend Gδ(s)Since η(t)tσminus1 is in `1 for σ in an open interval I containing [12 32] the transformGδ(s) is defined for lt(s) in the same interval I However we also know that thetransformation rule M(tf prime(t))(s) = minuss middotMf(s) (see (210) by integration by parts)is valid when s is in the holomorphy strip for both M(tf prime(t)) and Mf In our case(f(t) = η(t)e(δt)) this happens when lt(s) isin (I minus 1) cap I (so that both sides of theequation in the rule are defined) Hence s middot Gδ(s) (which equals s middotMf(s)) can beanalytically continued to lt(s) in (I minus 1) cup I which is an open interval containing[minus12 32] This implies immediately that Gδ(s) can be analytically continued to thesame region with a possible pole at s = 0

    When does Gδ(s) have a pole at s = 0 This happens when sGδ(s) is non-zero ats = 0 ie when M(tf prime(t))(0) 6= 0 for f(t) = η(t)e(δt) Now

    M(tf prime(t))(0) =

    int infin0

    f prime(t)dt = limtrarrinfin

    f(t)minus f(0)

    We already know that f prime(t) = (ddt)(η(t)e(δt)) is in `1 Hence limtrarrinfin f(t) existsand must be 0 because f is in `1 Hence minusM(tf prime(t))(0) = f(0) = η(0)

    Let us look at the next term in the Laurent expansion of Gδ(s) at s = 0 It is

    limsrarr0

    sGδ(s)minus η(0)

    s= limsrarr0

    minusM(tf prime(t))(s)minus f(0)

    s= minus lim

    srarr0

    1

    s

    int infin0

    f prime(t)(ts minus 1)dt

    = minusint infin

    0

    f prime(t) limsrarr0

    ts minus 1

    sdt = minus

    int infin0

    f prime(t) log t dt

    166 CHAPTER 9 EXPLICIT FORMULAS

    Here we were able to exchange the limit and the integral because f prime(t)tσ is in `1for σ in a neighborhood of 0 in turn this is true because f prime(t) = ηprime(t) + 2πiδη(t)and ηprime(t)tσ and η(t)tσ are both in `1 for σ in a neighborhood of 0 In fact we willuse the easy bounds |η(t) log t| le (23)(|η(t)tminus12|1 + |η(t)t12|1) |ηprime(t) log t| le(23)(|ηprime(t)tminus12|1 + |ηprime(t)t12|1) resulting from the inequality

    2

    3

    (tminus

    12 + t

    12

    )le | log t| (95)

    valid for all t gt 0We conclude that the Laurent expansion of Gδ(s) at s = 0 is

    Gδ(s) =η(0)

    s+ c0 + c1s+ (96)

    where

    c0 = Olowast(|f prime(t) log t|1)

    =2

    3Olowast(∣∣∣∣ηprime(t)radict

    ∣∣∣∣1

    +∣∣∣ηprime(t)radict∣∣∣

    1+ 2πδ

    (∣∣∣∣η(t)radict

    ∣∣∣∣1

    + |η(t)radict|1))

    We shift the line of integration in (94) to lt(s) = minus12 We obtain

    1

    2πi

    int 2+iinfin

    2minusiinfinminusLprime(s χ)

    L(s χ)Gδ(s)x

    sds = Iq=1Gδ(1)xminussumρ

    Gδ(ρ)xρ minusR

    minus 1

    2πi

    int minus12+iinfin

    minus12minusiinfin

    Lprime(s χ)

    L(s χ)Gδ(s)x

    sds

    (97)

    where

    R = Ress=0Lprime(s χ)

    L(s χ)Gδ(s)

    Of course

    Gδ(1) = M(η(t)e(δt))(1) =

    int infin0

    η(t)e(δt)dt = η(minusδ)

    Let us work out the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 By the func-tional equation (as in eg [IK04 Thm 415])

    Lprime(s χ)

    L(s χ)= log

    π

    qminus 1

    (s+ κ

    2

    )minus 1

    (1minus s+ κ

    2

    )minus Lprime(1minus s χ)

    L(1minus s χ) (98)

    where ψ(s) = Γprime(s)Γ(s) and

    κ =

    0 if χ(minus1) = 1

    1 if χ(minus1) = minus1

    91 A GENERAL EXPLICIT FORMULA 167

    By ψ(1 minus x) minus ψ(x) = π cotπx (immediate from Γ(s)Γ(1 minus s) = π sinπs) andψ(s) + ψ(s+ 12) = 2(ψ(2s)minus log 2) (Legendre [AS64 (638)])

    minus 1

    2

    (s+ κ

    2

    )+ ψ

    (1minus s+ κ

    2

    ))= minusψ(1minuss)+log 2+

    π

    2cot

    π(s+ κ)

    2 (99)

    Hence unless q = 1 the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 is

    1minus κs

    +

    (log

    qminus ψ(1)minus Lprime(1 χ)

    L(1 χ)

    )+a1

    s+a2

    s2+

    Here ψ(1) = minusγ the Euler gamma constant [AS64 (632)]There is a special case for q = 1 due to the pole of ζ(s) at s = 1 We know that

    ζ prime(0)ζ(0) = log 2π (see eg [MV07 p 331])From this and (96) we conclude that if η(0) = 0 then

    R =

    c0 if q gt 1 and χ(minus1) = 10 otherwise

    where c0 = Olowast(|ηprime(t) log t|1 + 2π|δ||η(t) log t|1) If η(0) 6= 0 then

    R = η(0)

    (log

    q+ γ minus Lprime(1 χ)

    L(1 χ)

    )+

    c0 if χ(minus1) = 1

    0 otherwise

    for q gt 1 andR = η(0) log 2π

    for q = 1It is time to estimate the integral on the right side of (97) For that we will need to

    estimate Lprime(s χ)L(s χ) for lt(s) = minus12 using (98) and (99)If lt(z) = 32 then |t2 + z2| ge 94 for all real t Hence by [OLBC10 (5915)]

    and [GR94 (34111)]

    ψ(z) = log z minus 1

    2zminus 2

    int infin0

    tdt

    (t2 + z2)(e2πt minus 1)

    = log z minus 1

    2z+ 2 middotOlowast

    (int infin0

    tdt94 (e2πt minus 1)

    )= log z minus 1

    2z+

    8

    9Olowast(int infin

    0

    tdt

    e2πt minus 1

    )= log z minus 1

    2z+

    8

    9middotOlowast

    (1

    (2π)2Γ(2)ζ(2)

    )= log z minus 1

    2z+Olowast

    (1

    27

    )= log z +Olowast

    (10

    27

    )

    (910)

    Thus in particular ψ(1 minus s) = log(32 minus iτ) + Olowast(1027) where we write s =12 + iτ Now ∣∣∣∣cot

    π(s+ κ)

    2

    ∣∣∣∣ =

    ∣∣∣∣e∓π4 iminusπ2 τ + eplusmnπ4 i+

    π2 τ

    e∓π4 iminus

    π2 τ minus eplusmnπ4 i+π

    2 τ

    ∣∣∣∣ = 1

    168 CHAPTER 9 EXPLICIT FORMULAS

    Since lt(s) = minus12 a comparison of Dirichlet series gives∣∣∣∣Lprime(1minus s χ)

    L(1minus s χ)

    ∣∣∣∣ le |ζ prime(32)||ζ(32)|

    le 150524 (911)

    where ζ prime(32) and ζ(32) can be evaluated by Euler-Maclaurin Therefore (98) and(99) give us that for s = minus12 + iτ ∣∣∣∣Lprime(s χ)

    L(s χ)

    ∣∣∣∣ le ∣∣∣logq

    π

    ∣∣∣+ log

    ∣∣∣∣32 + iτ

    ∣∣∣∣+10

    27+ log 2 +

    π

    2+ 150524

    le∣∣∣log

    q

    π

    ∣∣∣+1

    2log

    (τ2 +

    9

    4

    )+ 41396

    (912)

    Recall that we must bound the integral on the right side of (97) The absolute valueof the integral is at most xminus12 times

    1

    int minus 12 +iinfin

    minus 12minusiinfin

    ∣∣∣∣Lprime(s χ)

    L(s χ)Gδ(s)

    ∣∣∣∣ ds (913)

    By Cauchy-Schwarz this is at mostradicradicradicradic 1

    int minus 12 +iinfin

    minus 12minusiinfin

    ∣∣∣∣Lprime(s χ)

    L(s χ)middot 1

    s

    ∣∣∣∣2 |ds| middotradicradicradicradic 1

    int minus 12 +iinfin

    minus 12minusiinfin

    |Gδ(s)s|2 |ds|

    By (912)radicradicradicradicint minus 12 +iinfin

    minus 12minusiinfin

    ∣∣∣∣Lprime(s χ)

    L(s χ)middot 1

    s

    ∣∣∣∣2 |ds| leradicradicradicradicint minus 1

    2 +iinfin

    minus 12minusiinfin

    ∣∣∣∣ log q

    s

    ∣∣∣∣2 |ds|+

    radicradicradicradicint infinminusinfin

    ∣∣ 12 log

    (τ2 + 9

    4

    )+ 41396 + log π

    ∣∣214 + τ2

    leradic

    2π log q +radic

    226844

    where we compute the last integral numerically1

    Again we use the fact that by (210) sGδ(s) is the Mellin transform of

    minus td(e(δt)η(t))

    dt= minus2πiδte(δt)η(t)minus te(δt)ηprime(t) (914)

    Hence by Plancherel (as in (26))radicradicradicradic 1

    int minus 12 +iinfin

    minus 12minusiinfin

    |Gδ(s)s|2 |ds| =

    radicint infin0

    |minus2πiδte(δt)η(t)minus te(δt)ηprime(t)|2 tminus2dt

    = 2π|δ|

    radicint infin0

    |η(t)|2dt+

    radicint infin0

    |ηprime(t)|2dt

    (915)1By a rigorous integration from τ = minus100000 to τ = 100000 using VNODE-LP [Ned06] which runs

    on the PROFILBIAS interval arithmetic package [Knu99]

    91 A GENERAL EXPLICIT FORMULA 169

    Thus (913) is at most(log q +

    radic226844

    )middot (|ηprime|2 + 2π|δ||η|2)

    Lemma 911 leaves us with three tasks bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with small imaginary part bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with large imaginary part and bounding Lprime(1 χ)L(1 χ) Letus start with the last task while in a narrow sense it is optional ndash in that in theapplications we actually need (Thm 712 Cor 713 and Thm 714) we will haveη(0) = 0 thus making the term Lprime(1 χ)L(1 χ) disappear ndash it is also very easy andcan be dealt with quickly

    Since we will be using a finite GRH check in all later applications we might aswell use it here

    Lemma 912 Let χ be a primitive character mod q q gt 1 Assume that all non-trivialzeroes ρ = σ + it of L(s χ) with |t| le 58 satisfy lt(ρ) = 12 Then∣∣∣∣Lprime(1 χ)

    L(1 χ)

    ∣∣∣∣ le 5

    2logM(q) + c

    where M(q) = maxn

    ∣∣∣summlen χ(m)∣∣∣ and

    c = 5 log2radic

    3

    ζ(94)ζ(98)= 1507016

    Proof By a lemma of Landaursquos (see eg [MV07 Lemma 63] where the constantsare easily made explicit) based on the Borel-Caratheodory Lemma (as in [MV07Lemma 62]) any function f analytic and zero-free on a disc Cs0R = s |sminus s0| leR of radius R gt 0 around s0 satisfies

    f prime(s)

    f(s)= Olowast

    (2R logM|f(s0)|

    (Rminus r)2

    )(916)

    for all s with |s minus s0| le r where 0 lt r lt R and M is the maximum of |f(z)| onCs0R Assuming L(s χ) has no non-trivial zeros off the critical line with |=(s)| le H where H gt 12 we set s0 = 12 +H r = H minus 12 and let Rrarr Hminus We obtain

    Lprime(1 χ)

    L(1 χ)= Olowast

    (8H log

    maxsisinCs0H |L(s χ)||L(s0 χ)|

    ) (917)

    Now

    |L(s0 χ)| geprodp

    (1 + pminuss0)minus1 =prodp

    (1minus pminus2s0)minus1

    (1minus pminuss0)minus1=ζ(2s0)

    ζ(s0)

    Since s0 = 12 +H Cs0H is contained in s isin C lt(s) gt 12 for any value of H We choose (somewhat arbitrarily) H = 58

    170 CHAPTER 9 EXPLICIT FORMULAS

    By partial summation for s = σ + it with 12 le σ lt 1 and any N isin Z+

    L(s χ) =sumnleN

    χ(m)nminuss minus

    summleN

    χ(m)

    (N + 1)minuss

    +sum

    ngeN+1

    summlen

    χ(m)

    (nminuss minus (n+ 1)minuss+1)

    = Olowast(N1minus12

    1minus 12+N1minusσ +M(q)Nminusσ

    )

    (918)

    where M(q) = maxn

    ∣∣∣summlen χ(m)∣∣∣ We set N = M(q)3 and obtain

    |L(s χ)| le 2M(q)Nminus12 = 2radic

    3radicM(q) (919)

    We put this into (917) and are done

    Let M(q) be as in the statement of Lem 912 Since the sum of χ(n) (χ mod qq gt 1) over any interval of length q is 0 it is easy to see that M(q) le q2 We alsohave the following explicit version of the Polya-Vinogradov inequality

    M(q) le

    2π2

    radicq log q + 4

    π2

    radicq log log q + 3

    2

    radicq if χ(minus1) = 1

    12π

    radicq log q + 1

    π

    radicq log log q +

    radicq if χ(minus1) = 1

    (920)

    Taken together with M(q) le q2 this implies that

    M(q) le q45 (921)

    for all q ge 1 and also thatM(q) le 2q35 (922)

    for all q ge 1Notice lastly that ∣∣∣∣log

    q+ γ

    ∣∣∣∣ le log q + logeγ middot 2π

    32

    for all q ge 3 (There are no primitive characters modulo 2 so we can omit q = 2)We conclude that for χ primitive and non-trivial∣∣∣∣log

    q+ γ minus Lprime(1 χ)

    L(1 χ)

    ∣∣∣∣ le logeγ middot 2π

    32+ log q +

    5

    2log q

    45 + 1507017

    le 3 log q + 15289

    Obviously 15289 is more than log 2π the bound for χ trivial Hence the absolutevalue of the quantity R in the statement of Lemma 911 is at most

    |η(0)|(3 log q + 15289) + |c0| (923)

    91 A GENERAL EXPLICIT FORMULA 171

    for all primitive χIt now remains to bound the sum

    sumρGδ(ρ)xρ in (91) Clearly∣∣∣∣∣sum

    ρ

    Gδ(ρ)xρ

    ∣∣∣∣∣ lesumρ

    |Gδ(ρ)| middot xlt(ρ)

    Recall that these are sums over the non-trivial zeros ρ of L(s χ)We first prove a general lemma on sums of values of functions on the non-trivial

    zeros of L(s χ) This is little more than partial summation given a (classical) boundfor the number of zeroesN(T χ) of L(s χ) with |=(s)| le T The error term becomesparticularly simple if f is real-valued and decreasing the statement is then practicallyidentical to that of [Leh66 Lemma 1] (for χ principal) except for the fact that the errorterm is improved here

    Lemma 913 Let f R+ rarr C be piecewise C1 Assume limtrarrinfin f(t)t log t = 0Let χ be a primitive character mod q q ge 1 let ρ denote the non-trivial zeros ρ ofL(s χ) Then for any y ge 1sum

    ρ non-trivial=(ρ)gty

    f(=(ρ)) =1

    int infiny

    f(T ) logqT

    2πdT

    +1

    2Olowast(|f(y)|gχ(y) +

    int infiny

    |f prime(T )| middot gχ(T )dT

    )

    (924)

    wheregχ(T ) = 05 log qT + 177 (925)

    If f is real-valued and decreasing on [yinfin) the second line of (924) equals

    Olowast(

    1

    4

    int infiny

    f(T )

    TdT

    )

    Proof WriteN(T χ) for the number of non-trivial zeros ofL(s χ) satisfying |=(s)| leT Write N+(T χ) for the number of (necessarily non-trivial) zeros of L(s χ) with0 lt =(s) le T Then for any f R+ rarr C with f piecewise differentiable andlimtrarrinfin f(t)N(T χ) = 0sum

    ρ=(ρ)gty

    f(=(ρ)) =

    int infiny

    f(T ) dN+(T χ)

    = minusint infiny

    f prime(T )(N+(T χ)minusN+(y χ))dT

    = minus1

    2

    int infiny

    f prime(T )(N(T χ)minusN(y χ))dT

    Now by [Ros41 Thms 17ndash19] and [McC84a Thm 21] (see also [Tru Thm 1])

    N(T χ) =T

    πlog

    qT

    2πe+Olowast (gχ(T )) (926)

    172 CHAPTER 9 EXPLICIT FORMULAS

    for T ge 1 where gχ(T ) is as in (925) (This is a classical formula the referencesserve to prove the explicit form (925) for the error term gχ(T ))

    Thus for y ge 1sumρ=(ρ)gty

    f(=(ρ)) = minus1

    2

    int infiny

    f prime(T )

    (T

    πlog

    qT

    2πeminus y

    πlog

    qy

    2πe

    )dT

    +1

    2Olowast(|f(y)|gχ(y) +

    int infiny

    |f prime(T )| middot gχ(T )dT

    )

    (927)

    Here

    minus 1

    2

    int infiny

    f prime(T )

    (T

    πlog

    qT

    2πeminus y

    πlog

    qy

    2πe

    )dT =

    1

    int infiny

    f(T ) logqT

    2πdT (928)

    If f is real-valued and decreasing (and so by limtrarrinfin f(t) = 0 non-negative)

    |f(y)|gχ(y) +

    int infiny

    |f prime(T )| middot gχ(T )dT = f(y)gχ(y)minusint infiny

    f prime(T )gχ(T )dT

    = 05

    int infiny

    f(T )

    TdT

    since gprimeχ(T ) le 05T for all T ge T0

    Let us bound the part of the sumsumρGδ(ρ)xρ corresponding to ρ with bounded

    |=(ρ)| The bound we will give is proportional toradicT0 log qT0 whereas a very naive

    approach (based on the trivial bound |Gδ(σ + iτ)| le |G0(σ)|) would give a boundproportional to T0 log qT0

    We could obtain a bound proportional toradicT0 log qT0 for η(t) = tkeminust

    22 by usingTheorem 801 Instead we will give a bound of that same quality valid for η essentiallyarbitrary simply by using the fact that the Mellin transform is an isometry (preceded byan application of Cauchy-Schwarz)

    Lemma 914 Let η R+0 rarr R be such that both η(t) and (log t)η(t) lie in L1 cap L2

    and η(t)radict lies in L1 (with respect to dt) Let δ isin R Let Gδ(s) be the Mellin

    transform of η(t)e(δt)Let χ be a primitive character mod q q ge 1 Let T0 ge 1 Assume that all non-

    trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Thensumρ non-trivial|=(ρ)|leT0

    |Gδ(ρ)|

    is at most

    (|η|2 + |η middot log |2)radicT0 log qT0 + (1721|η middot log |2 minus (log 2π

    radice)|η|2)

    radicT0

    +∣∣∣η(t)

    radict∣∣∣1middot (132 log q + 345)

    (929)

    91 A GENERAL EXPLICIT FORMULA 173

    Proof For s = 12 + iτ we have the trivial bound

    |Gδ(s)| leint infin

    0

    |η(t)|t12 dtt

    =∣∣∣η(t)

    radict∣∣∣1 (930)

    where Fδ is as in (947) We also have the trivial bound

    |Gprimeδ(s)| =∣∣∣∣int infin

    0

    (log t)η(t)tsdt

    t

    ∣∣∣∣ le int infin0

    |(log t)η(t)|tσ dtt

    =∣∣(log t)η(t)tσminus1

    ∣∣1

    (931)for s = σ + iτ

    Let us start by bounding the contribution of very low-lying zeros (|=(ρ)| le 1) By(926) and (925)

    N(1 χ) =1

    πlog

    q

    2πe+Olowast (05 log q + 177) = Olowast(0819 log q + 168)

    Therefore sumρ non-trivial|=(ρ)|le1

    |Gδ(ρ)| le∣∣∣η(t)tminus12

    ∣∣∣1middot (0819 log q + 168)

    Let us now consider zeros ρ with |=(ρ)| gt 1 Apply Lemma 913 with y = 1 and

    f(t) =

    |Gδ(12 + it)| if t le T0

    0 if t gt T0

    This gives us thatsumρ1lt|=(ρ)|leT0

    f(=(ρ)) =1

    π

    int T0

    1

    f(T ) logqT

    2πdT

    +Olowast(|f(1)|gχ(1) +

    int infin1

    |f prime(T )| middot gχ(T ) dT

    )

    (932)

    where we are using the fact that f(σ+ iτ) = f(σminus iτ) (because η is real-valued) ByCauchy-Schwarz

    1

    π

    int T0

    1

    f(T ) logqT

    2πdT le

    radic1

    π

    int T0

    1

    |f(T )|2dT middot

    radic1

    π

    int T0

    1

    (log

    qT

    )2

    dT

    Now

    1

    π

    int T0

    1

    |f(T )|2dT le 1

    int infinminusinfin

    ∣∣∣∣Gδ (1

    2+ iT

    )∣∣∣∣2 dT le int infin0

    |e(δt)η(t)|2dt = |η|22

    by Plancherel (as in (26)) We also haveint T0

    1

    (log

    qT

    )2

    dT le 2π

    q

    int qT02π

    0

    (log t)2dt le

    ((log

    qT0

    2πe

    )2

    + 1

    )middot T0

    174 CHAPTER 9 EXPLICIT FORMULAS

    Hence1

    π

    int T0

    1

    f(T ) logqT

    2πdT le

    radic(log

    qT0

    2πe

    )2

    + 1 middot |η|2radicT0

    Again by Cauchy-Schwarzint infin1

    |f prime(T )| middot gχ(T ) dT le

    radic1

    int infinminusinfin|f prime(T )|2dT middot

    radic1

    π

    int T0

    1

    |gχ(T )|2dT

    Since |f prime(T )| = |Gprimeδ(12 + iT )| and (Mη)prime(s) is the Mellin transform of log(t) middote(δt)η(t) (by (210))

    1

    int infinminusinfin|f prime(T )|2dT = |η(t) log(t)|2

    Much as beforeint T0

    1

    |gχ(T )|2dT leint T0

    0

    (05 log qT + 177)2dT

    = (025(log qT0)2 + 172(log qT0) + 29609)T0

    Summing we obtain

    1

    π

    int T0

    1

    f(T ) logqT

    2πdT +

    int infin1

    |f prime(T )| middot gχ(T ) dT

    le((

    logqT0

    2πe+

    1

    2

    )|η|2 +

    (log qT0

    2+ 1721

    )|η(t)(log t)|2

    )radicT0

    Finally by (930) and (925)

    |f(1)|gχ(1) le∣∣∣η(t)

    radict∣∣∣1middot (05 log q + 177)

    By (932) and the assumption that all non-trivial zeros with |=(ρ)| le T0 lie on the linelt(s) = 12 we conclude thatsum

    ρ non-trivial1lt|=(ρ)|leT0

    |Gδ(ρ)| le (|η|2 + |η middot log |2)radicT0 log qT0

    + (1721|η middot log |2 minus (log 2πradice)|η|2)

    radicT0

    +∣∣∣η(t)

    radict∣∣∣1middot (05 log q + 177)

    All that remains is to bound the contribution tosumρGδ(ρ)xρ corresponding to all

    zeroes ρ with |=(ρ)| gt T0 This will do by another application of Lemma 913combined with bounds on Gδ(ρ) for =(ρ) large This is the only part that will requireus to take a look at the actual smoothing function η we are working with it is at thispoint not before that we actually have to look at each of our options for η one by one

    92 SUMS AND DECAY FOR THE GAUSSIAN 175

    92 Sums and decay for the GaussianIt is now time to derive our bounds for the Gaussian smoothing As we were sayingthere is really only one thing left to do namely an estimate for the sum

    sumρ |Fδ(ρ)|

    over all zeros ρ with |=(ρ)| gt T0

    Lemma 921 Let ηhearts(t) = eminust22 Let x isin R+ δ isin R Let χ be a primitive character

    mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge 50

    Write Fδ(s) for the Mellin transform of η(t)e(δt) Thensumρ

    |=(ρ)|gtT0

    |Fδ(ρ)| le logqT0

    2πmiddot(

    353eminus01598T0 + 225δ2

    T0eminus01065( T0

    π|δ| )2)

    Here we have preferred to give a bound with a simple form It is probably feasibleto derive from Theorem 801 a bound essentially proportional to eminusE(ρ)T0 where ρ =T0(πδ)

    2 and E(ρ) is as in (82) (As we discussed in sect85 E(ρ) behaves as eminus(π4)T0

    for ρ large and as eminus0125(T0(πδ))2

    for ρ small)

    Proof First of allsumρ

    |=(ρ)|gtT0

    |Fδ(ρ)| =sumρ

    =(ρ)gtT0

    (|Fδ(ρ)|+ |Fδ(1minus ρ)|)

    by the functional equation (which implies that non-trivial zeros come in pairs ρ 1minusρ)Hence by a somewhat brutish application of Cor 802sum

    ρ

    |=(ρ)|gtT0

    |Fδ(ρ)| lesumρ

    =(ρ)gtT0

    f(=(ρ)) (933)

    wheref(τ) = 3001eminus01065( τ

    πδ )2

    + 3286eminus01598|τ | (934)

    Obviously f(τ) is a decreasing function of τ for τ ge T0We now apply Lemma 913 We obtain thatsum

    ρ

    =(ρ)gtT0

    f(=(ρ)) leint infinT0

    f(T )

    (1

    2πlog

    qT

    2π+

    1

    4T

    )dT (935)

    We just need to estimate some integrals For any y ge 1 c c1 gt 0int infiny

    (log t+

    c1t

    )eminusctdt le

    int infiny

    (log tminus 1

    ct

    )eminusctdt+

    (1

    c+ c1

    )int infiny

    eminusct

    tdt

    =(log y)eminuscy

    c+

    (1

    c+ c1

    )E1(cy)

    176 CHAPTER 9 EXPLICIT FORMULAS

    where E1(x) =intinfinxeminustdtt Clearly E1(x) le

    intinfinxeminustdtx = eminusxx Henceint infin

    y

    (log t+

    c1t

    )eminusctdt le

    (log y +

    (1

    c+ c1

    )1

    y

    )eminuscy

    c

    We conclude thatint infinT0

    eminus01598t

    (1

    2πlog

    qt

    2π+

    1

    4t

    )dt

    le 1

    int infinT0

    (log t+

    π2

    t

    )eminusctdt+

    log q2π

    2πc

    int infinT0

    eminusctdt

    =1

    2πc

    (log T0 + log

    q

    2π+

    (1

    c+π

    2

    )1

    T0

    )eminuscT0

    (936)

    with c = 01598 Since T0 ge 50 and q ge 1 this is at most

    1072 logqT0

    2πeminuscT0 (937)

    Now let us deal with the Gaussian term (It appears only if T0 lt (32)(πδ)2 asotherwise |τ | ge (32)(πδ)2 holds whenever |τ | ge T0) For any y ge e c ge 0int infin

    y

    eminusct2

    dt =1radicc

    int infinradiccy

    eminust2

    dt le 1

    cy

    int infinradiccy

    teminust2

    dt le eminuscy2

    2cy (938)

    int infiny

    eminusct2

    tdt =

    int infincy2

    eminust

    2tdt =

    E1(cy2)

    2le eminuscy

    2

    2cy2 (939)int infin

    y

    (log t)eminusct2

    dt leint infiny

    (log t+

    log tminus 1

    2ct2

    )eminusct

    2

    dt =log y

    2cyeminuscy

    2

    (940)

    Hence int infinT0

    eminus01065( Tπδ )2(

    1

    2πlog

    qT

    2π+

    1

    4T

    )dT

    =

    int infinT0π|δ|

    eminus01065t2(|δ|2

    logq|δ|t

    2+

    1

    4t

    )dt

    le

    |δ|2 log T0

    π|δ|

    2cprime T0

    π|δ|+|δ|2 log q|δ|

    2

    2cprime T0

    π|δ|+

    1

    8cprime(T0

    π|δ|

    )2

    eminuscprime( T0π|δ| )

    2

    (941)

    with cprime = 01065 Since T0 ge 50 and q ge 1

    8T0le π

    200le 00152 middot 1

    2log

    qT0

    Thus the last line of (941) is less than

    10152|δ|2 log qT0

    2π2cprimeT0

    π|δ|eminusc

    prime( T0π|δ| )

    2

    = 7487δ2

    T0middot log

    qT0

    2πmiddot eminusc

    prime( T0π|δ| )

    2

    (942)

    92 SUMS AND DECAY FOR THE GAUSSIAN 177

    Again by T0 ge 4π2|δ| we see that 10057π|δ|(4cT0) le 10057(16cπ) le 018787To obtain our final bound we simply sum (937) and (942) after multiplying them

    by the constants 3286 and 3001 in (934) We conclude that the integral in (935) is atmost (

    353eminus01598T0 + 225δ2

    T0eminus01065( T0

    π|δ| )2)

    logqT0

    We need to record a few norms related to the Gaussian ηhearts(t) = eminust22 before we

    proceed Recall we are working with the one-sided Gaussian ie we set ηhearts(t) = 0for t lt 0 Symbolic integration then gives

    |ηhearts|22 =

    int infin0

    eminust2

    dt =

    radicπ

    2

    |ηprimehearts|22 =

    int infin0

    (teminust22)2dt =

    radicπ

    4

    |ηhearts middot log |22 =

    int infin0

    eminust2

    (log t)2dt

    =

    radicπ

    16

    (π2 + 2γ2 + 8γ log 2 + 8(log 2)2

    )le 194753

    (943)

    |ηhearts(t)radict|1 =

    int infin0

    eminust22

    radictdt =

    Γ(14)

    234le 215581

    |ηprimehearts(t)radict| = |ηhearts(t)

    radict|1 =

    int infin0

    eminust2

    2

    radictdt =

    Γ(34)

    214le 103045∣∣∣ηprimehearts(t)t12

    ∣∣∣1

    =∣∣∣ηhearts(t)t32

    ∣∣∣1

    =

    int infin0

    eminust2

    2 t32 dt = 107791

    (944)

    We can now state what is really our main result for the Gaussian smoothing (Theversion in sect71 will as we shall later see follow from this given numerical inputs)

    Proposition 922 Let η(t) = eminust22 Let x ge 1 δ isin R Let χ be a primitive character

    mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie onthe critical line Assume that T0 ge 50

    Then

    infinsumn=1

    Λ(n)χ(n)e

    xn

    )η(nx

    )=

    η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

    (945)where

    errηχ(δ x) = logqT0

    2πmiddot(

    353eminus01598T0 + 225δ2

    T0eminus01065( T0

    π|δ| )2)

    + (2337radicT0 log qT0 + 21817

    radicT0 + 285 log q + 7438)xminus

    12

    + (3 log q + 14|δ|+ 17)xminus1 + (log q + 6) middot (1 + 5|δ|) middot xminus32

    178 CHAPTER 9 EXPLICIT FORMULAS

    Proof Let Fδ(s) be the Mellin transform of ηhearts(t)e(δt) By Lemmas 914 (withGδ =Fδ) and Lemma 921 ∣∣∣∣∣∣

    sumρ non-trivial

    Fδ(ρ)xρ

    ∣∣∣∣∣∣is at most (929) (with η = ηhearts) times

    radicx plus

    logqT0

    2πmiddot(

    353eminus01598T0 + 225|δ|2

    T0eminus01065( T0

    π|δ| )2)middot x

    By the norm computations in (943) and (944) we see that (929) is at most

    2337radicT0 log qT0 + 21817

    radicT0 + 285 log q + 7438

    Let us now apply Lemma 911 We saw that the value of R in Lemma 911 isbounded by (923) We know that ηhearts(0) = 1 Again by (943) and (944) the quantityc0 defined in (93) is at most 14056 + 133466|δ| Hence

    |R| le 3 log q + 13347|δ|+ 16695

    Lastly|ηprimehearts|2 + 2π|δ||ηhearts|2 le 0942 + 4183|δ| le 1 + 5|δ|

    Clearly(601minus 6) middot (1 + 5|δ|) + 13347|δ|+ 16695 lt 14|δ|+ 17

    and so we are done

    93 The case of ηlowast(t)We will now work with a weight based on the Gaussian

    η(t) =

    t2eminust

    22 if t ge 00 if t lt 0

    (946)

    The fact that this vanishes at t = 0 actually makes it easier to work with at severallevels

    Its Mellin transform is just a shift of that of the Gaussian Write

    Fδ(s) = (M(eminust2

    2 e(δt)))(s)

    Gδ(s) = (M(η(t)e(δt)))(s)(947)

    Then by the definition of the Mellin transform

    Gδ(s) = Fδ(s+ 2)

    We start by bounding the contribution of zeros with large imaginary part just asbefore

    93 THE CASE OF ηlowast(T ) 179

    Lemma 931 Let η(t) = t2eminust22 Let x isin R+ δ isin R Let χ be a primitive character

    mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge max(10π|δ| 50)

    Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

    sumρ

    |=(ρ)|gtT0

    |Gδ(ρ)| le T0 logqT0

    2πmiddot(

    611eminus01598T0 + 1578eminus01065middot T

    20

    (πδ)2

    )

    Proof We start by writingsumρ

    |=(ρ)|gtT0

    |Gδ(ρ)| =sumρ

    =(ρ)gtT0

    (|Fδ(ρ+ 2)|+ |Fδ((1minus ρ) + 2)|)

    where we are usingGδ(ρ) = Fδ(ρ+2) and the fact that non-trivial zeros come in pairsρ 1minus ρ

    By Cor 802 with k = 2sumρ

    |=(ρ)|gtT0

    |Gδ(ρ)| lesumρ

    =(ρ)gtT0

    f(=(ρ))

    where

    f(τ) =

    κ21|τ |eminus01598|τ | +κ20

    4

    (|τ |πδ

    )2

    eminus01065( |τ|πδ )2

    if |τ | lt 32 (πδ)2

    κ21|τ |eminus01598|τ | if |τ | ge 32 (πδ)2

    (948)

    where κ20 = 796 and κ21 = 513 We are including the term |τ |eminus01598|τ | in bothcases in part because we cannot be bothered to take it out (just as we could not bebothered in the proof of Lem 921) and in part to ensure that f(τ) is a decreasingfunction of τ for τ ge T0

    We can now apply Lemma 913 We obtain againsumρ

    =(ρ)gtT0

    f(=(ρ)) leint infinT0

    f(T )

    (1

    2πlog

    qT

    2π+

    1

    4T

    )dT (949)

    Just as before we will need to estimate some integralsFor any y ge 1 c c1 gt 0 such that log y gt 1(cy)int infin

    y

    teminusctdt =

    (y

    c+

    1

    c2

    )eminuscy

    int infiny

    (t log t+

    c1t

    )eminusctdt le

    int infiny

    ((t+

    aminus 1

    c

    )log tminus 1

    cminus a

    c2t

    )eminusctdt

    =(yc

    +a

    c2

    )eminuscy log y

    (950)

    180 CHAPTER 9 EXPLICIT FORMULAS

    where

    a =

    log yc + 1

    c + c1y

    log yc minus

    1c2y

    Setting c = 01598 c1 = π2 y = T0 ge 50 we obtain thatint infinT0

    (1

    2πlog

    qT

    2π+

    1

    4T

    )Teminus01598T dT

    le 1

    (log

    q

    2πmiddot(T0

    c+

    1

    c2

    )+

    (T0

    c+a

    c2

    )log T0

    )eminus01598T0

    (951)

    and

    a =

    log T0

    01598 + 101598 + π2

    T0

    log T0

    01598 minus1

    015982T0

    le 1299

    It is easy to see that ratio of the expression within parentheses on the right side of(951) to T0 log(qT02π) increases as q decreases and if we hold q fixed decreases asT0 ge 2π increases thus it is maximal for q = 1 and T0 = 50 Multiplying (951) byκ21 = 513 and simplifying by the assumption T0 ge 50 we obtain thatint infin

    T0

    513Teminus01598T

    (1

    2πlog

    qT0

    2π+

    1

    4T

    )dT le 611T0 log

    qT0

    2πmiddot eminus01598T0

    (952)Now let us examine the Gaussian term First of all ndash when does it arise If T0 ge

    (32)(πδ)2 then |τ | ge (32)(πδ)2 holds whenever |τ | ge T0 and so (948) does notgive us a Gaussian term Recall that T0 ge 10π|δ| which means that |δ| le 20(3π)implies that T0 ge (32)(πδ)2 We can thus assume from now on that |δ| gt 20(3π)since otherwise there is no Gaussian term to treat

    For any y ge 1 c c1 gt 0int infiny

    t2eminusct2

    dt lt

    int infiny

    (t2 +

    1

    4c2t2

    )eminusct

    2

    dt =

    (y

    2c+

    1

    4c2y

    )middot eminuscy

    2

    int infiny

    (t2 log t+ c1t) middot eminusct2

    dt leint infiny

    (t2 log t+

    at log et

    2cminus log et

    2cminus a

    4c2t

    )eminusct

    2

    dt

    =(2cy + a) log y + a

    4c2middot eminuscy

    2

    where

    a =c1y + log ey

    2cy log ey

    2c minus 14c2y

    =1

    y+

    c1y + 14c2y2

    y log ey2c minus 1

    4c2y

    =1

    y+

    2c1c

    log ey+

    c12cy log ey + 1

    4c2y2

    y log ey2c minus 1

    4c2y

    (Note that a decreases as y ge y0 increases provided that log ey0 gt 1(2cy20)) Setting

    93 THE CASE OF ηlowast(T ) 181

    c = 01065 c1 = 1(2|δ|) le 316 and y = T0(π|δ|) ge 4π we obtainint infinT0π|δ|

    (1

    2πlog

    q|δ|t2

    +1

    4π|δ|t

    )t2eminus01065t2dt

    le(

    1

    2πlog

    q|δ|2

    )middot(

    T0

    2πc|δ|+

    1

    4c2 middot 10

    )middot eminus01065( T0

    π|δ| )2

    +1

    2πmiddot

    (2c T0

    π|δ| + a)

    log T0

    π|δ| + a

    4c2middot eminus01065( T0

    π|δ| )2

    and

    a le 1

    10+

    (2middot203π

    )minus1 middot 10 + 14middot010652middot102

    10 log 10e2middot01065 minus

    14middot010652middot10

    le 0117

    Multiplying by (κ204)π|δ| we get thatint infinT0

    κ20

    4

    (T

    π|δ|

    )2

    eminus01065( Tπ|δ| )

    2(

    1

    2πlog

    qT0

    2π+

    1

    4T

    )dT (953)

    is at most eminus01065( T0π|δ| )

    2

    times((1487T0 + 2194|δ|) middot log

    q|δ|2

    + 1487T0 logT0

    π|δ|+ 2566|δ| log

    eT0

    π|δ|

    )le

    (1487 + 2566 middot

    1 + 1log T0π|δ|

    T0|δ|

    )T0 log

    qT0

    2πle 1578 middot T0 log

    qT0

    (954)

    where we are using several times the assumption that T0 ge 4π2|δ| (and in one occa-sion the fact that |δ| gt 20(3π) gt 2)

    We sum (952) and the estimate for (953) we have just got to reach our conclusion

    Again we record some norms obtained by symbolic integration for η as in (946)

    |η|22 =3

    8

    radicπ |ηprime|22 =

    7

    16

    radicπ

    |η middot log |22 =

    radicπ

    64

    (8(3γ minus 8) log 2 + 3π2 + 6γ2 + 24(log 2)2 + 16minus 32γ

    )le 016364

    |η(t)radict|1 =

    214Γ(14)

    4le 107791 |η(t)

    radict|1 =

    3

    4234Γ(34) le 154568

    |ηprime(t)radict|1 =

    int radic2

    0

    t32eminust2

    2 dtminusint infinradic

    2

    t32eminust2

    2 dt le 148469

    |ηprime(t)radict|1 le 172169

    (955)

    182 CHAPTER 9 EXPLICIT FORMULAS

    Proposition 932 Let η(t) = t2eminust22 Let x ge 1 δ isin R Let χ be a primitive

    character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0

    lie on the critical line Assume that T0 ge max(10π|δ| 50)Theninfinsumn=1

    Λ(n)χ(n)e

    xn

    )η(nx) =

    η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

    (956)where

    errηχ(δ x) = T0 logqT0

    2πmiddot(

    611eminus01598T0 + 1578eminus01065middot T

    20

    (πδ)2

    )+(

    122radicT0 log qT0 + 5056

    radicT0 + 1423 log q + 3719

    )middot xminus12

    + (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32(957)

    Proof We proceed as in the proof of Prop 922 The contribution of Lemma 931 is

    T0 logqT0

    2πmiddot(

    611eminus01598T0 + 1578eminus01065middot T

    20

    (πδ)2

    )middot x

    whereas the contribution of Lemma 914 is at most

    (122radicT0 log qT0 + 5056

    radicT0 + 1423 log q + 37188)

    radicx

    Let us now apply Lemma 911 Since η(0) = 0 we have

    R = Olowast(c0) = Olowast(2138 + 1099|δ|)

    Lastly|ηprime|2 + 2π|δ||η|2 le 0881 + 5123|δ|

    Now that we have Prop 932 we can derive from it similar bounds for a smoothingdefined as the multiplicative convolution of η with something else In general forϕ1 ϕ2 [0infin)rarr C if we know how to bound sums of the form

    Sfϕ1(x) =sumn

    f(n)ϕ1(nx) (958)

    we can bound sums of the form Sfϕ1lowastMϕ2 simply by changing the order of summationand integration

    Sfϕ1lowastMϕ2 =sumn

    f(n) middot (ϕ1 lowastM ϕ2)(nx

    )=

    int infin0

    sumn

    f(n)ϕ1

    ( n

    wx

    )ϕ2(w)

    dw

    w=

    int infin0

    Sfϕ1(wx)ϕ2(w)

    dw

    w

    (959)

    93 THE CASE OF ηlowast(T ) 183

    This is particularly nice if ϕ2(t) vanishes in a neighbourhood of the origin since thenthe argument wx of Sfϕ1(wx) is always large

    We will use ϕ1(t) = t2eminust22 ϕ2(t) = η1 lowastM η1 where η1 is 2 times the char-

    acteristic function of the interval [12 1] The motivation for the choice of ϕ1 and ϕ2

    is clear we have just got bounds based on ϕ1(t) in the major arcs and we obtainedminor-arc bounds for the weight ϕ2(t) in Part I

    Corollary 933 Let η(t) = t2eminust22 η1 = 2 middot I[121] η2 = η1 lowastM η1 Let ηlowast =

    η2 lowastM η Let x isin R+ δ isin R Let χ be a primitive character mod q q ge 1 Assumethat all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Assumethat T0 ge max(10π|δ| 50)

    Theninfinsumn=1

    Λ(n)χ(n)e

    xn

    )ηlowast(nx) =

    ηlowast(minusδ)x+Olowast (errηlowastχ(δ x)) middot x if q = 1Olowast (errηlowastχ(δ x)) middot x if q gt 1

    (960)where

    errηχlowast(δ x) = T0 logqT0

    2πmiddot(

    611eminus01598T0 + 00102 middot eminus01065middot T20

    (πδ)2

    )+(

    1679radicT0 log qT0 + 6957

    radicT0 + 1958 log q + 5117

    )middot xminus 1

    2

    + (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32(961)

    Proof The left side of (960) equalsint infin0

    infinsumn=1

    Λ(n)χ(n)e

    (δn

    x

    )η( n

    wx

    )η2(w)

    dw

    w

    =

    int 1

    14

    infinsumn=1

    Λ(n)χ(n)e

    (δwn

    wx

    )η( n

    wx

    )η2(w)

    dw

    w

    since η2 is supported on [minus14 1] By Prop 932 the main term (if q = 1) contributesint 1

    14

    η(minusδw)xw middot η2(w)dw

    w= x

    int infin0

    η(minusδw)η2(w)dw

    = x

    int infin0

    int infinminusinfin

    η(t)e(δwt)dt middot η2(w)dw = x

    int infin0

    int infinminusinfin

    η( rw

    )e(δr)

    dr

    wη2(w)dw

    = x

    int infinminusinfin

    (int infin0

    η( rw

    )η2(w)

    dw

    w

    )e(δr)dr = ηlowast(minusδ) middot x

    The error term isint 1

    14

    errηχ(δwwx) middot wx middot η2(w)dw

    w= x middot

    int 1

    14

    errηχ(δwwx)η2(w)dw (962)

    184 CHAPTER 9 EXPLICIT FORMULAS

    Using the fact that

    η2(w) =

    4 log 4w if w isin [14 12]4 logwminus1 if w isin [12 1]0 otherwise

    we can easily check thatint infin0

    η2(w)dw = 1

    int infin0

    wminus12η2(w)dw le 137259int infin0

    wminus1η2(w)dw = 4(log 2)2 le 192182

    int infin0

    wminus32η2(w)dw le 274517

    and by rigorous numerical integration from 14 to 12 and from 12 to 1 (using egVNODE-LP [Ned06])int infin

    0

    eminus01065middot102( 1w2minus1)η2(w)dw le 0006446

    We then see that (957) and (962) imply (961)

    94 The case of η+(t)

    We will work with

    η(t) = η+(t) = hH(t) middot tηhearts(t) = hH(t) middot teminust22 (963)

    where hH is as in (76) We recall that hH is a band-limited approximation to thefunction h defined in (75) ndash to be more precise MhH(it) is the truncation of Mh(it)to the interval [minusHH]

    We are actually defining h hH and η in a slightly different way from what was donein the first version of [Hela] The difference is instructive There η(t) was defined ashH(t)eminust

    22 and hH was a band-limited approximation to a function h defined as in(75) but with t3(2 minus t)3 instead of t2(2 minus t)3 The reason for our new definitions isthat now the truncation of Mh(it) will not break the holomorphy of Mη and so wewill be able to use the general results we proved in sect91

    In essence Mh will still be holomorphic because the Mellin transform of tηhearts(t) isholomorphic in the domain we care about unlike the Mellin transform of ηhearts(t) whichdoes have a pole at s = 0

    As usual we start by bounding the contribution of zeros with large imaginary partThe procedure is much as before since η+(t) = ηH(t)ηhearts(t) the Mellin transformMη+ is a convolution of M(teminust

    22) and something of support in [minusHH]i namelyMηH restricted to the imaginary axis This means that the decay of Mη+ is (at worst)like the decay of M(teminust

    22) delayed by H

    94 THE CASE OF η+(T ) 185

    Lemma 941 Let η = η+ be as in (963) for some H ge 25 Let x isin R+ δ isin R Letχ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ)with |=(ρ)| le T0 satisfy lt(s) = 12 where T0 ge H + max(10π|δ| 50)

    Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

    sumρ

    |=(ρ)|gtT0

    |Gδ(ρ)| le

    (11308

    radicT prime0eminus01598T prime0 + 16147|δ|e

    minus01065

    (T prime0πδ

    )2)log

    qT0

    where T prime0 = T0 minusH

    Proof As usual sumρ

    |=(ρ)|gtT0

    |Gδ(ρ)| =sumρ

    =(ρ)gtT0

    (|Gδ(ρ)|+ |Gδ(1minus ρ)|)

    Let Fδ be as in (947) Then since η+(t)e(δt) = hH(t)teminust22e(δt) where hH is as

    in (76) we see by (29) that

    Gδ(s) =1

    int H

    minusHMh(ir)Fδ(s+ 1minus ir)dr

    and so since |Mh(ir)| = |Mh(minusir)|

    |Gδ(ρ)|+ |Gδ(1minus ρ)| le 1

    int H

    minusH|Mh(ir)|(|Fδ(1 +ρminus ir)|+ |Fδ(2minus (ρminus ir))|)dr

    (964)We apply Cor 802 with k = 1 and T0minusH instead of T0 and obtain that |Fδ(ρ)|+

    |Fδ(1minus ρ)| le g(τ) where

    g(τ) = κ11

    radic|τ |eminus01598|τ | + κ10

    |τ |2π|δ|

    eminus01065( τπδ )

    2

    (965)

    where κ10 = 4903 and κ11 = 4017 (As in the proof of Lemmas 921 and 931 weare putting in extra terms so as to simplify our integrals)

    From (964) we conclude that

    |Gδ(ρ)|+ |Gδ(1minus ρ)| le f(τ)

    for ρ = σ + iτ τ gt 0 where

    f(τ) =|Mh(ir)|1

    2πmiddot g(τ minusH)

    is decreasing for τ ge T0 (because g(τ) is decreasing for τ ge T0 minus H) By (A17)|Mh(ir)|1 le 16193918

    186 CHAPTER 9 EXPLICIT FORMULAS

    We apply Lemma 913 and get that

    sumρ

    |=(ρ)|gtT0

    |Gδ(ρ)| leint infinT0

    f(T )

    (1

    2πlog

    qT

    2π+

    1

    4T

    )dT

    =|Mh(ir)|1

    int infinT0

    g(T minusH)

    (1

    2πlog

    qT

    2π+

    1

    4T

    )dT

    (966)

    Now we just need to estimate some integrals For any y ge e2 c gt 0 and κ κ1 ge 0int infiny

    radicteminusctdt le

    (radicy

    c+

    1

    2c2radicy

    )eminuscy

    int infiny

    (radict log(t+ κ) +

    κ1radict

    )eminusctdt le

    (radicy

    c+

    a

    c2radicy

    )log(y + κ)eminuscy

    where

    a =1

    2+

    1 + cκ1

    log(y + κ)

    The contribution of the exponential term in (965) to (966) thus equals

    κ11|Mh(ir)|12π

    int infinT0

    (1

    2πlog

    qT

    2π+

    1

    4T

    )radicT minusH middot eminus01598(TminusH)dT

    le 103532

    int infinT0minusH

    (1

    2πlog(T +H) +

    log q2π

    2π+

    1

    4T

    )radicTeminus01598T dT

    le 103532

    (radicT0 minusH01598

    +a

    015982radicT0 minusH

    )log

    qT0

    2πmiddot eminus01598(T0minusH)

    (967)

    where a = 12+(1+01598π2) log T0 Since T0minusH ge 50 and T0 ge 50+25 = 75this is at most

    11308radicT0 minusH log

    qT0

    2πmiddot eminus01598(T0minusH)

    We now estimate a few more integrals so that we can handle the Gaussian term in(965) For any y gt 1 c gt 0 κ κ1 ge 0int infin

    y

    teminusct2

    dt =eminuscy

    2

    2c

    int infiny

    (t log(t+ κ) + κ1)eminusct2

    dt le

    (1 +

    κ1 + 12cy

    y log(y + κ)

    )log(y + κ) middot eminuscy2

    2c

    Proceeding just as before we see that the contribution of the Gaussian term in (965)

    94 THE CASE OF η+(T ) 187

    to (966) is at most

    κ10|Mh(ir)|12π

    int infinT0

    (1

    2πlog

    qT

    2π+

    1

    4T

    )T minusH2π|δ|

    middot eminus01065(TminusHπδ )2

    dT

    le 126368 middot |δ|4

    int infinT0minusHπ|δ|

    (log

    (T +

    H

    π|δ|

    )+ log

    q|δ|2

    +π2

    T

    )Teminus01065T 2

    dT

    le 126368 middot |δ|8 middot 01065

    1 +

    π2 + π|δ|

    2middot01065middot(T0minusH)

    T0minusHπ|δ| log T0

    π|δ|

    logqT0

    2πmiddot eminus01065(T0minusHπδ )

    2

    (968)Since (T0 minusH)(π|δ|) ge 10 this is at most

    16147|δ| logqT0

    2πmiddot eminus01065(T0minusHπδ )

    2

    Proposition 942 Let η = η+ be as in (963) for some H ge 25 Let x ge 103 δ isin RLet χ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0 lie on the critical line where T0 ge H + max(10π|δ| 50)

    Theninfinsumn=1

    Λ(n)χ(n)e

    xn

    )η+(nx) =

    η+(minusδ)x+Olowast

    (errη+χ(δ x)

    )middot x if q = 1

    Olowast(errη+χ(δ x)

    )middot x if q gt 1

    (969)where

    errη+χ(δ x) =

    (11308

    radicT prime0 middot eminus01598T prime0 + 16147|δ|e

    minus01065

    (T prime0πδ

    )2)log

    qT0

    + (1634radicT0 log qT0 + 1243

    radicT0 + 1321 log q + 3451)x12

    + (9 + 11|δ|)xminus1 + (log q)(11 + 6|δ|)xminus32(970)

    where T prime0 = T0 minusH

    Proof We can apply Lemmas 911 and Lemma 914 because η+(t) (log t)η+(t) andηprime+(t) are in `2 (by (A25) (A28) and (A32)) and η+(t)tσminus1 and ηprime+(t)tσminus1 are in`1 for σ in an open interval containing [12 32] (by (A30) and (A33)) (Because of(95) the fact that η+(t)tminus12 and η+(t)t12 are in `1 implies that η+(t) log t is also in`1 as is required by Lemma 914)

    We apply Lemmas 911 914 and 941 We bound the norms involving η+ usingthe estimates in sectA3 and sectA4 Since η+(0) = 0 (by the definition (A3) of η+) theterm R in (92) is at most c0 where c0 is as in (93) We bound

    c0 le2

    3

    (2922875

    (radicΓ(12) +

    radicΓ(32)

    )+ 1062319

    (radicΓ(52) +

    radicΓ(72)

    ))+

    3|δ| middot 1062319

    (radicΓ(32) +

    radicΓ(52)

    )le 6536232 + 9319578|δ|

    188 CHAPTER 9 EXPLICIT FORMULAS

    using (A30) and (A33) By (A25) (A32) and the assumption H ge 25

    |η+|2 le 080365 |ηprime+|2 le 10845789

    Thus the error terms in (91) total at most

    6536232+9319578|δ|+ (log q + 601)(10845789 + 2π middot 080365|δ|)xminus12

    le 9 + 11|δ|+ (log q)(11 + 6|δ|)xminus12(971)

    The part of the sumsumρGδ(ρ)xρ in (91) corresponding to zeros ρ with |=(ρ)| gt

    T0 gets estimated by Lem 941 By Lemma 914 the part of the sum correspondingto zeros ρ with |=(ρ)| le T0 is at most

    (1634radicT0 log qT0 + 1243

    radicT0 + 1321 log q + 3451)x12

    where we estimate the norms |η+|2 |η middot log |2 and |η(t)radict|1 by (A25) (A28) and

    (A30)

    95 A sum for η+(t)2

    Using a smoothing function sometimes leads to considering sums involving the squareof the smoothing function In particular in Part III we will need a result involving η2

    +

    ndash something that could be slightly challenging to prove given the way in which η+ isdefined Fortunately we have bounds on |η+|infin and other `infin-norms (see AppendixA5) Our task will also be made easier by the fact that we do not have a phase e(δnx)this time All in all this will be yet another demonstration of the generality of theframework developed in sect91

    Proposition 951 Let η = η+ be as in (963) H ge 25 Let x ge 108 Assume thatall non-trivial zeros ρ of the Riemann zeta function ζ(s) with |=(ρ)| le T0 lie on thecritical line where T0 ge max(2H + 25 200)

    Theninfinsumn=1

    Λ(n)(log n)η2+(nx) = x middot

    int infin0

    η2+(t) log xt dt+Olowast(err`2η+) middot x log x (972)

    where

    err`2η+ =

    ((0462

    (log T1)2

    log x+ 0909 log T1

    )T1 + 171

    (1 +

    log T1

    log x

    )H

    )eminus

    π4 T1

    + (2445radicT0 log T0 + 5004) middot xminus12

    (973)and T1 = T0 minus 2H

    The assumption T0 ge 200 is stronger than what we strictly need but as it happenswe could make much stronger assumptions still Proposition 951 relies on a verifica-tion of zeros of the Riemann zeta function such verifications have gone up to valuesof T0 much higher than 200

    95 A SUM FOR η+(T )2 189

    Proof We will need to consider two smoothing functions namely η+0(t) = η+(t)2

    and η+1 = η+(t)2 log t Clearly

    infinsumn=1

    Λ(n)(log n)η2+(nx) = (log x)

    infinsumn=1

    Λ(n)η+0(nx) +

    infinsumn=1

    Λ(n)η+1(nx)

    Since η+(t) = hH(t)teminust22

    η+0(r) = h2H(t)t2eminust

    2

    η+1(r) = h2H(t)(log t)t2eminust

    2

    Let η+2 = (log x)η+0 + η+1 = η2+(t) log xt

    We wish to apply Lemma 911 For this we must first check that some norms arefinite Clearly

    η+2(t) = η2+(t) log x+ η2

    +(t) log t

    ηprime+2(t) = 2η+(t)ηprime+(t) log x+ 2η+(t)ηprime+(t) log t+ η2+(t)t

    (974)

    Thus we see that η+2(t) is in `2 because η+(t) is in `2 and η+(t) η+(t) log t are bothin `infin (see (A25) (A38) (A40))

    |η+2(t)|2 le∣∣η2

    +(t)∣∣2

    log x+∣∣η2

    +(t) log t∣∣2

    le |η+|infin |η+|2 log x+ |η+(t) log t|infin |η+|2 (975)

    Similarly ηprime+2(t) is in `2 because η+(t) is in `2 ηprime+(t) is in `2 (A32) and η+(t)η+(t) log t and η+(t)t (see (A41)) are all in `infin∣∣ηprime+2(t)

    ∣∣2le∣∣2η+(t)ηprime+(t)

    ∣∣2

    log x+∣∣2η+(t)ηprime+(t) log t

    ∣∣2

    +∣∣η2

    +(t)t∣∣2

    le 2 |η+|infin∣∣ηprime+∣∣2 log x+ 2 |η+(t) log t|infin

    ∣∣ηprime+∣∣2 + |η+(t)t|infin |η+|2 (976)

    In the same way we see that η+2(t)tσminus1 is in `1 for all σ in (minus1infin) (because the sameis true of η+(t)tσminus1 (A30) and η+(t) η+(t) log t are both in `infin) and ηprime+2(t)tσminus1 isin `1 for all σ in (0infin) (because the same is true of η+(t)tσminus1 and ηprime+(t)tσminus1 (A33)and η+(t) η+(t) log t η+(t)t are all in `infin)

    We now apply Lemma 911 with q = 1 δ = 0 Since η+2(0) = 0 the residueterm R equals c0 which by (974) is at most 23 times

    2 (|η+|infin log x+ |η+(t) log t|infin)(∣∣∣ηprime+(t)

    radict∣∣∣1

    +∣∣∣ηprime+(t)

    radict∣∣∣1

    )+ |η+(t)t|infin

    (∣∣∣η+(t)radict∣∣∣1

    +∣∣∣η+(t)

    radict∣∣∣1

    )

    Using the bounds (A38) (A40) (A41) (with the assumption H ge 25) (A30) and(A33) we get that this means that

    c0 le 1857606 log x+ 863264

    190 CHAPTER 9 EXPLICIT FORMULAS

    Since q = 1 and δ = 0 we get from (976) (and (A38) (A40) (A41) with theassumption H ge 25 and also (A25) and (A32)) that

    (log q + 601)middot(∣∣ηprime+2∣∣2 + 2π|δ| |η+2|2

    )xminus12

    = 601∣∣ηprime+2∣∣2 xminus12 le (16256 log x+ 59325)xminus12

    Using the assumption x ge 108 we obtain

    c0 + (18526 log x+ 71799)xminus12 le 19064 log x (977)

    We will now apply Lemma 914 ndash as we may because of the finiteness of the normswe have already checked together with

    |η+2(t) log t|2 le∣∣η2

    +(t) log t∣∣2

    log x+∣∣η2

    +(t)(log t)2∣∣2

    le |η+(t) log t|infin (|η+(t)|2 log x+ |η+(t) log t|2)

    le 04976 middot (080365 log x+ 082999) le 03999 log x+ 041301(978)

    (by (A40) (A25) and (A28) use the assumption H ge 25) We also need the bounds

    |η+2(t)|2 le 114199 log x+ 039989 (979)

    (from (975) by the norm bounds (A38) (A40) and (A25) all with H ge 25) and∣∣∣η+2(t)radict∣∣∣1le (|η+(t)|infin log x+ |η+(t) log t|infin)

    ∣∣∣η+(t)radict∣∣∣1

    le 14211 log x+ 049763(980)

    (by (A38) (A40) (again with H ge 25) and (A30))Applying Lemma 914 we obtain that the sum

    sumρ |G0(ρ)|xρ (where G0(ρ) =

    Mη+2(ρ)) over all non-trivial zeros ρ with |=(ρ)| le T0 is at most x12 times

    (154189 log x+ 08129)radicT0 log T0 + (421245 log x+ 617301)

    radicT0

    + 491 log x+ 172(981)

    where we are bounding norms by (979) (978) and (980) (We are using the fact thatT0 ge 2π

    radice to ensure that the quantity

    radicT0 log T0minus (log 2π

    radice)radicT0 being multiplied

    by |η+2|2 is positive thus an upper bound for |η+2|2 suffices) By the assumptionsx ge 108 T0 ge 200 (981) is at most

    (2445radicT0 log T0 + 50034) log x

    In comparison 19064xminus12 log x le 0002 log x since x ge 108It remains to bound the sum of Mη+2(ρ) over zeros with |=(ρ)| gt T0 This we

    will do as usual by Lemma 913 For that we will need to bound Mη+2(ρ) for ρ inthe critical strip

    95 A SUM FOR η+(T )2 191

    The Mellin transform of eminust2

    is Γ(s2)2 and so the Mellin transform of t2eminust2

    is Γ(s2 + 1)2 By (210) this implies that the Mellin transform of (log t)t2eminust2

    isΓprime(s2 + 1)4 Hence by (29)

    Mη+2(s) =1

    int infinminusinfin

    M(h2H)(ir) middot Fx (sminus ir) dr (982)

    whereFx(s) = (log x)Γ

    (s2

    + 1)

    +1

    2Γprime(s

    2+ 1) (983)

    Moreover

    M(h2H)(ir) =

    1

    int infinminusinfin

    MhH(iu)MhH(i(r minus u)) du (984)

    and so M(h2H)(ir) is supported on [minus2H 2H] We also see that |Mh2

    H(ir)|1 le|MhH(ir)|212π We know that |MhH(ir)|212π le 4173727 by (A17)

    Hence

    |Mη+2(s)| le 1

    int infinminusinfin|M(h2

    H)(ir)|dr middot max|r|le2H

    |Fx(sminus ir)|

    le 4173727

    4πmiddot max|r|le2H

    |Fx(sminus ir)| le 332135 middot max|r|le2H

    |Fx(sminus ir)|(985)

    By (851) (Stirling with explicit constants)

    |Γ(s)| leradic

    2π|s|σminus 12 e

    112|s|+

    radic2

    180|s|3 eminusπ|=(s)|2 (986)

    when lt(s) ge 0 and so

    |Γ(s)| leradic

    (radic1252 + 152

    125

    )e

    112middot125 +

    radic2

    180middot1253 middot |=(s)|eminusπ|=(s)|2

    le 2542|=(s)|eminusπ|=(s)|2

    (987)

    for s isin C with 0 lt lt(s) le 32 and |=(s)| ge 252 Moreover by [OLBC10 5112]and the remarks at the beginning of [OLBC10 511(ii)]

    Γprime(s)

    Γ(s)= log sminus 1

    2s+Olowast

    (1

    12|s|2middot 1

    cos3 θ2

    )for | arg(s)| lt θ (θ isin (minusπ π)) Again for s = σ + iτ with 0 lt σ le 32 and|τ | ge 252 this gives us

    Γprime(s)

    Γ(s)= log |τ |+ log

    radic|τ |2 + 152

    |τ |+Olowast

    (1

    2|τ |

    )+Olowast

    (1

    12|τ |2middot 1

    (1radic

    2)3

    )= log |τ |+Olowast

    (9

    8|τ |2+

    1

    2|τ |

    )+Olowast(0236)

    |τ |2

    = log |τ |+Olowast(

    0609

    |τ |

    )

    192 CHAPTER 9 EXPLICIT FORMULAS

    Hence for 0 le lt(s) le 1 (or in fact minus2 le lt(s) le 1) and |=(s)| ge 25

    |Fx(s)| le(

    (log x) +1

    2log∣∣∣τ2

    ∣∣∣+1

    2Olowast(

    0609

    |τ2|

    ))Γ(s

    2+ 1)

    le 2542((log x) +1

    2log |τ | minus 0297)

    |τ |2eminusπ|τ |2

    (988)

    Thus by (985) for ρ = σ + iτ with |τ | ge T0 ge 2H + 25 and 0 le σ le 1

    |Mη+2(ρ)| le f(τ)

    where

    f(T ) = 845

    (log x+

    1

    2log T

    )(|τ |2minusH

    )middot eminus

    π(|τ|minus2H)4 (989)

    The functions t 7rarr teminusπt2 and t 7rarr (log t)teminusπt2 are decreasing for t ge e (or in factfor t ge 1762) setting t = T2minusH we see that the right side of (989) is a decreasingfunction of T for T ge T0 since T02minusH ge 252 gt e

    We can now apply Lemma 913 and get thatsumρ

    |=(ρ)|gtT0

    |Mη+2(ρ)| leint infinT0

    f(T )

    (1

    2πlog

    T

    2π+

    1

    4T

    )dT (990)

    Since T ge T0 ge 75 gt 2 we know that ((12π) log(T2π) + 14T ) le (12π) log T Hence the right side of (990) is at most

    839

    int infinT0

    ((log x)(log T ) +

    (log T )2

    2

    )(T minus 2H)eminus

    π(Tminus2H)4 dT

    le 0668

    int infinT1

    ((log x)

    (log t+

    2H

    t

    )+

    ((log t)2

    2+ 2H

    log t

    t

    ))teminus

    πt4 dt

    (991)

    where T1 = T0 minus 2H and t = T minus 2H we are using the facts that (log t)primeprime lt 0 fort gt 0 and ((log t)2)primeprime lt 0 for t gt e (Of course T1 ge 25 gt e)

    Of courseintinfinT1eminus(π4)t = (4π)eminus(π4)T1 We recall (936) and (950)int infinT1

    log t middot eminusπ4 tdt le(

    log T1 +4π

    T1

    )eminus

    π4 T1

    π4int infinT1

    (log t)teminusπ4 tdt le

    (T1 +

    4a

    π

    )eminus

    π4 T1 log T1

    π4

    for T1 ge 1 satisfying log T1 gt 4(πT1) where a = 1 + (1 + 4(πT1))(log T1 minus4(πT1)) It is easy to check that log T1 gt 4(πT1) and 4aπ le 16957 for T1 ge 25of course we also have (4π)25 le 0051 Lastlyint infin

    T1

    (log t)2teminusπ4 tdt le

    (T1 +

    4b

    π

    )eminus

    π4 T1(log T1)2

    π4

    96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 193

    for T1 ge e where b = 1 + (2 + 8(πT1))(log T1 minus 8(πT1)) and we check that4bπ le 21319 for T1 ge 25 We conclude that the integral on the second line of (991)is at most

    4

    π

    ((log T1)2

    2(T1 + 2132) + (log x)(log T1)(T1 + 1696)

    )eminus

    π4 T1

    +4

    πmiddot 2H(log T1 + 0051 + log x)eminus

    π4 T1

    Multiplying this by 0668 and simplifying further (using T1 ge 25) we conclude thatsumρ|=(ρ)|gtT0

    |Mη+2(ρ)| is at most

    ((0462 log T1 + 0909 log x)(log T1)T1 + 171(log T1 + log x)H) eminusπ4 T1

    96 A verification of zeros and its consequencesDavid Platt verified in his doctoral thesis [Pla11] that for every primitive character χof conductor q le 105 all the non-trivial zeroes of L(s χ) with imaginary partle 108qlie on the critical line ie have real part exactly 12 (We call this a GRH verificationup to 108q)

    In work undertaken in coordination with the present work [Plab] Platt has extendedthese computations to

    bull all odd q le 3 middot 105 with Tq = 108q

    bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

    The method used was rigorous its implementation uses interval arithmeticLet us see what this verification gives us when used as an input to Prop 922 We

    are interested in bounds on | errηχlowast(δ x)| for q le r and |δ| le 4rq We set r = 3middot105(We will not be using the verification for q even with 3 middot 105 lt q le 4 middot 105 though wecertainly could)

    We let T0 = 108q Thus

    T0 ge108

    3 middot 105=

    1000

    3

    T0

    π|δ|ge 108q

    π middot 4rq=

    1000

    12π

    (992)

    and so by |δ| le 4rq le 12 middot 106q le 12 middot 106

    353eminus01598T0 le 2597 middot 10minus23

    225δ2

    T0eminus01065

    T20

    (πδ)2 le |δ| middot 7715 middot 10minus34 le 9258 middot 10minus28

    194 CHAPTER 9 EXPLICIT FORMULAS

    Since qT0 le 108 this gives us that

    logqT0

    2πmiddot(

    353eminus01598T0 + 225δ2

    T0eminus01065

    T20

    (πδ)2

    )le 43054 middot 10minus22 +

    154 middot 10minus26

    qle 4306 middot 10minus22

    Again by T0 = 108q

    2337radicT0 log qT0 + 21817

    radicT0 + 285 log q + 7438

    is at most648662radicq

    + 111

    and

    3 log q + 14|δ|+ 17 le 55 +17 middot 107

    q

    (log q + 6) middot (1 + 5|δ|) le 19 +12 middot 108

    q

    Hence assuming x ge 108 to simplify we see that Prop 922 gives us that

    errηχ(δ x) le 4306 middot 10minus22 +

    648662radicq + 111radicx

    +55 + 17middot107

    q

    x+

    19 + 12middot108

    q

    x32

    le 4306 middot 10minus22 +1radicx

    (650400radicq

    + 112

    )for η(t) = eminust

    22 This proves Theorem 711Let us now see what Plattrsquos calculations give us when used as an input to Prop 932

    and Cor 933 Again we set r = 3 middot 105 δ0 = 8 |δ| le 4rq and T0 = 108q so(992) is still valid We obtain

    T0 logqT0

    2πmiddot(

    611eminus01598T0 + 1578eminus01065middot T

    20

    (πδ)2

    )le log

    108

    (611 middot 1000

    3eminus01598middot 10003 + 108 middot 1578eminus01065( 1000

    12π )2)

    le 2485 middot 10minus19

    since t exp(minus01598t) is decreasing on t for t ge 101598 We use the same boundwhen we have 00102 instead of 1578 on the left side as in (961) (The coefficientaffects what is by far the smaller term so we are wasting nothing) Again by T0 =108q and q le r

    122radicT0 log qT0 + 5053

    radicT0 + 1423 log q + 3719 le 279793

    radicq

    + 552

    1679radicT0 log qT0 + 6957

    radicT0 + 1958 log q + 5117 le 378854

    radicq

    + 759

    96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 195

    For x ge 108 we use |δ| le 4rq le 12 middot 106q to bound

    (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32 le(

    00004 +1322

    q

    )xminus12

    (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32 le(

    00007 +2644

    q

    )xminus12

    Summing we obtain

    errηχ le 2485 middot 10minus19 +1radicx

    (281200radicq

    + 56

    )for η(t) = t2eminust

    22 and

    errηχ le 2485 middot 10minus19 +1radicx

    (381500radicq

    + 76

    )for η(t) = t2eminust

    22 lowastM η2(t) This proves Theorem 712 and Corollary 713Now let us work with the smoothing weight η+ This time around set r = 150000

    if q is odd and r = 300000 if q is even As before we assume

    q le r |δ| le 4rq

    We can see that Plattrsquos verification [Plab] mentioned before allows us to take

    T0 = H +250r

    q H = 200

    since Tq is always at least this (Tq = 108q ge 200 + 7 middot 107q gt 200 + 375 middot 107qfor q le 150000 odd Tq ge 200 + 75 middot 107q for q le 300000 even)

    Thus

    T0 minusH =250r

    qge 250r

    r= 250

    T0 minusHπδ

    ge 250r

    πδqge 250

    4π= 1989436

    and also

    T0 le 200 + 250 middot 150000 le 3751 middot 107 qT0 le rH + 250r le 135 middot 108

    Hence sinceradicteminus01598t is decreasing on t for t ge 1(2 middot 01598)

    11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

    (T0minusH)2

    (πδ)2

    le 79854 middot 10minus16 +4r

    qmiddot 79814 middot 10minus18

    le 79854 middot 10minus16 +95777 middot 10minus12

    q

    196 CHAPTER 9 EXPLICIT FORMULAS

    Examining (970) we get

    errη+χ(δ x) le log135 middot 108

    2πmiddot(

    79854 middot 10minus16 +95777 middot 10minus12

    q

    )+

    ((1634 log(135 middot 108) + 1243

    ) radic135 middot 108

    radicq

    + 1321 log 300000 + 3451

    )1radicx

    +

    (9 + 11 middot 12 middot 106

    q

    )xminus1 + (log 300000)

    (11 + 6 middot 12 middot 106

    q

    )xminus32

    le 13482 middot 10minus14 +1617 middot 10minus10

    q

    +

    (499845radicq

    + 5117 +132 middot 106

    qradicx

    +9radicx

    +91 middot 107

    qx+

    139

    x

    )1radicx

    Making the assumption x ge 1012 we obtain

    errη+χ(δ x) le 13482 middot 10minus14 +1617 middot 10minus10

    q+

    (499900radicq

    + 52

    )1radicx

    This proves Theorem 714 for general qLet us optimize things a little more carefully for the trivial character χT Again

    we will make the assumption x ge 1012 We will also assume as we did before that|δ| le 4rq this now gives us |δ| le 600000 since q = 1 and r = 150000 for q oddWe will go up to a height T0 = H + 600000π middot t where H = 200 and t ge 10 Then

    T0 minusHπδ

    =600000πt

    4πrge t

    Hence

    11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

    (T0minusH)2

    (πδ)2

    le 10minus1300000 + 9689000eminus01065t2

    Looking at (970) we get

    errη+χT (δ x) le logT0

    2πmiddot(

    10minus1300000 + 9689000eminus01065t2)

    + ((1634 log T0 + 1243)radicT0 + 3451)xminus12 + 6600009xminus1

    The value t = 20 seems good enough we choose it because it is not far from optimalfor x sim 1027 We get that T0 = 12000000π + 200 since T0 lt 108 we are within therange of the computations in [Plab] (or for that matter [Wed03] or [Plaa]) We obtain

    errη+χT (δ x) le 4772 middot 10minus11 +251400radic

    x

    Lastly let us look at the sum estimated in (972) Here it will be enough to go upto just T0 = 2H + max(50 H4) = 450 where as before H = 200 Of course the

    96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 197

    verification of the zeros of the Riemann zeta function does go that far as we alreadysaid it goes until 108 (or rather more see [Wed03] and [Plaa]) We make again theassumption x ge 1012 We look at (973) and obtain that err`2η+ is at most((

    0462(log 50)2

    log 1012+ 0909 log 50

    )middot 50 + 171

    (1 +

    log 50

    log 1012

    )middot 200

    )eminus

    π4 50

    + (2445radic

    450 log 450 + 5004) middot xminus12

    le 5123 middot 10minus15 +36691radic

    x

    (993)It remains only to estimate the integral in (972) First of allint infin

    0

    η2+(t) log xt dt =

    int infin0

    η2(t) log xt dt

    + 2

    int infin0

    (η+(t)minus η(t))η(t) log xt dt+

    int infin0

    (η+(t)minus η(t))2 log xt dt

    The main term will be given byint infin0

    η2(t) log xt dt =

    (064020599736635 +O

    (10minus14

    ))log x

    minus 0021094778698867 +O(10minus15

    )

    where the integrals were computed rigorously using VNODE-LP [Ned06] (The in-tegral

    intinfin0η2(t)dt can also be computed symbolically) By Cauchy-Schwarz and the

    triangle inequalityint infin0

    (η+(t)minus η(t))η(t) log xt dt le |η+ minus η|2|η(t) log xt|2

    le |η+ minus η|2(|η|2 log x+ |η middot log |2)

    le 27486

    H72(080013 log x+ 0214)

    le 1944 middot 10minus6 middot log x+ 52 middot 10minus7

    where we are using (A23) and evaluate |η middot log |2 rigorously as above By (A23) and(A24)int infin

    0

    (η+(t)minus η(t))2 log xt dt le(

    27486

    H72

    )2

    log x+27428

    H7

    le 5903 middot 10minus12 middot log x+ 2143 middot 10minus12

    We conclude thatint infin0

    η2+(t) log xt dt

    = (0640206 +Olowast(195 middot 10minus6)) log xminus 0021095 +Olowast(53 middot 10minus7)

    (994)

    198 CHAPTER 9 EXPLICIT FORMULAS

    We add to this the error term 5123 middot 10minus15 + 36691radicx from (993) and simplify

    using the assumption x ge 1012 We obtain

    infinsumn=1

    Λ(n)(log n)η2+(nx) = 0640206x log xminus 0021095x

    +Olowast(2 middot 10minus6x log x+ 36691

    radicx log x

    )

    (995)

    and so Prop 951 gives us Proposition 715As we can see the relatively large error term 2 middot 10minus6 comes from the fact that we

    have wanted to give the main term in (972) as an explicit constant rather than as anintegral This is satisfactory Prop 715 is an auxiliary result that will be needed forone specific purpose in Part III as opposed to Thms 711ndash714 which while crucialfor Part III are also of general applicability and interest

    Part III

    The integral over the circle

    199

    Chapter 10

    The integral over the major arcs

    LetSη(α x) =

    sumn

    Λ(n)e(αn)η(nx) (101)

    where α isin RZ Λ is the von Mangoldt function and η R rarr C is of fast enoughdecay for the sum to converge

    Our ultimate goal is to bound from belowsumn1+n2+n3=N

    Λ(n1)Λ(n2)Λ(n3)η1(n1x)η2(n2x)η3(n3x) (102)

    where η1 η2 η3 R rarr C Once we know that this is neither zero nor very close tozero we will know that it is possible to write N as the sum of three primes n1 n2 n3

    in at least one way that is we will have proven the ternary Goldbach conjectureAs can be readily seen (102) equalsint

    RZSη1(α x)Sη2(α x)Sη3(α x)e(minusNα) dα (103)

    In the circle method the set RZ gets partitioned into the set of major arcs M and theset of minor arcs m the contribution of each of the two sets to the integral (103) isevaluated separately

    Our objective here is to treat the major arcs we wish to estimateintM

    Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα)dα (104)

    for M = Mδ0r where

    Mδ0r =⋃qlerq odd

    ⋃a mod q

    (aq)=1

    (a

    qminus δ0r

    2qxa

    q+δ0r

    2qx

    )cup⋃qle2rq even

    ⋃a mod q

    (aq)=1

    (a

    qminus δ0r

    qxa

    q+δ0r

    qx

    )(105)

    201

    202 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    and δ0 gt 0 r ge 1 are givenIn other words our major arcs will be few (that is a constant number) and narrow

    While [LW02] used relatively narrow major arcs as well their number as in all pre-vious proofs of Vinogradovrsquos result was not bounded by a constant (In his proof ofthe five-primes theorem [Tao14] is able to take a single major arc around 0 this is notpossible here)

    What we are about to see is the general major-arc setup This is naturally the placewhere the overlap with the existing literature is largest Two important differences cannevertheless be singled out

    bull The most obvious one is the presence of smoothing At this point it improvesand simplifies error terms but it also means that we will later need estimates forexponential sums on major arcs and not just at the middle of each major arc (Ifthere is smoothing we cannot use summation by parts to reduce the problem ofestimating sums to a problem of counting primes in arithmetic progressions orweighted by characters)

    bull Since our L-function estimates for exponential sums will give bounds that arebetter than the trivial one by only a constant ndash even if it is a rather large con-stant ndash we need to be especially careful when estimating error terms findingcancellation when possible

    101 Decomposition of Sη by charactersWhat follows is largely classical cf [HL22] or say [Dav67 sect26] The only differencefrom the literature lies in the treatment of n non-coprime to q and the way in whichwe show that our exponential sum (108) is equal to a linear combination of twistedsums Sηχlowast over primitive characters χlowast (Non-primitive characters would give us L-functions with some zeroes inconveniently placed on the line lt(s) = 0)

    Write τ(χ b) for the Gauss sum

    τ(χ b) =sum

    a mod q

    χ(a)e(abq) (106)

    associated to a b isin ZqZ and a Dirichlet character χ with modulus q We let τ(χ) =τ(χ 1) If (b q) = 1 then τ(χ b) = χ(bminus1)τ(χ)

    Recall that χlowast denotes the primitive character inducing a given Dirichlet characterχ Writing

    sumχ mod q for a sum over all characters χ of (ZqZ)lowast) we see that for any

    a0 isin ZqZ

    1

    φ(q)

    sumχ mod q

    τ(χ b)χlowast(a0) =1

    φ(q)

    sumχ mod q

    suma mod q

    (aq)=1

    χ(a)e(abq)χlowast(a0)

    =sum

    a mod q

    (aq)=1

    e(abq)

    φ(q)

    sumχ mod q

    χlowast(aminus1a0) =sum

    a mod q

    (aq)=1

    e(abq)

    φ(q)

    sumχ mod qprime

    χ(aminus1a0)

    (107)

    101 DECOMPOSITION OF Sη BY CHARACTERS 203

    where qprime = q gcd(q ainfin0 ) Nowsumχ mod qprime χ(aminus1a0) = 0 unless a = a0 (in which

    casesumχ mod qprime χ(aminus1a0) = φ(qprime)) Thus (107) equals

    φ(qprime)

    φ(q)

    suma mod q

    (aq)=1

    aequiva0 mod qprime

    e(abq) =φ(qprime)

    φ(q)

    sumk mod qqprime

    (kqqprime)=1

    e

    ((a0 + kqprime)b

    q

    )

    =φ(qprime)

    φ(q)e

    (a0b

    q

    ) sumk mod qqprime

    (kqqprime)=1

    e

    (kb

    qqprime

    )=φ(qprime)

    φ(q)e

    (a0b

    q

    )micro(qqprime)

    provided that (b q) = 1 (We are evaluating a Ramanujan sum in the last step) Hencefor α = aq + δx q le x (a q) = 1

    1

    φ(q)

    sumχ

    τ(χ a)sumn

    χlowast(n)Λ(n)e(δnx)η(nx)

    equals sumn

    micro((q ninfin))

    φ((q ninfin))Λ(n)e(αn)η(nx)

    Since (a q) = 1 τ(χ a) = χ(a)τ(χ) The factor micro((q ninfin))φ((q ninfin)) equals 1when (n q) = 1 the absolute value of the factor is at most 1 for every n Clearlysum

    n(nq)6=1

    Λ(n)η(nx

    )=sump|q

    log psumαge1

    η

    (pα

    x

    )

    Recalling the definition (101) of Sη(α x) we conclude that

    Sη(α x) =1

    φ(q)

    sumχ mod q

    χ(a)τ(χ)Sηχlowast

    x x

    )+Olowast

    2sump|q

    log psumαge1

    η

    (pα

    x

    )

    (108)where

    Sηχ(β x) =sumn

    Λ(n)χ(n)e(βn)η(nx) (109)

    Hence Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα) equals

    1

    φ(q)3

    sumχ1

    sumχ2

    sumχ3

    τ(χ1)τ(χ2)τ(χ3)χ1(a)χ2(a)χ3(a)e(minusNaq)

    middot Sη1χlowast1 (δx x)Sη2χlowast2 (δx x)Sη3χlowast3 (δx x)e(minusδNx)

    (1010)

    plus an error term of absolute value at most

    2

    3sumj=1

    prodjprime 6=j

    |Sηjprime (α x)|sump|q

    log psumαge1

    ηj

    (pα

    x

    ) (1011)

    204 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    We will later see that the integral of (1011) over S1 is negligible ndash for our choices ofηj it will in fact be of size O(x(log x)A) A a constant The error term O(x(log x)A)should be compared to the main term which will be of size about a constant times x2

    In (1010) we have reduced our problems to estimating Sηχ(δx x) for χ prim-itive a more obvious way of reaching the same goal would have made (1011) worseby a factor of about

    radicq

    102 The integral over the major arcs the main term

    We are to estimate the integral (104) where the major arcs Mδ0r are defined as in(105) We will use η1 = η2 = η+ η3(t) = ηlowast(κt) where η+ and ηlowast will be set later

    We can write

    Sηχ(δx x) = Sη(δx x) =

    int infin0

    η(tx)e(δtx)dt+Olowast(errηχ(δ x)) middot x

    = η(minusδ) middot x+Olowast(errηχT (δ x)) middot x(1012)

    for χ = χT the trivial character and

    Sηχ(δx) = Olowast(errηχ(δ x)) middot x (1013)

    for χ primitive and non-trivial The estimation of the error terms err will come laterlet us focus on (a) obtaining the contribution of the main term (b) using estimates onthe error terms efficiently

    The main term three principal characters The main contribution will be given bythe term in (1010) with χ1 = χ2 = χ3 = χ0 where χ0 is the principal character modq

    The sum τ(χ0 n) is a Ramanujan sum as is well-known (see eg [IK04 (32)])

    τ(χ0 n) =sumd|(qn)

    micro(qd)d (1014)

    This simplifies to micro(q(q n))φ((q n)) for q square-free The special case n = 1 givesus that τ(χ0) = micro(q)

    Thus the term in (1010) with χ1 = χ2 = χ3 = χ0 equals

    e(minusNaq)φ(q)3

    micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx) (1015)

    where of course Sηχlowast0 (α x) = Sη(α x) (since χlowast0 is the trivial character) Summing(1015) for α = aq+δx and a going over all residues mod q coprime to q we obtain

    micro(

    q(qN)

    )φ((qN))

    φ(q)3micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx)

    102 THE INTEGRAL OVER THE MAJOR ARCS THE MAIN TERM 205

    The integral of (1015) over all of M = Mδ0r (see (105)) thus equals

    sumqlerq odd

    φ((qN))

    φ(q)3micro(q)2micro((qN))

    int δ0r2qx

    minus δ0r2qx

    S2η+χlowast0

    (α x)Sηlowastχlowast0 (α x)e(minusαN)dα

    +sumqle2rq even

    φ((qN))

    φ(q)3micro(q)2micro((qN))

    int δ0rqx

    minus δ0rqxS2η+χlowast0

    (α x)Sηlowastχlowast0 (α x)e(minusαN)dα

    (1016)The main term in (1016) is

    x3 middotsumqlerq odd

    φ((qN))

    φ(q)3micro(q)2micro((qN))

    int δ0r2qx

    minus δ0r2qx

    (η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

    +x3 middotsumqle2rq even

    φ((qN))

    φ(q)3micro(q)2micro((qN))

    int δ0rqx

    minus δ0rqx(η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

    (1017)We would like to complete both the sum and the integral Before we should say

    that we will want to be able to use smoothing functions η+ whose Fourier transformsare not easy to deal with directly All we want to require is that there be a smoothingfunction η easier to deal with such that η be close to η+ in `2 norm

    Assume then that

    |η+ minus η|2 le ε0|η|

    where η is thrice differentiable outside finitely many points and satisfies η(3) isin L1

    Then (1017) equals

    x3 middotsumqlerq odd

    φ((qN))

    φ(q)3micro(q)2micro((qN))

    int δ0r2qx

    minus δ0r2qx

    (η(minusαx))2ηlowast(minusαx)e(minusαN)dα

    +x3 middotsumqle2rq even

    φ((qN))

    φ(q)3micro(q)2micro((qN))

    int δ0rqx

    minus δ0rqx(η(minusαx))2ηlowast(minusαx)e(minusαN)dα

    (1018)plus

    Olowast

    (x2 middot

    sumq

    micro(q)2

    φ(q)2

    int infinminusinfin|(η+(minusα))2 minus (η(minusα))2||ηlowast(minusα)|dα

    ) (1019)

    206 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    Here (1019) is bounded by 282643x2 (by (C9)) times

    |ηlowast(minusα)|infin middot

    radicint infinminusinfin|η+(minusα)minus η(minusα)|2dα middot

    int infinminusinfin|η+(minusα) + η(minusα)|2dα

    le |ηlowast|1 middot |η+ minus η|2|η+ + η|2 = |ηlowast|1 middot |η+ minus η|2|η+ + η|2le |ηlowast|1 middot |η+ minus η|2(2|η|2 + |η+ minus η|2) = |ηlowast|1|η|22 middot (2 + ε0)ε0

    Now (1018) equals

    x3

    int infinminusinfin

    (η(minusαx))2ηlowast(minusαx)e(minusαN)sum

    q(q2)lemin( δ0r

    2|α|x r)micro(q)2=1

    φ((qN))

    φ(q)3micro((qN))dα

    = x3

    int infinminusinfin

    (η(minusαx))2ηlowast(minusαx)e(minusαN)dα middot

    sumqge1

    φ((qN))

    φ(q)3micro(q)2micro((qN))

    minusx3

    int infinminusinfin

    (η(minusαx))2ηlowast(minusαx)e(minusαN)sum

    q(q2)

    gtmin( δ0r

    2|α|x r)micro(q)2=1

    φ((qN))

    φ(q)3micro((qN))dα

    (1020)The last line in (1020) is bounded1 by

    x2|ηlowast|infinint infinminusinfin|η(minusα)|2

    sumq

    (q2)gtmin( δ0r2|α| r)

    micro(q)2

    φ(q)2dα (1021)

    By (21) (with k = 3) (C16) and (C17) this is at most

    x2|ηlowast|1int δ02

    minusδ02|η(minusα)|2 431004

    rdα

    + 2x2|ηlowast|1int infinδ02

    (|η(3) |1

    (2πα)3

    )2862008|α|

    δ0rdα

    le |ηlowast|1

    (431004|η|22 + 000113

    |η(3) |21δ50

    )x2

    r

    It is easy to see that

    sumqge1

    φ((qN))

    φ(q)3micro(q)2micro((qN)) =

    prodp|N

    (1minus 1

    (pminus 1)2

    )middotprodp-N

    (1 +

    1

    (pminus 1)3

    )

    1This is obviously crude in that we are bounding φ((qN))φ(q) by 1 We are doing so in order toavoid a potentially harmful dependence on N

    103 THE `2 NORM OVER THE MAJOR ARCS 207

    Expanding the integral implicit in the definition of f int infininfin

    (η(minusαx))2ηlowast(minusαx)e(minusαN)dα =

    1

    x

    int infin0

    int infin0

    η(t1)η(t2)ηlowast

    (N

    xminus (t1 + t2)

    )dt1dt2

    (1022)

    (This is standard One rigorous way to obtain (1022) is to approximate the integralover α isin (minusinfininfin) by an integral with a smooth weight at different scales as the scalebecomes broader the Fourier transform of the weight approximates (as a distribution)the δ function Apply Plancherel)

    Hence (1017) equals

    x2 middotint infin

    0

    int infin0

    η(t1)η(t2)ηlowast

    (N

    xminus (t1 + t2)

    )dt1dt2

    middotprodp|N

    (1minus 1

    (pminus 1)2

    )middotprodp-N

    (1 +

    1

    (pminus 1)3

    )

    (1023)

    (the main term) plus

    282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 000113

    |η(3) |21δ50

    r

    |ηlowast|1x2 (1024)

    Here (1023) is just as in the classical case [IK04 (1910)] except for the fact thata factor of 12 has been replaced by a double integral Later in chapter 11 we will seehow to choose our smoothing functions (and x in terms ofN ) so as to make the doubleintegral as large as possible in comparison with the error terms This is an importantoptimization (We already had a first discussion of this in the introduction see (139)and what follows)

    What remains to estimate is the contribution of all the terms of the form errηχ(δ x)in (1012) and (1013) Let us first deal with another matter ndash bounding the `2 norm of|Sη(α x)|2 over the major arcs

    103 The `2 norm over the major arcs

    We can always bound the integral of |Sη(α x)|2 on the whole circle by Plancherel Ifwe only want the integral on certain arcs we use the bound in Prop 1212 (based onwork by Ramare) If these arcs are really the major arcs ndash that is the arcs on whichwe have useful analytic estimates ndash then we can hope to get better bounds using L-functions This will be useful both to estimate the error terms in this section and tomake the use of Ramarersquos bounds more efficient later

    208 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    By (108)

    suma mod q

    gcd(aq)=1

    ∣∣∣∣Sη (aq +δ

    x χ

    )∣∣∣∣2

    =1

    φ(q)2

    sumχ

    sumχprime

    τ(χ)τ(χprime)

    suma mod q

    gcd(aq)=1

    χ(a)χprime(a)

    middot Sηχlowast(δx x)Sηχprimelowast(δx x)

    +Olowast(

    2(1 +radicq)(log x)2|η|infinmax

    α|Sη(α x)|+

    ((1 +

    radicq)(log x)2|η|infin

    )2)=

    1

    φ(q)

    sumχ

    |τ(χ)|2|Sηχlowast(δx x)|2 +Kq1(2|Sη(0 x)|+Kq1)

    where

    Kq1 = (1 +radicq)(log x)2|η|infin

    As is well-known (see eg [IK04 Lem 31])

    τ(χ) = micro

    (q

    qlowast

    )χlowast(q

    qlowast

    )τ(χlowast)

    where qlowast is the modulus of χlowast (ie the conductor of χ) and

    |τ(χlowast)| =radicqlowast

    Using the expressions (1012) and (1013) we obtain

    suma mod q

    (aq)=1

    ∣∣∣∣Sη (aq +δ

    x x

    )∣∣∣∣2 =micro2(q)

    φ(q)|η(minusδ)x+Olowast (errηχT (δ x) middot x)|2

    +1

    φ(q)

    sumχ 6=χT

    micro2

    (q

    qlowast

    )qlowast middotOlowast

    (| errηχ(δ x)|2x2

    )+Kq1(2|Sη(0 x)|+Kq1)

    =micro2(q)x2

    φ(q)

    (|η(minusδ)|2 +Olowast (|errηχT (δ x)(2|η|1 + errηχT (δ x))|)

    )+Olowast

    (maxχ6=χT

    qlowast| errηχlowast(δ x)|2x2 +Kq2x

    )

    where Kq2 = Kq1(2|Sη(0 x)|x+Kq1x)

    103 THE `2 NORM OVER THE MAJOR ARCS 209

    Thus the integral of |Sη(α x)|2 over M (see (105)) is

    sumqlerq odd

    suma mod q

    (aq)=1

    int aq+

    δ0r2qx

    aqminus

    δ0r2qx

    |Sη(α x)|2 dα+sumqle2rq even

    suma mod q

    (aq)=1

    int aq+

    δ0rqx

    aqminus

    δ0rqx

    |Sη(α x)|2 dα

    =sumqlerq odd

    micro2(q)x2

    φ(q)

    int δ0r2qx

    minus δ0r2qx

    |η(minusαx)|2 dα+sumqle2rq even

    micro2(q)x2

    φ(q)

    int δ0rqx

    minus δ0rqx|η(minusαx)|2 dα

    +Olowast

    (sumq

    micro2(q)x2

    φ(q)middot gcd(q 2)δ0r

    qx

    (ET

    ηδ0r2

    (2|η|1 + ETηδ0r2

    )))

    +sumqlerq odd

    δ0rx

    qmiddotOlowast

    maxχ mod q

    χ 6=χT|δ|leδ0r2q

    qlowast| errηχlowast(δ x)|2 +Kq2

    x

    +sumqle2rq even

    2δ0rx

    qmiddotOlowast

    maxχ mod q

    χ 6=χT|δ|leδ0rq

    qlowast| errηχlowast(δ x)|2 +Kq2

    x

    (1025)where

    ETηs = max|δ|les

    | errηχT (δ x)|

    and χT is the trivial character If all we want is an upper bound we can simply remarkthat

    xsumqlerq odd

    micro2(q)

    φ(q)

    int δ0r2qx

    minus δ0r2qx

    |η(minusαx)|2 dα+ xsumqle2rq even

    micro2(q)

    φ(q)

    int δ0rqx

    minus δ0rqx|η(minusαx)|2 dα

    le

    sumqlerq odd

    micro2(q)

    φ(q)+sumqle2rq even

    micro2(q)

    φ(q)

    |η|22 = 2|η|22sumqlerq odd

    micro2(q)

    φ(q)

    If we also need a lower bound we proceed as follows

    Again we will work with an approximation η such that (a) |η minus η|2 is small (b)

    210 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    η is thrice differentiable outside finitely many points (c) η(3) isin L1 Clearly

    xsumqlerq odd

    micro2(q)

    φ(q)

    int δ0r2qx

    minus δ0r2qx

    |η(minusαx)|2 dα

    lesumqlerq odd

    micro2(q)

    φ(q)

    (int δ0r2q

    minus δ0r2q

    |η(minusα)|2 dα+ 2〈|η| |η minus η|〉+ |η minus η|22

    )

    =sumqlerq odd

    micro2(q)

    φ(q)

    int δ0r2q

    minus δ0r2q

    |η(minusα)|2 dα

    +Olowast(

    1

    2log r + 085

    )(2 |η|2 |η minus η|2 + |η minus η|22

    )

    where we are using (C11) and isometry Alsosumqle2rq even

    micro2(q)

    φ(q)

    int δ0rqx

    minus δ0rqx|η(minusαx)|2 dα =

    sumqlerq odd

    micro2(q)

    φ(q)

    int δ0r2qx

    minus δ0r2qx

    |η(minusαx)|2 dα

    By (21) and Plancherelint δ0r2q

    minus δ0r2q

    |η(minusα)|2 dα =

    int infinminusinfin|η(minusα)|2 dαminusOlowast

    (2

    int infinδ0r2q

    |η(3) |21

    (2πα)6dα

    )

    = |η|22 +Olowast

    (|η(3) |21q5

    5π6(δ0r)5

    )

    Hence

    sumqlerq odd

    micro2(q)

    φ(q)

    int δ0r2q

    minus δ0r2q

    |η(minusα)|2 dα = |η|22 middotsumqlerq odd

    micro2(q)

    φ(q)+Olowast

    sumqlerq odd

    micro2(q)

    φ(q)

    |η(3) |21q5

    5π6(δ0r)5

    Using (C18) we get thatsumqlerq odd

    micro2(q)

    φ(q)

    |η(3) |21q5

    5π6(δ0r)5le 1

    r

    sumqlerq odd

    micro2(q)q

    φ(q)middot |η

    (3) |21

    5π6δ50

    le |η(3) |21

    5π6δ50

    middot(

    064787 +log r

    4r+

    0425

    r

    )

    Going back to (1025) we use (C7) to boundsumq

    micro2(q)x2

    φ(q)

    gcd(q 2)δ0r

    qxle 259147 middot δ0rx

    103 THE `2 NORM OVER THE MAJOR ARCS 211

    We also note that sumqlerq odd

    1

    q+sumqle2rq even

    2

    q=sumqler

    1

    qminussumqle r2

    1

    2q+sumqler

    1

    q

    le 2 log er minus logr

    2le log 2e2r

    We have proven the following result

    Lemma 1031 Let η [0infin) rarr R be in L1 cap Linfin Let Sη(α x) be as in (101) andlet M = Mδ0r be as in (105) Let η [0infin) rarr R be thrice differentiable outsidefinitely many points Assume η(3)

    isin L1Assume r ge 182 ThenintM

    |Sη(α x)|2dα = Lrδ0x+Olowast(

    519δ0xr

    (ET

    ηδ0r2middot(|η|1 +

    ETηδ0r2

    2

    )))+Olowast

    (δ0r(log 2e2r)

    (x middot E2

    ηrδ0 +Kr2

    ))

    (1026)where

    Eηrδ0 = maxχ mod q

    qlermiddotgcd(q2)

    |δ|legcd(q2)δ0r2q

    radicqlowast| errηχlowast(δ x)| ETηs = max

    |δ|les| errηχT (δ x)|

    Kr2 = (1 +radic

    2r)(log x)2|η|infin(2|Sη(0 x)|x+ (1 +radic

    2r)(log x)2|η|infinx)(1027)

    and Lrδ0 satisfies both

    Lrδ0 le 2|η|22sumqlerq odd

    micro2(q)

    φ(q)(1028)

    and

    Lrδ0 = 2|η|22sumqlerq odd

    micro2(q)

    φ(q)+Olowast(log r + 17) middot

    (2 |η|2 |η minus η|2 + |η minus η|22

    )

    +Olowast

    (2|η(3) |21

    5π6δ50

    )middot(

    064787 +log r

    4r+

    0425

    r

    )

    (1029)Here as elsewhere χlowast denotes the primitive character inducing χ whereas qlowast denotesthe modulus of χlowast

    The error term xrETηδ0r will be very small since it will be estimated using theRiemann zeta function the error term involving Kr2 will be completely negligibleThe term involving xr(r+1)E2

    ηrδ0 we see that it constrains us to have | errηχ(xN)|

    less than a constant times 1r if we do not want the main term in the bound (1026) tobe overwhelmed

    212 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    104 The integral over the major arcs conclusion

    There are at least two ways we can evaluate (104) One is to substitute (1010) into(104) The disadvantages here are that (a) this can give rise to pages-long formulae (b)this gives error terms proportional to xr| errηχ(xN)| meaning that to win we wouldhave to show that | errηχ(xN)| is much smaller than 1r What we will do instead isto use our `2 estimate (1026) in order to bound the contribution of non-principal termsThis will give us a gain of almost

    radicr on the error terms in other words to win it will

    be enough to show later that | errηχ(xN)| is much smaller than 1radicr

    The contribution of the error terms in Sη3(α x) (that is all terms involving thequantities errηχ in expressions (1012) and (1013)) to (104) is

    sumqlerq odd

    1

    φ(q)

    sumχ3 mod q

    τ(χ3)sum

    a mod q

    (aq)=1

    χ3(a)e(minusNaq)

    int δ0r2qx

    minus δ0r2qx

    Sη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

    +sumqle2rq even

    1

    φ(q)

    sumχ3 mod q

    τ(χ3)sum

    a mod q

    (aq)=1

    χ3(a)e(minusNaq)

    int δ0rqx

    minus δ0rqxSη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

    (1030)

    We should also remember the terms in (1011) we can integrate them over all of RZand obtain that they contribute at most

    intRZ

    2

    3sumj=1

    prodjprime 6=j

    |Sηjprime (α x)| middotmaxqler

    sump|q

    log psumαge1

    ηj

    (pα

    x

    )dα

    le 2

    3sumj=1

    prodjprime 6=j

    |Sηjprime (α x)|2 middotmaxqler

    sump|q

    log psumαge1

    ηj

    (pα

    x

    )

    = 2sumn

    Λ2(n)η2+(nx) middot log r middotmax

    pler

    sumαge1

    ηlowast

    (pα

    x

    )

    + 4

    radicsumn

    Λ2(n)η2+(nx) middot

    sumn

    Λ2(n)η2lowast(nx) middot log r middotmax

    pler

    sumαge1

    ηlowast

    (pα

    x

    )

    by Cauchy-Schwarz and Plancherel

    104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 213

    The absolute value of (1030) is at most

    sumqlerq odd

    suma mod q

    (aq)=1

    int δ0r2qx

    minus δ0r2qx

    ∣∣Sη+(α+ aq x)∣∣2 dα middot max

    χ mod q

    |δ|leδ0r2q

    radicqlowast| errηlowastχlowast(δ x)|

    +sumqle2rq even

    suma mod q

    (aq)=1

    int δ0rqx

    minus δ0rqx

    ∣∣Sη+(α+ aq x)∣∣2 dα middot max

    χ mod q

    |δ|leδ0rq

    radicqlowast| errηlowastχlowast(δ x)|

    leintMδ0r

    ∣∣Sη+(α)∣∣2 dα middot max

    χ mod q

    qlermiddotgcd(q2)

    |δ|legcd(q2)δ0rq

    radicqlowast| errηlowastχlowast(δ x)|

    (1031)We can bound the integral of |Sη+(α)|2 by (1026)

    What about the contribution of the error part of Sη2(α x) We can obviouslyproceed in the same way except that to avoid double-counting Sη3(α x) needs tobe replaced by

    1

    φ(q)τ(χ0)η3(minusδ) middot x =

    micro(q)

    φ(q)η3(minusδ) middot x (1032)

    which is its main term (coming from (1012)) Instead of having an `2 norm as in(1031) we have the square-root of a product of two squares of `2 norms (by Cauchy-Schwarz) namely

    intM|Slowastη+(α)|2dα and

    sumqlerq odd

    micro2(q)

    φ(q)2

    int δ0r2qx

    minus δ0r2qx

    |ηlowast(minusαx)x|2 dα+sumqle2rq even

    micro2(q)

    φ(q)2

    int δ0rqx

    minus δ0rqx|ηlowast(minusαx)x|2 dα

    le x|ηlowast|22 middotsumq

    micro2(q)

    φ(q)2

    (1033)

    By (C9) the sum over q is at most 282643As for the contribution of the error part of Sη1(α x) we bound it in the same way

    using solely the `2 norm in (1033) (and replacing both Sη2(α x) and Sη3(α x) byexpressions as in (1032))

    The total of the error terms is thus

    x middot maxχ mod q

    qlermiddotgcd(q2)

    |δ|legcd(q2)δ0rq

    radicqlowast middot | errηlowastχlowast(δ x)| middotA

    + x middot maxχ mod q

    qlermiddotgcd(q2)

    |δ|legcd(q2)δ0rq

    radicqlowast middot | errη+χlowast(δ x)|(

    radicA+

    radicB+)

    radicBlowast

    (1034)

    where A = (1x)intM|Sη+(α x)|2dα (bounded as in (1026)) and

    Blowast = 282643|ηlowast|22 B+ = 282643|η+|22 (1035)

    214 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    In conclusion we have proven

    Proposition 1041 Let x ge 1 Let η+ ηlowast [0infin)rarr R Assume η+ isin C2 ηprimeprime+ isin L2

    and η+ ηlowast isin L1 cap L2 Let η [0infin) rarr R be thrice differentiable outside finitelymany points Assume η(3)

    isin L1 and |η+ minus η|2 le ε0|η|2 where ε0 ge 0Let Sη(α x) =

    sumn Λ(n)e(αn)η(nx) Let errηχ χ primitive be given as in

    (1012) and (1013) Let δ0 gt 0 r ge 1 Let M = Mδ0r be as in (105)Then for any N ge 0int

    M

    Sη+(α x)2Sηlowast(α x)e(minusNα)dα

    equals

    C0Cηηlowastx2 +

    282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

    |η(3) |21δ50

    r

    |ηlowast|1x2

    +Olowast(Eηlowastrδ0Aη+ + Eη+rδ0 middot 16812(radicAη+ + 16812|η+|2)|ηlowast|2) middot x2

    +Olowast(

    2Zη2+2(x)LSηlowast(x r) middot x+ 4radicZη2+2(x)Zη2lowast2(x)LSη+(x r) middot x

    )

    (1036)where

    C0 =prodp|N

    (1minus 1

    (pminus 1)2

    )middotprodp-N

    (1 +

    1

    (pminus 1)3

    )

    Cηηlowast =

    int infin0

    int infin0

    η(t1)η(t2)ηlowast

    (N

    xminus (t1 + t2)

    )dt1dt2

    (1037)

    Eηrδ0 = maxχ mod q

    qlegcd(q2)middotr|δ|legcd(q2)δ0r2q

    radicqlowast middot | errηχlowast(δ x)| ETηs = max

    |δ|lesq| errηχT (δ x)|

    Aη =1

    x

    intM

    ∣∣Sη+(α x)∣∣2 dα Lηrδ0 le 2|η|22

    sumqlerq odd

    micro2(q)

    φ(q)

    Kr2 = (1 +radic

    2r)(log x)2|η|infin(2Zη1(x)x+ (1 +radic

    2r)(log x)2|η|infinx)

    Zηk(x) =1

    x

    sumn

    Λk(n)η(nx) LSη(x r) = log r middotmaxpler

    sumαge1

    η

    (pα

    x

    )

    (1038)and errηχ is as in (1012) and (1013)

    Here is how to read these expressions The error term in the first line of (1036)will be small provided that ε0 is small and r is large The third line of (1036) willbe negligible as will be the term 2δ0r(log er)Kr2 in the definition of Aη (ClearlyZηk(x)η (log x)kminus1 and LSη(x q)η τ(q) log x for any η of rapid decay)

    104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 215

    It remains to estimate the second line of (1036) This includes estimating Aη ndasha task that was already accomplished in Lemma 1031 We see that we will have togive very good bounds for Eηrδ0 when η = η+ or η = ηlowast We also see that we wantto make C0Cη+ηlowastx

    2 as large as possible it will be competing not just with the errorterms here but more importantly with the bounds from the minor arcs which will beproportional to |η+|22|ηlowast|1

    216 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

    Chapter 11

    Optimizing and adaptingsmoothing functions

    One of our goals is to maximize the quantity Cηηlowast in (1037) relative to |η|22|ηlowast|1One way to do this is to ensure that (a) ηlowast is concentrated on a very short1 interval [0 ε)(b) η is supported on the interval [0 2] and is symmetric around t = 1 meaning thatη(t) sim η(2minus t) Then for x sim N2 the integralint infin

    0

    int infin0

    η(t1)η(t2)ηlowast

    (N

    xminus (t1 + t2)

    )dt1dt2

    in (1037) should be approximately equal to

    |ηlowast|1 middotint infin

    0

    η(t)η

    (N

    xminus t)dt = |ηlowast|1 middot

    int infin0

    η(t)2dt = |ηlowast|1 middot |η|22 (111)

    provided that η0(t) ge 0 for all t It is easy to check (using Cauchy-Schwarz in thesecond step) that this is essentially optimal (We will redo this rigorously in a littlewhile)

    At the same time the fact is that major-arc estimates are best for smoothing func-tions η of a particular form and we have minor-arc estimates from Part I for a differentspecific smoothing η2 The issue then is how do we choose η and ηlowast as above so that

    bull ηlowast is concentrated on [0 ε)

    bull η is supported on [0 2] and symmetric around t = 1

    bull we can give minor-arc and major-arc estimates for ηlowast

    bull we can give major-arc estimates for a function η+ close to η in `2 norm

    1This is an idea appearing in work by Bourgain in a related context [Bou99]

    217

    218 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

    111 The symmetric smoothing function ηWe will later work with a smoothing function ηhearts whose Mellin transform decreasesvery rapidly Because of this rapid decay we will be able to give strong results basedon an explicit formula for ηhearts The issue is how to define η given ηhearts so that η issymmetric around t = 1 (ie η(2minus x) sim η(x)) and is very small for x gt 2

    We will later set ηhearts(t) = eminust22 Let

    h t 7rarr

    t3(2minus t)3etminus12 if t isin [0 2]0 otherwise

    (112)

    We define η Rrarr R by

    η(t) = h(t)ηhearts(t) =

    t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 otherwise

    (113)

    It is clear that η is symmetric around t = 1 for t isin [0 2]

    1111 The product η(t)η(ρminus t)We now should go back and redo rigorously what we discussed informally around(111) More precisely we wish to estimate

    η(ρ) =

    int infinminusinfin

    η(t)η(ρminus t)dt =

    int infinminusinfin

    η(t)η(2minus ρ+ t)dt (114)

    for ρ le 2 close to 2 In this it will be useful that the Cauchy-Schwarz inequalitydegrades slowly in the following sense

    Lemma 1111 Let V be a real vector space with an inner product 〈middot middot〉 Then forany v w isin V with |w minus v|2 le |v|22

    〈v w〉 = |v|2|w|2 +Olowast(271|v minus w|22)

    Proof By a truncated Taylor expansion

    radic1 + x = 1 +

    x

    2+x2

    2max

    0letle1

    1

    4(1minus (tx)2)32

    = 1 +x

    2+Olowast

    (x2

    232

    )for |x| le 12 Hence for δ = |w minus v|2|v|2

    |w|2|v|2

    =

    radic1 +

    2〈w minus v v〉+ |w minus v|22|v|22

    = 1 +2 〈wminusvv〉|v|22

    + δ2

    2+Olowast

    ((2δ + δ2)2

    232

    )= 1 + δ +Olowast

    ((1

    2+

    (52)2

    232

    )δ2

    )= 1 +

    〈w minus v v〉|v|22

    +Olowast(

    271|w minus v|22|v|22

    )

    112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS219

    Multiplying by |v|22 we obtain that

    |v|2|w|2 = |v|22 + 〈w minus v v〉+Olowast(271|w minus v|22

    )= 〈v w〉+Olowast

    (271|w minus v|22

    )

    Applying Lemma 1111 to (114) we obtain that

    (η lowast η)(ρ) =

    int infinminusinfin

    η(t)η((2minus ρ) + t)dt

    =

    radicint infinminusinfin|η(t)|2dt

    radicint infinminusinfin|η((2minus ρ) + t)|2dt

    +Olowast(

    271

    int infinminusinfin|η(t)minus η((2minus ρ) + t)|2 dt

    )= |η|22 +Olowast

    (271

    int infinminusinfin

    (int 2minusρ

    0

    |ηprime(r + t)| dr)2

    dt

    )

    = |η|22 +Olowast(

    271(2minus ρ)

    int 2minusρ

    0

    int infinminusinfin|ηprime(r + t)|2 dtdr

    )= |η|22 +Olowast(271(2minus ρ)2|ηprime|22)

    (115)

    We will be working with ηlowast supported on the non-negative reals we recall that ηis supported on [0 2] Henceint infin

    0

    int infin0

    η(t1)η(t2)ηlowast

    (N

    xminus (t1 + t2)

    )dt1dt2

    =

    int Nx

    0

    (η lowast η)(ρ)ηlowast

    (N

    xminus ρ)dρ

    =

    int Nx

    0

    (|η|22 +Olowast(271(2minus ρ)2|ηprime|22)) middot ηlowast(N

    xminus ρ)dρ

    = |η|22int N

    x

    0

    ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

    x

    0

    ((2minusNx) + ρ)2ηlowast(ρ)dρ

    )

    (116)provided that Nx ge 2 We see that it will be wise to set Nx very slightly larger than2 As we said before ηlowast will be scaled so that it is concentrated on a small interval[0 ε)

    112 The smoothing function ηlowast adapting minor-arcbounds

    Here the challenge is to define a smoothing function ηlowast that is good both for minor-arcestimates and for major-arc estimates The two regimes tend to favor different kinds of

    220 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

    smoothing function For minor-arc estimates we use as [Tao14] did

    η2(t) = 4 max(log 2minus | log 2t| 0) = ((2I[121]) lowastM (2I[121]))(t) (117)

    where I[121](t) is 1 if t isin [12 1] and 0 otherwise For major-arc estimates we willuse a function based on

    ηhearts = eminust22

    We will actually use here the function t2eminust22 whose Mellin transform isMηhearts(s+2)

    (by eg [BBO10 Table 111]))We will follow the simple expedient of convolving the two smoothing functions

    one good for minor arcs the other one for major arcs In general let ϕ1 ϕ2 [0infin)rarrC It is easy to use bounds on sums of the form

    Sfϕ1(x) =

    sumn

    f(n)ϕ1(nx) (118)

    to bound sums of the form Sfϕ1lowastMϕ2

    Sfϕ1lowastMϕ2=sumn

    f(n)(ϕ1 lowastM ϕ2)(nx

    )=

    int infin0

    sumn

    f(n)ϕ1

    ( n

    wx

    )ϕ2(w)

    dw

    w=

    int infin0

    Sfϕ1(wx)ϕ2(w)dw

    w

    (119)The same holds of course if ϕ1 and ϕ2 are switched since ϕ1 lowastM ϕ2 = ϕ2 lowastM ϕ1The only objection is that the bounds on (118) that we input might not be valid ornon-trivial when the argument wx of Sfϕ1

    (wx) is very small Because of this it isimportant that the functions ϕ1 ϕ2 vanish at 0 and desirable that their first derivativesdo so as well

    Let us see how this works out in practice for ϕ1 = η2 Here η2 [0infin) rarr R isgiven by

    η2 = η1 lowastM η1 = 4 max(log 2minus | log 2t| 0) (1110)

    where η1 = 2 middot I[121]Let us restate the bounds from Theorem 311 ndash the main result of Part I We will

    use Lemma C22 to bound terms of the form qφ(q)Let x ge x0 x0 = 216 middot 1020 Let 2α = aq + δx q le Q gcd(a q) = 1

    |δx| le 1qQ where Q = (34)x23 Then if 3 le q le x136 Theorem 311 givesus that

    |Sη2(α x)| le gx(

    max

    (1|δ|8

    )middot q)x (1111)

    where

    gx(r) =(Rx2r log 2r + 05)

    radicz(r) + 25radic

    2r+L2r

    r+ 336xminus16 (1112)

    112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS221

    with

    Rxt = 027125 log

    (1 +

    log 4t

    2 log 9x13

    2004t

    )+ 041415

    Lt = z(t2)

    (13

    4log t+ 782

    )+ 1366 log t+ 3755

    (1113)

    If q gt x136 then again by Theorem 311

    |Sη2(α x)| le h(x)x (1114)

    whereh(x) = 0276xminus16(log x)32 + 1234xminus13 log x (1115)

    We will work with x varying within a range and so we must pay some attentionto the dependence of (1111) and (1114) on x Let us prove two auxiliary lemmas onthis

    Lemma 1121 Let gx(r) be as in (1112) and h(x) as in (1115) Then

    x 7rarr

    h(x) if x lt (6r)3

    gx(r) if x ge (6r)3

    is a decreasing function of x for r ge 11 fixed and x ge 21

    Proof It is clear from the definitions that x 7rarr h(x) (for x ge 21) and x 7rarr gx(r) areboth decreasing Thus we simply have to show that h(xr) ge gxr (r) for xr = (6r)3Since xr ge (6 middot 11)3 gt e125

    Rxr2r le 027125 log(0065 log xr + 1056) + 041415

    le 027125 log((0065 + 00845) log xr) + 041415 le 027215 log log xr

    Hence

    Rxr2r log 2r + 05 le 027215 log log xr log x13r minus 027215 log 125 log 3 + 05

    le 009072 log log xr log xr minus 0255

    At the same time

    z(r) = eγ log logx

    13r

    6+

    250637

    log log rle eγ log log xr minus eγ log 3 + 19521

    le eγ log log xr

    (1116)

    for r ge 37 and we also get z(r) le eγ log log xr for r isin [11 37] by the bisectionmethod with 10 iterations Hence

    (Rxr2r log 2r + 05)radicz(r) + 25

    le (009072 log log xr log xr minus 0255)radiceγ log log xr + 25

    le 01211 log xr(log log xr)32 + 2

    222 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

    and so

    (Rxr2r log 2r + 05)radic

    z(r) + 25radic2r

    le (021 log xr(log log xr)32 + 347)xminus16

    r

    Now by (1116)

    L2r le eγ log log xr middot(

    13

    4log(x13

    r 3) + 782

    )+ 1366 log(x13

    r 3) + 3755

    le eγ log log xr middot(

    13

    12xr + 425

    )+ 456 log xr + 2255

    It is clear that

    425eγ log log xr + 456 log xr + 2255

    x13r 6

    lt 1234xminus13r log xr

    for xr ge e we make the comparison for xr = e and take the derivative of the ratio ofthe left side by the right side

    It remains to show that

    021 log xr(log log xr)32 + 347 + 336 +

    13

    2eγxminus13

    r log xr log log xr (1117)

    is less than 0276(log xr)32 for xr large enough Since t 7rarr (log t)32t12 is de-

    creasing for t gt e3 we see that

    021 log xr(log log xr)32 + 683 + 13

    2 eγxminus13r log xr log log xr

    0276(log xr)32lt 1

    for all xr ge e33 simply because it is true for x = e33 which is greater than ee3

    We conclude that h(xr) ge gxr (r) = gxr (x

    13r 6) for xr ge e33 We check that

    h(xr) ge gxr (x13r 6) for log xr isin [log 663 33] as well by the bisection method

    (applied with 30 iterations with log xr as the variable on the intervals [log 663 20][20 25] [25 30] and [30 33]) Since r ge 11 implies xr ge 663 we are done

    Lemma 1122 Let Rxr be as in (1112) Then t rarr Retr(r) is convex-up for t ge3 log 6r

    Proof Since trarr eminust6 and trarr t are clearly convex-up all we have to do is to showthat trarr Retr is convex-up In general since

    (log f)primeprime =

    (f prime

    f

    )prime=f primeprimef minus (f prime)2

    f2

    a function of the form (log f) is convex-up exactly when f primeprimef minus (f prime)2 ge 0 If f(t) =1 + a(tminus b) we have f primeprimef minus (f prime)2 ge 0 whenever

    (t+ aminus b) middot (2a) ge a2

    ie a2 + 2at ge 2ab and that certainly happens when t ge b In our case b =3 log(2004r9) and so t ge 3 log 6r implies t ge b

    112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS223

    Now we come to the point where we prove bounds on exponential sums of the formSηlowast(α x) (that is sums based on the smoothing ηlowast) based on our bounds (1111) and(1114) on the exponential sums Sη2(α x) This is straightforward as promised

    Proposition 1123 Let x ge Kx0 x0 = 216 middot 1020 K ge 1 Let Sη(α x) be asin (101) Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in L1

    Let 2α = aq+δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23If q le (xK)136 then

    Sηlowast(α x) le gxϕ(

    max

    (1|δ|8

    )q

    )middot |ϕ|1x (1118)

    where

    gxϕ(r) =(RxKϕ2r log 2r + 05)

    radicz(r) + 25radic

    2r+L2r

    r+ 336K16xminus16

    RxKϕt = Rxt + (RxKt minusRxt)Cϕ2K|ϕ|1

    logK(1119)

    with Rxt and Lt are as in (1113) and

    Cϕ2K = minusint 1

    1K

    ϕ(w) logw dw (1120)

    If q gt (xK)136 then

    |Sηlowast(α x)| le hϕ(xK) middot |ϕ|1x

    wherehϕ(x) = h(x) + Cϕ0K|ϕ|1

    Cϕ0K = 104488

    int 1K

    0

    |ϕ(w)|dw(1121)

    and h(x) is as in (1115)

    Proof By (119)

    Sηlowast(α x) =

    int 1K

    0

    Sη2(αwx)ϕ(w)dw

    w+

    int infin1K

    Sη2(αwx)ϕ(w)dw

    w

    We bound the first integral by the trivial estimate |Sη2(αwx)| le |Sη2(0 wx)| andCor C13 int 1K

    0

    |Sη2(0 wx)|ϕ(x)dw

    wle 104488

    int 1K

    0

    wxϕ(w)dw

    w

    = 104488x middotint 1K

    0

    ϕ(w)dw

    224 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

    Ifw ge 1K thenwx ge x0 and we can use (1111) or (1114) If q gt (xK)136then |Sη2(αwx)| le h(xK)wx by (1114) moreover |Sη2(α y)| le h(y)y forxK le y lt (6q)3 (by (1114)) and |Sη2(α y)| le gy1(r) for y ge (6q)3 (by (1111))Thus Lemma 1121 gives us thatint infin

    1K

    |Sη2(αwx)|ϕ(w)dw

    wleint infin

    1K

    h(xK)wx middot ϕ(w)dw

    w

    = h(xK)x

    int infin1K

    ϕ(w)dw le h(xK)|ϕ|1 middot x

    If q le (xK)136 we always use (1111) We can use the coarse boundint infin1K

    336xminus16 middot wx middot ϕ(w)dw

    wle 336K16|ϕ|1x56

    Since Lr does not depend on xint infin1K

    Lrrmiddot wx middot ϕ(w)

    dw

    wle Lr

    r|ϕ|1x

    By Lemma 1122 and q le (xK)136 y 7rarr Reyt is convex-up and decreasingfor y isin [log(xK)infin) Hence

    Rwxt le

    logwlog 1

    K

    RxKt +(

    1minus logwlog 1

    K

    )Rxt if w lt 1

    Rxt if w ge 1

    Thereforeint infin1K

    Rwxt middot wx middot ϕ(w)dw

    w

    leint 1

    1K

    (logw

    log 1K

    RxKt +

    (1minus logw

    log 1K

    )Rxt

    )xϕ(w)dw +

    int infin1

    Rxtϕ(w)xdw

    le Rxtx middotint infin

    1K

    ϕ(w)dw + (RxKt minusRxt)x

    logK

    int 1

    1K

    ϕ(w) logwdw

    le(Rxt|ϕ|1 + (RxKt minusRxt)

    Cϕ2logK

    )middot x

    where

    Cϕ2K = minusint 1

    1K

    ϕ(w) logw dw

    We finish by proving a couple more lemmas

    Lemma 1124 Let x gt K gt 1 Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) andϕ [0infin)rarr [0infin) is continuous and in L1 Let gxϕ be as in (1119)

    Then gxϕ(r) is a decreasing function of r for 670 le r le (xK)136

    112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS225

    Proof Taking derivatives we can easily see that

    r 7rarr log log r

    r r 7rarr log r

    r r 7rarr log r log log r

    r r 7rarr (log r)2 log log r

    r(1122)

    are decreasing for r ge 20 The same is true if log log r is replaced by z(r) sincez(r) log log r is a decreasing function for r ge e Since (Cϕ2K|ϕ|1) logK le 1(by (1120)) we see that it is enough to prove that r 7rarr Ry2r log 2r

    radiclog log r

    radic2r is

    decreasing on r for y = x and y = xK (under the assumption that r ge 670)Looking at (1113) and at (1122) we see that it remains only to check that

    r 7rarr log

    (1 +

    log 8r

    2 log 9y13

    4008r

    )log 2r middot

    radiclog log r

    r(1123)

    is decreasing on r for r ge 670 Taking logarithms and then derivatives we see that wehave to show that

    1r `+

    log 8rr

    2`2(1 + log 8r

    2`

    )log(

    1 + log 8r2`

    ) +1

    r log 2r+

    1

    2r log r log log rlt

    1

    2r

    where ` = log 9y13

    4008r We multiply by 2r and see that this is equivalent to

    1`

    (2minus 1

    1+ log 8r2`

    )log(

    1 + log 8r2`

    ) +2

    log 2r+

    1

    log r log log rlt 1 (1124)

    A derivative test is enough to show that s log(1 + s) is an increasing function of s fors gt 0 hence so is s middot (2minus 1(1 + s)) log(1 + s) Setting s = (log 8r)` we obtainthat the left side of (1124) is a decreasing function of ` for r ge 1 fixed

    Since r le y136 ` ge log 544008 gt 26 Thus for (1124) to hold it is enoughto ensure that

    126

    (2minus 1

    1+ log 8r52

    )log(

    1 + log 8r52

    ) +2

    log 2r+

    1

    log r log log rlt 1 (1125)

    A derivative test shows that (2 minus 1s) log(1 + s) is a decreasing function of s fors ge 123 since log(8 middot 75)52 gt 123 this implies that the left side of (1125) is adecreasing function of r for r ge 75

    We check that the left side of (1125) is indeed less than 1 for r = 670 we concludethat it is less than 1 for all r ge 670

    Lemma 1125 Let x ge 1025 Let φ [0infin) rarr [0infin) be continuous and in L1 Letgxφ(r) and h(x) be as in (1119) and (1115) respectively Then

    gxφ

    (3

    8x415

    )ge h(2x log x)

    226 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

    Proof We can bound gxφ(r) from below by

    gmx(r) =(Rxr log 2r + 05)

    radicz(r) + 25radic

    2r

    Let r = (38)x415 Using the assumption that x ge 1025 we see that

    Rxr = 027125 log

    1 +log(

    3x415

    2

    )2 log

    (9

    2004middot 38middot x 1

    3minus415)+ 041415 ge 063368

    (1126)(It is easy to see that the left side of (1126) is increasing on x) Using x ge 1025 againwe get that

    z(r) = eγ log log r +250637

    log log rge 568721

    Since log 2r = (415) log x+ log(34) we conclude that

    gmx(r) ge 040298 log x+ 325765radic34 middot x215

    Recall that

    h(x) =0276(log x)32

    x16+

    1234 log x

    x13

    We can see that

    x 7rarr (log x+ 33)x215

    (log(2x log x))32(2x log x)16(1127)

    is increasing for x ge 1025 (and indeed for x ge e27) by taking the logarithm of theright side of (1127) and then taking its derivative with respect to t = log x We cansee in the same way that (1x215)(log(2x log x)(2x log x)13) is increasing forx ge e22 Since

    040298(log x+ 33)radic34 middot x215

    ge 0276(log(2x log x))32

    (2x log x)16

    325765minus 33 middot 040298radic34 middot x215

    ge 1234 log(2x log(x))

    (2x log(x))13

    for x = 1025 we are done

    Chapter 12

    The `2 norm and the large sieve

    Our aim here is to give a bound on the `2 norm of an exponential sum over the minorarcs While we care about an exponential sum in particular we will prove a result validfor all exponential sums S(α x) =

    sumn ane(αn) with an of prime support

    We start by adapting ideas from Ramarersquos version of the large sieve for primes toestimate `2 norms over parts of the circle (sect121) We are left with the task of givingan explicit bound on the factor in Ramarersquos work this we do in sect122 As a side effectthis finally gives a fully explicit large sieve for primes that is asymptotically optimalmeaning a sieve that does not have a spurious factor of eγ in front this was an arguablyimportant gap in the literature

    121 Variations on the large sieve for primes

    We are trying to estimate an integralintRZ |S(α)|3dα Instead of bounding it trivially by

    |S|infin|S|22 we can use the fact that large (ldquomajorrdquo) values of S(α) have to be multipliedonly by

    intM|S(α)|2dα where M is a union (small in measure) of major arcs Now

    can we give an upper bound forintM|S(α)|2dα better than |S|22 =

    intRZ |S(α)|2dα

    The first version of [Helb] gave an estimate on that integral using a technique due toHeath-Brown which in turn rests on an inequality of Montgomeryrsquos ([Mon71 (39)]see also eg [IK04 Lem 715]) The technique was communicated by Heath-Brownto the present author who communicated it to Tao who used it in his own notable workon sums of five primes (see [Tao14 Lem 46] and adjoining comments) We will beable to do better than that estimate here

    The role played by Montgomeryrsquos inequality in Heath-Brownrsquos method is playedhere by a result of Ramarersquos ([Ram09 Thm 21] see also [Ram09 Thm 52]) Thefollowing proposition is based on Ramarersquos result or rather on one possible proof ofit Instead of using the result as stated in [Ram09] we will actually be using elementsof the proof of [Bom74 Thm 7A] credited to Selberg Simply integrating Ramarersquosinequality would give a non-trivial if slightly worse bound

    227

    228 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    Proposition 1211 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

    radicx Let Q0 ge 1 δ0 ge 1 be such that

    δ0Q20 le x2 set Q =

    radicx2δ0 ge Q0 Let

    M =⋃qleQ0

    ⋃a mod q

    (aq)=1

    (a

    qminus δ0r

    qxa

    q+δ0r

    qx

    ) (121)

    Let S(α) =sumn ane(αn) for α isin RZ Thenint

    M

    |S(α)|2 dα le(

    maxqleQ0

    maxsleQ0q

    Gq(Q0sq)

    Gq(Qsq)

    )sumn

    |an|2

    where

    Gq(R) =sumrleR

    (rq)=1

    micro2(r)

    φ(r) (122)

    Proof By (121)intM

    |S(α)|2 dα =sumqleQ0

    int δ0Q0qx

    minus δ0Q0qx

    suma mod q

    (aq)=1

    ∣∣∣∣S (aq + α

    )∣∣∣∣2 dα (123)

    Thanks to the last equations of [Bom74 p 24] and [Bom74 p 25]

    suma mod q

    (aq)=1

    ∣∣∣∣S (aq)∣∣∣∣2 =

    1

    φ(q)

    sumqlowast|q

    (qlowastqqlowast)=1

    micro2(qqlowast)=1

    qlowast middotsumlowast

    χ mod qlowast

    ∣∣∣∣∣sumn

    anχ(n)

    ∣∣∣∣∣2

    for every q leradicx where we use the assumption that n is prime and gt

    radicx (and thus

    coprime to q) when an 6= 0 HenceintM

    |S(α)|2 dα =sumqleQ0

    sumqlowast|q

    (qlowastqqlowast)=1

    micro2(qqlowast)=1

    qlowastint δ0Q0

    qx

    minus δ0Q0qx

    1

    φ(q)

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    =sumqlowastleQ0

    qlowast

    φ(qlowast)

    sumrleQ0qlowast

    (rqlowast)=1

    micro2(r)

    φ(r)

    int δ0Q0qlowastrx

    minus δ0Q0qlowastrx

    sumlowast

    χ mod qlowast

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    =sumqlowastleQ0

    qlowast

    φ(qlowast)

    int δ0Q0qlowastx

    minus δ0Q0qlowastx

    sumrleQ0

    qlowast min(1δ0|α|x )

    (rqlowast)=1

    micro2(r)

    φ(r)

    sumlowast

    χ mod qlowast

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 229

    Here |α| le δ0Q0qlowastx implies (Q0q)δ0|α|x ge 1 Thereforeint

    M

    |S(α)|2 dα le(

    maxqlowastleQ0

    maxsleQ0qlowast

    Gqlowast(Q0sqlowast)

    Gqlowast(Qsqlowast)

    )middot Σ (124)

    where

    Σ =sumqlowastleQ0

    qlowast

    φ(qlowast)

    int δ0Q0qlowastx

    minus δ0Q0qlowastx

    sumrle Q

    qlowast min(1δ0|α|x )

    (rqlowast)=1

    micro2(r)

    φ(r)

    sumlowast

    χ mod qlowast

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    lesumqleQ

    q

    φ(q)

    sumrleQq(rq)=1

    micro2(r)

    φ(r)

    int δ0Qqrx

    minus δ0Qqrx

    sumlowast

    χ mod q

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    As stated in the proof of [Bom74 Thm 7A]

    χ(r)χ(n)τ(χ)cr(n) =

    qrsumb=1

    (bqr)=1

    χ(b)e2πin bqr

    for χ primitive of modulus q Here cr(n) stands for the Ramanujan sum

    cr(n) =sum

    u mod r(ur)=1

    e2πnur

    For n coprime to r cr(n) = micro(r) Since χ is primitive |τ(χ)| =radicq Hence for

    r leradicx coprime to q

    q

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    =

    ∣∣∣∣∣∣∣∣qrsumb=1

    (bqr)=1

    χ(b)S

    (b

    qr+ α

    )∣∣∣∣∣∣∣∣2

    Thus

    Σ =sumqleQ

    sumrleQq(rq)=1

    micro2(r)

    φ(rq)

    int δ0Qqrx

    minus δ0Qqrx

    sumlowast

    χ mod q

    ∣∣∣∣∣∣∣∣qrsumb=1

    (bqr)=1

    χ(b)S

    (b

    qr+ α

    )∣∣∣∣∣∣∣∣2

    lesumqleQ

    1

    φ(q)

    int δ0Qqx

    minus δ0Qqx

    sumχ mod q

    ∣∣∣∣∣∣∣∣qsumb=1

    (bq)=1

    χ(b)S

    (b

    q+ α

    )∣∣∣∣∣∣∣∣2

    =sumqleQ

    int δ0Qqx

    minus δ0Qqx

    qsumb=1

    (bq)=1

    ∣∣∣∣S ( bq + α

    )∣∣∣∣2 dα

    230 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    Let us now check that the intervals (bq minus δ0Qqx bq + δ0Qqx) do not overlapSince Q =

    radicx2δ0 we see that δ0Qqx = 12qQ The difference between two

    distinct fractions bq bprimeqprime is at least 1qqprime For q qprime le Q 1qqprime ge 12qQ+ 12QqprimeHence the intervals around bq and bprimeqprime do not overlap We conclude that

    Σ leintRZ|S(α)|2 =

    sumn

    |an|2

    and so by (124) we are done

    We will actually use Prop 1211 in the slightly modified form given by the follow-ing statement

    Proposition 1212 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

    radicx Let Q0 ge 1 δ0 ge 1 be such that

    δ0Q20 le x2 set Q =

    radicx2δ0 ge Q0 Let M = Mδ0Q0

    be as in (105)Let S(α) =

    sumn ane(αn) for α isin RZ Then

    intMδ0Q0

    |S(α)|2 dα le

    maxqle2Q0

    q even

    maxsle2Q0q

    Gq(2Q0sq)

    Gq(2Qsq)

    sumn

    |an|2

    where

    Gq(R) =sumrleR

    (rq)=1

    micro2(r)

    φ(r) (125)

    Proof By (105)intM

    |S(α)|2 dα =sumqleQ0

    q odd

    int δ0Q02qx

    minus δ0Q02qx

    suma mod q

    (aq)=1

    ∣∣∣∣S (aq + α

    )∣∣∣∣2 dα+sumqleQ0

    q even

    int δ0Q0qx

    minus δ0Q0qx

    suma mod q

    (aq)=1

    ∣∣∣∣S (aq + α

    )∣∣∣∣2 dαWe proceed as in the proof of Prop 1211 We still have (123) Hence

    intM|S(α)|2 dα

    equals

    sumqlowastleQ0

    qlowast odd

    qlowast

    φ(qlowast)

    int δ0Q02qlowastx

    minus δ0Q02qlowastx

    sumrleQ0

    qlowast min(1δ0

    2|α|x )(r2qlowast)=1

    micro2(r)

    φ(r)

    sumlowast

    χ mod qlowast

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    +sum

    qlowastle2Q0

    qlowast even

    qlowast

    φ(qlowast)

    int δ0Q0qlowastx

    minus δ0Q0qlowastx

    sumrle 2Q0

    qlowast min(1δ0

    2|α|x )(rqlowast)=1

    micro2(r)

    φ(r)

    sumlowast

    χ mod qlowast

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 231

    (The sum with q odd and r even is equal to the first sum hence the factor of 2 in front)Therefore int

    M

    |S(α)|2 dα le

    maxqlowastleQ0

    qlowast odd

    maxsleQ0qlowast

    G2qlowast(Q0sqlowast)

    G2qlowast(Qsqlowast)

    middot 2Σ1

    +

    maxqlowastle2Q0

    qlowast even

    maxsle2Q0qlowast

    Gqlowast(2Q0sqlowast)

    Gqlowast(2Qsqlowast)

    middot Σ2

    (126)

    where

    Σ1 =sumqleQq odd

    q

    φ(q)

    sumrleQq

    (r2q)=1

    micro2(r)

    φ(r)

    int δ0Q2qrx

    minus δ0Q2qrx

    sumlowast

    χ mod q

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    =sumqleQq odd

    q

    φ(q)

    sumrle2Qq

    (rq)=1

    r even

    micro2(r)

    φ(r)

    int δ0Qqrx

    minus δ0Qqrx

    sumlowast

    χ mod q

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    Σ2 =sumqle2Qq even

    q

    φ(q)

    sumrle2Qq

    (rq)=1

    micro2(r)

    φ(r)

    int δ0Qqrx

    minus δ0Qqrx

    sumlowast

    χ mod q

    ∣∣∣∣∣sumn

    ane(αn)χ(n)

    ∣∣∣∣∣2

    The two expressions within parentheses in (126) are actually equalMuch as before using [Bom74 Thm 7A] we obtain that

    Σ1 lesumqleQq odd

    1

    φ(q)

    int δ0Q2qx

    minus δ0Q2qx

    qsumb=1

    (bq)=1

    ∣∣∣∣S ( bq + α

    )∣∣∣∣2 dαΣ1 + Σ2 le

    sumqle2Qq even

    1

    φ(q)

    int δ0Qqx

    minus δ0Qqx

    qsumb=1

    (bq)=1

    ∣∣∣∣S ( bq + α

    )∣∣∣∣2 dαLet us now check that the intervals of integration (bq minus δ0Q2qx bq + δ0Q2qx)(for q odd) (bq minus δ0Qqx bq + δ0Qqx) (for q even) do not overlap Recall thatδ0Qqx = 12qQ The absolute value of the difference between two distinct fractionsbq bprimeqprime is at least 1qqprime For q qprime le Q odd this is larger than 14qQ + 14Qqprimeand so the intervals do not overlap For q le Q odd and qprime le 2Q even (or vice versa)1qqprime ge 14qQ + 12Qqprime and so again the intervals do not overlap If q le Qand qprime le Q are both even then |bq minus bprimeqprime| is actually ge 2qqprime Clearly 2qqprime ge12qQ+ 12Qqprime and so again there is no overlap We conclude that

    2Σ1 + Σ2 leintRZ|S(α)|2 =

    sumn

    |an|2

    232 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    122 Bounding the quotient in the large sieve for primesThe estimate given by Proposition 1211 involves the quotient

    maxqleQ0

    maxsleQ0q

    Gq(Q0sq)

    Gq(Qsq) (127)

    where Gq is as in (122) The appearance of such a quotient (at least for s = 1)is typical of Ramarersquos version of the large sieve for primes see eg [Ram09] Wewill see how to bound such a quotient in a way that is essentially optimal not justasymptotically but also in the ranges that are most relevant to us (This includes forexample Q0 sim 106 Q sim 1015)

    As the present work shows an approach based on Ramarersquos work gives bounds thatare in some contexts better than those of other large sieves for primes by a constantfactor (approaching eγ = 178107 ) Thus giving a fully explicit and nearly optimalbound for (127) is a task of clear general relevance besides being needed for our maingoal

    We will obtain bounds for Gq(Q0sq)Gq(Qsq) when Q0 le 2 middot 1010 Q ge Q20

    As we shall see our bounds will be best when s = q = 1 ndash or sometimes when s = 1and q = 2 instead

    Write G(R) for G1(R) =sumrleR micro

    2(r)φ(r) We will need several estimates forGq(R) and G(R) As stated in [Ram95 Lemma 34]

    G(R) le logR+ 14709 (128)

    for R ge 1 By [MV73 Lem 7]

    G(R) ge logR+ 107 (129)

    for R ge 6 There is also the trivial bound

    G(R) =sumrleR

    micro2(r)

    φ(r)=sumrleR

    micro2(r)

    r

    prodp|r

    (1minus 1

    p

    )minus1

    =sumrleR

    micro2(r)

    r

    prodp|r

    sumjge1

    1

    pjgesumrleR

    1

    rgt logR

    (1210)

    The following bound also well-known and easy

    G(R) le q

    φ(q)Gq(R) le G(Rq) (1211)

    can be obtained by multiplying Gq(R) =sumrleR(rq)=1 micro

    2(r)φ(r) term-by-term byqφ(q) =

    prodp|q(1 + 1φ(p))

    We will also use Ramarersquos estimate from [Ram95 Lem 34]

    Gd(R) =φ(d)

    d

    logR+ cE +sump|d

    log p

    p

    +Olowast(

    7284Rminus13f1(d))

    (1212)

    122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 233

    for all d isin Z+ and all R ge 1 where

    f1(d) =prodp|d

    (1 + pminus23)

    (1 +

    p13 + p23

    p(pminus 1)

    )minus1

    (1213)

    andcE = γ +

    sumpge2

    log p

    p(pminus 1)= 13325822 (1214)

    by [RS62 (211)]If R ge 182 then

    logR+ 1312 le G(R) le logR+ 1354 (1215)

    where the upper bound is valid for R ge 120 This is true by (1212) for R ge 4 middot 107we check (1215) for 120 le R le 4 middot 107 by a numerical computation1 Similarly forR ge 200

    logR+ 1661

    2le G2(R) le logR+ 1698

    2(1216)

    by (1212) for R ge 16 middot108 and by a numerical computation for 200 le R le 16 middot108Write ρ = (logQ0)(logQ) le 1 We obtain immediately from (1215) and (1216)

    thatG(Q0)

    G(Q)le logQ0 + 1354

    logQ+ 1312

    G2(Q0)

    G2(Q)le logQ0 + 1698

    logQ+ 1661

    (1217)

    for QQ0 ge 200 What is hard is to approximate Gq(Q0)Gq(Q) for q large and Q0

    smallLet us start by giving an easy bound off from the truth by a factor of about eγ

    (Specialists will recognize this as a factor that appears often in first attempts at esti-mates based on either large or small sieves) First we need a simple explicit lemma

    Lemma 1221 Let m ge 1 q ge 1 Thenprodp|qorplem

    p

    pminus 1le eγ(log(m+ log q) + 065771) (1218)

    Proof Let P =prodplemorp|q p Then by [RS75 (51)]

    P le qprodplem

    p = qesumplem log p le qe(1+ε0)m

    where ε0 = 0001102 Now by [RS62 (342)]

    n

    φ(n)le eγ log log n+

    250637

    log log nle eγ log log x+

    250637

    log log x

    1Using D Plattrsquos implementation [Pla11] of double-precision interval arithmetic based on Lambovrsquos[Lam08] ideas

    234 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    for all x ge n ge 27 (since given a b gt 0 the function t 7rarr a + bt is increasing on tfor t ge

    radicba) Hence if qem ge 27

    P

    φ(P)le eγ log((1 + ε0)m+ log q) +

    250637

    log(m+ log q)

    le eγ(

    log(m+ log q) + ε0 +250637eγ

    log(m+ log q)

    )

    Thus (1218) holds when m + log q ge 853 since then ε0 + (250637eγ) log(m +log q) le 065771 We verify all choices of m q ge 1 with m + log q le 853 compu-tationally the worst case is that of m = 1 q = 6 which give the value 065771 in(1218)

    Here is the promised easy bound

    Lemma 1222 Let Q0 ge 1 Q ge 182Q0 Let q le Q0 s le Q0q q an integer Then

    Gq(Q0sq)

    Gq(Qsq)leeγ log

    (Q0

    sq + log q)

    + 1172

    log QQ0

    + 1312le eγ logQ0 + 1172

    log QQ0

    + 1312

    Proof Let P =prodpleQ0sqorp|q p Then

    Gq(Q0sq)GP(QQ0) le Gq(Qsq)

    and soGq(Q0sq)

    Gq(Qsq)le 1

    GP(QQ0) (1219)

    Now the lower bound in (1211) gives us that for d = P R = QQ0

    GP(QQ0) ge φ(P)

    PG(QQ0)

    By Lem 1221

    P

    φ(P)le eγ

    (log

    (Q0

    sq+ log q

    )+ 0658

    )

    Hence using (1215) we get that

    Gq(Q0sq)

    Gq(Qsq)le Pφ(P)

    G(QQ0)leeγ log

    (Q0

    sq + log q)

    + 1172

    log QQ0

    + 1312 (1220)

    since QQ0 ge 184 Since(Q0

    sq+ log q

    )prime= minusQ0

    sq2+

    1

    q=

    1

    q

    (1minus Q0

    sq

    )le 0

    the rightmost expression of (1220) is maximal for q = 1

    122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 235

    Lemma 1222 will play a crucial role in reducing to a finite computation the prob-lem of bounding Gq(Q0sq)Gq(Qsq) As we will now see we can use Lemma1222 to obtain a bound that is useful when sq is large compared to Q0 ndash precisely thecase in which asymptotic estimates such as (1212) are relatively weak

    Lemma 1223 Let Q0 ge 1 Q ge 200Q0 Let q le Q0 s le Q0q Let ρ =(logQ0) logQ le 23 Then for any σ ge 1312ρ

    Gq(Q0sq)

    Gq(Qsq)le logQ0 + σ

    logQ+ 1312(1221)

    holds provided thatQ0

    sqle c(σ) middotQ(1minusρ)eminusγ

    0 minus log q

    where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

    Proof By Lemma 1222 we see that (1221) will hold provided that

    eγ log

    (Q0

    sq+ log q

    )+ 1172 le

    log QQ0

    + 1312

    logQ+ 1312middot (logQ0 + σ) (1222)

    The expression on the right of (1222) equals

    logQ0 + σ minus (logQ0 + σ) logQ0

    logQ+ 1312

    = (1minus ρ)(logQ0 + σ) +1312ρ(logQ0 + σ)

    logQ+ 1312

    ge (1minus ρ)(logQ0 + σ) + 1312ρ2

    and so (1222) will hold provided that

    eγ log

    (Q0

    sq+ log q

    )+ 1172 le (1minus ρ)(logQ0) + (1minus ρ)σ + 1312ρ2

    Taking derivatives we see that

    (1minus ρ)σ + 1312ρ2 minus 1172 ge(

    1minus σ

    2624

    )σ + 1312

    ( σ

    2624

    )2

    minus 1172

    = σ minus σ2

    4 middot 1312minus 1172

    Hence it is enough that

    Q0

    sq+ log q le ee

    minusγ(

    (1minusρ) logQ0+σminus σ2

    4middot1312minus1172)

    = c(σ) middotQ(1minusρ)eminusγ0

    where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

    We now pass to the main result of the section

    236 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    Proposition 1224 Let Q ge 20000Q0 Q0 ge Q0min where Q0min = 105 Letρ = (logQ0) logQ Assume ρ le 06 Then for every 1 le q le Q0 and everys isin [1 Q0q]

    Gq(Q0sq)

    Gq(Qsq)le logQ0 + c+

    logQ+ cE (1223)

    where cE is as in (1214) and c+ = 136

    An ideal result would have c+ instead of cE but this is not actually possible errorterms do exist even if they are in reality smaller than the bound given in (1212) thismeans that a bound such as (1223) with c+ instead of cE would be false for q = 1s = 1

    There is nothing special about the assumptions

    Q ge 20000Q0 Q0 ge 105 (logQ0)(logQ) le 06

    They can all be relaxed at the cost of an increase in c+

    Proof Define errqR so that

    Gq(R) =φ(q)

    q

    logR+ cE +sump|q

    log p

    p

    + errqR (1224)

    Then (1223) will hold if

    logQ0

    sq+ cE +

    sump|q

    log p

    p+

    q

    φ(q)err

    qQ0sq

    le

    logQ

    sq+ cE +

    sump|q

    log p

    p+

    q

    φ(q)errq Qsq

    logQ0 + c+logQ+ cE

    (1225)

    This in turn happens iflog sq minussump|q

    log p

    p

    (1minus logQ0 + c+logQ+ cE

    )+ c+ minus cE

    ge q

    φ(q)

    (err

    qQ0sqminus logQ0 + c+

    logQ+ cEerrq Qsq

    )

    Defineω(ρ) =

    logQ0min + c+1ρ logQ0min + cE

    = ρ+c+ minus ρcE

    1ρ logQ0min + cE

    Then ρ le (logQ0 + c+)(logQ+ cE) le ω(ρ) (because c+ ge ρcE) We conclude that(1225) (and hence (1223)) holds provided that

    (1minus ω(ρ))

    log sq minussump|q

    log p

    p

    + c∆

    ge q

    φ(q)

    (err

    qQ0sq

    +ω(ρ) max(

    0minus errq Qsq

    ))

    (1226)

    122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 237

    where c∆ = c+ minus cE Note that 1minus ω(ρ) gt 0First let us give some easy bounds on the error terms these bounds will yield upper

    bounds for s By (128) and (1211)

    errqR leφ(q)

    q

    log q minussump|q

    log p

    p+ (14709minus cE)

    for R ge 1 by (1215) and (1211)

    errqR ge minusφ(q)

    q

    sump|q

    log p

    p+ (cE minus 1312)

    for R ge 182 Therefore the right side of (1226) is at most

    log q minus (1minus ω(ρ))sump|q

    log p

    p+ ((14709minus cE) + ω(ρ)(cE minus 1312))

    and so (1226) holds provided that

    (1minus ω(ρ)) log sq ge log q + (14709minus cE) + ω(ρ)(cE minus 1312)minus c∆ (1227)

    We will thus be able to assume from now on that (1227) does not hold or what is thesame that

    sq lt (cρ2q)1

    1minusω(ρ) (1228)

    holds where cρ2 = exp((14709minus cE) + ω(ρ)(cE minus 1312)minus c∆)What values of R = Q0sq must we consider for q given First by (1228) we

    can assume R gt Q0min(cρ2q)1(1minusω(ρ)) We can also assume

    R gt c(c+) middotmax(RqQ0min)(1minusρ)eminusγ minus log q (1229)

    for c(c+) is as in Lemma 1223 since all smaller R are covered by that LemmaClearly (1229) implies that

    R1minusτ gt c(c+) middot qτ minus log q

    Rτgt c(c+)qτ minus log q

    where τ = (1minusρ)eminusγ and also thatR gt c(c+)Q(1minusρ)eminusγ0min minus log q Iterating we obtain

    that we can assume that R gt $(q) where

    $(q) = max

    ($0(q) c(c+)Qτ0min minus log q

    Q0min

    (cρ2q)1

    1minusω(ρ)

    )(1230)

    and

    $0(q) =

    (c(c+)qτ minus log q

    (c(c+)qτminuslog q)τ

    1minusτ

    ) 11minusτ

    if c(c+)qτ gt log q + 1

    0 otherwise

    238 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    Looking at (1226) we see that it will be enough to show that for all R satisfyingR gt $(q) we have

    errqR +ω(ρ) max (0minus errqtR) le φ(q)

    qκ(q) (1231)

    for all t ge 20000 where

    κ(q) = (1minus ω(ρ))

    log q minussump|q

    log p

    p

    + c∆

    Ramarersquos bound (1212) implies that

    | errqR | le 7284Rminus13f1(q) (1232)

    with f1(q) as in (1213) and so

    errqR +ω(ρ) max (0minus errqtR) le (1 + βρ) middot 7284Rminus13f1(q)

    where βρ = ω(ρ)2000013 This is enough when

    R ge λ(q) =

    (q

    φ(q)

    7284(1 + βρ)f1(q)

    κ(q)

    )3

    (1233)

    It remains to do two things First we have to compute how large q has to be for$(q) to be guaranteed to be greater than λ(q) (For such q there is no checking to bedone) Then we check the inequality (1231) for all smaller q letting R range throughthe integers in [$(q) λ(q)] We bound errqtR using (1232) but we compute errqRdirectly

    How large must q be for $(q) gt λ(q) to hold We claim that $(q) gt λ(q)whenever q ge 22 middot 1010 Let us show this

    It is easy to see that (p(pminus1)) middotf1(p) and prarr (log p)p are decreasing functionsof p for p ge 3 moreover for both functions the value at p ge 7 is smaller than forp = 2 Hence we have that for q lt

    prodplep0 p p0 a prime

    κ(q) ge (1minus ω(ρ))

    (log q minus

    sumpltp0

    log p

    p

    )+ c∆ (1234)

    and

    λ(q) le

    prodpltp0

    p

    pminus 1middot

    7284(1 + βρ)prodpltp0

    f1(p)

    (1minus ω(ρ))(

    log q minussumpltp0

    log pp

    )+ c∆

    3

    (1235)

    If we also assume that 2 middot 3 middot 5 middot 7 - q we obtain

    κ(q) ge (1minus ω(ρ))

    log q minussumpltp0p 6=7

    log p

    p

    + c∆ (1236)

    122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 239

    and

    λ(q) le

    prodpltp0p 6=7

    p

    pminus 1middot

    7284(1 + βρ)prodpltp0p 6=7 f1(p)

    (1minus ω(ρ))(

    log q minussumpltp0p6=7

    log pp

    )+ c∆

    3

    (1237)

    for q ltprodplep0 (We are taking out 7 because it is the ldquoleast helpfulrdquo prime to omit

    among all primes from 2 to 7 again by the fact that (p(p minus 1)) middot f1(p) and p rarr(log p)p are decreasing functions for p ge 3)

    We know how to give upper bounds for the expression on the right of (1235)The task is in essence simple we can base our bounds on the classic explicit work in[RS62] except that we also have to optimize matters so that they are close to tight forp1 = 29 p1 = 31 and other low p1

    By [RS62 (330)] and a numerical computation for 29 le p1 le 43prodplep1

    p

    pminus 1lt 190516 log p1

    for p1 ge 29 Since ω(ρ) is increasing on ρ and we are assuming ρ le 06 Q0min =100000

    ω(ρ) le 0627312 βρ le 0023111

    For x gt a where a gt 1 is any constant we obviously havesumaltplex

    log(

    1 + pminus23)le

    sumaltplex

    (log p)pminus23

    log a

    by Abel summation (133) and the estimate [RS62 (332)] for θ(x) =sumplex log psum

    altplex

    (log p)pminus23 = (θ(x)minus θ(a))xminus23 minus

    int x

    a

    (θ(u)minus θ(a))

    (minus2

    3uminus

    53

    )du

    le (101624xminus θ(a))xminus23 +

    2

    3

    int x

    a

    (101624uminus θ(a))uminus53 du

    = (101624xminus θ(a))xminus23 + 2 middot 101624(x13 minus a13) + θ(a)(xminus23 minus aminus23)

    = 3 middot 101624 middot x13 minus (203248a13 + θ(a)aminus23)

    We conclude thatsum

    104ltplex log(1 + pminus23) le 033102x13 minus 706909 for x gt 104Since

    sumple104 log p le 1009062 this means thatsum

    plex

    log(1 + pminus23) le(

    033102 +1009062minus 706909

    1043

    )x13 le 047126x13

    for x gt 104 a direct computation for all x prime between 29 and 104 then confirmsthat sum

    plex

    log(1 + pminus23) le 074914x13

    240 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    for all x ge 29 Thusprodplex

    f1(p) le esumplex log(1+pminus23)prod

    ple29

    (1 + p13+p23

    p(pminus1)

    ) le e074914x13

    662365

    for x ge 29 Finally by [RS62 (324)]sumplep1

    log pp lt log p1

    We conclude that for q ltprodplep0 p0 p0 a prime and p1 the prime immediately

    preceding p0

    λ(q) le

    190516 log p1 middot745235 middot

    (e074914p

    131

    662365

    )037268(log q minus log p1) + 002741

    3

    le 190272(log p1)3e224742p131

    (log q minus log p1 + 007354)3

    (1238)

    It is clear from (1230) that $(q) is increasing as soon as

    q ge max(Q0min Q1minusω(ρ)0min cρ2)

    and c(c+)qτ gt log q+ 1 since then $0(q) is increasing and $(q) = $0(q) Here it isuseful to recall that cρ2 ge exp(14709 minus c+) and to note that c(c+)qτ minus (log q + 1)is increasing for q ge 1(τ middot c(c+))1τ we see also that 1(τ middot c(c+))1τ le 1((1 minus06)eminusγc(c+))1((1minus06)eminusγ) for ρ le 06 A quick computation for our value of c+makes us conclude that q gt 112Q0min = 112000 is a sufficient condition for $(q) tobe equal to $0(q) and for $0(q) to be increasing

    Since (1238) is decreasing on q for p1 fixed and $0(q) is decreasing on ρ andincreasing on q we set ρ = 06 and check that then

    $0

    (22 middot 1010

    )ge 846765

    whereas by (1238)

    λ(22 middot 1010) le 838227 lt 846765

    this is enough to ensure that λ(q) lt $0(q) for 22 middot 1010 le q ltprodple31 p

    Let us now give some rough bounds that will be enough to cover the case q geprodple31 p First as we already discussed $(q) = $0(q) and since c(c+)qτ gt log q +

    1

    $0(q) ge (c(c+)qτ minus log q)1

    1minusτ ge (0911q0224 minus log q)1289 ge q02797 (1239)

    by q geprodple31 p We are in the range

    prodplep1 p le q le

    prodplep0 p where p1 lt p0

    are two consecutive primes with p1 ge 31 By [RS62 (316)] and a computation for31 le q lt 200 we know that log q ge

    prodplep1 log p ge 08009p1 By (1238) and

    (1239) it follows that we just have to show that

    e0224t gt190272(log t)3e224742t13

    (08009tminus log t+ 007354)3

    122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 241

    for t ge 31 Now t ge 31 implies 08009tminus log t+ 007354 ge 06924t and so takinglogarithms we see that we just have to verify

    0224tminus 224742t13 gt 3 log log tminus 3 log t+ 63513 (1240)

    for t ge 31 and since the left side is increasing and the right side is decreasing fort ge 31 this is trivial to check

    We conclude that $(q) gt λ(q) whenever q ge 22 middot 1010It remains to see how we can relax this assumption if we assume that 2 middot 3 middot 5 middot 7 - q

    We repeat the same analysis as before using (1236) and (1237) instead of (1234) and(1235) For p1 ge 29

    prodplep1p 6=7

    p

    pminus 1lt 1633 log p1

    prodplep1p6=7

    f1(p) le e074914x13minuslog(1+7minus23)

    58478le e074914x13

    744586

    andsumplep1p 6=7(log p)p lt log p1minus (log 7)7 So for q lt

    prodplep0p 6=7 p and p1 ge 29

    the prime immediately preceding p0

    λ(q) le

    1633 log p1 middot745235 middot

    (e074914p

    131

    744586

    )037268

    (log q minus log p1 + log 7

    7

    )+ 002741

    3

    le 84351(log p1)3e224742p131

    (log q minus log p1 + 035152)3

    Thus we obtain just like before that

    $0(33 middot 109) ge 477465 λ(33 middot 109) le 475513 lt 477465

    We also check that $0(q0) ge 916322 is greater than λ(q0) le 429731 for q0 =prodple31p 6=7 p The analysis for q ge

    prodple37p 6=7 p is also just like before since log q ge

    08009p1 minus log 7 we have to show that

    e0224t

    7gt

    84351(log t)3e224742t13

    (08009tminus log t+ 007354)3

    for t ge 37 and that in turn follows from

    0224tminus 224742t13 gt 3 log log tminus 3 log t+ 674849

    which we check for t ge 37 just as we checked (1240)We conclude that $(q) gt λ(q) if q ge 33 middot 109 and 210 - qComputation Now for q lt 33middot109 (and also for 33middot109 le q lt 22middot1010 210|q)

    we need to check that the maximum mqR1 of errqR over all $(q) le R lt λ(q)satisfies (1231) Note that there is a term errqtR in (1231) we bound it using (1232)

    Since logR is increasing on R and Gq(R) depends only on bRc we can tell from(1224) that since we are taking the maximum of errqR it is enough to check integer

    242 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    values of R We check all integers R in [$(q) λ(q)) for all q lt 33 middot 109 (and all33 middot 109 le q lt 22 middot 1010 210|q) by an explicit computation2

    Finally we have the trivial bound

    Gq(Q0sq)

    Gq(Qsq)le 1 (1241)

    which we shall use for Q0 close to Q

    Corollary 1225 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

    radicx Let Q0 ge 105 δ0 ge 1 be such that

    (20000Q0)2 le x2δ0 set Q =radicx2δ0

    Let S(α) =sumn ane(αn) for α isin RZ Let M as in (121) Then if Q0 le Q06int

    M

    |S(α)|2 dα le logQ0 + c+logQ+ cE

    sumn

    |an|2

    where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

    Let Mδ0Q0 as in (105) Then if (2Q0) le (2Q)06intMδ0Q0

    |S(α)|2 dα le log 2Q0 + c+log 2Q+ cE

    sumn

    |an|2 (1242)

    Here of courseintRZ |S(α)|2 dα =

    sumn |an|2 (Plancherel) If Q0 gt Q06 we will

    use the trivial boundintMδ0r

    |S(α)|2 dα leintRZ|S(α)|2 dα =

    sumn

    |an|2 (1243)

    Proof Immediate from Prop 1211 Prop 1212 and Prop 1224

    Obviously one can also give a statement derived from Prop 1211 the resultingbound is int

    M

    |S(α)|2dα le logQ0 + c+logQ+ cE

    sumn

    |an|2

    where M is as in (121)We also record the large-sieve form of the result

    2This is by far the heaviest computation in the present work though it is still rather minor (about twoweeks of computing on a single core of a fairly new (2010) desktop computer carrying out other tasks as wellthis is next to nothing compared to the computations in [Plab] or even those in [HP13]) For the applicationshere we could have assumed ρ le 815 and that would have reduced computation time drastically thelighter assumption ρ le 06 was made with views to general applicability in the future As elsewhere in thissection numerical computations were carried out by the author in C all floating-point operations used DPlattrsquos interval arithmetic package

    122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 243

    Corollary 1226 Let N ge 1 Let aninfinn=1 an isin C be supported on the integersn le N Let Q0 ge 105 Q ge 20000Q0 Assume that an = 0 for every n for whichthere is a p le Q dividing n

    Let S(α) =sumn ane(αn) for α isin RZ Then if Q0 le Q06sum

    qleQ0

    suma mod q

    (aq)=1

    |S(aq)|2 dα le logQ0 + c+logQ+ cE

    middot (N +Q2)sumn

    |an|2

    where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

    Proof Proceed as Ramare does in the proof of [Ram09 Thm 52] with Kq = a isinZqZ (a q) = 1 and un = an) in particular apply [Ram09 Thm 21] The proofof [Ram09 Thm 52] shows thatsum

    qleQ0

    suma mod q

    (aq)=1

    |S(aq)|2 dα le maxqleQ0

    Gq(Q0)

    Gq(Q)middotsumqleQ0

    suma mod q

    (aq)=1

    |S(aq)|2 dα

    Now instead of using the easy inequalityGq(Q0)Gq(Q) le G1(Q0)G1(QQ0) useProp 1224

    It would seem desirable to prove a result such as Prop 1224 (or Cor 1225 orCor 1226) without computations and with conditions that are as weak as possibleSince as we said we cannot make c+ equal to cE and since c+ does have to increasewhen the conditions are weakened (as is shown by computations this is not an arti-fact of our method of proof) the right goal might be to show that the maximum ofGq(Q0sq)Gq(Qsq) is reached when s = q = 1

    However this is also untrue without conditions For instance for Q0 = 2 and Qlarge the value of Gq(Q0q)Gq(Qq) at q = 2 is larger than at q = 1 by (1212)

    G2

    (Q0

    2

    )G2

    (Q2

    ) sim 1

    12

    (log Q

    2 + cE + log 22

    )=

    2

    logQ+ cE minus log 22

    gt2

    logQ+ cEsim G(Q0)

    G(Q)

    Thus at the very least a lower bound on Q0 is needed as a condition This also dimsthe hopes somewhat for a combinatorial proof of Gq(Q0q)G(Q) le Gq(Qq)G(Q0)at any rate while such a proof would be welcome it could not be extremely straightfor-ward since there are terms in Gq(Q0q)G(Q) that do not appear in Gq(Qq)G(Q0)

    244 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

    Chapter 13

    The integral over the minor arcs

    The time has come to bound the part of our triple-product integral (103) that comesfrom the minor arcs m sub RZ We have an `infin estimate (from Prop 1123 based onTheorem 311) and an `2 estimate (from sect122) Now we must put them together

    There are two ways in which we must be careful A trivial bound of the form`33 =

    int|S(α)|3dα le `22 middot `infin would introduce a fatal factor of log x coming from `2

    We avoid this by using the fact that we have `2 estimates over Mδ0Q0for varying Q0

    We must also remember to substract the major-arc contribution from our estimatefor Mδ0Q0 this is why we were careful to give a lower bound in Lem 1031 asopposed to just the upper bound (1028)

    131 Putting together `2 bounds over arcs and `infin bounds

    Let us start with a simple lemma ndash essentially a way to obtain upper bounds by meansof summation by parts

    Lemma 1311 Let f g a a+ 1 b rarr R+0 where a b isin Z+ Assume that for

    all x isin [a b] sumalenlex

    f(n) le F (x) (131)

    where F [a b]rarr R is continuous piecewise differentiable and non-decreasing Then

    bsumn=a

    f(n) middot g(n) le (maxngea

    g(n)) middot F (a) +

    int b

    a

    (maxngeu

    g(n)) middot F prime(u)du

    Proof Let S(n) =sumnm=a f(m) Then by partial summation

    bsumn=a

    f(n) middot g(n) le S(b)g(b) +bminus1sumn=a

    S(n)(g(n)minus g(n+ 1)) (132)

    245

    246 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

    Let h(x) = maxxlenleb g(n) Then h is non-increasing Hence (131) and (132) implythat

    bsumn=a

    f(n)g(n) lebsum

    n=a

    f(n)h(n)

    le S(b)h(b) +

    bminus1sumn=a

    S(n)(h(n)minus h(n+ 1))

    le F (b)h(b) +

    bminus1sumn=a

    F (n)(h(n)minus h(n+ 1))

    In general for αn isin C A(x) =sumalenlex αn and F continuous and piecewise differ-

    entiable on [a x]sumalenlex

    αnF (x) = A(x)F (x)minusint x

    a

    A(u)F prime(u)du (Abel summation) (133)

    Applying this with αn = h(n)minush(n+1) andA(x) =sumalenlex αn = h(a)minush(bxc+

    1) we obtain

    bminus1sumn=a

    F (n)(h(n)minus h(n+ 1))

    = (h(a)minus h(b))F (bminus 1)minusint bminus1

    a

    (h(a)minus h(buc+ 1))F prime(u)du

    = h(a)F (a)minus h(b)F (bminus 1) +

    int bminus1

    a

    h(buc+ 1)F prime(u)du

    = h(a)F (a)minus h(b)F (bminus 1) +

    int bminus1

    a

    h(u)F prime(u)du

    = h(a)F (a)minus h(b)F (b) +

    int b

    a

    h(u)F prime(u)du

    since h(buc+ 1) = h(u) for u isin Z Hence

    bsumn=a

    f(n)g(n) le h(a)F (a) +

    int b

    a

    h(u)F prime(u)du

    We will now see our main application of Lemma 1311 We have to bound anintegral of the form

    intMδ0r

    |S1(α)|2|S2(α)|dα where Mδ0r is a union of arcs defined

    as in (105) Our inputs are (a) a bound on integrals of the formintMδ0r

    |S1(α)|2dα (b)a bound on |S2(α)| for α isin (RZ)Mδ0r The input of type (a) is what we derived insect121 and sect122 the input of type (b) is a minor-arcs bound and as such was the mainsubject of Part I

    131 PUTTING TOGETHER `2 BOUNDS OVER ARCS AND `infin BOUNDS 247

    Proposition 1312 Let S1(α) =sumn ane(αn) an isin C an in L1 Let S2 RZrarr

    C be continuous Define Mδ0r as in (105)Let r0 be a positive integer not greater than r1 Let H [r0 r1] rarr R+ be a

    continuous piecewise differentiable non-decreasing function such that

    1sum|an|2

    intMδ0r+1

    |S1(α)|2dα le H(r) (134)

    for some δ0 le x2r21 and all r isin [r0 r1] Assume moreover that H(r1) = 1 Let

    g [r0 r1]rarr R+ be a non-increasing function such that

    maxαisin(RZ)Mδ0r

    |S2(α)| le g(r) (135)

    for all r isin [r0 r1] and δ0 as aboveThen

    1sumn |an|2

    int(RZ)Mδ0r0

    |S1(α)|2|S2(α)|dα

    le g(r0) middot (H(r0)minus I0) +

    int r1

    r0

    g(r)H prime(r)dr

    (136)

    whereI0 =

    1sumn |an|2

    intMδ0r0

    |S1(α)|2dα (137)

    The condition δ0 le x2r21 is there just to ensure that the arcs in the definition of

    Mδ0r do not overlap for r le r1

    Proof For r0 le r lt r1 let

    f(r) =1sum

    n |an|2

    intMδ0r+1Mδ0r

    |S1(α)|2dα

    Letf(r1) =

    1sumn |an|2

    int(RZ)Mδ0r1

    |S1(α)|2dα

    Then by (135)

    1sumn |an|2

    int(RZ)Mδ0r0

    |S1(α)|2|S2(α)|dα ler1sumr=r0

    f(r)g(r)

    By (134)sumr0lerlex

    f(r) =1sum

    n |an|2

    intMδ0x+1Mδ0r0

    |S1(α)|2dα

    =

    (1sum

    n |an|2

    intMδ0x+1

    |S1(α)|2dα

    )minus I0 le H(x)minus I0

    (138)

    248 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

    for x isin [r0 r1) Moreoversumr0lerler1

    f(r) =1sum

    n |an|2

    int(RZ)Mδ0r0

    |S1(α)|2

    =

    (1sum

    n |an|2

    intRZ|S1(α)|2

    )minus I0 = 1minus I0 = H(r1)minus I0

    We let F (x) = H(x) minus I0 and apply Lemma 1311 with a = r0 b = r1 Weobtain that

    r1sumr=r0

    f(r)g(r) le (maxrger0

    g(r))F (r0) +

    int r1

    r0

    (maxrgeu

    g(r))F prime(u) du

    le g(r0)(H(r0)minus I0) +

    int r1

    r0

    g(u)H prime(u) du

    132 The minor-arc totalWe now apply Prop 1312 Inevitably the main statement involves some integrals thatwill have to be evaluated at the end of the section

    Theorem 1321 Let x ge 1025 middot κ where κ ge 1 Let

    Sη(α x) =sumn

    Λ(n)e(αn)η(nx) (139)

    Let ηlowast(t) = (η2 lowastM ϕ)(κt) where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in `1 Let η+ [0infin)rarr [0infin) be a bounded piecewise differentiablefunction with limtrarrinfin η+(t) = 0 Let Mδ0r be as in (105) with δ0 = 8 Let 105 ler0 lt r1 where r1 = (38)(xκ)415 Let g(r) = gxκϕ(r) where

    gyϕ(r) =(RyKϕ2r log 2r + 05)

    radicz(r) + 25radic

    2r+L2r

    r+ 336K16yminus16 (1310)

    just as in (1119) and K = log(xκ)2 Here RyKφt is as in (1119) and Lt is asin (1113)

    Denote

    Zr0 =

    int(RZ)M8r0

    |Sηlowast(α x)||Sη+(α x)|2dα

    Then

    Zr0 le

    (radic|ϕ|1xκ

    (M + T ) +radicSηlowast(0 x) middot E

    )2

    132 THE MINOR-ARC TOTAL 249

    where

    S =sumpgtradicx

    (log p)2η2+(nx)

    T = Cϕ3

    (1

    2log

    x

    κ

    )middot (S minus (

    radicJ minusradicE)2)

    J =

    intM8r0

    |Sη+(α x)|2 dα

    E =((Cη+0 + Cη+2) log x+ (2Cη+0 + Cη+1)

    )middot x12

    (1311)

    Cη+0 = 07131

    int infin0

    1radict(suprget

    η+(r))2dt

    Cη+1 = 07131

    int infin1

    log tradict

    (suprget

    η+(r))2dt

    Cη+2 = 051942|η+|2infin

    Cϕ3(K) =104488

    |ϕ|1

    int 1K

    0

    |ϕ(w)|dw

    (1312)

    and

    M = g(r0) middot(

    log(r0 + 1) + c+

    logradicx+ cminus

    middot S minus (radicJ minusradicE)2

    )+

    (2

    log x+ 2cminus

    int r1

    r0

    g(r)

    rdr +

    (7

    15+minus214938 + 8

    15 logκlog x+ 2cminus

    )g(r1)

    )middot S

    (1313)where c+ = 20532 and cminus = 06394

    Proof Let y = xκ Let Q = (34)y23 as in Thm 311 (applied with y insteadof x) Let α isin (RZ) M8r where r0 le r le y136 and y is used instead ofx to define M8r (see (105)) There exists an approximation 2α = aq + δy withq le Q |δ|y le 1qQ Thus α = aprimeqprime + δ2y where either aprimeqprime = a2q oraprimeqprime = (a + q)2q holds (In particular if qprime is odd then qprime = q if qprime is even thenqprime = 2q)

    There are three cases

    1 q le r Then either (a) qprime is odd and qprime le r or (b) qprime is even and qprime le 2rSince α is not in M8r then by definition (105) |δ|2y ge δ0r2qy and so|δ| ge δ0rq = 8rq In particular |δ| ge 8

    Thus by Prop 1123

    |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(|δ|8q

    )middot|ϕ|1y le gyϕ(r)middot|ϕ|1y (1314)

    where we use the fact that g(r) is a non-increasing function (Lemma 1124)

    250 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

    2 r lt q le y136 Then by Prop 1123 and Lemma 1124

    |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(

    max

    (|δ|8 1

    )q

    )middot |ϕ|1y

    le gyϕ(r) middot |ϕ|1y(1315)

    3 q gt y136 Again by Prop 1123

    |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le(h( yK

    )+ Cϕ3(K)

    )|ϕ|1y (1316)

    where h(x) is as in (1115) (Of course Cϕ3(K) as in (1312) is equal toCϕ0K|φ|1 where Cϕ0K is as in (1121)) We set K = (log y)2 Sincey = xκ ge 1025 it follows that yK = 2y log y gt 347 middot 1023 gt 216 middot 1020

    Let

    r1 =3

    8y415 g(r) =

    gyϕ(r) if r le r1

    gyϕ(r1) if r gt r1

    By Lemma 1124 for r ge 670 g(r) is a non-increasing function and g(r) ge gyφ(r)Moreover by Lemma 1125 gyφ(r1) ge h(2y log y) where h is as in (1115) and sog(r) ge h(2y log y) for all r ge r0 ge 670 Thus we have shown that

    |Sηlowast(y α)| le(g(r) + Cϕ3

    (log y

    2

    ))middot |ϕ|1y (1317)

    for all α isin (RZ) M8rWe first need to undertake the fairly dull task of getting non-prime or small n out

    of the sum defining Sη+(α x) Write

    S1η+(α x) =sumpgtradicx

    (log p)e(αp)η+(px)

    S2η+(α x) =sum

    n non-primengtradicx

    Λ(n)e(αn)η+(nx) +sumnleradicx

    Λ(n)e(αn)η+(nx)

    By the triangle inequality (with weights |Sη+(α x)|)radicint(RZ)M8r0

    |Sηlowast(α x)||Sη+(α x)|2dα

    le2sumj=1

    radicint(RZ)M8r0

    |Sηlowast(α x)||Sjη+(α x)|2dα

    132 THE MINOR-ARC TOTAL 251

    Clearlyint(RZ)M8r0

    |Sηlowast(α x)||S2η+(α x)|2dα

    le maxαisinRZ

    |Sηlowast(α x)| middotintRZ|S2η+(α x)|2dα

    leinfinsumn=1

    Λ(n)ηlowast(nx) middot

    sumn non-prime

    Λ(n)2η+(nx)2 +sumnleradicx

    Λ(n)2η+(nx)2

    Let η+(z) = suptgez η+(t) Since η+(t) tends to 0 as t rarr infin so does η+ By [RS62Thm 13] partial summation and integration by partssum

    n non-prime

    Λ(n)2η+(nx)2 lesum

    n non-prime

    Λ(n)2η+(nx)2

    le minusint infin

    1

    sumnlet

    n non-prime

    Λ(n)2

    (η+2(tx)

    )primedt

    le minusint infin

    1

    (log t) middot 14262radict(η+

    2(tx))primedt

    le 07131

    int infin1

    log e2tradictmiddot η+

    2

    (t

    x

    )dt

    =

    (07131

    int infin1x

    2 + log txradict

    η+2(t)dt

    )radicx

    while by [RS62 Thm 12]sumnleradicx

    Λ(n)2η+(nx)2 le 1

    2|η+|2infin(log x)

    sumnleradicx

    Λ(n)

    le 051942|η+|2infin middotradicx log x

    This shows thatint(RZ)M8r0

    |Sηlowast(α x)||S2η+(α x)|2dα leinfinsumn=1

    Λ(n)ηlowast(nx) middot E = Sηlowast(0 x) middot E

    where E is as in (1311)It remains to boundint

    (RZ)M8r0

    |Sηlowast(α x)||S1η+(α x)|2dα (1318)

    We wish to apply Prop 1312 Corollary 1225 gives us an input of type (134) wehave just derived a bound (1317) that provides an input of type (135) More precisely

    252 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

    by (1242) (134) holds with

    H(r) =

    log(r+1)+c+

    logradicx+cminus

    if r lt r1

    1 if r ge r1

    where c+ = 20532 gt log 2 + 136 and cminus = 06394 lt log(1radic

    2 middot 8) + log 2 +13325822 (We can apply Corollary 1225 because 2(r1 + 1) = (34)x415 + 2 le(2radicx16)06 for x ge 1025 (or even for x ge 100000)) Since r1 = (38)y415 and

    x ge 1025 middot κ

    limrrarrr+1

    H(r)minus limrrarrrminus1

    H(r) = 1minus log((38)(xκ)415 + 1) + c+

    logradicx+ cminus

    le 1minus(

    415

    12+

    log 38 + c+ minus 4

    15 logκ minus 815cminus

    logradicx+ cminus

    )le 7

    15+minus214938 + 8

    15 logκlog x+ 2cminus

    We also have (135) with (g(r) + Cϕ3

    (log y

    2

    ))middot |ϕ|1y (1319)

    instead of g(r) (by (1317)) Here (1319) is a non-increasing function of r becauseg(r) is as we already checked Hence Prop 1312 gives us that (1318) is at most

    g(r0)middot(H(r0)minus I0) + (1minus I0) middot Cϕ3(

    log y

    2

    )+

    1

    logradicx+ cminus

    int r1

    r0

    g(r)

    r + 1dr +

    (7

    15+minus214938 + 8

    15 logκlog x+ 2cminus

    )g(r1)

    (1320)times |ϕ|1y middot

    sumpgtradicx(log p)2η2

    +(px) where

    I0 =1sum

    pgtradicx(log p)2η2

    +(nx)

    intM8r0

    |S1η+(α x)|2 dα (1321)

    By the triangle inequalityradicintM8r0

    |S1η+(α x)|2 dα =

    radicintM8r0

    |Sη+(α x)minus S2η+(α x)|2 dα

    geradicint

    M8r0

    |Sη+(α x)|2 dαminusradicint

    M8r0

    |S2η+(α x)|2 dα

    geradicint

    M8r0

    |Sη+(α x)|2 dαminusradicint

    RZ|S2η+(α x)|2 dα

    132 THE MINOR-ARC TOTAL 253

    As we already showedintRZ|S2η+(α x)|2 dα =

    sumn non-primeor n le

    radicx

    Λ(n)2η+(nx)2 le E

    ThusI0 middot S ge (

    radicJ minusradicE)2

    and so we are done

    We now should estimate the integralint r1r0

    g(r)r dr in (1313) It is easy to see thatint infin

    r0

    1

    r32dr =

    2

    r120

    int infinr0

    log r

    r2dr =

    log er0

    r0

    int infinr0

    1

    r2dr =

    1

    r0int r1

    r0

    1

    rdr = log

    r1

    r0

    int infinr0

    log r

    r32dr =

    2 log e2r0radicr0

    int infinr0

    log 2r

    r32dr =

    2 log 2e2r0radicr0

    int infinr0

    (log 2r)2

    r32dr =

    2P2(log 2r0)radicr0

    int infinr0

    (log 2r)3

    r32dr =

    2P3(log 2r0)

    r120

    (1322)where

    P2(t) = t2 + 4t+ 8 P3(t) = t3 + 6t2 + 24t+ 48 (1323)

    We also have int infinr0

    dr

    r2 log r= E1(log r0) (1324)

    where E1 is the exponential integral

    E1(z) =

    int infinz

    eminust

    tdt

    We must also estimate the integralsint r1

    r0

    radicz(r)

    r32dr

    int r1

    r0

    z(r)

    r2dr

    int r1

    r0

    z(r) log r

    r2dr

    int r1

    r0

    z(r)

    r32dr (1325)

    Clearly z(r) minus eγ log log r = 250637 log log r is decreasing on r Hence forr ge 105

    z(r) le eγ log log r + cγ

    where cγ = 1025742 Let F (t) = eγ log t+ cγ Then F primeprime(t) = minuseγt2 lt 0 Hence

    d2radicF (t)

    dt2=

    F primeprime(t)

    2radicF (t)

    minus (F prime(t))2

    4(F (t))32lt 0

    254 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

    for all t gt 0 In other wordsradicF (t) is convex-down and so we can bound

    radicF (t)

    from above byradicF (t0) +

    radicFprime(t0) middot (tminus t0) for any t ge t0 gt 0 Hence for r ge r0 ge

    105 radicz(r) le

    radicF (log r) le

    radicF (log r0) +

    dradicF (t)

    dt|t=log r0 middot log

    r

    r0

    =radicF (log r0) +

    eγradicF (log r0)

    middotlog r

    r0

    2 log r0

    Thus by (1322)int infinr0

    radicz(r)

    r32dr le

    radicF (log r0)

    (2minus eγ

    F (log r0)

    )1radicr0

    +eγradic

    F (log r0) log r0

    log e2r0radicr0

    =2radicF (log r0)radicr0

    (1 +

    F (log r0) log r0

    )

    (1326)

    The other integrals in (1325) are easier Just as in (1326) we extend the range ofintegration to [r0infin] Using (1322) and (1324) we obtainint infin

    r0

    z(r)

    r2dr le

    int infinr0

    F (log r)

    r2dr = eγ

    (log log r0

    r0+ E1(log r0)

    )+cγr0int infin

    r0

    z(r) log r

    r2dr le eγ

    ((1 + log r0) log log r0 + 1

    r0+ E1(log r0)

    )+cγ log er0

    r0

    By [OLBC10 (682)]

    1

    r(log r + 1)le E1(log r) le 1

    r log r

    (The second inequality is obvious) Henceint infinr0

    z(r)

    r2dr le eγ(log log r0 + 1 log r0) + cγ

    r0

    int infinr0

    z(r) log r

    r2dr le

    eγ(

    log log r0 + 1log r0

    )+ cγ

    r0middot log er0

    Finally int infinr0

    z(r)

    r32le eγ

    (2 log log r0radic

    r0+ 2E1

    (log r0

    2

    ))+

    2cγradicr0

    le 2radicr0

    (F (log r0) +

    2eγ

    log r0

    )

    (1327)

    It is time to estimate int r1

    r0

    Rz2r log 2rradicz(r)

    r32dr (1328)

    132 THE MINOR-ARC TOTAL 255

    where z = y or z = y((log y)2) (and y = xκ as before) and where Rzt is asdefined in (1113) By Cauchy-Schwarz (1328) is at most

    radicint r1

    r0

    (Rz2r log 2r)2

    r32dr middot

    radicint r1

    r0

    z(r)

    r32dr (1329)

    We have already bounded the second integral Let us look at the first one We can writeRzt = 027125Rzt + 041415 where

    Rzt = log

    (1 +

    log 4t

    2 log 9z13

    2004t

    ) (1330)

    Clearly

    Rzet4 = log

    (1 +

    t2

    log 36z13

    2004 minus t

    )

    Now for f(t) = log(c+ at(bminus t)) and t isin [0 b)

    f prime(t) =ab(

    c+ atbminust

    )(bminus t)2

    f primeprime(t) =minusab((aminus 2c)(bminus 2t)minus 2ct)(

    c+ atbminust

    )2

    (bminus t)4

    In our case a = 12 c = 1 and b = log 36z13 minus log(2004) gt 0 Hence for t lt b

    minusab((aminus 2c)(bminus 2t)minus 2ct) =b

    2

    (2t+

    3

    2(bminus 2t)

    )=b

    2

    (3

    2bminus t

    )gt 0

    and so f primeprime(t) gt 0 In other words t rarr Rzet4 is convex-up for t lt b ie foret4 lt 9z132004 It is easy to check that since we are assuming y ge 1025

    2r1 =3

    16y415 lt

    9

    2004

    (2y

    log y

    )13

    le 9z13

    2004

    We conclude that r rarr Rz2r is convex-up on log 8r for r le r1 and hence so isr rarr Rzr and so in turn is r rarr R2

    zr Thus for r isin [r0 r1]

    R2z2r le R2

    z2r0 middotlog r1r

    log r1r0+R2

    z2r1 middotlog rr0

    log r1r0 (1331)

    256 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

    Therefore by (1322)

    int r1

    r0

    (Rz2r log 2r)2

    r32dr

    leint r1

    r0

    (R2z2r0

    log r1r

    log r1r0+R2

    z2r1

    log rr0

    log r1r0

    )(log 2r)2 dr

    r32

    =2R2

    z2r0

    log r1r0

    ((P2(log 2r0)radicr0

    minus P2(log 2r1)radicr1

    )log 2r1 minus

    P3(log 2r0)radicr0

    +P3(log 2r1)radicr1

    )+

    2R2z2r1

    log r1r0

    (P3(log 2r0)radicr0

    minus P3(log 2r1)radicr1

    minus(P2(log 2r0)radicr0

    minus P2(log 2r1)radicr1

    )log 2r0

    )

    = 2

    (R2z2r0 minus

    log 2r0

    log r1r0

    (R2z2r1 minusR

    2z2r0)

    )middot(P2(log 2r0)radicr0

    minus P2(log 2r1)radicr1

    )+ 2

    R2z2r1 minusR

    2z2r0

    log r1r0

    (P3(log 2r0)radicr0

    minus P3(log 2r1)radicr1

    )= 2R2

    z2r0 middot(P2(log 2r0)radicr0

    minus P2(log 2r1)radicr1

    )+ 2

    R2z2r1 minusR

    2z2r0

    log r1r0

    (Pminus2 (log 2r0)radicr0

    minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

    )

    (1332)where P2(t) and P3(t) are as in (1323) and Pminus2 (t) = P3(t)minustP2(t) = 2t2 +16t+48

    Putting all terms together we conclude that

    int r1

    r0

    g(r)

    rdr le f0(r0 y) + f1(r0) + f2(r0 y) (1333)

    where

    f0(r0 y) =

    ((1minus cϕ)

    radicI0r0r1y + cϕ

    radicI0r0r1 2y

    log y

    )radic2radicr0I1r0

    f1(r0) =

    radicF (log r0)radic

    2r0

    (1 +

    F (log r0) log r0

    )+

    5radic2r0

    +1

    r0

    ((13

    4log er0 + 1107

    )Jr0 + 1366 log er0 + 3755

    )f2(r0 y) = 336

    ((log y)2)16

    y16log

    r1

    r0

    (1334)

    132 THE MINOR-ARC TOTAL 257

    where F (t) = eγ log t+ cγ cγ = 1025742 y = xκ (as usual)

    I0r0r1z = R2z2r0 middot

    (P2(log 2r0)radicr0

    minus P2(log 2r1)radicr1

    )+R2z2r1 minusR

    2z2r0

    log r1r0

    (Pminus2 (log 2r0)radicr0

    minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

    )Jr = F (log r) +

    log r I1r = F (log r) +

    2eγ

    log r cϕ =

    Cϕ2 log y2|ϕ|1

    log log y2(1335)

    and Cϕ2K is as in (1120)Let us recapitulate briefly The term f2(r0 y) in (1334) comes from the term

    336xminus116 in (1112) The term f1(r0 y) includes all other terms in (1112) exceptfor Rx2r log 2r

    radicz(r)(

    radic2r) The contribution of that last term is (1328) divided

    byradic

    2 That in turn is at most (1329) divided byradic

    2 The first integral in (1329)was bounded in (1332) the second integral was bounded in (1327)

    258 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

    Chapter 14

    Conclusion

    We now need to gather all results using the smoothing functions

    ηlowast = (η2 lowastM ϕ)(κt)

    where ϕ(t) = t2eminust22 η2 = η1 lowastM η1 and η1 = 2 middot I[minus1212] and

    η+ = h200(t)teminust22

    where

    hH(t) =

    int infin0

    h(tyminus1)FH(y)dy

    y

    h(t) =

    t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

    FH(t) =sin(H log y)

    π log y

    We studied ηlowast and η+ in Part II We saw ηlowast in Thm 1321 (which actually works forgeneral ϕ [0infin)rarr [0infin) as its statement says) We will set κ soon

    We fix a value for r namely r = 150000 Our results will have to be valid for anyx ge x+ where x+ is fixed We set x+ = 49 middot 1026 since we want a result valid forN ge 1027 and as was discussed in (111) we will work with x+ slightly smaller thanN2

    141 The `2 norm over the major arcs explicit versionWe apply Lemma 1031 with η = η+ and η as in (113) Let us first work out theerror terms defined in (1027) Recall that δ0 = 8 By Thm 714

    ETη+δ0r2 = max|δ|leδ0r2

    | errηχT (δ x)|

    = 4772 middot 10minus11 +251400radicx+

    le 11405 middot 10minus8(141)

    259

    260 CHAPTER 14 CONCLUSION

    Eη+rδ0 = maxχ mod q

    qlermiddotgcd(q2)

    |δ|legcd(q2)δ0r2q

    radicqlowast| errη+χlowast(δ x)|

    le 13482 middot 10minus14radic

    300000 +1617 middot 10minus10

    radic2

    +1radicx+

    (499900 + 52

    radic300000

    )le 23992 middot 10minus8

    (142)where in the latter case we are using the fact that a stronger bound for q = 1 (namely(141)) allows us to assume q ge 2

    We also need to bound a few norms by the estimates in sectA3 and sectA5 (appliedwith H = 200)

    |η+|1 le 1062319 |η+|2 le 0800129 +2748569

    20072le 0800132

    |η+|infin le 1 + 206440727 middot1 + 4

    π logH

    Hle 1079955

    (143)

    By (1012) and (141)

    |Sη+(0 x)| =∣∣η+(0) middot x+Olowast

    (errη+χT (0 x)

    )middot x∣∣

    le (|η+|1 + ETη+δ0r2)x le 1063x

    This is far from optimal but it will do since all we wish to do with this is to bound thetiny error term Kr2 in (1027)

    Kr2 = (1 +radic

    300000)(log x)2 middot 1079955

    middot (2 middot 106232 + (1 +radic

    300000)(log x)21079955x)

    le 125906(log x)2 le 971 middot 10minus21x

    for x ge x+ By (141) we also have

    519δ0r

    (ET

    η+δ0r2middot

    (|η+|1 +

    ETη+

    δ0r2

    2

    ))le 0075272

    andδ0r(log 2e2r)

    (E2η+rδ0 +Kr2x

    )le 100393 middot 10minus8

    By (A23) and (A26)

    08001287 le |η|2 le 08001288 (144)

    and|η+ minus η|2 le

    274856893

    H72le 242942 middot 10minus6 (145)

    We bound |η(3) |1 using the fact that (as we can tell by taking derivatives) η(2)

    (t)

    increases from 0 at t = 0 to a maximum within [0 12] and then decreases to η(2) (1) =

    142 THE TOTAL MAJOR-ARC CONTRIBUTION 261

    minus7 only to increase to a maximum within [32 2] (equal to the maximum attainedwithin [0 12]) and then decrease to 0 at t = 2

    |η(3) |1 = 2 max

    tisin[012]η

    (2) (t)minus 2η

    (2) (1) + 2 max

    tisin[322]η

    (2) (t)

    = 4 maxtisin[012]

    η(2) (t) + 14 le 4 middot 46255653 + 14 le 325023

    (146)

    where we compute the maximum by the bisection method with 30 iterations (usinginterval arithmetic as always)

    We evaluate explicitly sumqlerq odd

    micro2(q)

    φ(q)= 6798779

    using yet again interval arithmeticLooking at (1029) and (1028) we conclude that

    Lrδ0 le 2 middot 6798779 middot 08001322 le 870531

    Lrδ0 ge 2 middot 6798779 middot 080012872 minus ((log r + 17) middot (3888 middot 10minus6 + 591 middot 10minus12))

    minus(1342 middot 10minus5

    )middot(

    064787 +log r

    4r+

    0425

    r

    )ge 870517

    Lemma 1031 thus gives us thatintM8r0

    ∣∣Sη+(α x)∣∣2 dα = (870524 +Olowast(000007))x+Olowast(0075273)x

    = (87052 +Olowast(00754))x le 87806x

    (147)

    142 The total major-arc contributionFirst of all we must bound from below

    C0 =prodp|N

    (1minus 1

    (pminus 1)2

    )middotprodp-N

    (1 +

    1

    (pminus 1)3

    ) (148)

    The only prime that we know does not divide N is 2 Thus we use the bound

    C0 ge 2prodpgt2

    (1minus 1

    (pminus 1)2

    )ge 13203236 (149)

    The other main constant is Cηηlowast which we defined in (1037) and already startedto estimate in (116)

    Cηηlowast = |η|22int N

    x

    0

    ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

    x

    0

    ((2minusNx) + ρ)2ηlowast(ρ)dρ

    )(1410)

    262 CHAPTER 14 CONCLUSION

    provided that N ge 2x Recall that ηlowast = (η2 lowastM ϕ)(κt) where ϕ(t) = t2eminust22

    Thereforeint Nx

    0

    ηlowast(ρ)dρ =

    int Nx

    0

    (η2 lowast ϕ)(κρ)dρ =

    int 1

    14

    η2(w)

    int Nx

    0

    ϕ(κρw

    )dρdw

    w

    =|η2|1|ϕ|1

    κminus 1

    κ

    int 1

    14

    η2(w)

    int infinκNxw

    ϕ(ρ)dρdw

    By integration by parts and [AS64 (7113)]int infiny

    ϕ(ρ)dρ = yeminusy22 +

    radic2

    int infinyradic

    2

    eminust2

    dt lt

    (y +

    1

    y

    )eminusy

    22

    Hence int infinκNxw

    ϕ(ρ)dρ leint infin

    2κϕ(ρ)dρ lt

    (2κ +

    1

    )eminus2κ2

    and so since |η2|1 = 1int Nx

    0

    ηlowast(ρ)dρ ge |ϕ|1κminusint 1

    14

    η2(w)dw middot(

    2 +1

    2κ2

    )eminus2κ2

    ge |ϕ|1κminus(

    2 +1

    2κ2

    )eminus2κ2

    (1411)

    Let us now focus on the second integral in (1410) Write Nx = 2 + c1κ Thenthe integral equalsint 2+c1κ

    0

    (minusc1κ + ρ)2ηlowast(ρ)dρ le 1

    κ3

    int infin0

    (uminus c1)2 (η2 lowastM ϕ)(u) du

    =1

    κ3

    int 1

    14

    η2(w)

    int infin0

    (vw minus c1)2ϕ(v)dvdw

    =1

    κ3

    int 1

    14

    η2(w)

    (3

    radicπ

    2w2 minus 2 middot 2c1w + c21

    radicπ

    2

    )dw

    =1

    κ3

    (49

    48

    radicπ

    2minus 9

    4c1 +

    radicπ

    2c21

    )

    It is thus best to choose c1 = (94)radic

    2π = 089762 We must now estimate |ηprime|22 We could do this directly by rigorous numerical

    integration but we might as well do it the hard way (which is actually rather easy) Bythe definition (113) of η

    |ηprime(x+ 1)|2 =(x14 minus 18x12 + 111x10 minus 284x8 + 351x6 minus 210x4 + 49x2

    )eminusx

    2

    (1412)for x isin [minus1 1] and ηprime(x+ 1) = 0 for x 6isin [minus1 1] Now for any even integer k gt 0int 1

    minus1

    xkeminusx2

    dx = 2

    int 1

    0

    xkeminusx2

    dx = γ

    (k + 1

    2 1

    )

    142 THE TOTAL MAJOR-ARC CONTRIBUTION 263

    where γ(a r) =int r

    0eminusttaminus1dt is the incomplete gamma function (We substitute

    t = x2 in the integral) By [AS64 (6516) (6522)] γ(a+ 1 1) = aγ(a 1)minus 1e forall a gt 0 and γ(12 1) =

    radicπ erf(1) where

    erf(z) =2radicπ

    int 1

    0

    eminust2

    dt

    Thus starting from (1412) we see that

    |ηprime|22 = γ

    (15

    2 1

    )minus 18 middot γ

    (13

    2 1

    )+ 111 middot γ

    (11

    2 1

    )minus 284 middot γ

    (9

    2 1

    )+ 351 middot γ

    (7

    2 1

    )minus 210 middot γ

    (5

    2 1

    )+ 49 middot γ

    (3

    2 1

    )=

    9151

    128

    radicπ erf(1)minus 18101

    64e= 27375292

    (1413)We thus obtain

    271|ηprime|22middotint N

    x

    0

    ((2minusNx) + ρ)2ηlowast(ρ)dρ

    le 74188 middot 1

    κ3

    (49

    48

    radicπ

    2minus (94)2

    2radic

    )le 20002

    κ3

    We conclude that

    Cηηlowast ge1

    κ|ϕ|1|η|22 minus |η|22

    (2 +

    1

    2κ2

    )eminus2κ2

    minus 20002

    κ3

    Settingκ = 49

    and using (144) we obtain

    Cηηlowast ge1

    κ(|ϕ|1|η|22 minus 0000834) (1414)

    Here it is useful to note that |ϕ|1 =radic

    π2 and so by (144) |ϕ|1|η|22 = 080237

    We have finally chosen x in terms of N

    x =N

    2 + c1κ

    =N

    2 + 94radic2π

    149

    = 0495461 middotN (1415)

    Thus we see that since we are assuming N ge 1027 we in fact have x ge 495461 middot1026 and so in particular

    x ge 49 middot 1026x

    κge 1025 (1416)

    264 CHAPTER 14 CONCLUSION

    Let us continue with our determination of the major-arcs total We should com-pute the quantities in (1038) We already have bounds for Eη+rδ0 Aη+ (see (147))Lηrδ0 and Kr2 By Corollary 713 we have

    Eηlowastr8 le maxχ mod q

    qlermiddotgcd(q2)

    |δ|legcd(q2)δ0r2q

    radicqlowast| errηlowastχlowast(δ x)|

    le 1

    κ

    (2485 middot 10minus19 +

    1radic1025

    (381500 + 76

    radic300000

    ))le 133805 middot 10minus8

    κ

    (1417)

    where the factor of κ comes from the scaling in ηlowast(t) = (η2 lowastM ϕ)(κt) (which ineffect divides x by κ) It remains only to bound the more harmless terms of type Zη2and LSη

    Clearly Zη2+2 le (1x)sumn Λ(n)(log n)η2

    +(nx) Now by Prop 715

    infinsumn=1

    Λ(n)(log n)η2(nx)

    =

    (0640206 +Olowast

    (2 middot 10minus6 +

    36691radicx

    ))x log xminus 0021095x

    le (0640206 +Olowast(3 middot 10minus6))x log xminus 0021095x

    (1418)

    ThusZη2+2 le 0640209 log x (1419)

    We will proceed a little more crudely for Zη2lowast2

    Zη2lowast2 =1

    x

    sumn

    Λ2(n)η2lowast(nx) le 1

    x

    sumn

    Λ(n)ηlowast(nx) middot (ηlowast(nx) log n)

    le (|ηlowast|1 + | errηlowastχT (0 x)|) middot (|ηlowast(t) middot log+(κt)|infin + |ηlowast|infin log(xκ))(1420)

    where log+(t) = max(0 log t) It is easy to see that

    |ηlowast|infin = |η2 lowastM ϕ|infin le∣∣∣∣η2(t)

    t

    ∣∣∣∣1

    |ϕ|infin le 4(log 2)2 middot 2

    ele 1414 (1421)

    and since log+ is non-decreasing and η2 is supported on a subset of [0 1]

    |ηlowast(t) middot log+(κt)|infin = |(η2 lowastM ϕ) middot log+ |infin le |η2 lowastM (ϕ middot log+)|infin

    le∣∣∣∣η2(t)

    t

    ∣∣∣∣1

    middot |ϕ middot log+ |infin le 1921813 middot 0381157 le 0732513

    where we bound |ϕ middot log+ |infin by the bisection method with 25 iterations We alreadyknow that

    |ηlowast|1 =|η2|1|ϕ|1

    κ=|ϕ|1κ

    =

    radicπ2

    κ (1422)

    142 THE TOTAL MAJOR-ARC CONTRIBUTION 265

    By Cor 713

    | errηlowastχT (0 x)| le 2485 middot 10minus19 +1radic1025

    (381500 + 76) le 120665 middot 10minus7

    We conclude that

    Zη2lowast2 le (radicπ249 + 120665 middot 10minus7)(0732513 + 1414 log(x49)) le 00362 log x

    (1423)We have bounds for |ηlowast|infin and |η+|infin We can also bound

    |ηlowast middot t|infin =|(η2 lowastM ϕ) middot t|infin

    κle |η2|1 middot |ϕ middot t|infin

    κle 332eminus32

    κ

    We quote the estimate

    |η+ middot t|infin = 1064735 + 325312 middot (1 + (4π) log 200)200 le 119073 (1424)

    from (A42)We can now bound LSη(x r) for η = ηlowast η+

    LSη(x r) = log r middotmaxpler

    sumαge1

    η

    (pα

    x

    )

    le (log r) middotmaxpler

    log x

    log p|η|infin +

    sumαge1

    pαgex

    |η middot t|infinpαx

    le (log r) middotmax

    pler

    (log x

    log p|η|infin +

    |η middot t|infin1minus 1p

    )le (log r)(log x)

    log 2|η|infin + 2(log r)|η middot t|infin

    and so

    LSηlowast le(

    1414

    log 2log x+ 2 middot (3e)32

    49

    )log r le 2432 log x+ 057

    LSη+ le(

    107996

    log 2log x+ 2 middot 119073

    )log r le 1857 log x+ 2839

    (1425)

    where we are using the bound on |η+|infin in (143)We can now start to put together all terms in (1036) Let ε0 = |η+ minus η|2|η|2

    Then by (145)ε0|η|2 = |η+ minus η|2 le 242942 middot 10minus6

    Thus

    282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

    |η(3) |21δ50

    r

    266 CHAPTER 14 CONCLUSION

    is at most

    282643 middot 242942 middot 10minus6 middot (2 middot 080013 + 242942 middot 10minus6)

    +43101 middot 0800132 + 00012 middot 325032

    85

    150000le 29387 middot 10minus5

    by (144) (146) and (1422)Since ηlowast = (η2 lowastM ϕ)(κx) and η2 is supported on [14 1]

    |ηlowast|22 =|η2 lowastM ϕ|22

    κ=

    1

    κ

    int infin0

    (int infin0

    η2(t)ϕ(wt

    ) dtt

    )2

    dw

    le 1

    κ

    int infin0

    (1minus 1

    4

    )int infin0

    η22(t)ϕ2

    (wt

    ) dtt2dw

    =3

    int infin0

    η22(t)

    t

    (int infin0

    ϕ2(wt

    ) dwt

    )dt

    =3

    4κ|η2(t)

    radict|22 middot |ϕ|22 =

    3

    4κmiddot 32

    3(log 2)3 middot 3

    8

    radicπ le 177082

    κ

    where we go from the first to the second line by Cauchy-SchwarzRecalling the bounds on Eηlowastrδ0 and Eη+rδ0 we obtained in (142) and (1417)

    we conclude that the second line of (1036) is at most x2 times

    133805 middot 10minus8

    κmiddot 87806 + 23922 middot 10minus8 middot 16812

    middot (radic

    87806 + 16812 middot 080014)

    radic177082

    κle 17316 middot 10minus6

    κ

    where we are using the boundAη+ le 87806 we obtained in (147) (We are also usingthe bounds on norms in (143) and the value κ = 49)

    By the bounds (1419) (1423) and (1425) we see that the third line of (1036) isat most

    2 middot (0640209 log x) middot (2432 log x+ 057) middot x

    + 4radic

    0640209 log x middot 00362 log x(1857 log x+ 2839)x le 43(log x)2x

    where we use the assumption x ge x+ = 49 middot 1026 (though a much weaker assumptionwould suffice)

    Using the assumption x ge x+ again together with (1422) and the bounds we havejust proven we conclude that for r = 150000 the integral over the major arcsint

    M8r

    Sη+(α x)2Sηlowast(α x)e(minusNα)dα

    143 THE MINOR-ARC TOTAL EXPLICIT VERSION 267

    is

    C0 middot Cη0ηlowastx2 +Olowast

    (29387 middot 10minus5 middot

    radicπ2

    κx2 +

    17316 middot 10minus6

    κx2 + 43(log x)2x

    )

    = C0 middot Cη0ηlowastx2 +Olowast(

    385628 middot 10minus5 middot x2

    κ

    )= C0 middot Cη0ηlowastx2 +Olowast(786996 middot 10minus7x2)

    (1426)where C0 and Cη0ηlowast are as in (1037) Notice that C0Cη0ηlowastx

    2 is the expected asymp-totic for the integral over all of RZ

    Moreover by (149) (1414) and (144) as well as |ϕ|1 =radicπ2

    C0 middot Cη0ηlowast ge 13203236

    (|ϕ|1|η|22

    κminus 0000834

    κ

    )ge 10594003

    κminus 0001102

    κge 1058298

    49

    Hence intM8r

    Sη+(α x)2Sηlowast(α x)e(minusNα)dα ge 1058259

    κx2 (1427)

    where as usual κ = 49 This is our total major-arc bound

    143 The minor-arc total explicit versionWe need to estimate the quantities E S T J M in Theorem 1321 Let us start bybounding the constants in (1312) The constants Cη+j j = 0 1 2 will appear onlyin the minor term E and so crude bounds on them will do

    By (143) and (1424)

    suprget

    η+(r) le min

    (107996

    119073

    t

    )for all t ge 0 Thus

    Cη+0 = 07131

    int infin0

    1radict

    (suprget

    η+(r)

    )2

    dt

    le 07131

    (int 1

    0

    1079962

    radict

    dt+

    int infin1

    1190732

    t52dt

    )le 233744

    Similarly

    Cη+1 = 07131

    int infin1

    log tradict

    (suprget

    η+(r)

    )2

    dt

    le 07131

    int infin1

    1190732 log t

    t52dt le 044937

    268 CHAPTER 14 CONCLUSION

    Immediately from (143)

    Cη+2 = 051942|η+|2infin le 060581

    We get

    E le ((233744 + 060581) log x+ (2 middot 233744 + 044937)) middot x12

    le (294325 log x+ 512426) middot x12 le 84029 middot 10minus12 middot x(1428)

    where E is defined as in (1311) and where we are using the assumption x ge x+ =49 middot 1026 Using (1417) and (1422) we see that

    Sηlowast(0 x) = (|ηlowast|1 +Olowast(ETηlowast0))x =(radic

    π2 +Olowast(133805 middot 10minus8)) xκ

    Hence

    Sηlowast(0 x) middot E le 105315 middot 10minus11 middot x2

    κ (1429)

    We can bound

    S lesumn

    Λ(n)(log n)η2+(nx) le 0640209x log xminus 0021095x (1430)

    by (1418) Let us now estimate T Recall that ϕ(t) = t2eminust22 Sinceint u

    0

    ϕ(t)dt =

    int u

    0

    t2eminust22dt le

    int u

    0

    t2dt =u3

    3

    we can bound

    Cϕ3

    (1

    2log

    x

    κ

    )=

    104488radicπ2

    int 2log xκ

    0

    t2eminust22dt le 02779

    ((log xκ)2)3

    By (147) we already know that J = (87052 +Olowast(00754))x Hence

    (radicJ minusradicE)2 = (

    radic(87052 +Olowast(00754))xminus

    radic84029 middot 10minus12 middot x)2

    ge 86297x(1431)

    and so

    T = Cϕ3

    (1

    2log

    x

    κ

    )middot (S minus (

    radicJ minusradicE)2)

    le 8 middot 02779

    (log xκ)3middot (0640209x log xminus 0021095xminus 86297x)

    le 0177928x log x

    (log xκ)3minus 240405

    8x

    (log xκ)3

    le 142336x

    (log xκ)2minus 1369293

    x

    (log xκ)3

    143 THE MINOR-ARC TOTAL EXPLICIT VERSION 269

    for κ = 49 Since xκ ge 1025 this implies that

    T le 35776 middot 10minus4 middot x (1432)

    It remains to estimate M Let us first look at g(r0) here g = gxκϕ where gyϕ isdefined as in (1119) and φ(t) = t2eminust

    22 as usual Write y = xκ We must estimatethe constant Cϕ2K defined in (1121)

    Cϕ2K = minusint 1

    1K

    ϕ(w) logw dw le minusint 1

    0

    ϕ(w) logw dw

    le minusint 1

    0

    w2eminusw22 logw dw le 0093426

    where again we use VNODE-LP for rigorous numerical integration Since |ϕ|1 =radicπ2 and K = (log y)2 this implies that

    Cϕ2K|ϕ|1logK

    le 007455

    log log y2

    (1433)

    and so

    RyKϕt =007455

    log log y2

    RyKt +

    (1minus 007455

    log log y2

    )Ryt (1434)

    Let t = 2r0 = 300000 we recall that K = (log y)2 Recall from (1416) thaty = xκ ge 1025 thus yK ge 347435 middot 1023 and log((log y)2) ge 335976 Goingback to the definition of Rxt in (1113) we see that

    Ry2r0 le 027125 log

    (1 +

    log(8 middot 150000)

    2 log 9middot(1025)13

    2004middot2middot150000

    )+ 041415 le 058341

    (1435)

    RyK2r0 le 027125 log

    (1 +

    log(8 middot 150000)

    2 log 9middot(347435middot1023)13

    2004middot2middot150000

    )+ 041415 le 060295

    (1436)and so

    RyKϕ2r0 le007455

    335976060295 +

    (1minus 007455

    335976

    )058341 le 058385

    Using

    z(r) = eγ log log r +250637

    log log rle 542506

    we see from (1113) that

    L2r0 = 542506 middot(

    13

    4log 300000 + 782

    )+ 1366 log 300000 + 3755 le 474608

    270 CHAPTER 14 CONCLUSION

    Going back to (1119) we sum up and obtain that

    g(r0) =(058385 middot log 300000 + 05)

    radic542506 + 25radic

    2 middot 150000

    +474608

    150000+ 336

    (log y

    2y

    )16

    le 0041568

    Using again the bound x ge 49 middot 1026 we obtain

    log(150000 + 1) + c+

    logradicx+ cminus

    middot S minus (radicJ minusradicE)2

    le 13971612 log x+ 06394

    middot (0640209x log xminus 0021095x)minus 86297x

    le 178895xminus 117332x12 log x+ 06394

    minus 86297x

    le (178895minus 86297)x le 92598x

    where c+ = 20532 and cminus = 06394 Therefore

    g(r0) middot(

    log(150000 + 1) + c+

    logradicx+ cminus

    middot S minus (radicJ minusradicE)2

    )le 0041568 middot 92598x

    le 038492x(1437)

    This is one of the main terms

    Let r1 = (38)y415 where as usual y = xκ and κ = 49 Then

    Ry2r1 = 027125 log

    1 +log(8 middot 3

    8y415

    )2 log 9y13

    2004middot 34y415

    + 041415

    = 027125 log

    (1 +

    415 log y + log 3

    2(

    13 minus

    415

    )log y + 2 log 9

    2004middot 34

    )+ 041415

    le 027125 log

    (1 +

    415

    2(

    13 minus

    415

    ))+ 041415 le 071215

    (1438)

    143 THE MINOR-ARC TOTAL EXPLICIT VERSION 271

    Similarly for K = (log y)2 (as usual)

    RyK2r1 = 027125 log

    1 +log(8 middot 3

    8y415

    )2 log 9(yK)13

    2004middot 34y415

    + 041415

    = 027125 log

    1 +415 log y + log 3

    215 log y minus 2

    3 log log y + 2 log 9middot213

    2004middot 34

    + 041415

    = 027125 log

    (3 +

    43 log log y minus c

    215 log y minus 2

    3 log log y + 2 log 12middot213

    2004

    )+ 041415

    (1439)where c = 4 log(12 middot 2132004)minus log 3 Let

    f(t) =43 log tminus c

    215 tminus

    23 log t+ 2 log 12middot213

    2004

    The bisection method with 32 iterations shows that

    f(t) le 0019562618 (1440)

    for 180 le t le 30000 since f(t) lt 0 for 0 lt t lt 180 (by (43) log t minus c lt 0) andsince by c gt 203 we have f(t) lt (52)(log t)t as soon as t gt (log t)2 (and so inparticular for t gt 30000) we see that (1440) is valid for all t gt 0 Therefore

    RyK2r1 le 071392 (1441)

    and so by (1434) we conclude that

    RyKϕ2r1 le007455

    335976middot 071392 +

    (1minus 007455

    335976

    )middot 071215 le 071219

    Since r1 = (38)y415 and z(r) is increasing for r ge 27 we know that

    z(r1) le z(y415) = eγ log log y415 +250637

    log log y415

    = eγ log log y +250637

    log log y minus log 154

    minus eγ log15

    4le eγ log log y minus 143644

    (1442)for y ge 1025 Hence (1113) gives us that

    L2r1 le (eγ log log y minus 143644)

    (13

    4log

    3

    4y

    415 + 782

    )+ 1366 log

    3

    4y

    415 + 3755

    le 13

    15eγ log y log log y + 239776 log y + 122628 log log y + 237304

    le (213522 log y + 18118) log log y

    272 CHAPTER 14 CONCLUSION

    Moreover again by (1442)radicz(r1) le

    radiceγ log log y minus 143644

    2radiceγ log log y

    and so by y ge 1025

    (071219 log3

    4y

    415 + 05)

    radicz(r1)

    le (018992 log y + 029512)

    (radiceγ log log y minus 143644

    2radiceγ log log y

    )le 019505

    radiceγ log log y minus 019505 middot 143644 log y

    2radiceγ log log y

    le 026031 log yradic

    log log y minus 300147

    Therefore by (1119)

    gyϕ(r1) le 026031 log yradic

    log log y + 25minus 300147radic34y

    415

    +(213522 log y + 18118) log log y

    38y

    415

    +336((log y)2)16

    y16

    le 030059 log yradic

    log log y

    y215

    +569392 log y log log y

    y415

    minus 057904

    y215

    +483147 log log y

    y415

    +2994(log y)16

    y16

    le 030059 log yradic

    log log y

    y215

    +569392 log y log log y

    y415

    +130151(log y)16

    y16

    le 030915 log yradic

    log log y

    y215

    where we use y ge 1025 and verify that the functions t 7rarr (log t)16t16minus215 t 7rarrradiclog log tt415minus215 and t 7rarr (log log t)t415minus215 are decreasing for t ge y (just by

    taking derivatives)Since κ = 49 one of the terms in (1313) simplifies easily

    7

    15+minus214938 + 8

    15 logκlog x+ 2cminus

    le 7

    15

    By (1430) and y = xκ = x49 we conclude that

    7

    15g(r1)S le 7

    15middot 030915 log y

    radiclog log y

    y215

    middot (0640209 log xminus 0021095)x

    le 014427 log yradic

    log log y

    y215

    (0640209 log y + 24705)x le 030517x

    (1443)

    143 THE MINOR-ARC TOTAL EXPLICIT VERSION 273

    where we are using the fact that y 7rarr (log y)2radic

    log log yy215 is decreasing for y ge1025 (because y 7rarr (log y)52y215 is decreasing for y ge e754 and 1025 gt e754)

    It remains only to bound

    2S

    log x+ 2cminus

    int r1

    r0

    g(r)

    rdr

    in the expression (1313) forM We will use the bound on the integral given in (1333)The easiest term to bound there is f1(r0) defined in (1334) since it depends only onr0 for r0 = 150000

    f1(r0) = 00169073

    It is also not hard to bound f2(r0 x) also defined in (1334)

    f2(r0 y) = 336((log y)2)16

    x16log

    38y

    415

    r0

    le 336(log y)16

    (2y)16

    (4

    15log y + 005699minus log r0

    )

    where we recall again that x = κy = 49y Thus since r0 = 150000 and y ge 1025

    f2(r0 y) le 0001399

    Let us now look at the terms I1r cϕ in (1335) We already saw in (1433) that

    cϕ =Cϕ2|ϕ|1

    logKle 007455

    log log y2

    le 002219

    Since F (t) = eγ log t+ cγ with cγ = 1025742

    I1r0 = F (log r0) +2eγ

    log r0= 573826 (1444)

    It thus remains only to estimate I0r0r1z for z = y and z = yK where K =(log y)2

    We will first give estimates for y large Omitting negative terms from (1335) weeasily get the following general bound crude but useful enough

    I0r0r1z le R2z2r0 middot

    P2(log 2r0)radicr0

    +R2z2r1 minus 0414152

    log r1r0

    Pminus2 (log 2r0)radicr0

    where P2(t) = t2 + 4t+ 8 and Pminus2 (t) = 2t2 + 16t+ 48 By (1438) and (1441)

    Ry2r1 le 071215 RyK2r1 le 071392

    for y ge 1025 Assume now that y ge 10150 Then since r0 = 150000

    Ryr0 le 027125 log

    (1 +

    log 4r0

    2 log 9middot(10150)13

    2004r0

    )+ 041415 le 043086

    274 CHAPTER 14 CONCLUSION

    and similarly RyKr0 le 043113 Since

    0430862 middot P2(log 2r0)radicr0

    le 010426 0431132 middot P2(log 2r0)radicr0

    le 010439

    we obtain that

    (1minus cϕ)radicI0r0r1y + cϕ

    radicI0r0r1 2y

    log y

    le 097781 middotradic

    010426 +049214

    415 log y minus log 400000

    + 002219

    radic010439 +

    049584415 log y minus log 400000

    le 033239

    (1445)

    for y ge 10150For y between 1025 and 10150 we evaluate the left side of (1445) directly using

    the definition (1335) of I0r0r1z instead as well as the bound

    cϕ le007455

    log log y2

    from (1433) (It is clear from the second and third lines of (1332) that I0r0r1z isdecreasing on z for r0 r1 fixed and so the upper bound for cϕ does give the worst case)The bisection method (applied to the interval [25 150] with 30 iterations including 30initial iterations) gives us that

    (1minus cϕ)radicI0r0r1y + cϕ

    radicI0r0r1 2y

    log yle 04153461 (1446)

    for 1025 le y le 10140 By (1445) (1446) is also true for y gt 10150 Hence

    f0(r0 y) le 04153461 middot

    radic2radicr0

    573827 le 0071498

    By (1333) we conclude thatint r1

    r0

    g(r)

    rdr le 0071498 + 0016908 + 0001399 le 0089805

    By (1430)

    2S

    log x+ 2cminusle 2(0640209x log xminus 0021095x)

    log x+ 2cminusle 2 middot 0640209x = 1280418x

    where we recall that cminus = 06294 gt 0 Hence

    2S

    log x+ 2cminus

    int r1

    r0

    g(r)

    rdr le 0114988x (1447)

    144 CONCLUSION PROOF OF MAIN THEOREM 275

    Putting (1437) (1443) and (1447) together we conclude that the quantity Mdefined in (1313) is bounded by

    M le 038492x+ 030517x+ 0114988x le 080508x (1448)

    Gathering the terms from (1429) (1432) and (1448) we see that Theorem 1321states that the minor-arc total

    Zr0 =

    int(RZ)M8r0

    |Sηlowast(α x)||Sη+(α x)|2dα

    is bounded by

    Zr0 le

    (radic|ϕ|1xκ

    (M + T ) +radicSηlowast(0 x) middot E

    )2

    le(radic|ϕ|1(080508 + 35776 middot 10minus4)

    xradicκ

    +radic

    10532 middot 10minus11xradicκ

    )2

    le 100948x2

    κ

    (1449)

    for r0 = 150000 x ge 49 middot 1026 where we use yet again the fact that |ϕ|1 =radicπ2

    This is our total minor-arc bound

    144 Conclusion proof of main theoremAs we have known from the startsum

    n1+n2+n3=N

    Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

    =

    intRZ

    Sη+(α x)2Sηlowast(α x)e(minusNα)dα

    (1450)

    We have just shown that assuming N ge 1027 N oddintRZ

    Sη+(α x)2Sηlowast(α x)e(minusNα)dα

    =

    intM8r0

    Sη+(α x)2Sηlowast(α x)e(minusNα)dα

    +Olowast

    (int(RZ)M8r0

    |Sη+(α x)|2|Sηlowast(α x)|dα

    )

    ge 1058259x2

    κ+Olowast

    (100948

    x2

    κ

    )ge 004877

    x2

    κ

    for r0 = 150000 where x = N(2 + 9(196radic

    2π)) as in (1415) (We are using(1427) and (1449)) Recall that κ = 49 and ηlowast(t) = (η2 lowastM ϕ)(κt) where ϕ(t) =

    t2eminust22

    276 CHAPTER 14 CONCLUSION

    It only remains to show that the contribution of terms with n1 n2 or n3 non-primeto the sum in (1450) is negligible (Let us take out n1 n2 n3 equal to 2 as well sincesome prefer to state the ternary Goldbach conjecture as follows every odd numberge 9is the sum of three odd primes) Clearlysum

    n1+n2+n3=Nn1 n2 or n3 even or non-prime

    Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

    le 3|η+|2infin|ηlowast|infinsum

    n1+n2+n3=Nn1 even or non-prime

    Λ(n1)Λ(n2)Λ(n3)

    le 3|η+|2infin|ηlowast|infinmiddot(logN)sum

    n1 le N non-primeor n1 = 2

    Λ(n1)sumn2leN

    Λ(n2)

    (1451)

    By (143) and (1421) |η+|infin le 1079955 and |ηlowast|infin le 1414 By [RS62 Thms 12and 13] sum

    n1 le N non-primeor n1 = 2

    Λ(n1) lt 14262radicN + log 2 lt 14263

    radicN

    sumn1 le N non-prime

    or n1 = 2

    Λ(n1)sumn2leN

    Λ(n2) = 14263radicN middot 103883N le 148169N32

    Hence the sum on the first line of (1451) is at most

    73306N32 logN

    Thus for N ge 1027 oddsumn1+n2+n3=N

    n1 n2 n3 odd primes

    Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

    ge 004877x2

    κminus 73306N32 logN

    ge 000024433N2 minus 14412 middot 10minus11 middotN2 ge 00002443N2

    by κ = 49 and (1415) Since 00002443N2 gt 0 this shows that every odd numberN ge 1027 can be written as the sum of three odd primes

    Since the ternary Goldbach conjecture has already been checked for allN le 8875middot1030 [HP13] we conclude that every odd number N gt 7 can be written as the sumof three odd primes and every odd number N gt 5 can be written as the sum of threeprimes The main result is hereby proven the ternary Goldbach conjecture is true

    Part IV

    Appendices

    277

    Appendix A

    Norms of smoothing functions

    Our aim here is to give bounds on the norms of some smoothing functions ndash and inparticular on several norms of a smoothing function η+ [0infin) rarr R based on theGaussian ηhearts(t) = eminust

    22As before we write

    h t 7rarr

    t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

    (A1)

    We recall that we will work with an approximation η+ to the function η [0infin)rarr Rdefined by

    η(t) = h(t)ηhearts(t) =

    t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

    (A2)

    The approximation η+ is defined by

    η+(t) = hH(t)teminust22 (A3)

    where

    FH(t) =sin(H log y)

    π log y

    hH(t) = (h lowastM FH)(y) =

    int infin0

    h(tyminus1)FH(y)dy

    y

    (A4)

    and H is a positive constant to be set later By (28) MhH = Mh middotMFH Now FH isjust a Dirichlet kernel under a change of variables using this we get that for τ real

    MFH(iτ) =

    1 if |τ | lt H 12 if |τ | = H 0 if |τ | gt H

    (A5)

    279

    280 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

    Thus

    MhH(iτ) =

    Mh(iτ) if |τ | lt H 12Mh(iτ) if |τ | = H 0 if |τ | gt H

    (A6)

    As it turns out h η and Mh (and hence MhH ) are relatively easy to work withwhereas we can already see that hH and η+ have more complicated definitions Partof our work will consist in expressing norms of hH and η+ in terms of norms of h ηand Mh

    A1 The decay of a Mellin transformNow consider any φ [0infin) rarr C that (a) has compact support (or fast decay) (b)satisfies φ(k)(t)tkminus1 = O(1) for trarr 0+ and 0 le k le 3 and (c) is C2 everywhere andquadruply differentiable outside a finite set of points

    By definition

    Mφ(s) =

    int infin0

    φ(x)xsdx

    x

    Thus by integration by parts for lt(s) gt minus1 and s 6= 0

    Mφ(s) =

    int infin0

    φ(x)xsdx

    x= limtrarr0+

    int infint

    φ(x)xsdx

    x= minus lim

    trarr0+

    int infint

    φprime(x)xs

    sdx

    = limtrarr0+

    int infint

    φprimeprime(x)xs+1

    s(s+ 1)dx = lim

    trarr0+minusint infint

    φ(3)(x)xs+2

    s(s+ 1)(s+ 2)dx

    = limtrarr0+

    int infint

    φ(4)(x)xs+3

    s(s+ 1)(s+ 2)(s+ 3)dx

    (A7)where φ(4)(x) is understood in the sense of distributions at the finitely many pointswhere it is not well-defined as a function

    Let s = it φ = h Let Ck = limtrarr0+

    intinfint|h(k)(x)|xkminus1dx for 0 le k le 4 Then

    (A7) gives us that

    Mh(it) le min

    (C0

    C1

    |t|

    C2

    |t||t+ i|

    C3

    |t||t+ i||t+ 2i|

    C4

    |t||t+ i||t+ 2i||t+ 3i|

    )

    (A8)We must estimate the constants Cj 0 le j le 4

    Clearly h(t)tminus1 = O(1) as t rarr 0+ hk(t) = O(1) as t rarr 0+ for all k ge 1h(2) = hprime(2) = hprimeprime(2) = 0 and h(x) hprime(x) and hprimeprime(x) are all continuous Thefunction hprimeprimeprime has a discontinuity at t = 2 As we said we understand h(4) in the senseof distributions at t = 2 for example limεrarr0

    int 2+ε

    2minusε h(4)(t)dt = limεrarr0(h(3)(2 + ε)minus

    h(3)(2minus ε))Symbolic integration easily gives that

    C0 =

    int 2

    0

    t(2minus t)3etminus12dt = 92eminus12 minus 12e32 = 202055184 (A9)

    A1 THE DECAY OF A MELLIN TRANSFORM 281

    We will have to compute Ck 1 le k le 4 with some care due to the absolute valueinvolved in the definition

    The function (x2(2minus x)3exminus12)prime = ((x2(2minus x)3)prime + x2(2minus x)3)exminus12 has thesame zeros as H1(x) = (x2(2minus x)3)prime + x2(2minus x)3 namely minus4 0 1 and 2 The signof H1(x) (and hence of hprime(x)) is + within (0 1) and minus within (1 2) Hence

    C1 =

    int infin0

    |hprime(x)|dx = |h(1)minus h(0)|+ |h(2)minus h(1)| = 2h(1) = 2radice (A10)

    The situation with (x2(2 minus x)3exminus12)primeprime is similar it has zeros at the roots ofH2(x) = 0 where H2(x) = H1(x) + H prime1(x) (and in general Hk+1(x) = Hk(x) +H primek(x)) This time we will prefer to find the roots numerically It is enough to find(candidates for) the roots using any available tool1 and then check rigorously that thesign does change around the purported roots In this way we check thatH2(x) = 0 hastwo roots α21 α22 in the interval (0 2) another root at 2 and two more roots outside[0 2] moreover

    α21 = 048756597185712

    α22 = 148777169309489 (A11)

    where we verify the root using interval arithmetic The sign of H2(x) (and hence ofhprimeprime(x)) is first + then minus then + Write α20 = 0 α23 = 2 By integration by parts

    C2 =

    int infin0

    |hprimeprime(x)|x dx =

    int α21

    0

    hprimeprime(x)x dxminusint α22

    α21

    hprimeprime(x)x dx+

    int 2

    α22

    hprimeprime(x)x dx

    =

    3sumj=1

    (minus1)j+1

    (hprime(x)x|α2j

    α2jminus1minusint α2j

    α2jminus1

    hprime(x) dx

    )

    = 2

    2sumj=1

    (minus1)j+1 (hprime(α2j)α2j minus h(α2j)) = 1079195821037

    (A12)

    To compute C3 we proceed in the same way finding two roots of H3(x) = 0(numerically) within the interval (0 2) viz

    α31 = 104294565694978

    α32 = 180999654602916

    The sign of H3(x) on the interval [0 2] is first minus then + then minus Write α30 = 0α33 = 2 Proceeding as before ndash with the only difference that the integration by parts

    1Routine find root in SAGE was used here

    282 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

    is iterated once now ndash we obtain that

    C3 =

    int infin0

    |hprimeprimeprime(x)|x2dx =

    3sumj=1

    (minus1)jint α3j

    α3jminus1

    hprimeprimeprime(x)x2dx

    =

    3sumj=1

    (minus1)j

    (hprimeprime(x)x2|α3j

    α3jminus1minusint α3j

    α3jminus1

    hprimeprime(x) middot 2x

    )dx

    =

    3sumj=1

    (minus1)j(hprimeprime(x)x2 minus hprime(x) middot 2x+ 2h(x)

    )|α3jα3jminus1

    = 2

    2sumj=1

    (minus1)j(hprimeprime(α3j)α23j minus 2hprime(α3j)α3j + 2h(α3j))

    (A13)

    and so interval arithmetic gives us

    C3 = 751295251672 (A14)

    The treatment of the integral in C4 is very similar at least as first There are tworoots of H4(x) = 0 in the interval (0 2) namely

    α41 = 045839599852663

    α42 = 154626346975533

    The sign ofH4(x) on the interval [0 2] is firstminus + thenminus Using integration by partsas before we obtainint 2minus

    0+

    ∣∣∣h(4)(x)∣∣∣x3dx

    = minusint α41

    0+

    h(4)(x)x3dx+

    int α42

    α41

    h(4)(x)x3dxminusint 2minus

    α41

    h(4)(x)x3dx

    = 2

    2sumj=1

    (minus1)j(h(3)(α4j)α

    34j minus 3h(2)(α4j)α

    24j + 6hprime(α4j)α4j minus 6h(α4j)

    )minus limtrarr2minus

    h(3)(t)t3 = 115269754862

    since limtrarr0+ h(k)(t)tk = 0 for 0 le k le 3 limtrarr2minus h(k)(t) = 0 for 0 le k le 2 and

    limtrarr2minus h(3)(t) = minus24e32 Nowint infin

    2minus|h(4)(x)x3|dx = lim

    εrarr0+|h(3)(2 + ε)minus h(3)(2minus ε)| middot 23 = 23 middot 24e32

    Hence

    C4 =

    int 2minus

    0+

    ∣∣∣h(4)(x)∣∣∣x3dx+ 24e32 middot 23 = 201318185012 (A15)

    A2 THE DIFFERENCE η+ minus η IN `2 NORM 283

    We finish by remarking that can write down Mh explicitly

    Mh = minuseminus12(minus1)minuss(8γ(s+2minus2)+12γ(s+3minus2)+6γ(s+4minus2)+γ(s+5minus2))(A16)

    where γ(s x) is the (lower) incomplete Gamma function

    γ(s x) =

    int x

    0

    eminusttsminus1dt

    We will however find it easier to deal with Mh by means of the bound (A8) in partbecause (A16) amounts to an invitation to numerical instability

    For instance it is easy to use (A8) to give a bound for the `1-norm of Mh(it)Since C4C3 gt C3C2 gt C2C1 gt C1C0

    |Mh(it)|1 = 2

    int infin0

    Mh(it)dt

    le2

    (C0C1

    C0+ C1

    int C2C1

    C1C0

    dt

    t+ C2

    int C3C2

    C2C1

    dt

    t2+ C3

    int C4C3

    C3C2

    dt

    t3+ C4

    int infinC4C3

    dt

    t4

    )

    =2

    (C1 + C1 log

    C2C0

    C21

    + C2

    (C1

    C2minus C2

    C3

    )+C3

    2

    (C2

    2

    C23

    minus C23

    C24

    )+C4

    3middot C

    33

    C34

    )

    and so|Mh(it)|1 le 161939176 (A17)

    This bound is far from tight but it will certainly be usefulSimilarly |(t+ i)Mh(it)|1 is at most two times

    C0

    int C1C0

    0

    |t+ i| dt+ C1

    int C2C1

    C1C0

    ∣∣∣∣1 +i

    t

    ∣∣∣∣ dt+ C2

    int C3C2

    C2C1

    dt

    t+ C3

    int C4C3

    C3C2

    dt

    t2+ C4

    int infinC4C3

    dt

    t3

    =C0

    2

    (radicC4

    1

    C40

    +C2

    1

    C20

    + sinhminus1 C1

    C0

    )+ C1

    (radict2 + 1 + log

    (radict2 + 1minus 1

    t

    ))|C2C1C1C0

    + C2 logC3C1

    C22

    + C3

    (C2

    C3minus C3

    C4

    )+C4

    2

    C23

    C24

    and so|(t+ i)Mh(it)|1 le 278622803 (A18)

    A2 The difference η+ minus η in `2 norm

    We wish to estimate the distance in `2 norm between η and its approximation η+ Thiswill be an easy affair since on the imaginary axis the Mellin transform of η+ is just atruncation of the Mellin transform of η

    284 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

    By (A2) and (A3)

    |η+ minus η|22 =

    int infin0

    ∣∣∣hH(t)teminust22 minus h(t)teminust

    22∣∣∣2 dt

    le(

    maxtge0

    eminust2

    t3)middotint infin

    0

    |hH(t)minus h(t)|2 dtt

    (A19)

    The maximum maxtge0 t3eminust

    2

    is (32)32eminus32 Since the Mellin transform is anisometry (ie (26) holds)int infin

    0

    |hH(t)minus h(t)|2 dtt

    =1

    int infinminusinfin|MhH(it)minusMh(it)|2dt =

    1

    π

    int infinH

    |Mh(it)|2dt

    (A20)By (A8) int infin

    H

    |Mh(it)|2dt leint infinH

    C24

    t8dt le C2

    4

    7H7 (A21)

    Hence int infin0

    |hH(t)minus h(t)|2 dttle C2

    4

    7πH7 (A22)

    Using the bound (A15) for C4 we conclude that

    |η+ minus η|2 leC4radic7π

    (3

    2e

    )34

    middot 1

    H72le 274856893

    H72 (A23)

    It will also be useful to bound∣∣∣∣int infin0

    (η+(t)minus η(t))2 log t dt

    ∣∣∣∣ This is at most (

    maxtge0

    eminust2

    t3| log t|)middotint infin

    0

    |hH(t)minus h(t)|2 dtt

    Now

    maxtge0

    eminust2

    t3| log t| = max

    (maxtisin[01]

    eminust2

    t3(minus log t) maxtisin[15]

    eminust2

    t3 log t

    )= 014882234545

    where we find the maximum by the bisection method with 40 iterations (see 26)Hence by (A22)int infin

    0

    (η+(t)minus η(t))2| log t|dt le 0148822346C2

    4

    le 27427502

    H7le(

    16561251

    H72

    )2

    (A24)

    A3 NORMS INVOLVING η+ 285

    A3 Norms involving η+

    Let us now bound some `1- and `2-norms involving η+ Relatively crude bounds willsuffice in most cases

    First by (A23)

    |η+|2 le |η|2 + |η+ minus η|2 le 0800129 +2748569

    H72

    |η+|2 ge |η|2 minus |η+ minus η|2 ge 0800128minus 2748569

    H72

    (A25)

    where we obtain

    |η|2 =radic

    0640205997 = 08001287 (A26)

    by symbolic integrationLet us now bound |η+ middot log |22 By isometry and (210)

    |η+ middot log |22 =1

    2πi

    int 12 +iinfin

    12minusiinfin

    |M(η+ middot log)(s)|2ds =1

    2πi

    int 12 +iinfin

    12minusiinfin

    |(Mη+)prime(s)|2ds

    Now (Mη+)prime(12 + it) equals 12π times the additive convolution of MhH(it) and(Mηdiams)prime(12 + it) where ηdiams(t) = teminust

    22 Hence by Youngrsquos inequality

    |(Mη+)prime(12 + it)|2 le1

    2π|MhH(it)|1|(Mηdiams)prime(12 + it)|2

    Again by isometry and (210)

    |(Mηdiams)prime(12 + it)|2 =radic

    2π|ηdiams middot log |2

    Hence by (A17)

    |η+ middot log |2 le1

    2π|MhH(it)|1|ηdiams middot log |2 le 25773421 middot |ηdiams middot log |2

    Since by symbolic integration

    |ηdiams middot log |2 leradicradic

    π

    32(8(log 2)2 + 2γ2 + π2 + 8(γ minus 2) log 2minus 8γ)

    le 03220301

    (A27)

    we get that|η+ middot log |2 le 08299818 (A28)

    Let us bound |η+(t)tσ|1 for σ isin (minus2infin) By Cauchy-Schwarz and Plancherel

    |η+(t)tσ|1 =∣∣∣hH(t)t1+σeminust

    22∣∣∣1le∣∣∣tσ+32eminust

    22∣∣∣2|hH(t)

    radict|2

    =∣∣∣tσ+32eminust

    22∣∣∣2

    radicint infin0

    |hH(t)|2 dtt

    =∣∣∣tσ+32eminust

    22∣∣∣2middot

    radic1

    int H

    minusH|Mh(ir)|2dr

    le∣∣∣tσ+32eminust

    22∣∣∣2middot

    radic1

    int infinminusinfin|Mh(ir)|2dr =

    ∣∣∣tσ+32eminust22∣∣∣2middot |h(t)

    radict|2

    (A29)

    286 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

    Since ∣∣∣tσ+32eminust22∣∣∣2

    =

    radicint infin0

    eminust2t2σ+3dt =

    radicΓ(σ + 2)

    2

    |h(t)radict|2 =

    radic31989

    8eminus 585e3

    8le 15023459

    we conclude that|η+(t)tσ|1 le 1062319 middot

    radicΓ(σ + 2) (A30)

    for σ gt minus2

    A4 Norms involving ηprime+By one of the standard transformation rules (see (210)) the Mellin transform of ηprime+equals minus(sminus 1) middotMη+(sminus 1) Since the Mellin transform is an isometry in the senseof (26)

    |ηprime+|22 =1

    2πi

    int 12 +iinfin

    12minusiinfin

    ∣∣M(ηprime+)(s)∣∣2 ds =

    1

    2πi

    int minus 12 +iinfin

    minus 12minusiinfin

    |s middotMη+(s)|2 ds

    Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) = teminust22 Thus by (29) the func-

    tion Mη+(minus12 + it) equals 12π times the (additive) convolution of MhH(it) andMηdiams(minus12 + it) Therefore for s = minus12 + it

    |s| |Mη+(s)| = |s|2π

    int H

    minusHMh(ir)Mηdiams(sminus ir)dr

    le 3

    int H

    minusH|ir minus 1||Mh(ir)| middot |sminus ir||Mηhearts(sminus ir)|dr

    =3

    2π(f lowast g)(t)

    (A31)

    where f(t) = |it minus 1||Mh(it)| and g(t) = | minus 12 + it||Mηdiams(minus12 + it)| (Since|(minus12 + i(tminus r)) + (1 + ir)| = |12 + it| = |s| either | minus 12 + i(tminus r)| ge |s|3 or|1+ir| ge 2|s|3 hence |sminusir||irminus1| = |minus12+i(tminusr)||1+ir| ge |s|3) By Youngrsquosinequality (in a special case that follows from Cauchy-Schwarz) |f lowast g|2 le |f |1|g|2By (A18)

    |f |1 = |(r + i)Mh(ir)|1 le 278622803

    Yet again by Plancherel

    |g|22 =

    int minus 12 +iinfin

    minus 12minusiinfin

    |s|2|Mηdiams(s)|2ds

    =

    int 12 +iinfin

    12minusiinfin

    |(M(ηprimediams))(s)|2ds = 2π|ηprimediams|22 =3π

    32

    4

    A4 NORMS INVOLVING ηprime+ 287

    Hence

    |ηprime+|2 le1radic2πmiddot 3

    2π|f lowast g|2

    le 1radic2π

    3

    2πmiddot 278622803

    radic3π

    32

    4le 10845789

    (A32)

    Let us now bound |ηprime+(t)tσ|1 for σ isin (minus1infin) First of all

    |ηprime+(t)tσ|1 =

    ∣∣∣∣(hH(t)teminust22)primetσ∣∣∣∣1

    le∣∣∣(hprimeH(t)teminust

    22 + hH(t)(1minus t2)eminust22)middot tσ∣∣∣1

    le∣∣∣hprimeH(t)tσ+1eminust

    22∣∣∣1

    + |η+(t)tσminus1|1 + |η+(t)tσ+1|1

    We can bound the last two terms by (A30) Much as in (A29) we note that∣∣∣hprimeH(t)tσ+1eminust22∣∣∣1le∣∣∣tσ+12eminust

    22∣∣∣2|hprimeH(t)

    radict|2

    and then see that

    |hprimeH(t)radict|2 =

    radicint infin0

    |hprimeH(t)|2t dt =

    radic1

    int infinminusinfin|M(hprimeH)(1 + ir)|2dr

    =

    radic1

    int infinminusinfin|(minusir)MhH(ir)|2dr =

    radic1

    int H

    minusH|(minusir)Mh(ir)|2dr

    =

    radic1

    int H

    minusH|M(hprime)(1 + ir)|2dr le

    radic1

    int infinminusinfin|M(hprime)(1 + ir)|2dr = |hprime(t)

    radict|2

    where we use the first rule in (210) twice Since

    ∣∣∣tσ+12eminust22∣∣∣2

    =

    radicΓ(σ + 1)

    2 |hprime(t)

    radict|2 =

    radic103983

    16eminus 1899e3

    16= 26312226

    we conclude that

    |ηprime+(t)tσ|1 le 1062319 middot (radic

    Γ(σ + 1) +radic

    Γ(σ + 3)) +

    radicΓ(σ + 1)

    2middot 26312226

    le 2922875radic

    Γ(σ + 1) + 1062319radic

    Γ(σ + 3)(A33)

    for σ gt minus1

    288 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

    A5 The `infin-norm of η+

    Let us now get a bound for |η+|infin Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) =

    teminust22 Clearly

    |η+|infin = |hH(t)ηdiams(t)|infin le |η|infin + |(h(t)minus hH(t))ηdiams(t)|infin

    le |η|infin +

    ∣∣∣∣h(t)minus hH(t)

    t

    ∣∣∣∣infin|ηdiams(t)t|infin

    (A34)

    Taking derivatives we easily see that

    |η|infin = η(1) = 1 |ηdiams(t)t|infin = 2e

    It remains to bound |(h(t)minus hH(t))t|infin By (76)

    hH(t) =

    int infint2

    h(tyminus1)sin(H log y)

    π log y

    dy

    y=

    int infinminusH log 2

    t

    h

    (t

    ewH

    )sinw

    πwdw (A35)

    The sine integral

    Si(x) =

    int x

    0

    sin t

    tdt

    is defined for all x it tends to π2 as xrarr +infin and to minusπ2 as xrarr minusinfin (see [AS64(5225)]) We apply integration by parts to the second integral in (A35) and obtain

    hH(t)minus h(t) = minus 1

    π

    int infinminusH log 2

    t

    (d

    dwh

    (t

    ewH

    ))Si(w)dw minus h(t)

    = minus 1

    π

    int infin0

    (d

    dwh

    (t

    ewH

    ))(Si(w)minus π

    2

    )dw

    minus 1

    π

    int 0

    minusH log 2t

    (d

    dwh

    (t

    ewH

    ))(Si(w) +

    π

    2

    )dw

    Now ∣∣∣∣ ddwh(

    t

    ewH

    )∣∣∣∣ =teminuswH

    H

    ∣∣∣∣hprime( t

    ewH

    )∣∣∣∣ le t|hprime|infinHewH

    Integration by parts easily yields the bounds |Si(x) minus π2| lt 2x for x gt 0 and|Si(x) + π2| lt 2|x| for x lt 0 we also know that 0 le Si(x) le x lt π2 forx isin [0 1] and minusπ2 lt x le Si(x) le 0 for x isin [minus1 0] Hence

    |hH(t)minus h(t)| le 2t|hprime|infinπH

    (int 1

    0

    π

    2eminuswHdw +

    int infin1

    2eminuswH

    wdw

    )= t|hprime|infin middot

    ((1minus eminus1H) +

    4

    π

    E1(1H)

    H

    )

    where E1 is the exponential integral

    E1(z) =

    int infinz

    eminust

    tdt

    A5 THE `infin-NORM OF η+ 289

    By [AS64 (5120)]

    0 lt E1(1H) ltlog(H + 1)

    e1H

    and since log(H+1) = logH+log(1+1H) lt logH+1H lt (logH)(1+1H) lt(logH)e1H for H ge e we see that this gives us that E1(1H) lt logH (again forH ge e as is the case) Hence

    |hH(t)minus h(t)|t

    lt |hprime|infin middot(

    1minus eminus 1H +

    4

    π

    logH

    H

    )lt |hprime|infin middot

    1 + 4π logH

    H (A36)

    and so by (A34)

    |η+|infin le 1 +2

    e

    ∣∣∣∣h(t)minus hH(t)

    t

    ∣∣∣∣infinlt 1 +

    2

    e|hprime|infin middot

    1 + 4π logH

    H

    By (A11) and interval arithmetic we determine that

    |hprime|infin = |hprime(α22)| le 2805820379671 (A37)

    where α22 is a root of hprimeprime(x) = 0 as in (A11) We have proven

    |η+|infin lt 1+2

    emiddot280582038 middot

    1 + 4π logH

    Hlt 1+206440727 middot

    1 + 4π logH

    H (A38)

    We will need three other bounds of this kind namely for η+(t) log t η+(t)t andη+(t)t We start as in (A34)

    |η+ log t|infin le |η log t|infin + |(h(t)minus hH(t))ηdiams(t) log t|infinle |η log t|infin + |(hminus hH(t))t|infin|ηdiams(t)t log t|infin

    |η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)|infin|η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)t2|infin

    (A39)

    By the bisection method with 30 iterations implemented with interval arithmetic

    |η(t) log t|infin le 0279491 |ηdiams(t)t log t|infin le 03811561

    Hence by (A36) and (A37)

    |η+ log t|infin le 0279491 + 1069456 middot1 + 4

    π logH

    H (A40)

    By the bisection method with 32 iterations

    |η(t)t|infin le 108754396

    (We can also obtain this by solving (η(t)t)prime = 0 symbolically) It is easy to show

    that |ηdiams|infin = 1radice Hence again by (A36) and (A37)

    |η+(t)t|infin le 108754396 + 170181609 middot1 + 4

    π logH

    H (A41)

    290 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

    By the bisection method with 32 iterations

    |η(t)t|infin le 106473476

    Taking derivatives we see that |ηdiams(t)t2|infin = 332eminus32 Hence yet again by (A36)and (A37)

    |η+(t)t|infin le 106473476 + 325312 middot1 + 4

    π logH

    H (A42)

    Appendix B

    Norms of Fourier transforms

    B1 The Fourier transform of ηprimeprime2Our aim here is to give upper bounds on |ηprimeprime2 |infin where η2 is as in (34) We will doconsiderably better than the trivial bound |ηprimeprime|infin le |ηprimeprime|1

    Lemma B11 For every t isin R

    |4e(minust4)minus 4e(minust2) + e(minust)| le 787052 (B1)

    We will describe an extremely simple but rigorous procedure to find the maxi-mum Since |g(t)|2 is C2 (in fact smooth) there are several more efficient and equallyrigourous algorithms ndash for starters the bisection method with error bounded in termsof |(|g|2)primeprime|infin

    Proof Letg(t) = 4e(minust4)minus 4e(minust2) + e(minust) (B2)

    For a le t le b

    g(t) = g(a) +tminus abminus a

    (g(b)minus g(a)) +1

    8(bminus a)2 middotOlowast( max

    visin[ab]|gprimeprime(v)|) (B3)

    (This formula in all likelihood well-known is easy to derive First we can assumewithout loss of generality that a = 0 b = 1 and g(a) = g(b) = 0 Dividing by gby g(t) we see that we can also assume that g(t) is real (and in fact 1) We can alsoassume that g is real-valued in that it will be enough to prove (B3) for the real-valuedfunction ltg as this will give us the bound g(t) = ltg(t) le (18) maxv |(ltg)primeprime(v)| lemaxv |gprimeprime(v)| that we wish for Lastly we can assume (by symmetry) that 0 le t le 12and that g has a local maximum or minimum at t Writing M = maxuisin[01] |gprimeprime(u)|we then have

    g(t) =

    int t

    0

    gprime(v)dv =

    int t

    0

    int v

    t

    gprimeprime(u)dudv = Olowast(int t

    0

    ∣∣∣∣int v

    t

    Mdu

    ∣∣∣∣ dv)= Olowast

    (int t

    0

    (v minus t)Mdv

    )= Olowast

    (1

    2t2M

    )= Olowast

    (1

    8M

    )

    291

    292 APPENDIX B NORMS OF FOURIER TRANSFORMS

    as desired)We obtain immediately from (B3) that

    maxtisin[ab]

    |g(t)| le max(|g(a)| |g(b)|) +1

    8(bminus a)2 middot max

    visin[ab]|gprimeprime(v)| (B4)

    For any v isin R

    |gprimeprime(v)| le(π

    2

    )2

    middot 4 + π2 middot 4 + (2π)2 = 9π2 (B5)

    Clearly g(t) depends only on t mod 4π Hence by (B4) and (B5) to estimate

    maxtisinR|g(t)|

    with an error of at most ε it is enough to subdivide [0 4π] into intervals of lengthleradic

    8ε9π2 each We set ε = 10minus6 and compute

    Lemma B12 Let η2 R+ rarr R be as in (34) Then

    |ηprimeprime2 |infin le 31521 (B6)

    This should be compared with |ηprimeprime2 |1 = 48

    Proof We can write

    ηprimeprime2 (x) = 4(4δ14(x)minus 4δ12(x) + δ1(x)) + f(x) (B7)

    where δx0is the point measure at x0 of mass 1 (Dirac delta function) and

    f(x) =

    0 if x lt 14 or x ge 1minus4xminus2 if 14 le x lt 124xminus2 if 12 le x lt 1

    Thus ηprimeprime2 (t) = 4g(t) + f(t) where g is as in (B2) It is easy to see that |f prime|1 =2 maxx f(x)minus 2 minx f(x) = 160 Therefore∣∣∣f(t)

    ∣∣∣ =∣∣∣f prime(t)(2πit)∣∣∣ le |f prime|1

    2π|t|=

    80

    π|t| (B8)

    Since 31521 minus 4 middot 787052 = 003892 we conclude that (B6) follows from LemmaB11 and (B8) for |t| ge 655 gt 80(π middot 003892)

    It remains to check the range t isin (minus655 655) since 4g(minust)+f(minust) is the complexconjugate of 4g(t) + f(t) it suffices to consider t non-negative We use (B4) (with4g+ f instead of g) and obtain that to estimate maxtisinR |4g+ f(t)| with an error of at

    most ε it is enough to subdivide [0 655) into intervals of lengthleradic

    2ε|(4g + f)primeprime|infineach and check |4g + f(t)| at the endpoints Now for every t isin R∣∣∣∣(f)primeprime (t)∣∣∣∣ =

    ∣∣∣(minus2πi)2x2f(t)∣∣∣ = (2π)2 middotOlowast

    (|x2f |1

    )= 12π2

    B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 293

    By this and (B5) |(4g + f)primeprime|infin le 48π2 Thus intervals of length δ1 give an errorterm of size at most 24π2δ2

    1 We choose δ1 = 0001 and obtain an error term less than0000237 for this stage

    To evaluate f(t) (and hence 4g(t) + f(t)) at a point we integrate using Simpsonrsquosrule on subdivisions of the intervals [14 12] [12 1] into 200 middotmax(1 b

    radic|t|c) sub-

    intervals each1 The largest value of f(t) we find is 3152065 with an error termof at most 45 middot 10minus5

    B2 Bounds involving a logarithmic factor

    Our aim now is to give upper bounds on |ηprimeprime(y)|infin where η(y)(t) = log(yt)η2(t) andy ge 4

    Lemma B21 Let η2 R+ rarr R be as in (34) Let η(y)(t) = log(yt)η2(t) wherey ge 4 Then

    |ηprime(y)|1 lt (log y)|ηprime2|1 (B9)

    Proof Recall that supp(η2) = (14 1) For t isin (14 12)

    ηprime(y)(t) = (4 log(yt) log 4t)prime =4 log 4t

    t+

    4 log yt

    tge 8 log 4t

    tgt 0

    whereas for t isin (12 1)

    ηprime(y)(t) = (minus4 log(yt) log t)prime = minus4 log yt

    tminus 4 log t

    t= minus4 log yt2

    tlt 0

    where we are using the fact that y ge 4 Hence η(y)(t) is increasing on (14 12) anddecreasing on (12 1) it is also continuous at t = 12 Hence |ηprime(y)|1 = 2|η(y)(12)|We are done by

    2|η(y)(12)| = 2 logy

    2middot η2(12) = log

    y

    2middot 8 log 2 lt log y middot 8 log 2 = (log y)|ηprime2|1

    Lemma B22 Let y ge 4 Let g(t) = 4e(minust4) minus 4e(minust2) + e(minust) and k(t) =2e(minust4)minus e(minust2) Then for every t isin R

    |g(t) middot log y minus k(t) middot 4 log 2| le 787052 log y (B10)

    Proof By Lemma B11 |g(t)| le 787052 Since y ge 4 k(t) middot (4 log 2) log y le 6For any complex numbers z1 z2 with |z1| |z2| le ` we can have |z1 minus z2| gt ` only if| arg(z1z2)| gt π3 It is easy to check that for all t isin [minus2 2]∣∣∣∣arg

    (g(t) middot log y

    4 log 2 middot k(t)

    )∣∣∣∣ =

    ∣∣∣∣arg

    (g(t)

    k(t)

    )∣∣∣∣ lt 07 ltπ

    3

    (It is possible to bound maxima rigorously as in (B4)) Hence (B10) holds1As usual the code uses interval arithmetic (sect26)

    294 APPENDIX B NORMS OF FOURIER TRANSFORMS

    Lemma B23 Let η2 R+ rarr R be as in (34) Let η(y)(t) = (log yt)η2(t) wherey ge 4 Then

    |ηprimeprime(y)|infin lt 31521 middot log y (B11)

    Proof Clearly

    ηprimeprime(y)(x) = ηprimeprime2 (x)(log y) +

    ((log x)ηprimeprime2 (x) +

    2

    xηprime2(x)minus 1

    x2η2(x)

    )= ηprimeprime2 (x)(log y) + 4(log x)(4δ14(x)minus 4δ12(x) + δ1(x)) + h(x)

    where

    h(x) =

    0 if x lt 14 or x gt 14x2 (2minus 2 log 2x) if 14 le x lt 124x2 (minus2 + 2 log x) if 12 le x lt 1

    (Here we are using the expression (B7) for ηprimeprime2 (x)) Hence

    ηprimeprime(y)(t) = (4g(t) + f(t))(log y) + (minus16 log 2 middot k(t) + h(t)) (B12)

    where k(t) = 2e(minust4)minus e(minust2) Just as in the proof of Lemma B12

    |f(t)| le |fprime|1

    2π|t|le 80

    π|t| |h(t)| le 160(1 + log 2)

    π|t| (B13)

    Again as before this implies that (B11) holds for

    |t| ge 1

    π middot 003892

    (80 +

    160(1 + log 2)

    (log 4)

    )= 225251

    Note also that it is enough to check (B11) for t ge 0 by symmetry Our remaining taskis to prove (B11) for 0 le t le 225221

    Let I = [03 225221] [325 365] For t isin I we will have

    arg

    (4g(t) + f(t)

    minus16 log 2 middot k(t) + h(t)

    )sub(minusπ

    3

    ) (B14)

    (This is actually true for 0 le t le 03 as well but we will use a different strategy inthat range in order to better control error terms) Consequently by Lemma B12 andlog y ge log 4

    |ηprimeprime(y)(t)| lt max(|4g(t) + f(t)| middot (log y) |16 log 2 middot k(t)minus h(t)|)

    lt max(31521(log y) |48 log 2 + 25|) = 31521 log y

    where we bound h(t) by (B13) and by a numerical computation of the maximum of|h(t)| for 0 le t le 4 as in the proof of Lemma B12

    It remains to check (B14) Here as in the proof of Lemma B22 the allowableerror is relatively large (the expression on the left of (B14) is actually contained in

    B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 295

    (minus1 1) for t isin I) We decide to evaluate the argument in (B14) at all t isin 0005Z cap I computing f(t) and h(t) by numerical integration (Simpsonrsquos rule) with a subdivisionof [minus14 1] into 5000 intervals Proceeding as in the proof of Lemma B11 we seethat the sampling induces an error of at most

    1

    200052 max

    visinI((4|gprimeprime(v)|+ |(f)primeprime(t)|) le 00001

    848π2 lt 000593 (B15)

    in the evaluation of 4g(t) + f(t) and an error of at most

    1

    200052 max

    visinI((16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|)

    le 00001

    8(16 log 2 middot 6π2 + 24π2 middot (2minus log 2)) lt 00121

    (B16)

    in the evaluation of 16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|Running the numerical evaluation just described for t isin I the estimates for the left

    side of (B14) at the sample points are at most 099134 in absolute value the absolutevalues of the estimates for 4g(t) + f(t) are all at least 27783 and the absolute valuesof the estimates for | minus 16 log 2 middot log k(t) + h(t)| are all at least 21166 Numericalintegration by Simpsonrsquos rule gives errors bounded by 017575 percent Hence theabsolute value of the left side of (B14) is at most

    099134 + arcsin

    (000593

    27783+ 00017575

    )+ arcsin

    (00121

    21166+ 00017575

    )le 100271 lt

    π

    3

    for t isin I Lastly for t isin [0 03] cup [325 365] a numerical computation (samples at 0001Z

    interpolation as in Lemma B12 integrals computed by Simpsonrsquos rule with a subdi-vision into 1000 intervals) gives

    maxtisin[003]cup[325365]

    (|(4g(t) + f(t))|+ | minus 16 log 2 middot k(t) + h(t)|

    log 4

    )lt 2908

    and so maxtisin[003]cup[325365] |ηprimeprime(y)|infin lt 291 log y lt 31521 log y

    An easy integral gives us that the function log middotη2 satisfies

    | log middotη2|1 = 2minus log 4 (B17)

    The following function will appear only in a lower-order term thus an `1 estimate willdo

    Lemma B24 Let η2 R+ rarr R be as in (34) Then

    |(log middotη2)primeprime|1 = 96 log 2 (B18)

    296 APPENDIX B NORMS OF FOURIER TRANSFORMS

    Proof The function log middotη(t) is 0 for t isin [14 1] is increasing and negative for t isin(14 12) and is decreasing and positive for t isin (12 1) Hence

    |(log middotη2)primeprime|infin = 2

    ((log middotη2)prime

    (1

    2

    )minus (log middotη2)prime

    (1

    4

    ))= 2(16 log 2minus (minus32 log 2)) = 96 log 2

    Appendix C

    Sums involving Λ and φ

    C1 Sums over primesHere we treat some sums of the type

    sumn Λ(n)ϕ(n) where ϕ has compact support

    Since the sums are over all integers (not just an arithmetic progression) and there is nophase e(αn) involved the treatment is relatively straightforward

    The following is standard

    Lemma C11 (Explicit formula) Let ϕ [1infin) rarr C be continuous and piecewiseC1 with ϕprimeprime isin `1 let it also be of compact support contained in [1infin) Thensum

    n

    Λ(n)ϕ(n) =

    int infin1

    (1minus 1

    x(x2 minus 1)

    )ϕ(x)dxminus

    sumρ

    (Mϕ)(ρ) (C1)

    where ρ runs over the non-trivial zeros of ζ(s)

    The non-trivial zeros of ζ(s) are of course those in the critical strip 0 lt lt(s) lt 1Remark Lemma C11 appears as exercise 5 in [IK04 sect55] the condition there

    that ϕ be smooth can be relaxed since already the weaker assumption that ϕprimeprime be in L1

    implies that the Mellin transform (Mϕ)(σ + it) decays quadratically on t as t rarr infinthereby guaranteeing that the sum

    sumρ(Mϕ)(ρ) converges absolutely

    Lemma C12 Let x ge 10 Let η2 be as in (117) Assume that all non-trivial zeros ofζ(s) with |=(s)| le T0 lie on the critical line

    Thensumn

    Λ(n)η2

    (nx

    )= x+Olowast

    (0135x12 +

    97

    x2

    )+

    log eT0

    T0

    (94

    2π+

    603

    T0

    )x

    (C2)In particular with T0 = 3061 middot 1010 in the assumption we have for x ge 2000sum

    n

    Λ(n)η2

    (nx

    )= (1 +Olowast(ε))x+Olowast(0135x12)

    where ε = 273 middot 10minus10

    297

    298 APPENDIX C SUMS INVOLVING Λ AND φ

    The assumption that all non-trivial zeros up to T0 = 3061 middot 1010 lie on the criticalline was proven rigorously in [Plaa] higher values of T0 have been reached elsewhere([Wed03] [GD04])

    Proof By Lemma C11sumn

    Λ(n)η2

    (nx

    )=

    int infin1

    η2

    (t

    x

    )dtminus

    int infin1

    η2(tx)

    t(t2 minus 1)dtminus

    sumρ

    (Mϕ)(ρ)

    where ϕ(u) = η2(ux) and ρ runs over all non-trivial zeros of ζ(s) Since η2 is non-negative

    intinfin1η2(tx)dt = x|η2|1 = x whileint infin

    1

    η2(tx)

    t(t2 minus 1)dt = Olowast

    (int 1

    14

    η2(t)

    tx2(t2 minus 1100)dt

    )= Olowast

    (961114

    x2

    )

    By (211)

    sumρ

    (Mϕ)(ρ) =sumρ

    Mη2(ρ) middot xρ =sumρ

    (1minus 2minusρ

    ρ

    )2

    = S1(x)minus 2S1(x2) + S1(x4)

    whereSm(x) =

    sumρ

    ρm+1 (C3)

    Setting aside the contribution of all ρ with |=(ρ)| le T0 and all ρ with |=(ρ)| gt T0 andlt(s) le 12 and using the symmetry provided by the functional equation we obtain

    |Sm(x)| le x12 middotsumρ

    1

    |ρ|m+1+ x middot

    sumρ

    |=(ρ)|gtT0

    |lt(ρ)|gt12

    1

    |ρ|m+1

    le x12 middotsumρ

    1

    |ρ|m+1+x

    2middotsumρ

    |=(ρ)|gtT0

    1

    |ρ|m+1

    We bound the first sum by [Ros41 Lemma 17] and the second sum by [RS03 Lemma2] We obtain

    |Sm(x)| le(

    1

    2mπTm0+

    268

    Tm+10

    )x log

    eT0

    2π+ κmx

    12 (C4)

    where κ1 = 00463 κ2 = 000167 and κ3 = 00000744Hence∣∣∣∣∣sum

    ρ

    (Mη)(ρ) middot xρ∣∣∣∣∣ le

    (1

    2πT0+

    268

    T 20

    )9x

    4log

    eT0

    2π+

    (3

    2+radic

    2

    )κ1x

    12

    C2 SUMS INVOLVING φ 299

    For T0 = 3061 middot 1010 and x ge 2000 we obtainsumn

    Λ(n)η2

    (nx

    )= (1 +Olowast(ε))x+Olowast(0135x12)

    where ε = 273 middot 10minus10

    Corollary C13 Let η2 be as in (117) Assume that all non-trivial zeros of ζ(s) with|=(s)| le T0 T0 = 3061 middot 1010 lie on the critical line Then for all x ge 1sum

    n

    Λ(n)η2

    (nx

    )le min

    ((1 + ε)x+ 02x12 104488x

    ) (C5)

    where ε = 273 middot 10minus10

    Proof Immediate from Lemma C12 for x ge 2000 For x lt 2000 we use computa-tion as follows Since |ηprime2|infin = 16 and

    sumx4lenlex Λ(n) le x for all x ge 0 computingsum

    nlex Λ(n)η2(nx) only for x isin (11000)Z cap [0 2000] results in an inaccuracy of atmost (16 middot 0000509995)x le 000801x This resolves the matter at all points outside(205 207) (for the first estimate) or outside (95 105) and (135 145) (for the secondestimate) In those intervals the prime powers n involved do not change (since whetherx4 lt n le x depends only on n and [x]) and thus we can find the maximum of thesum in (C5) just by taking derivatives

    C2 Sums involving φWe need estimates for several sums involving φ(q) in the denominator

    The easiest are convergent sums such assumq micro

    2(q)(φ(q)q) We can express thisasprodp(1 + 1(p(pminus 1))) This is a convergent product and the main task is to bound

    a tail for r an integer

    logprodpgtr

    (1 +

    1

    p(pminus 1)

    )lesumpgtr

    1

    p(pminus 1)lesumngtr

    1

    n(nminus 1)=

    1

    r (C6)

    A quick computation1 now suffices to give

    2591461 lesumq

    gcd(q 2)micro2(q)

    φ(q)qlt 2591463 (C7)

    and so

    1295730 lesumq odd

    micro2(q)

    φ(q)qlt 1295732 (C8)

    since the expression bounded in (C8) is exactly half of that bounded in (C7)

    1Using D Plattrsquos integer arithmetic package

    300 APPENDIX C SUMS INVOLVING Λ AND φ

    Again using (C6) we get that

    2826419 lesumq

    micro2(q)

    φ(q)2lt 2826421 (C9)

    In what follows we will use values for convergent sums obtained in much the sameway ndash an easy tail bound followed by a computation

    By [Ram95 Lemma 34]sumqler

    micro2(q)

    φ(q)= log r + cE +Olowast(7284rminus13)

    sumqlerq odd

    micro2(q)

    φ(q)=

    1

    2

    (log r + cE +

    log 2

    2

    )+Olowast(4899rminus13)

    (C10)

    wherecE = γ +

    sump

    log p

    p(pminus 1)= 1332582275 +Olowast(10minus93)

    by [RS62 (211)] As we already said in (1215) this supplemented by a computationfor r le 4 middot 107 gives

    log r + 1312 lesumqler

    micro2(q)

    φ(q)le log r + 1354

    for r ge 182 In the same way we get that

    1

    2log r + 083 le

    sumqlerq odd

    micro2(q)

    φ(q)le 1

    2log r + 085 (C11)

    for r ge 195 (The numerical verification here goes up to 138 middot 108 for r gt 318 middot 108use C11)

    Clearly sumqle2rq even

    micro2(q)

    φ(q)=sumqlerq odd

    micro2(q)

    φ(q) (C12)

    We wish to obtain bounds for the sumssumqger

    micro2(q)

    φ(q)2

    sumqgerq odd

    micro2(q)

    φ(q)2

    sumqgerq even

    micro2(q)

    φ(q)2

    where N isin Z+ and r ge 1 To do this it will be helpful to express some of thequantities within these sums as convolutions2 For q squarefree and j ge 1

    micro2(q)qjminus1

    φ(q)j=sumab=q

    fj(b)

    a (C13)

    2The author would like to thank O Ramare for teaching him this technique

    C2 SUMS INVOLVING φ 301

    where fj is the multiplicative function defined by

    fj(p) =pj minus (pminus 1)j

    (pminus 1)jp fj(p

    k) = 0 for k ge 2

    We will also find the following estimate useful

    Lemma C21 Let j ge 2 be an integer andA a positive real Letm ge 1 be an integerThen sum

    ageA(am)=1

    micro2(a)

    ajle ζ(j)ζ(2j)

    Ajminus1middotprodp|m

    (1 +

    1

    pj

    )minus1

    (C14)

    It is useful to note that ζ(2)ζ(4) = 15π2 = 1519817 and ζ(3)ζ(6) =1181564

    Proof The right side of (C14) decreases as A increases while the left side dependsonly on dAe Hence it is enough to prove (C14) when A is an integer

    For A = 1 (C14) is an equality Let

    C =ζ(j)

    ζ(2j)middotprodp|m

    (1 +

    1

    pj

    )minus1

    Let A ge 2 Since sumageA

    (am)=1

    micro2(a)

    aj= C minus

    sumaltA

    (am)=1

    micro2(a)

    aj

    and

    C =suma

    (am)=1

    micro2(a)

    ajlt

    sumaltA

    (am)=1

    micro2(a)

    aj+

    1

    Aj+

    int infinA

    1

    tjdt

    =sumaltA

    (am)=1

    micro2(a)

    aj+

    1

    Aj+

    1

    (j minus 1)Ajminus1

    we obtainsumageA

    (am)=1

    micro2(a)

    aj=

    1

    Ajminus1middot C +

    Ajminus1 minus 1

    Ajminus1middot C minus

    sumaltA

    (am)=1

    micro2(a)

    aj

    ltC

    Ajminus1+Ajminus1 minus 1

    Ajminus1middot(

    1

    Aj+

    1

    (j minus 1)Ajminus1

    )minus 1

    Ajminus1

    sumaltA

    (am)=1

    micro2(a)

    aj

    le C

    Ajminus1+

    1

    Ajminus1

    ((1minus 1

    Ajminus1

    )(1

    A+

    1

    j minus 1

    )minus 1

    )

    302 APPENDIX C SUMS INVOLVING Λ AND φ

    Since (1minus 1A)(1A+ 1) lt 1 and 1A+ 1(j minus 1) le 1 for j ge 3 we obtain that(1minus 1

    Ajminus1

    )(1

    A+

    1

    j minus 1

    )lt 1

    for all integers j ge 2 and so the statement follows

    We now obtain easily the estimates we want by (C13) and Lemma C21 (withj = 2 and m = 1)sumqger

    micro2(q)

    φ(q)2=sumqger

    sumab=q

    f2(b)

    a

    micro2(q)

    qlesumbge1

    f2(b)

    b

    sumagerb

    micro2(a)

    a2

    le ζ(2)ζ(4)

    r

    sumbge1

    f2(b) =15π2

    r

    prodp

    (1 +

    2pminus 1

    (pminus 1)2p

    )le 67345

    r

    (C15)

    Similarly by (C13) and Lemma C21 (with j = 2 and m = 2)sumqgerq odd

    micro2(q)

    φ(q)2=sumbge1

    b odd

    f2(b)

    b

    sumagerba odd

    micro2(a)

    a2le ζ(2)ζ(4)

    1 + 122

    1

    r

    sumb odd

    f2(b)

    =12

    π2

    1

    r

    prodpgt2

    (1 +

    2pminus 1

    (pminus 1)2p

    )le 215502

    r

    (C16)

    sumqgerq even

    micro2(q)

    φ(q)2=sumqger2q odd

    micro2(q)

    φ(q)2le 431004

    r (C17)

    Lastlysumqlerq odd

    micro2(q)q

    φ(q)=sumqlerq odd

    micro2(q)sumd|q

    1

    φ(d)=sumdlerd odd

    1

    φ(d)

    sumqlerd|qq odd

    micro2(q) lesumdlerd odd

    1

    2φ(d)

    ( rd

    + 1)

    le r

    2

    sumd odd

    1

    φ(d)d+

    1

    2

    sumdlerd odd

    1

    φ(d)le 064787r +

    log r

    4+ 0425

    (C18)where we are using (C8) and (C11)

    Since we are on the subject of φ(q) let us also prove a simple lemma that we useat various points in the text to bound qφ(q)

    Lemma C22 For any q ge 1 and any r ge max(3 q)

    q

    φ(q)lt z(r)

    C2 SUMS INVOLVING φ 303

    wherez(r) = eγ log log r +

    250637

    log log r (C19)

    Proof Since z(r) is increasing for r ge 27 the statement follows immediately forq ge 27 by [RS62 Thm 15]

    q

    φ(q)lt z(q) le z(r)

    For q lt 27 it is clear that qφ(q) le 2 middot 3(1 middot 2) = 3 By the arithmeticgeometricmean inequality z(t) ge 2

    radiceγ250637 gt 3 for all t gt e and so the lemma holds for

    q lt 27

    304 APPENDIX C SUMS INVOLVING Λ AND φ

    Appendix D

    Checking small n by checkingzeros of ζ(s)

    In order to show that every odd number n le N is the sum of three primes it is enoughto show for some M le N that

    1 every even integer 4 le m leM can be written as the sum of two primes

    2 the difference between any two consecutive primes le N is at most M minus 4

    (If we want to show that every odd number n le N is the sum of three odd primeswe just replace M minus 4 by M minus 6 in (2)) The best known result of type (1) is thatof Oliveira e Silva Herzog and Pardi ([OeSHP14] M = 4 middot 1018) As for (2) it wasproven in [HP13] for M = 4 middot 1018 and N = 8875694 middot 1030 by a direct computation(valid even if we replace M minus 4 by M minus 6 in the statement of (2))

    Alternatively one can establish results of type (2) by means of numerical verifica-tions of the Riemann hypothesis up to a certain height This is a classical approachfollowed in [RS75] and [Sch76] and later in [RS03] we will use the version of (1)kindly provided by Ramare in [Ramd] We carry out this approach in full here notbecause it is preferrable to [HP13] ndash it is still based on computations and it is slightlymore indirect than [HP13] ndash but simply to show that one can establish what we needby a different route

    A numerical verification of the Riemann hypothesis up to a certain height consistssimply in checking that all (non-trivial) zeroes z of the Riemann zeta function up to aheight H (meaning =(z) le H) lie on the critical line lt(z) = 12

    The height up to which the Riemann hypothesis has actually been fully verified isnot a matter on which there is unanimity The strongest claim in the literature is in[GD04] which states that the first 1013 zeroes of the Riemann zeta function lie on thecritical line lt(z) = 12 This corresponds to checking the Riemann hypothesis up toheight H = 244599 middot 1012 It is unclear whether this computation was or could beeasily made rigorous as pointed out in [SD10 p 2398] it has not been replicated yet

    Before [GD04] the strongest results were those of the ZetaGrid distributed com-puting project led by S Wedeniwski [Wed03] the method followed in it was more

    305

    306 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

    traditional and should allow rigorous verification involving interval arithmetic Unfor-tunately the results were never formally published The statement that the ZetaGridproject verified the first 9 middot 1011 zeroes (corresponding to H = 2419 middot 1011) is oftenquoted (eg [Bom10 p 29]) this is the point to which the project had got by thetime of Gourdon and Demichelrsquos announcement Wedeniwski asserts in private com-munication that the project verified the first 1012 zeroes and that the computation wasdouble-checked (by the same method)

    The strongest claim prior to ZetaGrid was that of van de Lune (H = 3293 middot 109first 1010 zeroes unpublished) Recently Platt [Plaa] checked the first 11 middot 1011 ze-roes (H = 3061 middot 1010) rigorously following a method essentially based on thatin [Boo06a] Note that [Plaa] uses interval arithmetic which is highly desirable forfloating-point computations

    Proposition D03 Every odd integer 5 le n le n0 is the sum of three primes where

    n0 =

    590698 middot 1029 if [GD04] is used (H = 244 middot 1012)615697 middot 1028 if ZetaGrid results are used (H = 2419 middot 1011)123163 middot 1027 if [Plaa] is used ( H = 3061 middot 1010)

    Proof For n le 4 middot 1018 + 3 this is immediate from [OeSHP14] Let 4 middot 1018 + 3 ltn le n0 We need to show that there is a prime p in [n minus 4 minus (n minus 4)∆ n minus 4]where ∆ is large enough for (nminus 4)∆ le 4 middot 1018 minus 4 to hold We will then have that4 le n minus p le 4 + (n minus 4)∆ le 4 middot 1018 Since n minus p is even [OeSHP14] will thenimply that nminus p is the sum of two primes pprime pprimeprime and so

    n = p+ pprime + pprimeprime

    Since nminus 4 gt 1011 the interval [nminus 4minus (nminus 4)∆ nminus 4] with ∆ = 28314000must contain a prime [RS03] This gives the solution for (nminus4) le 11325 middot1026 sincethen (nminus 4) le 4 middot 1018 minus 4 Note 11325 middot 1026 gt e59

    From here onwards we use the tables in [Ramd] to find acceptable values of ∆Since nminus 4 ge e59 we can choose

    ∆ =

    52211882224 if [GD04] is used (case (a))13861486834 if ZetaGrid is used (case (b))307779681 if [Plaa] is used (case (c))

    This gives us (n minus 4)∆ le 4 middot 1018 minus 4 for n minus 4 lt er0 where r0 = 67 in case (a)r0 = 66 in case (b) and r0 = 62 in case (c)

    If nminus 4 ge er0 we can choose (again by [Ramd])

    ∆ =

    146869130682 in case (a)15392435100 in case (b)307908668 in case (c)

    This is enough for nminus4 lt e68 in case (a) and without further conditions for (b) or (c)

    307

    Finally if nminus 4 ge e68 and we are in case (a) [Ramd] assures us that the choice

    ∆ = 147674531294

    is valid we verify as well that (n0 minus 4)∆ le 4 middot 1018 minus 4

    In other words the rigorous results in [Plaa] are enough to show the result for allodd n le 1027 Of course [HP13] is also more than enough and gives stronger resultsthan Prop D03

    308 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

    Bibliography

    [AS64] M Abramowitz and I A Stegun Handbook of mathematical func-tions with formulas graphs and mathematical tables volume 55 ofNational Bureau of Standards Applied Mathematics Series For sale bythe Superintendent of Documents US Government Printing OfficeWashington DC 1964

    [BBO10] J Bertrand P Bertrand and J-P Ovarlez Mellin transform In A DPoularikas editor Transforms and applications handbook CRC PressBoca Raton FL 2010

    [Bom74] E Bombieri Le grand crible dans la theorie analytique des nombresSociete Mathematique de France Paris 1974 Avec une sommaire enanglais Asterisque No 18

    [Bom10] E Bombieri The classical theory of zeta and L-functions Milan JMath 78(1)11ndash59 2010

    [Bom76] E Bombieri On twin almost primes Acta Arith 28(2)177ndash193197576

    [Boo06a] A R Booker Artinrsquos conjecture Turingrsquos method and the Riemannhypothesis Experiment Math 15(4)385ndash407 2006

    [Boo06b] A R Booker Turing and the Riemann hypothesis Notices AmerMath Soc 53(10)1208ndash1211 2006

    [Bor56] K G Borodzkin On the problem of I M Vinogradovrsquos constant (inRussian) In Proc Third All-Union Math Conf volume 1 page 3Izdat Akad Nauk SSSR Moscow 1956

    [Bou99] J Bourgain On triples in arithmetic progression Geom Funct Anal9(5)968ndash984 1999

    [BR02] G Bastien and M Rogalski Convexite complete monotonie etinegalites sur les fonctions zeta et gamma sur les fonctions desoperateurs de Baskakov et sur des fonctions arithmetiques CanadJ Math 54(5)916ndash944 2002

    309

    310 BIBLIOGRAPHY

    [But11] Y Buttkewitz Exponential sums over primes and the prime twin prob-lem Acta Math Hungar 131(1-2)46ndash58 2011

    [Che73] J R Chen On the representation of a larger even integer as the sum ofa prime and the product of at most two primes Sci Sinica 16157ndash1761973

    [Che85] J R Chen On the estimation of some trigonometrical sums and theirapplication Sci Sinica Ser A 28(5)449ndash458 1985

    [Chu37] NG Chudakov On the Goldbach problem C R (Dokl) Acad SciURSS n Ser 17335ndash338 1937

    [Chu38] NG Chudakov On the density of the set of even numbers which arenot representable as the sum of two odd primes Izv Akad Nauk SSSRSer Mat 2 pages 25ndash40 1938

    [Chu47] N G Chudakov Introduction to the theory of Dirichlet L-functionsOGIZ Moscow-Leningrad 1947 In Russian

    [CW89] J R Chen and T Z Wang On the Goldbach problem Acta MathSinica 32(5)702ndash718 1989

    [CW96] J R Chen and T Z Wang The Goldbach problem for odd numbersActa Math Sinica (Chin Ser) 39(2)169ndash174 1996

    [Dab96] H Daboussi Effective estimates of exponential sums over primesIn Analytic number theory Vol 1 (Allerton Park IL 1995) volume138 of Progr Math pages 231ndash244 Birkhauser Boston Boston MA1996

    [Dav67] H Davenport Multiplicative number theory Markham PublishingCo Chicago Ill 1967 Lectures given at the University of MichiganWinter Term

    [dB81] N G de Bruijn Asymptotic methods in analysis Dover PublicationsInc New York third edition 1981

    [Des08] R Descartes Œuvres de Descartes publiees par Charles Adam etPaul Tannery sous les auspices du Ministere de lrsquoInstruction publiquePhysico-mathematica Compendium musicae Regulae ad directionemingenii Recherche de la verite Supplement a la correspondance XParis Leopold Cerf IV u 691 S 4 1908

    [Des77] J-M Deshouillers Sur la constante de Snirelprimeman In SeminaireDelange-Pisot-Poitou 17e annee (197576) Theorie des nombresFac 2 Exp No G16 page 6 Secretariat Math Paris 1977

    [DEtRZ97] J-M Deshouillers G Effinger H te Riele and D Zinoviev A com-plete Vinogradov 3-primes theorem under the Riemann hypothesisElectron Res Announc Amer Math Soc 399ndash104 1997

    BIBLIOGRAPHY 311

    [Dic66] L E Dickson History of the theory of numbers Vol I Divisibilityand primality Chelsea Publishing Co New York 1966

    [DLDDD+10] C Daramy-Loirat F De Dinechin D Defour M Gallet N Gast andCh Lauter Crlibm March 2010 version 10beta4

    [DR01] H Daboussi and J Rivat Explicit upper bounds for exponential sumsover primes Math Comp 70(233)431ndash447 (electronic) 2001

    [Dre93] F Dress Fonction sommatoire de la fonction de Mobius I Majorationsexperimentales Experiment Math 2(2)89ndash98 1993

    [DS70] H G Diamond and J Steinig An elementary proof of the prime num-ber theorem with a remainder term Invent Math 11199ndash258 1970

    [Eff99] G Effinger Some numerical implications of the Hardy and Littlewoodanalysis of the 3-primes problem Ramanujan J 3(3)239ndash280 1999

    [EM95] M El Marraki Fonction sommatoire de la fonction de Mobius III Ma-jorations asymptotiques effectives fortes J Theor Nombres Bordeaux7(2)407ndash433 1995

    [EM96] M El Marraki Majorations de la fonction sommatoire de la fonctionmicro(n)n Univ Bordeaux 1 preprint (96-8) 1996

    [Est37] T Estermann On Goldbachrsquos Problem Proof that Almost all EvenPositive Integers are Sums of Two Primes Proc London Math SocS2-44(4)307ndash314 1937

    [FI98] J Friedlander and H Iwaniec Asymptotic sieve for primes Ann ofMath (2) 148(3)1041ndash1065 1998

    [FI10] J Friedlander and H Iwaniec Opera de cribro volume 57 of AmericanMathematical Society Colloquium Publications American Mathemat-ical Society Providence RI 2010

    [For02] K Ford Vinogradovrsquos integral and bounds for the Riemann zeta func-tion Proc London Math Soc (3) 85(3)565ndash633 2002

    [GD04] X Gourdon and P Demichel The first 1013 zeros of the Rie-mann zeta function and zeros computation at very large heighthttpnumberscomputationfreefrConstantsMiscellaneouszetazeros1e13-1e24pdf 2004

    [GR94] I S Gradshteyn and I M Ryzhik Table of integrals series and prod-ucts Academic Press Inc Boston MA fifth edition 1994 Transla-tion edited and with a preface by Alan Jeffrey

    [GR96] A Granville and O Ramare Explicit bounds on exponential sumsand the scarcity of squarefree binomial coefficients Mathematika43(1)73ndash107 1996

    312 BIBLIOGRAPHY

    [Har66] G H Hardy Collected papers of G H Hardy (Including Joint pa-pers with J E Littlewood and others) Vol I Edited by a committeeappointed by the London Mathematical Society Clarendon Press Ox-ford 1966

    [HB79] D R Heath-Brown The fourth power moment of the Riemann zetafunction Proc London Math Soc (3) 38(3)385ndash422 1979

    [HB85] D R Heath-Brown The ternary Goldbach problem Rev MatIberoamericana 1(1)45ndash59 1985

    [HB11] H Hong and Ch W Brown QEPCAD B ndash Quantifier elimination bypartial cylindrical algebraic decomposition May 2011 version 162

    [Hela] H A Helfgott Major arcs for Goldbachrsquos problem Preprint Availableat arXiv12035712

    [Helb] H A Helfgott Minor arcs for Goldbachrsquos problem Preprint Availableas arXiv12055252

    [Helc] H A Helfgott The Ternary Goldbach Conjecture is true PreprintAvailable as arXiv13127748

    [Hel13a] H Helfgott La conjetura debil de Goldbach Gac R Soc Mat Esp16(4) 2013

    [Hel13b] H A Helfgott The ternary Goldbach conjecture 2013 Avail-able at httpvaluevarwordpresscom20130702the-ternary-goldbach-conjecture

    [Hel14a] H A Helfgott La conjecture de Goldbach ternaire Gaz Math(140)5ndash18 2014 Translated by Margaret Bilu revised by the author

    [Hel14b] H A Helfgott The ternary Goldbach problem To appear in Proceed-ings of the International Congress of Mathematicians (Seoul Korea2014) 2014

    [HL22] G H Hardy and J E Littlewood Some problems of lsquoPartitio numero-rumrsquo III On the expression of a number as a sum of primes ActaMath 44(1)1ndash70 1922

    [HP13] H A Helfgott and David J Platt Numerical verification of the ternaryGoldbach conjecture up to 8875 middot 1030 Exp Math 22(4)406ndash4092013

    [HR00] G H Hardy and S Ramanujan Asymptotic formulaelig in combinatoryanalysis [Proc London Math Soc (2) 17 (1918) 75ndash115] In Collectedpapers of Srinivasa Ramanujan pages 276ndash309 AMS Chelsea PublProvidence RI 2000

    BIBLIOGRAPHY 313

    [Hux72] M N Huxley Irregularity in sifted sequences J Number Theory4437ndash454 1972

    [IK04] H Iwaniec and E Kowalski Analytic number theory volume 53 ofAmerican Mathematical Society Colloquium Publications AmericanMathematical Society Providence RI 2004

    [Kad] H Kadiri An explicit zero-free region for the Dirichlet L-functionsPreprint Available as arXiv0510570

    [Kad05] H Kadiri Une region explicite sans zeros pour la fonction ζ de Rie-mann Acta Arith 117(4)303ndash339 2005

    [Kar93] A A Karatsuba Basic analytic number theory Springer-VerlagBerlin 1993 Translated from the second (1983) Russian edition andwith a preface by Melvyn B Nathanson

    [Knu99] O Knuppel PROFILBIAS February 1999 version 2

    [Kor58] N M Korobov Estimates of trigonometric sums and their applicationsUspehi Mat Nauk 13(4 (82))185ndash192 1958

    [Lam08] B Lambov Interval arithmetic using SSE-2 In Reliable Implemen-tation of Real Number Algorithms Theory and Practice Interna-tional Seminar Dagstuhl Castle Germany January 8-13 2006 volume5045 of Lecture Notes in Computer Science pages 102ndash113 SpringerBerlin 2008

    [Leh66] R Sherman Lehman On the difference π(x) minus li(x) Acta Arith11397ndash410 1966

    [LW02] M-Ch Liu and T Wang On the Vinogradov bound in the three primesGoldbach conjecture Acta Arith 105(2)133ndash175 2002

    [Mar41] K K Mardzhanishvili On the proof of the Goldbach-Vinogradov the-orem (in Russian) C R (Doklady) Acad Sci URSS (NS) 30(8)681ndash684 1941

    [McC84a] K S McCurley Explicit estimates for the error term in the prime num-ber theorem for arithmetic progressions Math Comp 42(165)265ndash285 1984

    [McC84b] K S McCurley Explicit zero-free regions for Dirichlet L-functionsJ Number Theory 19(1)7ndash32 1984

    [Mon68] H L Montgomery A note on the large sieve J London Math Soc4393ndash98 1968

    [Mon71] H L Montgomery Topics in multiplicative number theory LectureNotes in Mathematics Vol 227 Springer-Verlag Berlin 1971

    314 BIBLIOGRAPHY

    [MV73] H L Montgomery and R C Vaughan The large sieve Mathematika20119ndash134 1973

    [MV74] H L Montgomery and R C Vaughan Hilbertrsquos inequality J LondonMath Soc (2) 873ndash82 1974

    [MV07] H L Montgomery and R C Vaughan Multiplicative number the-ory I Classical theory volume 97 of Cambridge Studies in AdvancedMathematics Cambridge University Press Cambridge 2007

    [Ned06] N S Nedialkov VNODE-LP a validated solver for initial value prob-lems in ordinary differential equations July 2006 version 03

    [OeSHP14] T Oliveira e Silva S Herzog and S Pardi Empirical verification ofthe even Goldbach conjecture and computation of prime gaps up to4 middot 1018 Math Comp 832033ndash2060 2014

    [OLBC10] F W J Olver D W Lozier R F Boisvert and Ch W Clark edi-tors NIST handbook of mathematical functions US Department ofCommerce National Institute of Standards and Technology Washing-ton DC 2010 With 1 CD-ROM (Windows Macintosh and UNIX)

    [Olv58] F W J Olver Uniform asymptotic expansions of solutions of lin-ear second-order differential equations for large values of a parameterPhilos Trans Roy Soc London Ser A 250479ndash517 1958

    [Olv59] F W J Olver Uniform asymptotic expansions for Weber paraboliccylinder functions of large orders J Res Nat Bur Standards Sect B63B131ndash169 1959

    [Olv61] F W J Olver Two inequalities for parabolic cylinder functions ProcCambridge Philos Soc 57811ndash822 1961

    [Olv65] F W J Olver On the asymptotic solution of second-order differentialequations having an irregular singularity of rank one with an applica-tion to Whittaker functions J Soc Indust Appl Math Ser B NumerAnal 2225ndash243 1965

    [Olv74] F W J Olver Asymptotics and special functions Academic Press[A subsidiary of Harcourt Brace Jovanovich Publishers] New York-London 1974 Computer Science and Applied Mathematics

    [Plaa] D Platt Computing π(x) analytically To appear in Math CompAvailable as arXiv12035712

    [Plab] D Platt Numerical computations concerning GRH Preprint Availableat arXiv13053087

    [Pla11] D Platt Computing degree 1 L-functions rigorously PhD thesis Bris-tol University 2011

    BIBLIOGRAPHY 315

    [Rama] O Ramare Etat des lieux Preprint Available as httpmathuniv-lille1fr˜ramareMathsExplicitJNTBpdf

    [Ramb] O Ramare Explicit estimates on several summatory functions involv-ing the Moebius function To appear in Math Comp

    [Ramc] O Ramare A sharp bilinear form decomposition for primes and Moe-bius function Preprint To appear in Acta Math Sinica

    [Ramd] O Ramare Short effective intervals containing primes Preprint

    [Ram95] O Ramare On Snirelprimemanrsquos constant Ann Scuola Norm Sup PisaCl Sci (4) 22(4)645ndash706 1995

    [Ram09] O Ramare Arithmetical aspects of the large sieve inequality volume 1of Harish-Chandra Research Institute Lecture Notes Hindustan BookAgency New Delhi 2009 With the collaboration of D S Ramana

    [Ram10] O Ramare On Bombierirsquos asymptotic sieve J Number Theory130(5)1155ndash1189 2010

    [Ram13] O Ramare From explicit estimates for primes to explicit estimates forthe Mobius function Acta Arith 157(4)365ndash379 2013

    [Ram14] O Ramare Explicit estimates on the summatory functions of theMobius function with coprimality restrictions Acta Arith 165(1)1ndash10 2014

    [Ros41] B Rosser Explicit bounds for some functions of prime numbers AmerJ Math 63211ndash232 1941

    [RR96] O Ramare and R Rumely Primes in arithmetic progressions MathComp 65(213)397ndash425 1996

    [RS62] J B Rosser and L Schoenfeld Approximate formulas for some func-tions of prime numbers Illinois J Math 664ndash94 1962

    [RS75] J B Rosser and L Schoenfeld Sharper bounds for the Chebyshevfunctions θ(x) and ψ(x) Math Comp 29243ndash269 1975 Collectionof articles dedicated to Derrick Henry Lehmer on the occasion of hisseventieth birthday

    [RS03] O Ramare and Y Saouter Short effective intervals containing primesJ Number Theory 98(1)10ndash33 2003

    [RV83] H Riesel and R C Vaughan On sums of primes Ark Mat 21(1)46ndash74 1983

    [Sao98] Y Saouter Checking the odd Goldbach conjecture up to 1020 MathComp 67(222)863ndash866 1998

    316 BIBLIOGRAPHY

    [Sch33] L Schnirelmann Uber additive Eigenschaften von Zahlen Math Ann107(1)649ndash690 1933

    [Sch76] L Schoenfeld Sharper bounds for the Chebyshev functions θ(x) andψ(x) II Math Comp 30(134)337ndash360 1976

    [SD10] Y Saouter and P Demichel A sharp region where π(x) minus li(x) ispositive Math Comp 79(272)2395ndash2405 2010

    [Sel91] A Selberg Lectures on sieves In Collected papers vol II pages66ndash247 Springer Berlin 1991

    [Sha14] X Shao A density version of the Vinogradov three primes theoremDuke Math J 163(3)489ndash512 2014

    [Shu92] F H Shu The Cosmos In Encyclopaedia Britannica Macropaediavolume 16 pages 762ndash795 Encyclopaedia Britannica Inc 15 edition1992

    [Tao14] T Tao Every odd number greater than 1 is the sum of at most fiveprimes Math Comp 83(286)997ndash1038 2014

    [Tem10] N M Temme Parabolic cylinder functions In NIST Handbook ofmathematical functions pages 303ndash319 US Dept Commerce Wash-ington DC 2010

    [Tru] T S Trudgian An improved upper bound for the error in thezero-counting formulae for Dirichlet L-functions and Dedekind zeta-functions Preprint

    [Tuc11] W Tucker Validated numerics A short introduction to rigorous com-putations Princeton University Press Princeton NJ 2011

    [Tur53] A M Turing Some calculations of the Riemann zeta-function ProcLondon Math Soc (3) 399ndash117 1953

    [TV03] N M Temme and R Vidunas Parabolic cylinder functions exam-ples of error bounds for asymptotic expansions Anal Appl (Singap)1(3)265ndash288 2003

    [van37] J G van der Corput Sur lrsquohypothese de Goldbach pour presque tousles nombres pairs Acta Arith 2266ndash290 1937

    [Vau77a] R C Vaughan On the estimation of Schnirelmanrsquos constant J ReineAngew Math 29093ndash108 1977

    [Vau77b] R-C Vaughan Sommes trigonometriques sur les nombres premiersC R Acad Sci Paris Ser A-B 285(16)A981ndashA983 1977

    [Vau80] R C Vaughan Recent work in additive prime number theory In Pro-ceedings of the International Congress of Mathematicians (Helsinki1978) pages 389ndash394 Acad Sci Fennica Helsinki 1980

    BIBLIOGRAPHY 317

    [Vau97] R C Vaughan The Hardy-Littlewood method volume 125 of Cam-bridge Tracts in Mathematics Cambridge University Press Cam-bridge second edition 1997

    [Vin37] I M Vinogradov A new method in analytic number theory (Russian)Tr Mat Inst Steklova 105ndash122 1937

    [Vin47] IM Vinogradov The method of trigonometrical sums in the theory ofnumbers (Russian) Tr Mat Inst Steklova 233ndash109 1947

    [Vin54] I M Vinogradov The method of trigonometrical sums in the theoryof numbers Interscience Publishers London and New York 1954Translated revised and annotated by K F Roth and Anne Davenport

    [Vin58] I M Vinogradov A new estimate of the function ζ(1 + it) Izv AkadNauk SSSR Ser Mat 22161ndash164 1958

    [Vin04] I M Vinogradov The method of trigonometrical sums in the theory ofnumbers Dover Publications Inc Mineola NY 2004 Translated fromthe Russian revised and annotated by K F Roth and Anne DavenportReprint of the 1954 translation

    [Wed03] S Wedeniwski ZetaGrid - Computational verification of the Riemannhypothesis Conference in Number Theory in honour of Professor HC Williams Banff Alberta Canada May 2003

    [Wei84] A Weil Number theory An approach through history From Hammu-rapi to Legendre Birkhauser Boston Inc Boston MA 1984

    [Whi03] E T Whittaker On the functions associated with the parabolic cylinderin harmonic analysis Proc London Math Soc 35417ndash427 1903

    [Wig20] S Wigert Sur la theorie de la fonction ζ(s) de Riemann Ark Mat141ndash17 1920

    [Won01] R Wong Asymptotic approximations of integrals volume 34 of Clas-sics in Applied Mathematics Society for Industrial and Applied Math-ematics (SIAM) Philadelphia PA 2001 Corrected reprint of the 1989original

    [Zin97] D Zinoviev On Vinogradovrsquos constant in Goldbachrsquos ternary problemJ Number Theory 65(2)334ndash358 1997

    • Preface
    • Acknowledgements
    • 1 Introduction
      • 11 History and new developments
      • 12 The circle method Fourier analysis on Z
      • 13 The major arcs M
        • 131 What do we really know about L-functions and their zeros
        • 132 Estimates of f0362f() for in the major arcs
          • 14 The minor arcs m
            • 141 Qualitative goals and main ideas
            • 142 Combinatorial identities
            • 143 Type I sums
            • 144 Type II or bilinear sums
              • 15 Integrals over the major and minor arcs
              • 16 Some remarks on computations
                • 2 Notation and preliminaries
                  • 21 General notation
                  • 22 Dirichlet characters and L functions
                  • 23 Fourier transforms and exponential sums
                  • 24 Mellin transforms
                  • 25 Bounds on sums of and
                  • 26 Interval arithmetic and the bisection method
                    • I Minor arcs
                      • 3 Introduction
                        • 31 Results
                        • 32 Comparison to earlier work
                        • 33 Basic setup
                          • 331 Vaughans identity
                          • 332 An alternative route
                              • 4 Type I sums
                                • 41 Trigonometric sums
                                • 42 Type I estimates
                                  • 421 Type I variations
                                      • 5 Type II sums
                                        • 51 The sum S1 cancellation
                                          • 511 Reduction to a sum with
                                          • 512 Explicit bounds for a sum with
                                          • 513 Estimating the triple sum
                                            • 52 The sum S2 the large sieve primes and tails
                                              • 6 Minor-arc totals
                                                • 61 The smoothing function
                                                • 62 Contributions of different types
                                                  • 621 Type I terms SI1
                                                  • 622 Type I terms SI2
                                                  • 623 Type II terms
                                                    • 63 Adjusting parameters Calculations
                                                      • 631 First choice of parameters qy
                                                      • 632 Second choice of parameters
                                                        • 64 Conclusion
                                                            • II Major arcs
                                                              • 7 Major arcs overview and results
                                                                • 71 Results
                                                                • 72 Main ideas
                                                                  • 8 The Mellin transform of the twisted Gaussian
                                                                    • 81 How to choose a smoothing function
                                                                    • 82 The twisted Gaussian overview and setup
                                                                      • 821 Relation to the existing literature
                                                                      • 822 General approach
                                                                        • 83 The saddle point
                                                                          • 831 The coordinates of the saddle point
                                                                          • 832 The direction of steepest descent
                                                                            • 84 The integral over the contour
                                                                              • 841 A simple contour
                                                                              • 842 Another simple contour
                                                                                • 85 Conclusions
                                                                                  • 9 Explicit formulas
                                                                                    • 91 A general explicit formula
                                                                                    • 92 Sums and decay for the Gaussian
                                                                                    • 93 The case of (t)
                                                                                    • 94 The case of +(t)
                                                                                    • 95 A sum for +(t)2
                                                                                    • 96 A verification of zeros and its consequences
                                                                                        • III The integral over the circle
                                                                                          • 10 The integral over the major arcs
                                                                                            • 101 Decomposition of S by characters
                                                                                            • 102 The integral over the major arcs the main term
                                                                                            • 103 The 2 norm over the major arcs
                                                                                            • 104 The integral over the major arcs conclusion
                                                                                              • 11 Optimizing and adapting smoothing functions
                                                                                                • 111 The symmetric smoothing function
                                                                                                  • 1111 The product (t) (-t)
                                                                                                    • 112 The smoothing function adapting minor-arc bounds
                                                                                                      • 12 The 2 norm and the large sieve
                                                                                                        • 121 Variations on the large sieve for primes
                                                                                                        • 122 Bounding the quotient in the large sieve for primes
                                                                                                          • 13 The integral over the minor arcs
                                                                                                            • 131 Putting together 2 bounds over arcs and bounds
                                                                                                            • 132 The minor-arc total
                                                                                                              • 14 Conclusion
                                                                                                                • 141 The 2 norm over the major arcs explicit version
                                                                                                                • 142 The total major-arc contribution
                                                                                                                • 143 The minor-arc total explicit version
                                                                                                                • 144 Conclusion proof of main theorem
                                                                                                                    • IV Appendices
                                                                                                                      • A Norms of smoothing functions
                                                                                                                        • A1 The decay of a Mellin transform
                                                                                                                        • A2 The difference +- in 2 norm
                                                                                                                        • A3 Norms involving +
                                                                                                                        • A4 Norms involving +
                                                                                                                        • A5 The -norm of +
                                                                                                                          • B Norms of Fourier transforms
                                                                                                                            • B1 The Fourier transform of 2
                                                                                                                            • B2 Bounds involving a logarithmic factor
                                                                                                                              • C Sums involving and
                                                                                                                                • C1 Sums over primes
                                                                                                                                • C2 Sums involving
                                                                                                                                  • D Checking small n by checking zeros of (s)

      Contents

      Preface vii

      Acknowledgements ix

      1 Introduction 111 History and new developments 212 The circle method Fourier analysis on Z 613 The major arcs M 9

      131 What do we really know about L-functions and their zeros 9132 Estimates of f(α) for α in the major arcs 10

      14 The minor arcs m 14141 Qualitative goals and main ideas 14142 Combinatorial identities 16143 Type I sums 18144 Type II or bilinear sums 21

      15 Integrals over the major and minor arcs 2416 Some remarks on computations 28

      2 Notation and preliminaries 3121 General notation 3122 Dirichlet characters and L functions 3223 Fourier transforms and exponential sums 3224 Mellin transforms 3425 Bounds on sums of micro and Λ 3526 Interval arithmetic and the bisection method 38

      I Minor arcs 41

      3 Introduction 4331 Results 4432 Comparison to earlier work 4533 Basic setup 45

      331 Vaughanrsquos identity 45

      iii

      iv CONTENTS

      332 An alternative route 47

      4 Type I sums 5141 Trigonometric sums 5142 Type I estimates 56

      421 Type I variations 63

      5 Type II sums 7751 The sum S1 cancellation 80

      511 Reduction to a sum with micro 80512 Explicit bounds for a sum with micro 84513 Estimating the triple sum 89

      52 The sum S2 the large sieve primes and tails 93

      6 Minor-arc totals 10161 The smoothing function 10162 Contributions of different types 102

      621 Type I terms SI1 102622 Type I terms SI2 103623 Type II terms 107

      63 Adjusting parameters Calculations 117631 First choice of parameters q le y 119632 Second choice of parameters 125

      64 Conclusion 133

      II Major arcs 135

      7 Major arcs overview and results 13771 Results 13872 Main ideas 140

      8 The Mellin transform of the twisted Gaussian 14381 How to choose a smoothing function 14582 The twisted Gaussian overview and setup 146

      821 Relation to the existing literature 146822 General approach 147

      83 The saddle point 149831 The coordinates of the saddle point 149832 The direction of steepest descent 150

      84 The integral over the contour 152841 A simple contour 152842 Another simple contour 157

      85 Conclusions 159

      CONTENTS v

      9 Explicit formulas 16391 A general explicit formula 16492 Sums and decay for the Gaussian 17593 The case of ηlowast(t) 17894 The case of η+(t) 18495 A sum for η+(t)2 18896 A verification of zeros and its consequences 193

      III The integral over the circle 199

      10 The integral over the major arcs 201101 Decomposition of Sη by characters 202102 The integral over the major arcs the main term 204103 The `2 norm over the major arcs 207104 The integral over the major arcs conclusion 212

      11 Optimizing and adapting smoothing functions 217111 The symmetric smoothing function η 218

      1111 The product η(t)η(ρminus t) 218112 The smoothing function ηlowast adapting minor-arc bounds 219

      12 The `2 norm and the large sieve 227121 Variations on the large sieve for primes 227122 Bounding the quotient in the large sieve for primes 232

      13 The integral over the minor arcs 245131 Putting together `2 bounds over arcs and `infin bounds 245132 The minor-arc total 248

      14 Conclusion 259141 The `2 norm over the major arcs explicit version 259142 The total major-arc contribution 261143 The minor-arc total explicit version 267144 Conclusion proof of main theorem 275

      IV Appendices 277

      A Norms of smoothing functions 279A1 The decay of a Mellin transform 280A2 The difference η+ minus η in `2 norm 283A3 Norms involving η+ 285A4 Norms involving ηprime+ 286A5 The `infin-norm of η+ 288

      vi CONTENTS

      B Norms of Fourier transforms 291B1 The Fourier transform of ηprimeprime2 291B2 Bounds involving a logarithmic factor 293

      C Sums involving Λ and φ 297C1 Sums over primes 297C2 Sums involving φ 299

      D Checking small n by checking zeros of ζ(s) 305

      Preface

      ἐγγὺς δrsquo ἦν τέλεος ὃ δὲ τὀ τρίτον ἧκε χ[αμᾶζε

      σὺν τῶι δrsquo ἐξέφυγεν θάνατον καὶ κῆ[ρα μέλαιναν

      Hesiod () Ehoiai fr 7621ndash2 Merkelbach and West

      The ternary Goldbach conjecture (or three-prime conjecture) states that every oddnumber n greater than 5 can be written as the sum of three primes The purpose of thisbook is to give the first full proof of this conjecture

      The proof builds on the great advances made in the early 20th century by Hardy andLittlewood (1922) and Vinogradov (1937) Progress since then has been more gradualIn some ways it was necessary to clear the board and start work using only the mainexisting ideas towards the problem together with techniques developed elsewhere

      Part of the aim has been to keep the exposition as accessible as possible withan emphasis on qualitative improvements and new technical ideas that should be ofuse elsewhere The main strategy was to give an analytic approach that is efficientrelatively clean and as it must be for this problem explicit the focus does not lie inoptimizing explicit constants or in performing calculations necessary as these tasksare

      Organization In the introduction after a summary of the history of the problemwe will go over a detailed outline of the proof The rest of the book is divided in threeparts structured so that they can be read independently the first two parts do not referto each other and the third part uses only the main results (clearly marked) of the firsttwo parts

      As is the case in most proofs involving the circle method the problem is reduced toshowing that a certain integral over the ldquocirclerdquo RZ is non-zero The circle is dividedinto major arcs and minor arcs In Part I ndash in some ways the technical heart of the proofndash we will see how to give upper bounds on the integrand when α is in the minor arcsPart II will provide rather precise estimates for the integrand when the variable α is inthe major arcs Lastly Part III shows how to use these inputs as well as possible toestimate the integral

      Each part and each chapter starts with a general discussion of the strategy andthe main ideas involved Some of the more technical bounds and computations arerelegated to the appendices

      vii

      viii PREFACE

      Dependencies between the chapters

      1 2

      3 7 10

      4 8 11

      5 9 12

      6 13

      14

      Introduction Notation andpreliminaries

      Minor arcsintroduction

      Type I sums

      Type II sums

      Minor-arctotals

      Major arcsoverview

      Mellin transform oftwisted Gaussian

      Explicit formulas

      The integral overthe major arcs

      Smoothing func-tions and their use

      The `2 norm andthe large sieve

      The integral overthe minor arcs

      Conclusion

      Acknowledgements

      The author is very thankful to D Platt who working in close coordination with himprovided GRH verifications in the necessary ranges and also helped him with the usageof interval arithmetic He is also deeply grateful to O Ramare who in reply to hisrequests prepared and sent for publication several auxiliary results and who otherwiseprovided much-needed feedback

      The author is also much indebted to A Booker B Green R Heath-Brown HKadiri D Platt T Tao and M Watkins for many discussions on Goldbachrsquos prob-lem and related issues Several historical questions became clearer due to the helpof J Brandes K Gong R Heath-Brown Z Silagadze R Vaughan and T WooleyAdditional references were graciously provided by R Bryant S Huntsman and IRezvyakova Thanks are also due to B Bukh A Granville and P Sarnak for theirvaluable advice

      The introduction is largely based on the authorrsquos article for the Proceedings of the2014 ICM [Hel14b] That article in turn is based in part on the informal note [Hel13b]which was published in Spanish translation ([Hel13a] translated by M A Morales andthe author and revised with the help of J Cilleruelo and M Helfgott) and in a Frenchversion ([Hel14a] translated by M Bilu and revised by the author) The proof firstappeared as a series of preprints [Helb] [Hela] [Helc]

      Travel and other expenses were funded in part by the Adams Prize and the PhilipLeverhulme Prize The authorrsquos work on the problem started at the Universite deMontreal (CRM) in 2006 he is grateful to both the Universite de Montreal and theEcole Normale Superieure for providing pleasant working environments During thelast stages of the work travel was partly covered by ANR Project Caesar No ANR-12-BS01-0011

      The present work would most likely not have been possible without free and pub-licly available software SAGE PARI Maxima gnuplot VNODE-LP PROFIL BIASand of course LATEX Emacs the gcc compiler and GNULinux in general Some ex-ploratory work was done in SAGE and Mathematica Rigorous calculations used eitherD Plattrsquos interval-arithmetic package (based in part on Crlibm) or the PROFILBIASinterval arithmetic package underlying VNODE-LP

      The calculations contained in this paper used a nearly trivial amount of resourcesthey were all carried out on the authorrsquos desktop computers at home and work How-ever D Plattrsquos computations [Plab] used a significant amount of resources kindly do-nated to D Platt and the author by several institutions This crucial help was providedby MesoPSL (affiliated with the Observatoire de Paris and Paris Sciences et Lettres)

      ix

      x ACKNOWLEDGEMENTS

      Universite de Paris VIVII (UPMC - DSI - Pole Calcul) University of Warwick (thanksto Bill Hart) University of Bristol France Grilles (French National Grid InfrastructureDIRAC national instance) Universite de Lyon 1 and Universite de Bordeaux 1 BothD Platt and the author would like to thank the donating organizations their technicalstaff and all those who helped to make these resources available to them

      Chapter 1

      Introduction

      The question we will discuss or one similar to it seems to have been first posed byDescartes in a manuscript published only centuries after his death [Des08 p 298]Descartes states ldquoSed amp omnis numerus par fit ex uno vel duobus vel tribus primisrdquo(ldquoBut also every even number is made out of one two or three prime numbersrdquo1) Thisstatement comes in the middle of a discussion of sums of polygonal numbers such asthe squares

      Statements on sums of primes and sums of values of polynomials (polygonal num-bers powers nk etc) have since shown themselves to be much more than mere cu-riosities ndash and not just because they are often very difficult to prove Whereas the studyof sums of powers can rely on their algebraic structure the study of sums of primesleads to the realization that from several perspectives the set of primes behaves muchlike the set of integers or like a random set of integers (It also leads to the realizationthat this is very hard to prove)

      If instead of the primes we had a random set of odd integers S whose density ndashan intuitive concept that can be made precise ndash equaled that of the primes then wewould expect to be able to write every odd number as a sum of three elements of Sand every even number as the sum of two elements of S We would have to check byhand whether this is true for small odd and even numbers but it is relatively easy toshow that after a long enough check it would be very unlikely that there would be anyexceptions left among the infinitely many cases left to check

      The question then is in what sense we need the primes to be like a random set ofintegers in other words we need to know what we can prove about the regularities ofthe distribution of the primes This is one of the main questions of analytic numbertheory progress on it has been very slow and difficult

      Fourier analysis expresses information on the distribution of a sequence in termsof frequencies In the case of the primes what may be called the main frequencies ndashthose in the major arcs ndash correspond to the same kind of large-scale distribution thatis encoded by L-functions the family of functions to which the Riemann zeta function

      1Thanks are due to J Brandes and R Vaughan for a discussion on a possible ambiguity in the Latinwording Descartesrsquo statement is mentioned (with a translation much like the one given here) in DicksonrsquosHistory [Dic66 Ch XVIII]

      1

      2 CHAPTER 1 INTRODUCTION

      belongs On some of the crucial questions on L-functions the limits of our knowledgehave barely budged in the last century There is something relatively new now namelyrigorous numerical data of non-negligible scope still such data is by definition finiteand as a consequence its range of applicability is very narrow Thus the real questionin the major-arc regime is how to use well the limited information we do have on thelarge-scale distribution of the primes As we will see this requires delicate work onexplicit asymptotic analysis and smoothing functions

      Outside the main frequencies ndash that is in what are called the minor arcs ndash estimatesbased on L-functions no longer apply and what is remarkable is that one can sayanything meaningful on the distribution of the primes Vinogradov was the first to giveunconditional non-trivial bounds showing that there are no great irregularities in theminor arcs this is what makes them ldquominorrdquo Here the task is to give sharper boundsthan Vinogradov It is in this regime that we can genuinely say that we learn a littlemore about the distribution of the primes based on what is essentially an elementaryand highly optimized analytic-combinatorial analysis of exponential sums ie Fouriercoefficients given by series (supported on the primes in our case)

      The circle method reduces an additive problem ndash that is a problems on sums suchas sums of primes powers etc ndash to the estimation of an integral on the space offrequencies (the ldquocirclerdquo RZ) In the case of the primes as we have just discussed wehave precise estimates on the integrand on part of the circle (the major arcs) and upperbounds on the rest of the circle (the minor arcs) Putting them together efficiently togive an estimate on the integral is a delicate matter we leave it for the last part as itis really what is particular to our problem as opposed to being of immediate generalrelevance to the study of the primes As we shall see estimating the integral well doesinvolve using ndash and improving ndash general estimates on the variance of irregularities inthe distribution of the primes as given by the large sieve

      In fact one of the main general lessons of the proof is that there is a very closerelationship between the circle method and the large sieve we will use the large sievenot just as a tool ndash which we shall incidentally sharpen in certain contexts ndash but as asource for ideas on how to apply the circle method more effectively

      This has been an attempt at a first look from above Let us now undertake a moreleisurely and detailed overview of the problem and its solution

      11 History and new developments

      The history of the conjecture starts properly with Euler and his close friend ChristianGoldbach both of whom lived and worked in Russia at the time of their correspon-dence ndash about a century after Descartesrsquo isolated statement Goldbach a man of manyinterests is usually classed as a serious amateur he seems to have awakened Eulerrsquospassion for number theory which would lead to the beginning of the modern era ofthe subject [Wei84 Ch 3 sectIV] In a letter dated June 7 1742 Goldbach made aconjectural statement on prime numbers and Euler rapidly reduced it to the followingconjecture which he said Goldbach had already posed to him every positive integercan be written as the sum of at most three prime numbers

      11 HISTORY AND NEW DEVELOPMENTS 3

      We would now say ldquoevery integer greater than 1rdquo since we no long consider 1 tobe a prime number Moreover the conjecture is nowadays split into two

      bull the weak or ternary Goldbach conjecture states that every odd integer greaterthan 5 can be written as the sum of three primes

      bull the strong or binary Goldbach conjecture states that every even integer greaterthan 2 can be written as the sum of two primes

      As their names indicate the strong conjecture implies the weak one (easily subtract 3from your odd number n then express nminus 3 as the sum of two primes)

      The strong conjecture remains out of reach A short while ago ndash the first completeversion appeared on May 13 2013 ndash the author proved the weak Goldbach conjecture

      Theorem 111 Every odd integer greater than 5 can be written as the sum of threeprimes

      In 1937 I M Vinogradov proved [Vin37] that the conjecture is true for all oddnumbers n larger than some constant C (Hardy and Littlewood had proved the samestatement under the assumption of the Generalized Riemann Hypothesis which weshall have the chance to discuss later)

      It is clear that a computation can verify the conjecture only for n le c c a constantcomputations have to be finite What can make a result coming from analytic numbertheory be valid only for n ge C

      An analytic proof generally speaking gives us more than just existence In thiskind of problem it gives us more than the possibility of doing something (here writingan integer n as the sum of three primes) It gives us a rigorous estimate for the numberof ways in which this something is possible that is it shows us that this number ofways equals

      main term + error term (11)

      where the main term is a precise quantity f(n) and the error term is something whoseabsolute value is at most another precise quantity g(n) If f(n) gt g(n) then (11) isnon-zero ie we will have shown the existence of a way to write our number as thesum of three primes

      (Since what we truly care about is existence we are free to weigh different waysof writing n as the sum of three primes however we wish ndash that is we can decide thatsome primes ldquocountrdquo twice or thrice as much as others and that some do not count atall)

      Typically after much work we succeed in obtaining (11) with f(n) and g(n) suchthat f(n) gt g(n) asymptotically that is for n large enough To give a highly simplifiedexample if say f(n) = n2 and g(n) = 100n32 then f(n) gt g(n) for n gt C whereC = 104 and so the number of ways (11) is positive for n gt C

      We want a moderate value of C that is a C small enough that all cases n le C canbe checked computationally To ensure this we must make the error term bound g(n)as small as possible This is our main task A secondary (and sometimes neglected)possibility is to rig the weights so as to make the main term f(n) larger in comparisonto g(n) this can generally be done only up to a certain point but is nonetheless veryhelpful

      4 CHAPTER 1 INTRODUCTION

      As we said the first unconditional proof that odd numbers n ge C can be writtenas the sum of three primes is due to Vinogradov Analytic bounds fall into severalcategories or stages quite often successive versions of the same theorem will gothrough successive stages

      1 An ineffective result shows that a statement is true for some constant C but givesno way to determine what the constant C might be Vinogradovrsquos first proof ofhis theorem (in [Vin37]) is like this it shows that there exists a constant C suchthat every odd number n gt C is the sum of three primes yet give us no hope offinding out what the constant C might be2 Many proofs of Vinogradovrsquos resultin textbooks are also of this type

      2 An effective but not explicit result shows that a statement is true for someunspecified constant C in a way that makes it clear that a constant C couldin principle be determined following and reworking the proof with great careVinogradovrsquos later proof ([Vin47] translated in [Vin54]) is of this nature AsChudakov [Chu47 sectIV2] pointed out the improvement on [Vin37] given byMardzhanishvili [Mar41] already had the effect of making the result effective3

      3 An explicit result gives a value of C According to [Chu47 p 201] the firstexplicit version of Vinogradovrsquos result was given by Borozdkin in his unpub-lished doctoral dissertation written under the direction of Vinogradov (1939)C = exp(exp(exp(4196))) Such a result is by definition also effectiveBorodzkin later [Bor56] gave the value C = ee

      16038

      though he does not seem tohave published the proof The best ndash that is smallest ndash value of C known beforethe present work was that of Liu and Wang [LW02] C = 2 middot 101346

      4 What we may call an efficient proof gives a reasonable value for C ndash in our casea value small enough that checking all cases up to C is feasible

      How far were we from an efficient proof That is what sort of computation couldever be feasible The situation was paradoxical the conjecture was known above anexplicit C but C = 2 middot101346 is so large that it could not be said that the problem couldbe attacked by any foreseeable computational means within our physical universe (Atruly brute-force verification up to C takes at least C steps a cleverer verification takeswell over

      radicC steps The number of picoseconds since the beginning of the universe is

      less than 1030 whereas the number of protons in the observable universe is currentlyestimated at sim 1080 [Shu92] this limits the number of steps that can be taken inany currently imaginable computer even if it were to do parallel processing on anastronomical scale) Thus the only way forward was a series of drastic improvementsin the mathematical rather than computational side

      I gave a proof with C = 1029 in May 2013 Since D Platt and I had verifiedthe conjecture for all odd numbers up to n le 88 middot 1030 by computer [HP13] thisestablished the conjecture for all odd numbers n

      2Here as is often the case in ineffective results in analytic number theory the underlying issue is that ofSiegel zeros which are believed not to exist but have not been shown not to the strongest bounds on (ieagainst) such zeros are ineffective and so are all of the many results using such estimates

      3The proof in [Mar41] combined the bounds in [Vin37] with a more careful accounting of the effect ofthe single possible Siegel zero within range

      11 HISTORY AND NEW DEVELOPMENTS 5

      (In December 2013 I reduced C to 1027 The verification of the ternary Gold-bach conjecture up to n le 1027 can be done on a home computer over a weekendas of the time of writing (2014) It must be said that this uses the verification of thebinary Goldbach conjecture for n le 4 middot 1018 [OeSHP14] which itself required com-putational resources far outside the home-computing range Checking the conjectureup to n le 1027 was not even the main computational task that needed to be accom-plished to establish the Main Theorem ndash that task was the finite verification of zeros ofL-functions in [Plab] a general-purpose computation that should be useful elsewhere)

      What was the strategy of the proof The basic framework is the one pioneered byHardy and Littlewood for a variety of problems ndash namely the circle method which aswe shall see is an application of Fourier analysis over Z (There are other later routesto Vinogradovrsquos result see [HB85] [FI98] and especially the recent work [Sha14]which avoids using anything about zeros of L-functions inside the critical strip) Vino-gradovrsquos proof like much of the later work on the subject was based on a detailedanalysis of exponential sums ie Fourier transforms over Z So is the proof that wewill sketch

      At the same time the distance between 2 middot 101346 and 1027 is such that we cannothope to get to 1027 (or any other reasonable constant) by fine-tuning previous workRather we must work from scratch using the basic outline in Vinogradovrsquos originalproof and other initially unrelated developments in analysis and number theory (no-tably the large sieve) Merely improving constants will not do rather we must doqualitatively better than previous work (by non-constant factors) if we are to have anychance to succeed It is on these qualitative improvements that we will focus

      It is only fair to review some of the progress made between Vinogradovrsquos time andours Here we will focus on results later we will discuss some of the progress madein the techniques of proof See [Dic66 Ch XVIII] for the early history of the problem(before Hardy and Littlewood) see R Vaughanrsquos ICM lecture notes on the ternaryGoldbach problem [Vau80] for some further details on the history up to 1978

      In 1933 Schnirelmann proved [Sch33] that every integer n gt 1 can be written asthe sum of at most K primes for some unspecified constant K (This pioneering workis now considered to be part of the early history of additive combinatorics) In 1969Klimov gave an explicit value for K (namely K = 6 middot 109) he later improved theconstant to K = 115 (with G Z Piltay and T A Sheptickaja) and K = 55 Laterthere were results by Vaughan [Vau77a] (K = 27) Deshouillers [Des77] (K = 26)and Riesel-Vaughan [RV83] (K = 19)

      Ramare showed in 1995 that every even number n gt 1 can be written as the sum ofat most 6 primes [Ram95] In 2012 Tao proved [Tao14] that every odd number n gt 1is the sum of at most 5 primes

      There have been other avenues of attack towards the strong conjecture Using ideasclose to those of Vinogradovrsquos Chudakov [Chu37] [Chu38] Estermann [Est37] andvan der Corput [van37] proved (independently from each other) that almost every evennumber (meaning all elements of a subset of density 1 in the even numbers) can bewritten as the sum of two primes In 1973 J-R Chen showed [Che73] that every even

      6 CHAPTER 1 INTRODUCTION

      number n larger than a constant C can be written as the sum of a prime number andthe product of at most two primes (n = p1 + p2 or n = p1 + p2p3) IncidentallyJ-R Chen himself together with T-Z Wang was responsible for the best bounds onC (for ternary Goldbach) before Lui and Wang C = exp(exp(11503)) lt 4 middot 1043000

      [CW89] and C = exp(exp(9715)) lt 6 middot 107193 [CW96]Matters are different if one assumes the Generalized Riemann Hypothesis (GRH)

      A careful analysis [Eff99] of Hardy and Littlewoodrsquos work [HL22] gives that everyodd number n ge 124 middot 1050 is the sum of three primes if GRH is true4 Accordingto [Eff99] the same statement with n ge 1032 was proven in the unpublished doctoraldissertation of B Lucke a student of E Landaursquos in 1926 Zinoviev [Zin97] improvedthis to n ge 1020 A computer check ([DEtRZ97] see also [Sao98]) showed that theconjecture is true for n lt 1020 thus completing the proof of the ternary Goldbachconjecture under the assumption of GRH What was open until now was of course theproblem of giving an unconditional proof

      12 The circle method Fourier analysis on Z

      It is common for a first course on Fourier analysis to focus on functions over the re-als satisfying f(x) = f(x + 1) or what is the same functions f RZ rarr CSuch a function (unless it is fairly pathological) has a Fourier series converging to itthis is just the same as saying that f has a Fourier transform f Z rarr C definedby f(n) =

      intRZ f(α)e(minusαn)dα and satisfying f(α) =

      sumnisinZ f(n)e(αn) (Fourier

      inversion theorem) where e(t) = e2πitIn number theory we are especially interested in functions f Zrarr C Then things

      are exactly the other way around provided that f decays reasonably fast as n rarr plusmninfin(or becomes 0 for n large enough) f has a Fourier transform f RZ rarr C definedby f(α) =

      sumn f(n)e(minusαn) and satisfying f(n) =

      intRZ f(α)e(αn)dα (Highbrow

      talk we already knew that Z is the Fourier dual of RZ and so of course RZ isthe Fourier dual of Z) ldquoExponential sumsrdquo (or ldquotrigonometrical sumsrdquo as in the titleof [Vin54]) are sums of the form

      sumn f(α)e(minusαn) of course the ldquocirclerdquo in ldquocircle

      methodrdquo is just a name for RZ (To see an actual circle in the complex plane look atthe image of RZ under the map α 7rarr e(α))

      The study of the Fourier transform f is relevant to additive problems in numbertheory ie questions on the number of ways of writing n as a sum of k integers ofa particular form Why One answer could be that f gives us information about theldquorandomnessrdquo of f if f were the characteristic function of a random set then f(α)would be very small outside a sharp peak at α = 0

      We can also give a more concrete and immediate answer Recall that in generalthe Fourier transform of a convolution equals the product of the transforms over Z

      4In fact Hardy Littlewood and Effinger use an assumption somewhat weaker than GRH they assumethat Dirichlet L-functions have no zeroes satisfying lt(s) ge θ where θ lt 34 is arbitrary (We will reviewDirichlet L-functions in a minute)

      12 THE CIRCLE METHOD FOURIER ANALYSIS ON Z 7

      this means that for the additive convolution

      (f lowast g)(n) =sum

      m1m2isinZm1+m2=n

      f(m1)g(m2)

      the Fourier transform satisfies the simple rule

      f lowast g(α) = f(α) middot g(α)

      We can see right away from this that (f lowast g)(n) can be non-zero only if n can bewritten as n = m1 + m2 for some m1 m2 such that f(m1) and g(m2) are non-zeroSimilarly (f lowastglowasth)(n) can be non-zero only if n can be written as n = m1 +m2 +m3

      for some m1 m2 m3 such that f(m1) f2(m2) and f3(m3) are all non-zero Thissuggests that to study the ternary Goldbach problem we define f1 f2 f3 Zrarr C sothat they take non-zero values only at the primes

      Hardy and Littlewood defined f1(n) = f2(n) = f3(n) = 0 for n non-prime (andalso for n le 0) and f1(n) = f2(n) = f3(n) = (log n)eminusnx for n prime (where x isa parameter to be fixed later) Here the factor eminusnx is there to provide ldquofast decayrdquoso that everything converges as we will see later Hardy and Littlewoodrsquos choice ofeminusnx (rather than some other function of fast decay) comes across in hindsight asbeing very clever though not quite best-possible (Their ldquochoicerdquo was to some extentnot a choice but an artifact of their version of the circle method which was framedin terms of power series not in terms of exponential sums with arbitrary smoothingfunctions) The term log n is there for technical reasons ndash in essence it makes senseto put it there because a random integer around n has a chance of about 1(log n) ofbeing prime

      We can see that (f1 lowast f2 lowast f3)(n) 6= 0 if and only if n can be written as the sumof three primes Our task is then to show that (f1 lowast f2 lowast f3)(n) (ie (f lowast f lowast f)(n))is non-zero for every n larger than a constant C sim 1027 Since the transform of aconvolution equals a product of transforms

      (f1lowastf2lowastf3)(n) =

      intRZ

      f1 lowast f2 lowast f3(α)e(αn)dα =

      intRZ

      (f1f2f3)(α)e(αn)dα (12)

      Our task is thus to show that the integralintRZ(f1f2f3)(α)e(αn)dα is non-zero

      As it happens f(α) is particularly large when α is close to a rational with smalldenominator Moreover for such α it turns out we can actually give rather preciseestimates for f(α) Define M (called the set of major arcs) to be a union of narrowarcs around the rationals with small denominator

      M =⋃qler

      ⋃a mod q

      (aq)=1

      (a

      qminus 1

      qQa

      q+

      1

      qQ

      )

      where Q is a constant times xr and r will be set later (This is a slight simplificationthe major-arc set we will actually use in the course of the proof will be a little different

      8 CHAPTER 1 INTRODUCTION

      due to a distinction between odd and even q) We can writeintRZ

      (f1f2f3)(α)e(αn)dα =

      intM

      (f1f2f3)(α)e(αn)dα+

      intm

      (f1f2f3)(α)e(αn)dα

      (13)where m is the complement (RZ) M (called minor arcs)

      Now we simply do not know how to give precise estimates for f(α) when α is inm However as Vinogradov realized one can give reasonable upper bounds on |f(α)|for α isin m This suggests the following strategy show thatint

      m

      |f1(α)||f2(α)||f3(α)|dα ltintM

      f1(α)f2(α)f3(α)e(αn)dα (14)

      By (12) and (13) this will imply immediately that (f1 lowast f2 lowast f3)(n) gt 0 and so wewill be done

      The name of circle method is given to the study of additive problems by means ofFourier analysis over Z and in particular to the use of a subdivision of the circle RZinto major and minor arcs to estimate the integral of a Fourier transform There wasa ldquocirclerdquo already in Hardy and Ramanujanrsquos work [HR00] but the subdivision intomajor and minor arcs is due to Hardy and Littlewood who also applied their methodto a wide variety of additive problems (Hence ldquothe Hardy-Littlewood methodrdquo as analternative name for the circle method) For instance before working on the ternaryGoldbach conjecture they studied the question of whether every n gt C can be writtenas the sum of kth powers (Waringrsquos problem) In fact they used a subdivision intomajor and minor arcs to study Waringrsquos problem and not for the ternary Goldbachproblem they had no minor-arc bounds for ternary Goldbach and their use of GRHhad the effect of making every α isin RZ yield to a major-arc treatment

      Vinogradov worked with finite exponential sums ie fi compactly supportedFrom todayrsquos perspective it is clear that there are applications (such as ours) in whichit can be more important for fi to be smooth than compactly supported still Vino-gradovrsquos simplifications were an incentive to further developments In the case of theternary Goldbachrsquos problem his key contribution consisted in the fact that he couldgive bounds on f(α) for α in the minor arcs without using GRH

      An important note in the case of the binary Goldbach conjecture the method failsat (14) and not before if our understanding of the actual value of fi(α) is at all correctit is simply not true in general thatint

      m

      |f1(α)||f2(α)|dα ltintM

      f1(α)f2(α)e(αn)dα

      Let us see why this is not surprising Set f1 = f2 = f3 = f for simplicity so thatwe have the integral of the square (f(α))2 for the binary problem and the integral ofthe cube (f(α))3 for the ternary problem Squaring like cubing amplifies the peaksof f(α) which are at the rationals of small denominator and their immediate neighbor-hoods (the major arcs) however cubing amplifies the peaks much more than squaringThis is why even though the arcs making up M are very narrow

      intM

      (f(α))3e(αn)dα

      13 THE MAJOR ARCS M 9

      is larger thanintm|f(α)|3dα that explains the name major arcs ndash they are not large but

      they give the major part of the contribution In contrast squaring amplifies the peaksless and this is why the absolute value of

      intMf(α)2e(αn)dα is in general smaller thanint

      m|f(α)|2dα As nobody knows how to prove a precise estimate (and in particular

      lower bounds) on f(α) for α isin m the binary Goldbach conjecture is still very muchout of reach

      To prove the ternary Goldbach conjecture it is enough to estimate both sides of(14) for carefully chosen f1 f2 f3 and compare them This is our task from now on

      13 The major arcs M

      131 What do we really know about L-functions and their zerosBefore we start let us give a very brief review of basic analytic number theory (in thesense of say [Dav67]) A Dirichlet character χ Z rarr C of modulus q is a characterof (ZqZ)lowast lifted to Z (In other words χ(n) = χ(n+ q) for all n χ(ab) = χ(a)χ(b)for all a b and χ(n) = 0 for (n q) 6= 1) A Dirichlet L-series is defined by

      L(s χ) =

      infinsumn=1

      χ(n)nminuss

      for lt(s) gt 1 and by analytic continuation for lt(s) le 1 (The Riemann zeta functionζ(s) is the L-function for the trivial character ie the character χ such that χ(n) = 1for all n) Taking logarithms and then derivatives we see that

      minus Lprime(s χ)

      L(s χ)=

      infinsumn=1

      χ(n)Λ(n)nminuss (15)

      for lt(s) gt 1 where Λ is the von Mangoldt function (Λ(n) = log p if n is some primepower pα α ge 1 and Λ(n) = 0 otherwise)

      Dirichlet introduced his characters and L-series so as to study primes in arithmeticprogressions In general and after some work (15) allows us to restate many sumsover the primes (such as our Fourier transforms f(α)) as sums over the zeros ofL(s χ)A non-trivial zero of L(s χ) is a zero of L(s χ) such that 0 lt lt(s) lt 1 (The otherzeros are called trivial because we know where they are namely at negative integersand in some cases also on the line lt(s) = 0 In order to eliminate all zeros onlt(s) = 0 outside s = 0 it suffices to assume that χ is primitive a primitive charactermodulo q is one that is not induced by (ie not the restriction of) any character modulod|q d lt q)

      The Generalized Riemann Hypothesis for Dirichlet L-functions is the statementthat for every Dirichlet character χ every non-trivial zero of L(s χ) satisfies lt(s) =12 Of course the Generalized Riemann Hypothesis (GRH) ndash and the Riemann Hy-pothesis which is the special case of χ trivial ndash remains unproven Thus if we want toprove unconditional statements we need to make do with partial results towards GRHTwo kinds of such results have been proven

      10 CHAPTER 1 INTRODUCTION

      bull Zero-free regions Ever since the late nineteenth century (Hadamard de laVallee-Poussin) we have known that there are hourglass-shaped regions (moreprecisely of the shape c

      log t le σ le 1minus clog t where c is a constant and where we

      write s = σ + it) outside which non-trivial zeros cannot lie Explicit values forc are known [McC84b] [Kad05] [Kad] There is also the Vinogradov-Korobovregion [Kor58] [Vin58] which is broader asymptotically but narrower in mostof the practical range (see [For02] however)

      bull Finite verifications of GRH It is possible to (ask a computer to) prove smallfinite fragments of GRH in the sense of verifying that all non-trivial zeros ofa given finite set of L-functions with imaginary part less than some constant Hlie on the critical line lt(s) = 12 Such verifications go back to Riemannwho checked the first few zeros of ζ(s) Large-scale rigorous computer-basedverifications are now a possibility

      Most work in the literature follows the first alternative though [Tao14] did use afinite verification of RH (ie GRH for the trivial character) Unfortunately zero-freeregions seem too narrow to be useful for the ternary Goldbach problem Thus we areleft with the second alternative

      In coordination with the present work Platt [Plab] verified that all zeros s of L-functions for characters χ with modulus q le 300000 satisfying =(s) le Hq lie on theline lt(s) = 12 where

      bull Hq = 108q for q odd and

      bull Hq = max(108q 200 + 75 middot 107q) for q even

      This was a medium-large computation taking a few hundreds of thousands of core-hours on a parallel computer It used interval arithmetic for the sake of rigor we willlater discuss what this means

      The choice to use a finite verification of GRH rather than zero-free regions hadconsequences on the manner in which the major and minor arcs had to be chosen Aswe shall see such a verification can be used to give very precise bounds on the majorarcs but also forces us to define them so that they are narrow and their number isconstant To be precise the major arcs were defined around rationals aq with q le rr = 300000 moreover as will become clear the fact that Hq is finite will force theirwidth to be bounded by c0rqx where c0 is a constant (say c0 = 8)

      132 Estimates of f(α) for α in the major arcs

      Recall that we want to estimate sums of the type f(α) =sumf(n)e(minusαn) where

      f(n) is something like (log n)η(nx) for n equal to a prime and 0 otherwise hereη Rrarr C is some function of fast decay such as Hardy and Littlewoodrsquos choice

      η(t) =

      eminust for t ge 0

      0 for t lt 0

      13 THE MAJOR ARCS M 11

      Let us modify this just a little ndash we will actually estimate

      Sη(α x) =sum

      Λ(n)e(αn)η(nx) (16)

      where Λ is the von Mangoldt function (as in (15)) The use of α rather thanminusα is justa bow to tradition as is the use of the letter S (for ldquosumrdquo) however the use of Λ(n)rather than just plain log p does actually simplify matters

      The function η here is sometimes called a smoothing function or simply a smooth-ing It will indeed be helpful for it to be smooth on (0infin) but in principle it neednot even be continuous (Vinogradovrsquos work implicitly uses in effect the ldquobrutal trun-cationrdquo 1[01](t) defined to be 1 when t isin [0 1] and 0 otherwise that would be fine forthe minor arcs but as it will become clear it is a bad idea as far as the major arcs areconcerned)

      Assume α is on a major arc meaning that we can write α = aq+δx for some aq(q small) and some δ (with |δ| small) We can write Sη(α x) as a linear combination

      Sη(α x) =sumχ

      cχSηχ

      x x

      )+ tiny error term (17)

      where

      Sηχ

      x x

      )=sum

      Λ(n)χ(n)e(δnx)η(nx) (18)

      In (17) χ runs over primitive Dirichlet characters of moduli d|q and cχ is small(|cχ| le

      radicdφ(q))

      Why are we expressing the sums Sη(α x) in terms of the sums Sηχ(δx x) whichlook more complicated The argument has become δx whereas before it was αHere δ is relatively small ndash smaller than the constant c0r in our setup In other wordse(δnx) will go around the circle a bounded number of times as n goes from 1 up to aconstant times x (by which time η(nx) has become small because η is of fast decay)This makes the sums much easier to estimate

      To estimate the sums Sηχ we will use L-functions together with one of the mostcommon tools of analytic number theory the Mellin transform This transform is es-sentially a Laplace transform with a change of variables and a Laplace transform inturn is a Fourier transform taken on a vertical line in the complex plane For f of fastenough decay the Mellin transform F = Mf of f is given by

      F (s) =

      int infin0

      f(t)tsdt

      t

      we can express f in terms of F by the Mellin inversion formula

      f(t) =1

      2πi

      int σ+iinfin

      σminusiinfinF (s)tminussds

      for any σ within an interval We can thus express e(δt)η(t) in terms of its Mellintransform Fδ and then use (15) to express Sηχ in terms of Fδ and Lprime(s χ)L(s χ)

      12 CHAPTER 1 INTRODUCTION

      shifting the integral in the Mellin inversion formula to the left we obtain what is knownin analytic number theory as an explicit formula

      Sηχ(δx x) = [η(minusδ)x]minussumρ

      Fδ(ρ)xρ + tiny error term

      Here the term between brackets appears only for χ trivial In the sum ρ goes over allnon-trivial zeros ofL(s χ) and Fδ is the Mellin transform of e(δt)η(t) (The tiny errorterm comes from a sum over the trivial zeros of L(s χ)) We will obtain the estimatewe desire if we manage to show that the sum over ρ is small

      The point is this if we verify GRH for L(s χ) up to imaginary part H ie ifwe check that all zeroes ρ of L(s χ) with |=(ρ)| le H satisfy lt(ρ) = 12 we have|xρ| =

      radicx In other words xρ is very small (compared to x) However for any

      ρ whose imaginary part has absolute value greater than H we know next to nothingabout its real part other than 0 le lt(ρ) le 1 (Zero-free regions are notoriously weakfor =(ρ) large we will not use them) Hence our only chance is to make sure thatFδ(ρ) is very small when |=(ρ)| ge H

      This has to be true for both δ very small (including the case δ = 0) and for δ not sosmall (|δ| up to c0rq which can be large because r is a large constant) How can wechoose η so that Fδ(ρ) is very small in both cases for τ = =(ρ) large

      The method of stationary phase is useful as an exploratory tool here In brief itsuggests (and can sometimes prove) that the main contribution to the integral

      Fδ(t) =

      int infin0

      e(δt)η(t)tsdt

      t(19)

      can be found where the phase of the integrand has derivative 0 This happens whent = minusτ2πδ (for sgn(τ) 6= sgn(δ)) the contribution is then a moderate factor timesη(minusτ2πδ) In other words if sgn(τ) 6= sgn(δ) and δ is not too small (|δ| ge 8 say)Fδ(σ + iτ) behaves like η(minusτ2πδ) if δ is small (|δ| lt 8) then Fδ behaves like F0which is the Mellin transform Mη of η Here is our goal then the decay of η(t) as|t| rarr infin should be as fast as possible and the decay of the transform Mη(σ + iτ)should also be as fast as possible

      This is a classical dilemma often called the uncertainty principle because it is themathematical fact underlying the physical principle of the same name you cannot havea function η that decreases extremely rapidly and whose Fourier transform (or in thiscase its Mellin transform) also decays extremely rapidly

      What does ldquoextremely rapidlyrdquo mean here It means (as Hardy himself proved)ldquofaster than any exponential eminusCtrdquo Thus Hardy and Littlewoodrsquos choice η(t) = eminust

      seems essentially optimal at first sightHowever it is not optimal We can choose η so that Mη decreases exponentially

      (with a constant C somewhat worse than for η(t) = eminust) but η decreases faster thanexponentially This is a particularly appealing possibility because it is t|δ| and not somuch t that risks being fairly small (To be explicit say we check GRH for charactersof modulus q up to Hq sim 50 middot c0rq ge 50|δ| Then we only know that |τ2πδ| amp8 So for η(t) = eminust η(minusτ2πδ) may be as large as eminus8 which is not negligibleIndeed since this term will be multiplied later by other terms eminus8 is simply not small

      13 THE MAJOR ARCS M 13

      enough On the other hand we can assume that Hq ge 200 (say) and so Mη(s) simeminus(π2)|τ | is completely negligible and will remain negligible even if we replace π2by a somewhat smaller constant)

      We shall take η(t) = eminust22 (that is the Gaussian) This is not the only possible

      choice but it is in some sense natural It is easy to show that the Mellin transform Fδfor η(t) = eminust

      22 is a multiple of what is called a parabolic cylinder function U(a z)with imaginary values for z There are plenty of estimates on parabolic cylinder func-tions in the literature ndash but mostly for a and z real in part because that is one of thecases occuring most often in applications There are some asymptotic expansions andestimates for U(a z) a z general due to Olver [Olv58] [Olv59] [Olv61] [Olv65]but unfortunately they come without fully explicit error terms for a and z within ourrange of interest (The same holds for [TV03])

      In the end I derived bounds for Fδ using the saddle-point method (The methodof stationary phase which we used to choose η seems to lead to error terms that aretoo large) The saddle-point method consists in brief in changing the contour of anintegral to be bounded (in this case (19)) so as to minimize the maximum of theintegrand (To use a metaphor in [dB81] find the lowest mountain pass)

      Here we strive to get clean bounds rather than the best possible constants Considerthe case k = 0 of Corollary 802 with k = 0 it states the following For s = σ + iτwith σ isin [0 1] and |τ | ge max(100 4π2|δ|) we obtain that the Mellin transform Fδ ofη(t)e(δt) with η(t) = eminust

      22 satisfies

      |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

      3001eminus01065( 2|τ|

      |`| )2

      if 4|τ |`2 lt 323286eminus01598|τ | if 4|τ |`2 ge 32

      (110)

      Similar bounds hold for σ in other ranges thus giving us estimates on the Mellintransform Fδ for η(t) = tkeminust

      22 and σ in the critical range [0 1] (We could do a littlebetter if we knew the value of σ but in our applications we do not once we leavethe range in which GRH has been checked We will give a bound (Theorem 801) thatdoes take σ into account and also reflects and takes advantage of the fact that thereis a transitional region around |τ | sim (32)(πδ)2 in practice however we will useCor 802)

      A momentrsquos thought shows that we can also use (110) to deal with the Mellintransform of η(t)e(δt) for any function of the form η(t) = eminust

      22g(t) (or more gener-ally η(t) = tkeminust

      22g(t)) where g(t) is any band-limited function By a band-limitedfunction we could mean a function whose Fourier transform is compactly supportedwhile that is a plausible choice it turns out to be better to work with functions that areband-limited with respect to the Mellin transform ndash in the sense of being of the form

      g(t) =

      int R

      minusRh(r)tminusirdr

      where h Rrarr C is supported on a compact interval [minusRR] withR not too large (sayR = 200) What happens is that the Mellin transform of the product eminust

      22g(t)e(δt)

      is a convolution of the Mellin transform Fδ(s) of eminust22e(δt) (estimated in (110)) and

      14 CHAPTER 1 INTRODUCTION

      that of g(t) (supported in [minusRR]) the effect of the convolution is just to delay decayof Fδ(s) by at most a shift by y 7rarr y minusR

      We wish to estimate Sηχ(δx) for several functions η This motivates us to derivean explicit formula (sect) general enough to work with all the weights η(t) we will workwith while being also completely explicit and free of any integrals that may be tediousto evaluate

      Once that is done and once we consider the input provided by Plattrsquos finite verifi-cation of GRH up to Hq we obtain simple bounds for different weights

      For η(t) = eminust22 x ge 108 χ a primitive character of modulus q le r = 300000

      and any δ isin R with |δ| le 4rq we obtain

      Sηχ

      x x

      )= Iq=1 middot η(minusδ)x+ E middot x (111)

      where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

      |E| le 4306 middot 10minus22 +1radicx

      (650400radicq

      + 112

      ) (112)

      Here η stands for the Fourier transform from R to R normalized as follows η(t) =intinfinminusinfin e(minusxt)η(x)dx Thus η(minusδ) is just

      radic2πeminus2π2δ2 (self-duality of the Gaussian)

      This is one of the main results of Part II see sect71 Similar bounds are also proventhere for η(t) = t2eminust

      22 as well as for a weight of type η(t) = teminust22g(t) where

      g(t) is a band-limited function and also for a weight η defined by a multiplicativeconvolution The conditions on q (namely q le r = 300000) and δ are what weexpected from the outset

      Thus concludes our treatment of the major arcs This is arguably the easiest part ofthe proof it was actually what I left for the end as I was fairly confident it would workout Minor-arc estimates are more delicate let us now examine them

      14 The minor arcs m

      141 Qualitative goals and main ideas

      What kind of bounds do we need What is there in the literatureWe wish to obtain upper bounds on |Sη(α x)| for some weight η and any α isin RZ

      not very close to a rational with small denominator Every α is close to some rationalaq what we are looking for is a bound on |Sη(α x)| that decreases rapidly when qincreases

      Moreover we want our bound to decrease rapidly when δ increases where α =aq + δx In fact the main terms in our bound will be decreasing functions ofmax(1 |δ|8) middot q (Let us write δ0 = max(2 |δ|4) from now on) This will allowour bound to be good enough outside narrow major arcs which will get narrower andnarrower as q increases ndash that is precisely the kind of major arcs we were presupposingin our major-arc bounds

      14 THE MINOR ARCS M 15

      It would be possible to work with narrow major arcs that become narrower as qincreases simply by allowing q to be very large (close to x) and assigning each angleto the fraction closest to it This is in fact the common procedure However thismakes matters more difficult in that we would have to minimize at the same time thefactors in front of terms xq x

      radicq etc and those in front of terms q

      radicqx and so

      on (These terms are being compared to the trivial bound x) Instead we choose tostrive for a direct dependence on δ throughout this will allow us to cap q at a muchlower level thus making terms such as q and

      radicqx negligible (This choice has been

      taken elsewhere in applications of the circle method but strangely seems absent fromprevious work on the ternary Goldbach conjecture)

      How good must our bounds be Since the major-arc bounds are valid only forq le r = 300000 and |δ| le 4rq we cannot afford even a single factor of log x (orany other function tending to infin as x rarr infin) in front of terms such as x

      radicq|δ0| a

      factor like that would make the term larger than the trivial bound x if q|δ0| is equal toa constant (r say) and x is very large Apparently there was no such ldquolog-free boundrdquowith explicit constants in the literature even though such bounds were considered tobe in principle feasible and even though previous work ([Che85] [Dab96] [DR01][Tao14]) had gradually decreased the number of factors of log x (In limited ranges forq there were log-free bounds without explicit constants see [Dab96] [Ram10] Theestimate in [Vin54 Thm 2a 2b] was almost log-free but not quite There were alsobounds [Kar93] [But11] that used L-functions and thus were not really useful in atruly minor-arc regime)

      It also seemed clear that a main bound proportional to (log q)2xradicq (as in [Tao14])

      was too large At the same time it was not really necessary to reach a bound of thebest possible form that could be found through Vinogradovrsquos basic approach namely

      |Sη(α x)| le Cxradicq

      φ(q) (113)

      Such a bound had been proven by Ramare [Ram10] for q in a limited range and Cnon-explicit later in [Ramc] ndash which postdates the first version of [Helb] ndash Ramarebroadened the range to q le x148 and gave an explicit value forC namelyC = 13000Such a bound is a notable achievement but unfortunately it is not useful for ourpurposes Rather we will aim at a bound whose main term is bounded by a constantaround 1 times x(log δ0q)

      radicδ0φ(q) this is slightly worse asymptotically than (113)

      but it is much better in the delicate range of δ0q sim 300000 and in fact for a muchwider range as well

      We see that we have several tasks One of them is the removal of logarithms wecannot afford a single factor of log x and in practice we can afford at most one factorof log q Removing logarithms will be possible in part because of the use of previouslyexisting efficient techniques (the large sieve for sequences with prime support) but alsobecause we will be able to find cancellation at several places in sums coming from acombinatorial identity (namely Vaughanrsquos identity) The task of finding cancellationis particularly delicate because we cannot afford large constants or for that matter

      16 CHAPTER 1 INTRODUCTION

      statements valid only for large x (Bounding a sum such assumn micro(n) efficiently where

      micro is the Mobius function

      micro(n) =

      (minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

      is harder than estimating a sum such assumn Λ(n) equally efficiently even though we

      are used to thinking of the two problems as equivalent)We have said that our bounds will improve as |δ| increases This dependence on

      δ will be secured in different ways at different places Sometimes δ will appear asan argument as in η(minusδ) for η piecewise continuous with ηprime isin L1 we know that|η(t)| rarr 0 as |t| rarr infin Sometimes we will obtain a dependence on δ by using severaldifferent rational approximations to the same α isin R Lastly we will obtain a gooddependence on δ in bilinear sums by supplying a scattered input to a large sieve

      If there is a main moral to the argument it lies in the close relation between thecircle method and the large sieve The circle method rests on the estimation of anintegral involving a Fourier transform f RZ rarr C as we will later see this leadsnaturally to estimating the `2-norm of f on subsets (namely unions of arcs) of the circleRZ The large sieve can be seen as an approximate discrete version of Plancherelrsquosidentity which states that |f |2 = |f |2

      Both in this section and in sect15 we shall use the large sieve in part so as to usethe fact that some of the functions we work with have prime support ie are non-zeroonly on prime numbers There are ways to use prime support to improve the outputof the large sieve In sect15 these techniques will be refined and then translated to thecontext of the circle method where f has (essentially) prime support and |f |2 must beintegrated over unions of arcs (This allows us to remove a logarithm) The main pointis that the large sieve is not being used as a black box rather we can adapt ideas from(say) the large-sieve context and apply them to the circle method

      Lastly there are the benefits of a continuous η Hardy and Littlewood alreadyused a continuous η this was abandoned by Vinogradov presumably for the sake ofsimplicity The idea that smooth weights η can be superior to sharp truncations isnow commonplace As we shall see using a continuous η is helpful in the minor-arcsregime but not as crucial there as for the major arcs We will not use a smooth η wewill prove our estimates for any continuous η that is piecewise C1 and then towardsthe end we will choose to use the same weight η = η2 as in [Tao14] in part because ithas compact support and in part for the sake of comparison The moral here is not quitethe common dictum ldquoalways smoothrdquo but rather that different kinds of smoothing canbe appropriate for different tasks in the end we will show how to coordinate differentsmoothing functions η

      There are other ideas involved for instance some of Vinogradovrsquos lemmas areimproved Let us now go into some of the details

      142 Combinatorial identitiesGenerally since Vinogradov a treatment of the minor arcs starts with a combinatorialidentity expressing Λ(n) (or the characteristic function of the primes) as a sum of two

      14 THE MINOR ARCS M 17

      or more convolutions (In this section by a convolution flowastg we will mean the Dirichletconvolution (f lowast g)(n) =

      sumd|n f(d)g(nd) ie the multiplicative convolution on the

      semigroup of positive integers)In some sense the archetypical identity is

      Λ = micro lowast log

      but it will not usually do the contribution of micro(d) log(nd) with d close to n is toodifficult to estimate precisely There are alternatives for example there is the identity

      Λ(n) log n = micro lowast log2minusΛ lowast Λ (114)

      which underlies an estimate of Selbergrsquos that in turn is the basis for the Erdos-Selbergproof of the prime number theorem see eg [MV07 sect82] More generally onecan decompose Λ(n)(log n)k as micro lowast logk+1 minus a linear combination of convolu-tions this kind of decomposition ndash really just a direct consequence of the develop-ment of (ζ prime(s)ζ(s))(k) ndash will be familiar to some from the exposition of Bombierirsquoswork [Bom76] in [FI10 sect3] (for instance) Another useful identity was that used byDaboussi [Dab96] witness its application in [DR01] which gives explicit estimates onexponential sums over primes

      The proof of Vinogradovrsquos three-prime result was simplified substantially [Vau77b]by the introduction of Vaughanrsquos identity

      Λ(n) = microleU lowast logminusΛleV lowast microleU lowast 1 + 1 lowast microgtU lowast ΛgtV + ΛleV (115)

      where we are using the notation

      fleW =

      f(n) if n leW 0 if n gt W

      fgtW =

      0 if n leW f(n) if n gt W

      Of the resulting sums (sumn(microleU lowast log)(n)e(αn)η(nx) etc) the first three are said

      to be of type I type I (again) and type II the last sumsumnleV Λ(n) is negligible

      One of the advantages of Vaughanrsquos identity is its flexibility we can set U and Vto whatever values we wish Its main disadvantage is that it is not ldquolog-freerdquo in that itseems to impose the loss of two factors of log x if we sum each side of (115) from 1to x we obtain

      sumnlex Λ(n) sim x on the left side whereas if we bound the sum on the

      right side without the use of cancellation we obtain a bound of x(log x)2 Of coursewe will obtain some cancellation from the phase e(αn) still even if this gives us afactor of say 1

      radicq we will get a bound of x(log x)2

      radicq which is worse than the

      trivial bound x for q bounded and x large Since we want a bound that is useful for allq larger than the constant r and all x larger than a constant this will not do

      As was pointed out in [Tao14] it is possible to get a factor of (log q)2 instead of afactor of (log x)2 in the type II sums by setting U and V appropriately Unfortunatelya factor of (log q)2 is still too large in practice and there is also the issue of factors oflog x in type I sums

      Vinogradov had already managed to get an essentially log-free result (by a ratherdifficult procedure) in [Vin54 Ch IX] The result in [Dab96] is log-free Unfortu-nately the explicit result in [DR01] ndash the study of which encouraged me at the begin-ning of the project ndash is not For a while I worked with the case k = 2 of the expansion

      18 CHAPTER 1 INTRODUCTION

      of (ζ prime(s)ζ(s))(k) which gives

      Λ middot log2 = micro lowast log3minus3 middot (Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (116)

      This identity is essentially log-free while a trivial bound on the sum of the right sidefor n from 1 to N does seem to have two extra factors of log they are present only inthe term micro lowast log3 which is not the hardest one to estimate Ramare obtained a log-freebound in [Ram10] using an identity introduced by Diamond and Steinig in the courseof their own work on elementary proofs of the prime number theorem [DS70] thatidentity gives a decomposition for Λ middot logk that can also be derived from the expansionof (ζ prime(s)ζ(s))(k) by a clever grouping of terms

      In the end I decided to use Vaughanrsquos identity motivated in part by [Tao14] andin part by the lack of free parameters in (116) as can be seen in (115) Vaughanrsquosidentity has two parameters U V that we can set to whatever values we think best Theform of the identity allowed me to reuse much of my work up to that point but it alsoposed a challenge since Vaughanrsquos identity is by no means log-free one has obtaincancellation in Vaughanrsquos identity at every possible step beyond the cancellation givenby the phase e(αn) (The presence of a phase in fact makes the task of getting can-cellation from the identity more complicated) The removal of logarithms will be oneof our main tasks in what follows It is clear that the presence of the Mobius functionmicro should give in principle some cancellation we will show how to use it to obtain asmuch cancellation as we need ndash with good constants and not just asymptotically

      143 Type I sumsThere are two type I sums namelysum

      mleU

      micro(m)sumn

      (log n)e(αmn)η(mnx

      )(117)

      and sumvleV

      Λ(v)sumuleU

      micro(u)sumn

      e(αvun)η(vunx

      ) (118)

      In either case α = aq + δx where q is larger than a constant r and |δx| le 1qQ0

      for some Q0 gt max(qradicx) For the purposes of this exposition we will set it as our

      task to estimate the slightly simpler sumsummleD

      micro(m)sumn

      e(αmn)η(mnx

      ) (119)

      where D can be U or UV or something else less than xWhy can we consider this simpler sum without omitting anything essential It is

      clear that (117) is of the same kind as (119) The inner double sum in (118) is just(119) with αv instead of α this enables us to estimate (118) by means of (119) for qsmall ie the more delicate case If q is not small then the approximation αv sim avqmay not be accurate enough In that case we collapse the two outer sums in (118) intoa sum

      sumn(ΛleV lowast microleU )(n) and treat all of (118) much as we will treat (119) since

      14 THE MINOR ARCS M 19

      q is not small we can afford to bound (ΛleV lowast microleU )(n) trivially (by log n) in the lesssensitive terms

      Let us first outline Vinogradovrsquos procedure for bounding type I sums Just by sum-ming a geometric series we get∣∣∣∣∣∣

      sumnleN

      e(αn)

      ∣∣∣∣∣∣ le min

      (N

      c

      α

      ) (120)

      where c is a constant and α is the distance from α to the nearest integer Vinogradovsplits the outer sum in (119) into sums of length q When m runs on an interval oflength q the angle amq runs through all fractions of the form bq due to the errorδx αm could be close to 0 for two values of n but otherwise αm takes valuesbounded below by 1q (twice) 2q (twice) 3q (twice) etc Thus∣∣∣∣∣∣

      sumyltmley+q

      micro(m)sumnleN

      e(αmn)

      ∣∣∣∣∣∣ lesum

      yltmley+q

      ∣∣∣∣∣∣sumnleN

      e(αmn)

      ∣∣∣∣∣∣ le 2N

      m+ 2cq log eq

      (121)for any y ge 0

      There are several ways to improve this One is simply to estimate the inner summore precisely this was already done in [DR01] One can also define a smoothingfunction η as in (119) it is easy to get∣∣∣∣∣∣

      sumnleN

      e(αn)η(nx

      )∣∣∣∣∣∣ le min

      (x|η|1 +

      |ηprime|12|ηprime|1

      2| sin(πα)||ηprimeprime|infin

      4x(sinπα)2

      )

      Except for the third term this is as in [Tao14] We could also choose carefully whichbound to use for each m surprisingly this gives an improvement ndash in fact an impor-tant one for m large However even with these improvements we still have a termproportional to Nm as in (121) and this contributes about (x log x)q to the sum(119) thus giving us an estimate that is not log-free

      What we have to do naturally is to take out the terms with q|m for m small (If mis large then those may not be the terms for which mα is close to 0 we will later seewhat to do) For y + q le Q2 |αminus aq| le 1qQ we get thatsum

      yltmley+q

      q-m

      min

      (A

      B

      | sinπαn|

      C

      | sinπαn|2

      )(122)

      is at most

      min

      (20

      3π2Cq2 2A+

      4q

      π

      radicAC

      2Bq

      πmax

      (2 log

      Ce3q

      )) (123)

      This is satisfactory We are left with all the terms m le M = min(DQ2) with q|mndash and also with all the terms Q2 lt m le D For m le M divisible by q we can

      20 CHAPTER 1 INTRODUCTION

      estimate (as opposed to just bound from above) the inner sum in (119) by the Poissonsummation formula and then sum over m but without taking absolute values writingm = aq we get a main term

      xmicro(q)

      qmiddot η(minusδ) middot

      sumaleMq

      (aq)=1

      micro(a)

      a (124)

      where (a q) stands for the greatest common divisor of a and qIt is clear that we have to get cancellation over micro here There is an elegant elemen-

      tary argument [GR96] showing that the absolute value of the sum in (124) is at most1 We need to gain one more log however Ramare [Ramb] helpfully furnished thefollowing bound ∣∣∣∣∣∣∣∣

      sumalex

      (aq)=1

      micro(a)

      a

      ∣∣∣∣∣∣∣∣ le4

      5

      q

      φ(q)

      1

      log xq(125)

      for q le x (Cf [EM95] [EM96]) This is neither trivial nor elementary5 We are so tospeak allowed to use non-elementary means (that is methods based on L-functions)because the only L-function we need to use here is the Riemann zeta function

      What shall we do for m gt Q2 We can always give a bound

      sumyltmley+q

      min

      (A

      C

      | sinπαn|2

      )le 3A+

      4q

      π

      radicAC (126)

      for y arbitrary since AC will be of constant size (4qπ)radicAC is pleasant enough but

      the contribution of 3A sim 3|η|1xy is nasty (it adds a multiple of (x log x)q to thetotal) and seems unavoidable the values of m for which αm is close to 0 no longercorrespond to the congruence class m equiv 0 mod q and thus cannot be taken out

      The solution is to switch approximations (The idea of using different approxima-tions to the same α is neither new nor recent in the general context of the circle methodsee [Vau97 sect28 Ex 2] What may be new is its use to clear a hurdle in type I sums)What does this mean If α were exactly or almost exactly aq then there would beno other very good approximations in a reasonable range However note that we candefine Q = bx|δq|c for α = aq + δx and still have |αminus aq| le 1qQ If δ is verysmall Q will be larger than 2D and there will be no terms with Q2 lt m le D toworry about

      5The current state of knowledge may seem surprising after all we expect nearly square-root cancella-tion ndash for instance |

      sumnlex micro(n)n| le

      radic2x holds for all real 0 lt x le 1012 see also the stronger

      bound [Dre93]) The classical zero-free region of the Riemann zeta function ought to give a factor ofexp(minus

      radic(log x)c) which looks much better than 1 log x What happens is that (a) such a factor is

      not actually much better than 1 log x for x sim 1030 say (b) estimating sums involving the Mobius func-tion by means of an explicit formula is harder than estimating sums involving Λ(n) the residues of 1ζ(s)at the non-trivial zeros of s come into play As a result getting non-trivial explicit results on sums of micro(n)is harder than one would naively expect from the quality of classical effective (but non-explicit) results See[Rama] for a survey of explicit bounds

      14 THE MINOR ARCS M 21

      What happens if δ is not very small We know that for any Qprime there is an approx-imation aprimeqprime to α with |αminus aprimeqprime| le 1qprimeQprime and qprime le Qprime However for Qprime gt Q weknow that aprimeqprime cannot equal aq by the definition of Q the approximation aq is notgood enough ie |α minus aq| le 1qQprime does not hold Since aq 6= aprimeqprime we see that|aq minus aprimeqprime| ge 1qqprime and this implies that qprime ge (ε(1 + ε))Q

      Thus for m gt Q2 the solution is to apply (126) with aprimeqprime instead of aq Thecontribution of A fades into insignificance for the first sum over a range y lt m ley + qprime y ge Q2 it contributes at most x(Q2) and all the other contributions of Asum up to at most a constant times (x log x)qprime

      Proceeding in this way we obtain a total bound for (119) whose main terms areproportional to

      1

      φ(q)

      x

      log xq

      min

      (1

      1

      δ2

      )

      2

      π

      radic|ηprimeprime|infin middotD and q log max

      (D

      q q

      ) (127)

      with good explicit constants The first term ndash usually the largest one ndash is precisely whatwe needed it is proportional to (1φ(q))x log x for q small and decreases rapidly as|δ| increases

      144 Type II or bilinear sums

      We must now bound

      S =summ

      (1 lowast microgtU )(m)sumngtV

      Λ(n)e(αmn)η(mnx)

      At this point it is convenient to assume that η is the Mellin convolution of two functionsThe multiplicative or Mellin convolution on R+ is defined by

      (η0 lowastM η1)(t) =

      int infin0

      η0(r)η1

      (t

      r

      )dr

      r

      Tao [Tao14] takes η = η2 = η1 lowastM η1 where η1 is a brutal truncation viz thefunction taking the value 2 on [12 1] and 0 elsewhere We take the same η2 in partfor comparison purposes and in part because this will allow us to use off-the-shelfestimates on the large sieve (Brutal truncations are rarely optimal in principle but asthey are very common results for them have been carefully optimized in the literature)Clearly

      S =

      int XU

      V

      summ

      sumdgtUd|m

      micro(d)

      η1

      (m

      xW

      )middotsumngeV

      Λ(n)e(αmn)η1

      ( nW

      ) dWW

      (128)

      22 CHAPTER 1 INTRODUCTION

      By Cauchy-Schwarz the integrand is at mostradicS1(UW )S2(VW ) where

      S1(UW ) =sum

      x2W ltmle x

      W

      ∣∣∣∣∣∣∣∣sumdgtUd|m

      micro(d)

      ∣∣∣∣∣∣∣∣2

      S2(VW ) =sum

      x2W lemle

      xW

      ∣∣∣∣∣∣∣sum

      max(VW2 )lenleW

      Λ(n)e(αmn)

      ∣∣∣∣∣∣∣2

      (129)

      We must bound S1(UW ) by a constant times xW We are able to do this ndash witha good constant (A careless bound would have given a multiple of (xU) log3(xU)which is much too large) First we reduce S1(W ) to an expression involving an inte-gral of sum

      r1lex

      sumr2lex

      (r1r2)=1

      micro(r1)micro(r2)

      σ(r1)σ(r2) (130)

      We can bound (130) by the use of bounds onsumnlet micro(n)n combined with the es-

      timation of infinite products by means of approximations to ζ(s) for s rarr 1+ Aftersome additional manipulations we obtain a bound for S1(UW ) whose main term isat most (3π2)(xW ) for each W and closer to 022482xW on average over W

      (This is as good a point as any to say that throughout we can use a trick in [Tao14]that allows us to work with odd values of integer variables throughout instead of lettingm or n range over all integers Here for instance if m and n are restricted to be oddwe obtain a bound of (2π2)(xW ) for individual W and 015107xW on averageoverW This is so even though we are losing some cancellation in micro by the restriction)

      Let us now bound S2(VW ) This is traditionally done by Linnikrsquos dispersionmethod However it should be clear that the thing to do nowadays is to use a largesieve and more specifically a large sieve for primes that kind of large sieve is nothingother than a tool for estimating expressions such as S2(VW ) (Incidentally eventhough we are trying to save every factor of log we can we choose not to use smallsieves at all either here or elsewhere) In order to take advantage of prime support weuse Montgomeryrsquos inequality ([Mon68] [Hux72] see the expositions in [Mon71 pp27ndash29] and [IK04 sect74]) combined with Montgomery and Vaughanrsquos large sieve withweights [MV73 (16)] following the general procedure in [MV73 (16)] We obtain abound of the form

      logW

      log W2q

      (x

      4φ(q)+qW

      φ(q)

      )W

      2(131)

      on S2(VW ) where of course we can also choose not to gain a factor of logW2q ifq is close to or greater than W

      It remains to see how to gain a factor of |δ| in the major arcs and more specificallyin S2(VW ) To explain this let us step back and take a look at what the large sieve is

      14 THE MINOR ARCS M 23

      Given a civilized function f Zrarr C Plancherelrsquos identity tells us thatintRZ

      ∣∣∣f (α)∣∣∣2 dα =

      sumn

      |f(n)|2

      The large sieve can be seen as an approximate or statistical version of this for aldquosamplerdquo of points α1 α2 αk satisfying |αi minus αj | ge β for i 6= j it tells us thatsum

      1lejlek

      ∣∣∣f (αi)∣∣∣2 le (X + βminus1)

      sumn

      |f(n)|2 (132)

      assuming that f is supported on an interval of length X Now consider α1 = α α2 = 2α α3 = 3α If α = aq then the angles

      α1 αq are well-separated ie they satisfy |αi minus αj | ge 1q and so we can apply(132) with β = 1q However αq+1 = α1 Thus if we have an outer sum oflength L gt q ndash in (129) we have an outer sum of length L = x2W ndash we needto split it into dLqe blocks of length q and so the total bound given by (132) isdLqe(X + q)

      sumn |f(n)|2 Indeed this is what gives us (131) which is fine but we

      want to do better for |δ| larger than a constantSuppose then that α = aq + δx where |δ| gt 8 say Then the angles α1

      and αq+1 are not identical |α1 minus αq+1| le q|δ|x We also see that αq+1 is at adistance at least q|δ|x from α2 α3 αq provided that q|δ|x lt 1q We can goon with αq+2 αq+3 and stop only once there is overlap ie only once we reachαm such that m|δ|x ge 1q We then give all the angles α1 αm ndash which areseparated by at least q|δ|x from each other ndash to the large sieve at the same time Wedo this dLme le dL(x|δ|q)e times and obtain a total bound of dL(x|δ|q)e(X +x|δ|q)

      sumn |f(n)|2 which for L = x2W X = W2 gives us about(

      x

      4Q

      W

      2+x

      4

      )logW

      provided thatL ge x|δ|q and as usual |αminusaq| le 1qQ This is very small comparedto the trivial bound xW8

      What happens if L lt x|δq| Then there is never any overlap we consider allangles αi and give them all together to the large sieve The total bound is (W 24 +xW2|δ|q) logW If L = x2W is smaller than say x3|δq| then we see clearlythat there are non-intersecting swarms of angles αi around the rationals aq We canthus save a factor of log (or rather (φ(q)q) log(W|δq|)) by applying Montgomeryrsquosinequality which operates by strewing displacements of given angles (or here swarmsaround angles) around the circle to the extent possible while keeping everything well-separated In this way we obtain a bound of the form

      logW

      log W|δ|q

      (x

      |δ|φ(q)+

      q

      φ(q)

      W

      2

      )W

      2

      Compare this to (131) we have gained a factor of |δ|4 and so we use this estimatewhen |δ| gt 4 (We will actually use the criterion |δ| gt 8 but since we will be working

      24 CHAPTER 1 INTRODUCTION

      with approximations of the form 2α = aq + δx the value of δ in our actual workis twice of what it is in this introduction This is a consequence of working with sumsover the odd integers as in [Tao14])

      We have succeeded in eliminating all factors of log we came across The onlyfactor of log that remains is log xUV coming from the integral

      int xUV

      dWW Thuswe want UV to be close to x but we cannot let it be too close since we also have aterm proportional to D = UV in (127) and we need to keep it substantially smallerthan x We set U and V so that UV is x

      radicqmax(4 |δ|) or thereabouts

      In the end after some work we obtain our main minor-arcs bound (Theorem 311)It states the following Let x ge x0 x0 = 216 middot 1020 Tecall that Sη(α x) =sumn Λ(n)e(αn)η(nx) and η2 = η1lowastM η1 = 4 middot1[121]lowast1[121] Let 2α = aq+δx

      q le Q gcd(a q) = 1 |δx| le 1qQ where Q = (34)x23 If q le x136 then

      |Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

      middot x+25xradicδ0q

      +2x

      δ0qmiddot Lxδ0qq + 336x56

      (133)where

      δ0 = max(2 |δ|4) Rxt = 027125 log

      (1 +

      log 4t

      2 log 9x13

      2004t

      )+ 041415

      Lxtq =q

      φ(q)

      (13

      4log t+ 782

      )+ 1366 log t+ 3755

      (134)The factor Rxt is small in practice for typical ldquodifficultrdquo values of x and δ0x it is

      less than 1 The crucial things to notice in (133) are that there is no factor of log x andthat in the main term there is only one factor of log δ0q The fact that δ0 helps us asit grows is precisely what enables us to take major arcs that get narrower and narroweras q grows

      15 Integrals over the major and minor arcsSo far we have sketched (sect13) how to estimate Sη(α x) for α in the major arcs andη based on the Gaussian eminust

      22 and also (sect14) how to bound |Sη(α x)| for α in theminor arcs and η = η2 where η2 = 4 middot 1[121] lowastM 1[121] We now must show how touse such information to estimate integrals such as the ones in (14)

      We will use two smoothing functions η+ ηlowast in the notation of (13) we set f1 =f2 = Λ(n)η+(nx) f3 = Λ(n)ηlowast(nx) and so we must give a lower bound forint

      M

      (Sη+(α x))2Sηlowast(α x)e(minusαn)dα (135)

      and an upper bound for intm

      ∣∣Sη+(α x)∣∣2 Sηlowast(α x)e(minusαn)dα (136)

      15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 25

      so that we can verify (14)The traditional approach to (136) is to boundintm

      (Sη+(α x))2Sηlowast(α x)e(minusαn)dα leintm

      ∣∣Sη+(α x)∣∣2 dα middotmax

      αisinmηlowast(α)

      lesumn

      Λ(n)2η2+

      (nx

      )middotmaxαisinm

      Sηlowast(α x)(137)

      Since the sum over n is of the order of x log x this is not log-free and so cannot begood enough we will later see how to do better Still this gets the main shape rightour bound on (136) will be proportional to |η+|22|ηlowast|1 Moreover we see that ηlowast hasto be such that we know how to bound |Sηlowast(α x)| for α isin m while our choice of η+

      is more or less free at least as far as the minor arcs are concernedWhat about the major arcs In order to do anything on them we will have to be

      able to estimate both η+(α) and ηlowast(α) for α isin M If that is the case then as weshall see we will be able to obtain that the main term of (135) is an infinite product(independent of the smoothing functions) times x2 timesint infin

      minusinfin(η+(minusα))2ηlowast(minusα)e(minusαnx)dα

      =

      int infin0

      int infin0

      η+(t1)η+(t2)ηlowast

      (nxminus (t1 + t2)

      )dt1dt2

      (138)

      In other words we want to maximize (or nearly maximize) the expression on the rightof (138) divided by |η+|22|ηlowast|1

      One way to do this is to let ηlowast be concentrated on a small interval [0 ε) Then theright side of (138) is approximately

      |ηlowast|1 middotint infin

      0

      η+(t)η+

      (nxminus t)dt (139)

      To maximize (139) we should make sure that η+(t) sim η+(nxminus t) We set x sim n2and see that we should define η+ so that it is supported on [0 2] and symmetric aroundt = 1 or nearly so this will maximize the ratio of (139) to |η+|22|ηlowast|1

      We should do this while making sure that we will know how to estimate Sη+(α x)for α isin M We know how to estimate Sη(α x) very precisely for functions of theform η(t) = g(t)eminust

      22 η(t) = g(t)teminust22 etc where g(t) is band-limited We will

      work with a function η+ of that form chosen so as to be very close (in `2 norm) to afunction η that is in fact supported on [0 2] and symmetric around t = 1

      We choose

      η(t) =

      t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 if t 6isin [0 2]

      This function is obviously symmetric (η(t) = η(2 minus t)) and vanishes to high orderat t = 0 besides being supported on [0 2]

      We set η+(t) = hR(t)teminust22 where hR(t) is an approximation to the function

      h(t) =

      t2(2minus t)3etminus

      12 if t isin [0 2]

      0 if t 6isin [0 2]

      26 CHAPTER 1 INTRODUCTION

      We just let hR(t) be the inverse Mellin transform of the truncation ofMh to an interval[minusiR iR] (Explicitly

      hR(t) =

      int infin0

      h(tyminus1)FR(y)dy

      y

      where FR(t) = sin(R log y)(π log y) that is FR is the Dirichlet kernel with a changeof variables)

      Since the Mellin transform of teminust22 is regular at s = 0 the Mellin transform

      Mη+ will be holomorphic in a neighborhood of s 0 le lt(s) le 1 even thoughthe truncation of Mh to [minusiR iR] is brutal Set R = 200 say By the fast decay ofMh(it) and the fact that the Mellin transform M is an isometry |(hR(t)minush(t))t|2 isvery small and hence so is |η+ minus η|2 as we desired

      But what about the requirement that we be able to estimate Sηlowast(α x) for bothα isin m and α isinM

      Generally speaking if we know how to estimate Sη1(α x) for some α isin RZ andwe also know how to estimate Sη2(α x) for all other α isin RZ where η1 and η2 aretwo smoothing functions then we know how to estimate Sη3(α x) for all α isin RZwhere η3 = η1 lowastM η2 or more generally ηlowast(t) = (η1 lowastM η2)(κt) κ gt 0 a constantThis is an easy exercise on exchanging the order of integration and summation

      Sηlowast(α x) =sumn

      Λ(n)e(αn)(η1 lowastM η2)(κn

      x

      )=

      int infin0

      sumn

      Λ(n)e(αn)η1(κr)η2

      ( nrx

      ) drr

      =

      int infin0

      η1(κr)Sη2(rx)dr

      r

      (140)and similarly with η1 and η2 switched Of course this trick is valid for all exponentialsums any function f(n) would do in place of Λ(n) The only caveat is that η1 (andη2) should be small very near 0 since for r small we may not be able to estimateSη2(rx) (or Sη1(rx)) with any precision This is not a problem one of our functionswill be t2eminust

      22 which vanishes to second order at 0 and the other one will be η2 =4 middot 1[121] lowastM 1[121] which has support bounded away from 0 We will set κ large(say κ = 49) so that the support of ηlowast is indeed concentrated on a small interval [0 ε)as we wanted

      Now that we have chosen our smoothing weights η+ and ηlowast we have to estimate themajor-arc integral (135) and the minor-arc integral (136) What follows can actuallybe done for general η+ and ηlowast we could have left our particular choice of η+ and ηlowastfor the end

      Estimating the major-arc integral (135) may sound like an easy task since we haverather precise estimates for Sη(α x) (η = η+ ηlowast) when α is on the major arcs wecould just replace Sη(α x) in (135) by the approximation given by (17) and (111) Itis however more efficient to express (135) as the sum of the contribution of the trivialcharacter (a sum of integrals of (η(minusδ)x)3 where η(minusδ)x comes from (111)) plus a

      15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 27

      term of the form

      (maximum ofradicq middot E(q) for q le r) middot

      intM

      ∣∣Sη+(α x)∣∣2 dα

      where E(q) = E is as in (112) plus two other terms of essentially the same form Asusual the major arcs M are the arcs around rationals aq with q le r We will soondiscuss how to bound the integral of

      ∣∣Sη+(α x)∣∣2 over arcs around rationals aq with

      q le s s arbitrary Here however it is best to estimate the integral over M using theestimate on Sη+(α x) from (17) and (111) we obtain a great deal of cancellationwith the effect that for χ non-trivial the error term in (112) appears only when it getssquared and thus becomes negligible

      The contribution of the trivial character has an easy approximation thanks to thefast decay of η We obtain that the major-arc integral (135) equals a main termC0Cηηlowastx

      2 where

      C0 =prodp|n

      (1minus 1

      (pminus 1)2

      )middotprodp-n

      (1 +

      1

      (pminus 1)3

      )

      Cηηlowast =

      int infin0

      int infin0

      η(t1)η(t2)ηlowast

      (nxminus (t1 + t2)

      )dt1dt2

      plus several small error terms We have already chosen η ηlowast and x so as to (nearly)maximize Cηηlowast

      It is time to bound the minor-arc integral (136) As we said in sect15 we must dobetter than the usual bound (137) Since our minor-arc bound (32) on |Sη(α x)|α sim aq decreases as q increases it makes sense to use partial summation togetherwith bounds onint

      ms

      |Sη+(α x)|2 =

      intMs

      |Sη+(α x)|2dαminusintM

      |Sη+(α x)|2dα

      where ms denotes the arcs around aq r lt q le s and Ms denotes the arcs around allaq q le s We already know how to estimate the integral on M How do we boundthe integral on Ms

      In order to do better than the trivial boundintMsleintRZ we will need to use the

      fact that the series (16) defining Sη+(α x) is essentially supported on prime numbersBounding the integral on Ms is closely related to the problem of bounding

      sumqles

      suma mod q

      (aq)=1

      ∣∣∣∣∣∣sumnlex

      ane(aq)

      ∣∣∣∣∣∣2

      (141)

      efficiently for s considerably smaller thanradicx and an supported on the primes

      radicx lt

      p le x This is a classical problem in the study of the large sieve The usual bound on(141) (by for instance Montgomeryrsquos inequality) has a gain of a factor of

      2eγ(log s)(log xs2)

      28 CHAPTER 1 INTRODUCTION

      relative to the bound of (x + s2)sumn |an|2 that one would get from the large sieve

      without using prime support Heath-Brown proceeded similarly to boundintMs

      |Sη+(α x)|2dα 2eγ log s

      log xs2

      intRZ|Sη+(α x)|2dα (142)

      This already gives us the gain of C(log s) log x that we absolutely need butthe constant C is suboptimal the factor in the right side of (142) should really be(log s) log x ie C should be 1 We cannot reasonably hope to obtain a factor betterthan 2(log s) log x in the minor arcs due to what is known as the parity problem insieve theory As it turns out Ramare [Ram09] had given general bounds on the largesieve that were clearly conducive to better bounds on (141) though they involved aratio that was not easy to bound in general

      I used several careful estimations (including [Ram95 Lem 34]) to reduce theproblem of bounding this ratio to a finite number of cases which I then checked bya rigorous computation This approach gave a bound on (141) with a factor of sizeclose to 2(log s) log x (This solves the large-sieve problem for s le x03 it wouldstill be worthwhile to give a computation-free proof for all s le x12minusε ε gt 0) It wasthen easy to give an analogous bound for the integral over Ms namelyint

      Ms

      |Sη+(α x)|2dα 2 log s

      log x

      intRZ|Sη+(α x)|2dα

      where can easily be made precise by replacing log s by log s + 136 and log x bylog x + c where c is a small constant Without this improvement the main theoremwould still have been proved but the required computation time would have been mul-tiplied by a factor of considerably more than e3γ = 56499

      What remained then was just to compare the estimates on (135) and (136) andcheck that (136) is smaller for n ge 1027 This final step was just bookkeeping Aswe already discussed a check for n lt 1027 is easy Thus ends the proof of the maintheorem

      16 Some remarks on computationsThere were two main computational tasks verifying the ternary conjecture for all n leC and checking the Generalized Riemann Hypothesis for modulus q le r up to acertain height

      The first task was not very demanding Platt and I verified in [HP13] that everyodd integer 5 lt n le 88 middot 1030 can be written as the sum of three primes (In theend only a check for 5 lt n le 1027 was needed) We proceeded as follows In amajor computational effort Oliveira e Silva Herzog and Pardi [OeSHP14]) had alreadychecked that the binary Goldbach conjecture is true up to 4 middot 1018 ndash that is every evennumber up to 4 middot 1018 is the sum of two primes Given that all we had to do wasto construct a ldquoprime ladderrdquo that is a list of primes from 3 up to 88 middot 1030 suchthat the difference between any two consecutive primes in the list is at least 4 and atmost 4 middot 1018 (This is a known strategy see [Sao98]) Then for any odd integer

      16 SOME REMARKS ON COMPUTATIONS 29

      5 lt n le 88 middot 1030 there is a prime p in the list such that 4 le n minus p le 4 middot 1018 + 2(Choose the largest p lt n in the ladder or if n minus that prime is 2 choose the primeimmediately under that) By [OeSHP14] (and the fact that 4 middot 1018 + 2 equals p + qwhere p = 2000000000000001301 and q = 1999999999999998701 are both prime)we can write nminus p = p1 + p2 for some primes p1 p2 and so n = p+ p1 + p2

      Building a prime ladder involves only integer arithmetic that is computer manip-ulation of integers rather than of real numbers Integers are something that computerscan handle rapidly and reliably We look for primes for our ladder only among a spe-cial set of integers whose primality can be tested deterministically quite quickly (Prothnumbers k middot 2m + 1 k lt 2m) Thus we can build a prime ladder by a rigorousdeterministic algorithm that can be (and was) parallelized trivially

      The second computation is more demanding It consists in verifying that for everyL-function L(s χ) with χ of conductor q le r = 300000 (for q even) or q le r2(for q odd) all zeroes of L(s χ) such that |=(s)| le Hq = 108q (for q odd) and|=(s)| le Hq = max(108q 200 + 75 middot 107q (for q even) lie on the critical lineAs a matter of fact Platt went up to conductor q le 200000 (or twice that for q even)[Plab] he had already gone up to conductor 100000 in his PhD thesis [Pla11] Theverification took in total about 400000 core-hours (ie the total number of processorcores used times the number of hours they ran equals 400000 nowadays a top-of-the-line processor typically has eight cores) In the end since I used only q le 150000 (ortwice that for q even) the number of hours actually needed was closer to 160000 sinceI could have made do with q le 120000 (at the cost of increasing C to 1029 or 1030) itis likely in retrospect that only about 80000 core-hours were needed

      Checking zeros of L-functions computationally goes back to Riemann (who didit by hand for the special case of the Riemann zeta function) It is also one of thethings that were tried on digital computers in their early days (by Turing [Tur53] forinstance see the exposition in [Boo06b]) One of the main issues to be careful aboutarises whenever one manipulates real numbers via a computer generally speaking acomputer cannot store an irrational number moreover while a computer can handlerationals it is really most comfortable handling just those rationals whose denomina-tors are powers of two Thus one cannot really say ldquocomputer give me the sine ofthat numberrdquo and expect a precise result What one should do if one really wants toprove something (as is the case here) is to say ldquocomputer I am giving you an intervalI = [a2k b2k] give me an interval I prime = [c2` d2`] preferably very short suchthat sin(I) sub I primerdquo This is called interval arithmetic it is arguably the easiest way to dofloating-point computations rigorously

      Processors do not do this natively and if interval arithmetic is implemented purelyon software computations can be slowed down by a factor of about 100 Fortunatelythere are ways of running interval-arithmetic computations partly on hardware partlyon software

      Incidentally there are some basic functions (such as sin) that should always be doneon software not just if one wants to use interval arithmetic but even if one just wantsreasonably precise results the implementation of transcendental functions in some ofthe most popular processors does not always round correctly and errors can accumulatequickly Fortunately this problem is already well-known and there is software thattakes care of this (Platt and I used the crlibm library [DLDDD+10])

      30 CHAPTER 1 INTRODUCTION

      Lastly there were several relatively minor computations strewn here and there inthe proof There is some numerical integration done rigorously once or twice thiswas done using a standard package based on interval arithmetic [Ned06] but most ofthe time I wrote my own routines in C (using Plattrsquos interval arithmetic package) forthe sake of speed Another kind of computation (employed much more in [Hela] thanin the somewhat more polished version of the proof given here) was a rigorous versionof a ldquoproof by graphrdquo (ldquothe maximum of a function f is clearly less than 4 because Ican see it on the screenrdquo) There is a standard way to do this (see eg [Tuc11 sect52])essentially the bisection method combines naturally with interval arithmetic as weshall describe in sect26 Yet another computation (and not a very small one) was thatinvolved in verifying a large-sieve inequality in an intermediate range (as we discussedin sect15)

      It may be interesting to note that one of the inequalities used to estimate (130) wasproven with the help of automatic quantifier elimination [HB11] Proving this inequal-ity was a very minor task both computationally and mathematically in all likelihoodit is feasible to give a human-generated proof Still it is nice to know from first-hand experience that computers can nowadays (pretend to) do something other thanjust perform numerical computations ndash and that this is already applicable in currentmathematical practice

      Chapter 2

      Notation and preliminaries

      21 General notationGiven positive integers m n we say m|ninfin if every prime dividing m also divides nWe say a positive integer n is square-full if for every prime p dividing n the squarep2 also divides n (In particular 1 is square-full) We say n is square-free if p2 - nfor every prime p For p prime n a non-zero integer we define vp(n) to be the largestnon-negative integer α such that pα|n

      When we writesumn we mean

      suminfinn=1 unless the contrary is stated As always

      Λ(n) denotes the von Mangoldt function

      Λ(n) =

      log p if n = pα for some prime p and some integer α ge 10 otherwise

      and micro denotes the Mobius function

      micro(n) =

      (minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

      We let τ(n) be the number of divisors of an integer n ω(n) the number of primedivisors of n and σ(n) the sum of the divisors of n

      We write (a b) for the greatest common divisor of a and b If there is any riskof confusion with the pair (a b) we write gcd(a b) Denote by (a binfin) the divisorprodp|b p

      vp(a) of a (Thus a(a binfin) is coprime to b and is in fact the maximal divisorof a with this property)

      As is customary we write e(x) for e2πix We denote the Lr norm of a function fby |f |r We write Olowast(R) to mean a quantity at most R in absolute value Given a setS we write 1S for its characteristic function

      1S(x) =

      1 if x isin S0 otherwise

      Write log+ x for max(log x 0)

      31

      32 CHAPTER 2 NOTATION AND PRELIMINARIES

      22 Dirichlet characters and L functions

      Let us go over some basic terms A Dirichlet character χ Z rarr C of modulus q is acharacter χ of (ZqZ)lowast lifted to Z with the convention that χ(n) = 0 when (n q) 6= 1(In other words χ is completely multiplicative and periodic modulo q and vanisheson integers not coprime to q) Again by convention there is a Dirichlet character ofmodulus q = 1 namely the trivial character χT Z rarr C defined by χT (n) = 1 forevery n isin Z

      If χ is a character modulo q and χprime is a character modulo qprime|q such that χ(n) =χprime(n) for all n coprime to q we say that χprime induces χ A character is primitive if it isnot induced by any character of smaller modulus Given a character χ we write χlowast forthe (uniquely defined) primitive character inducing χ If a character χmod q is inducedby the trivial character χT we say that χ is principal and write χ0 for χ (provided themodulus q is clear from the context) In other words χ0(n) = 1 when (n q) = 1 andχ0(n) = 0 when (n q) = 0

      A Dirichlet L-function L(s χ) (χ a Dirichlet character) is defined as the analyticcontinuation of

      sumn χ(n)nminuss to the entire complex plane there is a pole at s = 1 if χ

      is principalA non-trivial zero of L(s χ) is any s isin C such that L(s χ) = 0 and 0 lt lt(s) lt 1

      (In particular a zero at s = 0 is called ldquotrivialrdquo even though its contribution can bea little tricky to work out The same would go for the other zeros with lt(s) = 0occuring for χ non-primitive though we will avoid this issue by working mainly withχ primitive) The zeros that occur at (some) negative integers are called trivial zeros

      The critical line is the line lt(s) = 12 in the complex plane Thus the generalizedRiemann hypothesis for Dirichlet L-functions reads for every Dirichlet character χall non-trivial zeros of L(s χ) lie on the critical line Verifiable finite versions ofthe generalized Riemann hypothesis generally read for every Dirichlet character χ ofmodulus q le Q all non-trivial zeros of L(s χ) with |=(s)| le f(q) lie on the criticalline (where f Zrarr R+ is some given function)

      23 Fourier transforms and exponential sums

      The Fourier transform on R is normalized here as follows

      f(t) =

      int infinminusinfin

      e(minusxt)f(x)dx

      The trivial bound is |f |infin le |f |1 If f is compactly supported (or of fast enoughdecay as t 7rarr plusmninfin) and piecewise continuous f(t) = f prime(t)(2πit) by integration byparts Iterating we obtain that if f is of fast decay and differentiable k times outsidefinitely many points then

      f(t) = Olowast

      (|f (k)|infin(2πt)k

      )= Olowast

      (|f (k)|1(2πt)k

      ) (21)

      23 FOURIER TRANSFORMS AND EXPONENTIAL SUMS 33

      Thus for instance if f is compactly supported continuous and piecewise C1 then fdecays at least quadratically

      It could happen that |f (k)|1 = infin in which case (21) is trivial (but not false) Inpractice we require f (k) isin L1 In a typical situation f is differentiable k times exceptat x1 x2 xk where it is differentiable only (k minus 2) times the contribution of xi(say) to |f (k)|1 is then | limxrarrx+

      if (kminus1)(x)minus limxrarrxminusi

      f (kminus1)(x)|The following bound is standard (see eg [Tao14 Lemma 31]) for α isin RZ and

      f Rrarr C compactly supported and piecewise continuous∣∣∣∣∣sumnisinZ

      f(n)e(αn)

      ∣∣∣∣∣ le min

      (|f |1 +

      1

      2|f prime|1

      12 |fprime|1

      | sin(πα)|

      ) (22)

      (The first bound follows fromsumnisinZ |f(n)| le |f |1 + (12)|f prime|1 which in turn is

      a quick consequence of the fundamental theorem of calculus the second bound isproven by summation by parts) The alternative bound (14)|f primeprime|1| sin(πα)|2 givenin [Tao14 Lemma 31] (for f continuous and piecewise C1) can usually be improvedby the following estimate

      Lemma 231 Let f Rrarr C be compactly supported continuous and piecewise C1Then ∣∣∣∣∣sum

      nisinZf(n)e(αn)

      ∣∣∣∣∣ le 14 |f primeprime|infin

      (sinπα)2(23)

      for every α isin R

      As usual the assumption of compact support could easily be relaxed to an assump-tion of fast decay

      Proof By the Poisson summation formulainfinsum

      n=minusinfinf(n)e(αn) =

      infinsumn=minusinfin

      f(nminus α)

      Since f(t) = f prime(t)(2πit)

      infinsumn=minusinfin

      f(nminus α) =

      infinsumn=minusinfin

      f prime(nminus α)

      2πi(nminus α)=

      infinsumn=minusinfin

      f primeprime(nminus α)

      (2πi(nminus α))2

      By Eulerrsquos formula π cot sπ = 1s+suminfinn=1(1(n+ s)minus 1(nminus s))

      infinsumn=minusinfin

      1

      (n+ s)2= minus(π cot sπ)prime =

      π2

      (sin sπ)2 (24)

      Hence∣∣∣∣∣infinsum

      n=minusinfinf(nminus α)

      ∣∣∣∣∣ le |f primeprime|infininfinsum

      n=minusinfin

      1

      (2π(nminus α))2= |f primeprime|infin middot

      1

      (2π)2middot π2

      (sinαπ)2

      34 CHAPTER 2 NOTATION AND PRELIMINARIES

      The trivial bound |f primeprime|infin le |f primeprime|1 applied to (23) recovers the bound in [Tao14Lemma 31] In order to do better we will give a tighter bound for |f primeprime|infin in AppendixB when f is equal to one of our main smoothing functions (f = η2)

      Integrals of multiples of f primeprime (in particular |f primeprime|1 and f primeprime) can still be made senseof when f primeprime is undefined at a finite number of points provided f is understood as adistribution (and f prime has finite total variation) This is the case in particular for f = η2

      When we need to estimatesumn f(n) precisely we will use the Poisson summation

      formula sumn

      f(n) =sumn

      f(n)

      We will not have to worry about convergence here since we will apply the Poissonsummation formula only to compactly supported functions f whose Fourier transformsdecay at least quadratically

      24 Mellin transformsThe Mellin transform of a function φ (0infin)rarr C is

      Mφ(s) =

      int infin0

      φ(x)xsminus1dx (25)

      If φ(x)xσminus1 is in `1 with respect to dt (ieintinfin

      0|φ(x)|xσminus1dx ltinfin) then the Mellin

      transform is defined on the line σ+ iR Moreover if φ(x)xσminus1 is in `1 for σ = σ1 andfor σ = σ2 where σ2 gt σ1 then it is easy to see that it is also in `1 for all σ isin (σ1 σ2)and that moreover the Mellin transform is holomorphic on s σ1 lt lt(s) lt σ2 Wethen say that s σ1 lt lt(s) lt σ2 is a strip of holomorphy for the Mellin transform

      The Mellin transform becomes a Fourier transform (of η(eminus2πv)eminus2πvσ) by meansof the change of variables x = eminus2πv We thus obtain for example that the Mellintransform is an isometry in the sense thatint infin

      0

      |f(x)|2x2σ dx

      x=

      1

      int infinminusinfin|Mf(σ + it)|2dt (26)

      Recall that in the case of the Fourier transform for |f |2 = |f |2 to hold it is enoughthat f be in `1 cap `2 This gives us that for (26) to hold it is enough that f(x)xσminus1 bein `1 and f(x)xσminus12 be in `2 (again with respect to dt in both cases)

      We write f lowastM g for the multiplicative or Mellin convolution of f and g

      (f lowastM g)(x) =

      int infin0

      f(w)g( xw

      ) dww (27)

      In generalM(f lowastM g) = Mf middotMg (28)

      25 BOUNDS ON SUMS OF micro AND Λ 35

      and

      M(f middot g)(s) =1

      2πi

      int σ+iinfin

      σminusiinfinMf(z)Mg(sminus z)dz [GR94 sect1732] (29)

      provided that z and sminus z are within the strips on which Mf and Mg (respectively) arewell-defined

      We also have several useful transformation rules just as for the Fourier transformFor example

      M(f prime(t))(s) = minus(sminus 1) middotMf(sminus 1)

      M(tf prime(t))(s) = minuss middotMf(s)

      M((log t)f(t))(s) = (Mf)prime(s)

      (210)

      (as in eg [BBO10 Table 111])Let

      η2 = (2 middot 1[121]) lowastM (2 middot 1[121])

      Since (see eg [BBO10 Table 113] or [GR94 sect1643])

      (MI[ab])(s) =bs minus as

      s

      we see that

      Mη2(s) =

      (1minus 2minuss

      s

      )2

      Mη4(s) =

      (1minus 2minuss

      s

      )4

      (211)

      Let fz = eminuszt where lt(z) gt 0 Then

      (Mf)(s) =

      int infin0

      eminuszttsminus1dt =1

      zs

      int infin0

      eminustdt

      =1

      zs

      int zinfin

      0

      eminusuusminus1du =1

      zs

      int infin0

      eminusttsminus1dt =Γ(s)

      zs

      where the next-to-last step holds by contour integration and the last step holds by thedefinition of the Gamma function Γ(s)

      25 Bounds on sums of micro and Λ

      We will need some simple explicit bounds on sums involving the von Mangoldt func-tion Λ and the Moebius function micro In non-explicit work such sums are usuallybounded using the prime number theorem or rather using the properties of the zetafunction ζ(s) underlying the prime number theorem Here however we need robustfully explicit bounds valid over just about any range

      For the most part we will just be quoting the literature supplemented with somecomputations when needed The proofs in the literature are sometimes based on prop-erties of ζ(s) and sometimes on more elementary facts

      36 CHAPTER 2 NOTATION AND PRELIMINARIES

      First let us see some bounds involving Λ The following bound can be easilyderived from [RS62 (323)] supplemented by a quick calculation of the contributionof powers of primes p lt 32 sum

      nlex

      Λ(n)

      nle log x (212)

      We can derive a bound in the other direction from [RS62 (321)] (for x gt 1000adding the contribution of all prime powers le 1000) and a numerical verification forx le 1000 sum

      nlex

      Λ(n)

      nge log xminus log

      3radic2 (213)

      We also use the following older bounds

      1 By the second table in [RR96 p 423] supplemented by a computation for2 middot 106 le V le 4 middot 106 sum

      nley

      Λ(n) le 10004y (214)

      for y ge 2 middot 106

      2 sumnley

      Λ(n) lt 103883y (215)

      for every y gt 0 [RS62 Thm 12]

      For all y gt 663 sumnley

      Λ(n)n lt 103884y2

      2 (216)

      where we use (215) and partial summation for y gt 200000 and a computation for663 lt y le 200000 Using instead the second table in [RR96 p 423] together withcomputations for small y lt 107 and partial summation we get that

      sumnley

      Λ(n)n lt 10008y2

      2(217)

      for y gt 16 middot 106Similarly sum

      nley

      Λ(n)radicn

      lt 2 middot 10004radicy (218)

      for all y ge 1It is also true that sum

      y2ltpley

      (log p)2 le 1

      2y(log y) (219)

      25 BOUNDS ON SUMS OF micro AND Λ 37

      for y ge 117 this holds for y ge 2 middot 758699 by [RS75 Cor 2] (applied to x = yx = y2 and x = 2y3) and for 117 le y lt 2 middot 758699 by direct computation

      Now let us see some estimates on sums involving micro The situation here is lesssatisfactory than for sums involving Λ The main reason is that the complex-analyticapproach to estimating

      sumnleN micro(n) would involve 1ζ(s) rather than ζ prime(s)ζ(s) and

      thus strong explicit bounds on the residues of 1ζ(s) would be needed Thus explicitestimates on sums involving micro are harder to obtain than estimates on sums involving ΛThis is so even though analytic number theorists are generally used (from the habit ofnon-explicit work) to see the estimation of one kind of sum or the other as essentiallythe same task

      Fortunately in the case of sums of the typesumnlex micro(n)n for x arbitrary (a type of

      sum that will be rather important for us) all we need is a saving of (log n) or (log n)2

      on the trivial bound This is provided by the following

      1 (Granville-Ramare [GR96] Lemma 102)∣∣∣∣∣∣sum

      nlexgcd(nq)=1

      micro(n)

      n

      ∣∣∣∣∣∣ le 1 (220)

      for all x q ge 1

      2 (Ramare [Ram13] cf El Marraki [EM95] [EM96])∣∣∣∣∣∣sumnlex

      micro(n)

      n

      ∣∣∣∣∣∣ le 003

      log x(221)

      for x ge 11815

      3 (Ramare [Ramb]) sumnlexgcd(nq)=1

      micro(n)

      n= Olowast

      (1

      log xqmiddot 4

      5

      q

      φ(q)

      )(222)

      for all x and all q le xsumnlexgcd(nq)=1

      micro(n)

      nlog

      x

      n= Olowast

      (100303

      q

      φ(q)

      )(223)

      for all x and all q

      Improvements on these bounds would lead to improvements on type I estimates butnot in what are the worst terms overall at this point

      A computation carried out by the author has proven the following inequality for allreal x le 1012 ∣∣∣∣∣∣

      sumnlex

      micro(n)

      n

      ∣∣∣∣∣∣ leradic

      2

      x(224)

      38 CHAPTER 2 NOTATION AND PRELIMINARIES

      The computation was conducted rigorously by means of interval arithmetic For thesake of verification we record that

      542625 middot 10minus8 lesum

      nle1012

      micro(n)

      nle 542898 middot 10minus8

      Computations also show that the stronger bound∣∣∣∣∣∣sumnlex

      micro(n)

      n

      ∣∣∣∣∣∣ le 1

      2radicx

      holds for all 3 le x le 7727068587 but not for x = 7727068588minus εEarlier numerical work carried out by Olivier Ramare [Ram14] had shown that

      (224) holds for all x le 1010

      26 Interval arithmetic and the bisection methodInterval arithmetic has at its basic data type intervals of the form I = [a2` b2`]where a b ` isin Z and a le b Say we have a real number x and we want to know sin(x)In general we cannot represent x in a computer in part because it may have no finitedescription The best we can do is to construct an interval of the form I = [a2` b2`]in which x is contained

      What we ask of a routine in an interval-arithmetic package is to construct an intervalI prime = [aprime2`

      prime bprime2`

      prime] in which sin(I) is contained (In practice this is done partly in

      software by means of polynomial approximations to sin with precise error terms andpartly in hardware by means of an efficient usage of rounding conventions) This givesus in effect a value for sin(x) (namely (aprime+ bprime)2`

      prime+1) and a bound on the error term(namely (bprime minus aprime)2`prime+1)

      There are several implementations of interval arithmetic available We will almostalways use D Plattrsquos implementation [Pla11] of double-precision interval arithmeticbased on Lambovrsquos [Lam08] ideas (At one point we will use the PROFILBIAS inter-val arithmetic package [Knu99] since it underlies the VNODE-LP [Ned06] packagewhich we use to bound an integral)

      The bisection method is a particularly simple method for finding maxima and min-ima of functions as well as roots It combines rather nicely with interval arithmeticwhich makes the method rigorous We follow an implementation based on [Tuc11sect52] Let us go over the basic ideas

      Let us use the bisection method to find the minima (say) of a function f on acompact interval I0 (If the interval is non-compact we generally apply the bisectionmethod to a compact sub-interval and use other tools eg power-series expansionsin the complement) The method proceeds by splitting an interval into two repeatedlydiscarding the halfs where the minimum cannot be found More precisely if we im-plement it by interval arithmetic it proceeds as follows First in an optional initialstep we subdivide (if necessary) the interval I0 into smaller intervals Ik to which thealgorithm will actually be applied For each k interval arithmetic gives us a lower

      26 INTERVAL ARITHMETIC AND THE BISECTION METHOD 39

      bound rminusk and an upper bound r+k on f(x) x isin Ik here rminusk and r+

      k are both ofthe form a2` a ` isin Z Let m0 be the minimum of r+

      k over all k We can discardall the intervals Ik for which rminusk gt m0 Then we apply the main procedure startingwith i = 1 split each surviving interval into two equal halves recompute the lower andupper bound on each half definemi as before to be the minimum of all upper boundsand discard again the intervals on which the lower bound is larger than mi increase iby 1 We repeat the main procedure as often as needed In the end we obtain that theminimum is no smaller than the minimum of the lower bounds (call them (r(i))minusk ) onall surviving intervals I(i)

      k Of course we also obtain that the minimum (or minima ifthere is more than one) must lie in one of the surviving intervals

      It is easy to see how the same method can be applied (with a trivial modification)to find maxima or (with very slight changes) to find the roots of a real-valued functionon a compact interval

      40 CHAPTER 2 NOTATION AND PRELIMINARIES

      Part I

      Minor arcs

      41

      Chapter 3

      Introduction

      The circle method expresses the number of solutions to a given problem in terms ofexponential sums Let η R+ rarr C be a smooth function Λ the von Mangoldt function(defined as in (15)) and e(t) = e2πit The estimation of exponential sums of the type

      Sη(α x) =sumn

      Λ(n)e(αn)η(nx) (31)

      where α isin RZ already lies at the basis of Hardy and Littlewoodrsquos approach to theternary Goldbach problem by means of the circle method [HL22] The division of thecircle RZ into ldquomajor arcsrdquo and ldquominor arcsrdquo goes back to Hardy and Littlewoodrsquosdevelopment of the circle method for other problems As they themselves noted as-suming GRH means that for the ternary Goldbach problem all of the circle can bein effect subdivided into major arcs ndash that is under GRH (31) can be estimated withmajor-arc techniques for α arbitrary They needed to make such an assumption pre-cisely because they did not yet know how to estimate Sη(α x) on the minor arcs

      Minor-arc techniques for Goldbachrsquos problem were first developed by Vinogradov[Vin37] These techniques make it possible to work without GRH The main obstacleto a full proof of the ternary Goldbach conjecture since then has been that in spite ofgradual improvements minor-arc bounds have simply not been strong enough

      As in all work to date our aim will be to give useful upper bounds on (31) forα in the minor bounds rather than the precise estimates that are typical of the major-arc case We will have to give upper bounds that are qualitatively stronger than thoseknown before (In Part III we will also show how to use them more efficiently)

      Our main challenge will be to give a good upper bound whenever q is larger than aconstant r Here ldquosufficiently goodrdquo means ldquosmaller than the trivial bound divided bya large constant and getting even smaller quickly as q growsrdquo Our bound must also begood for α = aq + δx where q lt r but δ is large (Such an α may be said to lie onthe tail (δ large) of a major arc (q small))

      Of course all expressions must be explicit and all constants in the leading terms ofthe bound must be small Still the main requirement is a qualitative one For instancewe know in advance that a single factor of log x would be the end of us That is we

      43

      44 CHAPTER 3 INTRODUCTION

      know that if there is a single term of the form say (x log x)q and the trivial boundis about x we are lost (x log x)q is greater than x for x large and q constant

      The quality of the results here is due to several new ideas of general applicabilityIn particular sect51 introduces a way to obtain cancellation from Vaughanrsquos identityVaughanrsquos identity is a two-log gambit in that it introduces two convolutions (each ofthem at a cost of log) and offers a great deal of flexibility in compensation One of theideas presented here is that at least one of two logs can be successfully recovered afterhaving been given away in the first stage of the proof This reduces the cost of the useof this basic identity in this and presumably many other problems

      There are several other improvements that make a qualitative difference see thediscussions at the beginning of sect4 and sect5 Considering smoothed sums ndash now a com-mon idea ndash also helps (Smooth sums here go back to Hardy-Littlewood [HL22] ndash bothin the general context of the circle method and in the context of Goldbachrsquos ternaryproblem In recent work on the problem they reappear in [Tao14])

      31 ResultsThe main bound we are about to see is essentially proportional to ((log q)

      radicφ(q)) middot x

      The term δ0 serves to improve the bound when we are on the tail of an arc

      Theorem 311 Let x ge x0 x0 = 216 middot 1020 Let Sη(α x) be as in (31) with ηdefined in (34) Let 2α = aq + δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23 If q le x136 then

      |Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

      middot x+25xradicδ0q

      +2x

      δ0qmiddot Lxδ0qq + 336x56

      (32)where

      δ0 = max(2 |δ|4) Rxt = 027125 log

      (1 +

      log 4t

      2 log 9x13

      2004t

      )+ 041415

      Lxtq =q

      φ(q)

      (13

      4log t+ 782

      )+ 1366 log t+ 3755

      (33)If q gt x136 then

      |Sη(α x)| le 0276x56(log x)32 + 1234x23 log x

      The factor Rxt is small in practice for instance for x = 1025 and δ0q = 5 middot 105

      (typical ldquodifficultrdquo values) Rxδ0q equals 059648 The classical choice1 for η in (31) is η(t) = 1 for t le 1 η(t) = 0 for t gt 1 which

      of course is not smooth or even continuous We use

      η(t) = η2(t) = 4 max(log 2minus | log 2t| 0) (34)

      1Or more precisely the choice made by Vinogradov and followed by most of the literature since himHardy and Littlewood [HL22] worked with η(t) = eminust

      32 COMPARISON TO EARLIER WORK 45

      as in Tao [Tao14] in part for purposes of comparison (This is the multiplicative con-volution of the characteristic function of an interval with itself) Nearly all work shouldbe applicable to any other sufficiently smooth function η of fast decay It is importantthat η decay at least quadratically

      We are not forced to use the same smoothing function as in Part II and we do notAs was explained in the introduction the simple technique (140) allows us to workwith one smoothing function on the major arcs and with another one on the minor arcs

      32 Comparison to earlier workTable 31 compares the bounds for the ratio |Sη(aq x)|x given by this paper and by[Tao14][Thm 13] for x = 1027 and different values of q We are comparing worstcases φ(q) as small as possible (q divisible by 2 middot 3 middot 5 middot middot middot ) in the result here and qdivisible by 4 (implying 4α sim a(q4)) in Taorsquos result The main term in the result inthis paper improves slowly with increasing x the results in [Tao14] worsen slowly withincreasing x The qualitative gain with respect to the main term in [Tao14 (110)] is inthe order of log(q)

      radicφ(q)q Notice also that the bounds in [Tao14] are not log-free in

      [Tao14 (110)] there is a term proportional to x(log x)2q This becomes larger thanthe trivial bound x for x very large

      The results in [DR01] are unfortunately worse than the trivial bound in the rangecovered by Table 31 Ramarersquos results ([Ram10 Thm 3] [Ramc Thm 6]) are notapplicable within the range since neither of the conditions log q le (150)(log x)13q le x148 is satisfied Ramarersquos bound in [Ramc Thm 6] is∣∣∣∣∣∣

      sumxltnle2x

      Λ(n)e(anq)

      ∣∣∣∣∣∣ le 13000

      radicq

      φ(q)x (35)

      for 20 le q le x148 We should underline that while both the constant 13000 and thecondition q le x148 keep (35) from being immediately useful in the present context(35) is asymptotically better than the results here as q rarr infin (Indeed qualitativelyspeaking the form of (35) is the best one can expect from results derived by the familyof methods stemming from Vinogradovrsquos work) There is also unpublished work byRamare (ca 1993) with different constants for q (log x log log x)4

      33 Basic setupIn the minor-arc regime the first step in estimating an exponential sum on the primesgenerally consists in the application of an identity expressing the von Mangoldt func-tion Λ(n) in terms of a sum of convolutions of other functions

      331 Vaughanrsquos identityWe recall Vaughanrsquos identity [Vau77b]

      Λ = microleU lowast log +microleU lowast ΛleV lowast 1 + microgtU lowast ΛgtV lowast 1 + ΛleV (36)

      46 CHAPTER 3 INTRODUCTION

      q0|Sη(aqx)|

      x HH |Sη(aqx)|x Tao

      105 004661 03447515 middot 105 003883 02883625 middot 105 003098 0231945 middot 105 002297 01741675 middot 105 001934 014775106 001756 013159107 000690 005251

      Table 31 Worst-case upper bounds on xminus1|Sη(a2q x)| for q ge q0 |δ| le 8 x =1027 The trivial bound is 1

      where 1 is the constant function 1 and where we write

      flez(n) =

      f(n) if n le z0 if n gt z

      fgtz(n) =

      0 if n le zf(n) if n gt z

      Here f lowast g denotes the Dirichlet convolution (f lowast g)(n) =sumd|n f(d)g(nd) We can

      set the values of U and V however we wishVaughanrsquos identity is essentially a consequence of the Mobius inversion formula

      (1 lowast micro)(n) =

      1 if n = 10 otherwise

      (37)

      Indeed by (37)

      ΛgtV (n) =sumdm|n

      micro(d)ΛgtV (m)

      =sumdm|n

      microleU (d)ΛgtV (m) +sumdm|n

      microgtU (d)ΛgtV (m)

      Applying to this the trivial equality ΛgtV = Λ minus ΛleV as well as the simple fact that1 lowast Λ = log we obtain that

      ΛgtV (n) =sumd|n

      microleU (d) log(nd)minussumdm|n

      microleU (d)ΛleV (m) +sumdm|n

      microgtU (d)ΛgtV (m)

      By ΛV = ΛgtV + ΛgeV we conclude that Vaughanrsquos identity (36) holdsApplying Vaughanrsquos identity we easily get that for any function η R rarr R any

      completely multiplicative function f Z+ rarr C and any x gt 0 U V ge 0sumn

      Λ(n)f(n)e(αn)η(nx) = SI1 minus SI2 + SII + S0infin (38)

      33 BASIC SETUP 47

      where

      SI1 =summleU

      micro(m)f(m)sumn

      (log n)e(αmn)f(n)η(mnx)

      SI2 =sumdleV

      Λ(d)f(d)summleU

      micro(m)f(m)sumn

      e(αdmn)f(n)η(dmnx)

      SII =summgtU

      f(m)

      sumdgtUd|m

      micro(d)

      sumngtV

      Λ(n)e(αmn)f(n)η(mnx)

      S0infin =sumnleV

      Λ(n)e(αn)f(n)η(nx)

      (39)

      We will use the function

      f(n) =

      1 if gcd(n v) = 10 otherwise

      (310)

      where v is a small positive square-free integer (Our final choice will be v = 2) Then

      Sη(x α) = SI1 minus SI2 + SII + S0infin + S0w (311)

      where Sη(x α) is as in (31) and

      S0v =sumn|v

      Λ(n)e(αn)η(nx)

      The sums SI1 SI2 are called ldquoof type Irdquo the sum SII is called ldquoof type IIrdquo (orbilinear) (The not-all-too colorful nomenclature goes back to Vinogradov) The sumS0infin is in general negligible for our later choice of V and η it will be in fact 0 Thesum S0v will be negligible as well

      As we already discussed in the introduction Vaughanrsquos identity is highly flexible(in that we can choose U and V at will) but somewhat inefficient in practice (in that atrivial estimate for the right side of (311) is actually larger than a trivial estimate forthe left side of (311)) Some of our work will consist in regaining part of what is givenup when we apply Vaughanrsquos identity

      332 An alternative route

      There is an alternative route ndash namely to use a less sacrificial though also more in-flexible identity While this was not in the end the route that was followed let usnevertheless discuss it in some detail in part so that we can understand to what extentit was in retrospect viable and in part so as to see how much of the work we willundertake is really more or less independent of the particular identity we choose

      48 CHAPTER 3 INTRODUCTION

      Since ζ prime(s)ζ(s) =sumn Λ(n)nminuss and(

      ζ prime(s)

      ζ(s)

      )(2)

      =

      (ζ primeprime(s)

      ζ(s)minus (ζ prime(s))

      2

      ζ(s)2

      )prime

      =ζ(3)(s)

      ζ(s)minus 3ζ primeprime(s)ζ prime(s)

      ζ(s)2+ 2

      (ζ prime(s)

      ζ(s)

      )3

      =ζ(3)(s)

      ζ(s)minus 3

      (ζ prime(s)

      ζ(s)

      )primemiddot ζprime(s)

      ζ(s)minus(ζ prime(s)

      ζ(s)

      )3

      (312)

      we can see comparing coefficients that

      Λ middot log2 = micro lowast log3minus3(Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (313)

      as was stated by Bombieri in [Bom76]Here the term microlowast log3 is of the same kind as the term microleU lowast log we have to estimate

      if we use Vaughanrsquos identity though the fact that there is no truncation at U means thatone of the error terms will get larger ndash it will be proportional to x in fact if we sumfrom 1 to x The trivial upper bound on the sum of Λ middot log2 from 1 to x is x(log x)2thus an error term of size x is barely acceptable

      In general when we have a double or triple sum we are not very good at gettingbetter than trivial bounds in ranges in which all but one of the variables are very smallThis is the source of the large error term that appears in the sum involving micro lowast log3

      because we are no longer truncating as for microleU lowast log It will also be the source of otherlarge error terms including one that would be too large ndash namely the one coming fromthe term (Λ middot log) lowast Λ when the variable of Λ middot log is large and that of Λ is small (Thetrivial bound on that range is x log x)

      We avoid this problem by substituting the identity Λ middot log = micro lowast log2minusΛ lowastΛ inside(313)

      Λ middot log2 = micro lowast log3minus3(micro lowast log2) lowast Λ + 2Λ lowast Λ lowast Λ (314)

      (We could also have got this directly from the next-to-last line in (312)) When thevariable of Λ in (micro lowast log2) lowast Λ is small the variable of micro lowast log2 is large and we canestimate the resulting term using the same techniques as for micro lowast log3

      It is easy to see that we can in fact mix (313) and (314)

      Λ middot log2 = micro lowast log3minus3((Λ middot log) lowast ΛgtV + (micro lowast log2) lowast ΛleV

      )+ (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ)

      (315)

      for V arbitrary Note here that there is some cancellation in the last term writing

      F3V (n) = (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ) (n) (316)

      we can check easily that for n = p1p2p3 square-free with V 3 lt n we have

      F3V (n) =

      minus6 log p1 log p2 log p3 if all pi gt V 0 if p1 lt p2 le V lt p36 log p1 log p2 log p3 if p1 le V lt p2 lt p312 log p1 log p2 log p3 if all pi le V

      33 BASIC SETUP 49

      In contrast for n square-free minusΛ lowast Λ lowast Λ(n) is minus6 if n is of the form p1p2p3 and 0otherwise

      We may find it useful to take aside two large terms that may need to be boundedtrivially namely micro lowast log3

      leu and (Λ middot log)leu lowastΛgtV where u will be a small parameter(We can let for instance u = 3) We conclude that

      Λ middot log2 = FI1u(n)minus 3FI2Vu(n)minus 3FIIVu(n) + F3V (n) + F0Vu(n) (317)

      whereFI1u = micro lowast log3

      gtu

      FI2Vu = (micro lowast log2) lowast ΛleV

      FIIVu(n) = (Λ middot log)gtu lowast ΛgtV

      F0Vu(n) = micro lowast log3leuminus3(Λ middot log)leu lowast ΛgtV

      and F3V is as in (316)In the bulk of the present work ndash in particular in all steps that are part of the proof

      of Theorem 311 or the Main Theorem ndash we will use Vaughanrsquos identity rather than(317) This choice was made while the proof was still underway it was due mainlyto back-of-the-envelope estimates that showed that the error terms could be too largeif (314) was used Of course this might have been the case with Vaughanrsquos identityas well but the fact that the parameters U V there have a large effect on the outcomemeant that one could hope to improve on insufficient estimates in part by adjusting Uand V without losing all previous work (This is what was meant by the ldquoflexibilityrdquoof Vaughanrsquos identity)

      The question remains can one prove ternary Goldbach using (317) rather thanVaughanrsquos identity This seems likely If so which proof would be more complicatedThis is not clear

      There are large parts of the work that are the essentially the same in both cases

      bull estimates for sums involving microleU lowast logk (ldquotype Irdquo)

      bull estimates for sums involving Λgtu lowast ΛgtV and the like (ldquotype IIrdquo)

      Trilinear sums ie sums involving ΛlowastΛlowastΛ can be estimated much like bilinear sumsie sums involving Λ lowast Λ

      There are also challenges that appear only for Vaughanrsquos identity and others thatappear only for (317) An example of a challenge that is successfully faced in the mainproof but does not appear if (317) is used consists in bounding sums of type

      sumUltmlexW

      sumdgtUd|m

      micro(d)

      2

      (In sect51 we will be able to bound sums of this type by a constant times xW ) Like-wise large tail terms that have to be estimated trivially seem unavoidable in (317)(The choice of a parameter u gt 1 as above is meant to alleviate the problem)

      50 CHAPTER 3 INTRODUCTION

      In the end losing a factor of about log xUV seems inevitable when one usesVaughanrsquos identity but not when one uses (317) Another reason why a full treatmentbased on (317) would also be worthwhile is that it is a somewhat less familiar andarguably under-used identity and deserves more exploration With these commentswe close the discussion of (317) we will henceforth use Vaughanrsquos identity

      Chapter 4

      Type I sums

      Here we must bound sums of the basic typesummleD

      micro(m)sumn

      e(αmn)η(mnx

      )and variations thereof There are three main improvements in comparison to standardtreatments

      1 The terms with m divisible by q get taken out and treated separately by analyticmeans This all but eliminates what would otherwise be the main term

      2 The other terms get handled by improved estimates on trigonometric sums Forlarge m the improvements have a substantial total effect ndash more than a constantfactor is gained

      3 The ldquoerrorrdquo term δx = α minus aq is used to our advantage This happens boththrough the Poisson summation formula and through the use of two alternativeapproximations to the same number α

      The fact that a continuous weight η is used (ldquosmoothingrdquo) is a difference with respectto the classical literature ([Vin37] and what followed) but not with respect to morerecent work (including [Tao14]) using smooth or continuous weights is an idea thathas become commonplace in analytic number theory even though it is not consistentlyapplied The improvements due to smoothing in type I are both relatively minor andessentially independent of the improvements due to (1) and (3) The use of a contin-uous weight combines nicely with (2) but the ideas given here would give qualitativeimprovements in the treatment of trigonometric sums even in the absence of smoothing

      41 Trigonometric sumsThe following lemmas on trigonometric sums improve on the best Vinogradov-typelemmas in the literature (By this we mean results of the type of Lemma 8a and

      51

      52 CHAPTER 4 TYPE I SUMS

      Lemma 8b in [Vin04 Ch I] See in particular the work of Daboussi and Rivat [DR01Lemma 1]) The main idea is to switch between different types of approximation withinthe sum rather than just choosing between bounding all terms either trivially (by A)or non-trivially (by C| sin(παn)|2) There will also1 be improvements in our appli-cations stemming from the fact that Lemmas 411 and Lemma 412 take quadratic(| sin(παn)|2) rather than linear (| sin(παn)|) inputs (These improved inputs comefrom the use of smoothing elsewhere)

      Lemma 411 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Then for anyAC ge 0

      sumyltnley+q

      min

      (A

      C

      | sin(παn)|2

      )le min

      (2A+

      6q2

      π2C 3A+

      4q

      π

      radicAC

      ) (41)

      Proof We start by letting m0 = byc + b(q + 1)2c j = n minusm0 so that j ranges inthe interval (minusq2 q2] We write

      αn =aj + c

      q+ δ1(j) + δ2 mod 1

      where |δ1(j)| and |δ2| are both le 12q we can assume δ2 ge 0 The variable r =aj + c mod q occupies each residue class mod p exactly once

      One option is to bound the terms corresponding to r = 0minus1 by A each and allthe other terms by C| sin(παn)|2 (This can be seen as the simple case it will takeus about a page just because we should estimate all sums and all terms here with greatcare ndash as in [DR01] only more so)

      The terms corresponding to r = minusk and r = k minus 1 (2 le k le q2) contribute atmost

      1

      sin2 πq (k minus 1

      2 minus qδ2)+

      1

      sin2 πq (k minus 3

      2 + qδ2)le 1

      sin2 πq

      (k minus 1

      2

      ) +1

      sin2 πq

      (k minus 3

      2

      ) since x 7rarr 1

      (sin x)2 is convex-up on (0infin) Hence the terms with r 6= 0 1 contribute atmost

      1(sin π

      2q

      )2 + 2sum

      2lerle q2

      1(sin π

      q (r minus 12))2 le

      1(sin π

      2q

      )2 + 2

      int q2

      1

      1(sin π

      q x)2

      where we use again the convexity of x 7rarr 1(sinx)2 (We can assume q gt 2 asotherwise we have no terms other than r = 0 1) Nowint q2

      1

      1(sin π

      q x)2 dx =

      q

      π

      int π2

      πq

      1

      (sinu)2du =

      q

      πcot

      π

      q

      1This is a change with respect to the first version of the preprint [Helb] The version of Lemma 411there has however the advantage of being immediately comparable to results in the literature

      41 TRIGONOMETRIC SUMS 53

      Hence sumyltnley+q

      min

      (A

      C

      (sinπαn)2

      )le 2A+

      C(sin π

      2q

      )2 + C middot 2q

      πcot

      π

      q

      Now by [AS64 (4368)] and [AS64 (4370)] for t isin (minusπ π)

      t

      sin t= 1 +

      sumkge0

      a2k+1t2k+2 = 1 +

      t2

      6+

      t cot t = 1minussumkge0

      b2k+1t2k+2 = 1minus t2

      3minus t4

      45minus

      (42)

      where a2k+1 ge 0 b2k+1 ge 0 Thus for t isin [0 t0] t0 lt π(t

      sin t

      )2

      = 1 +t2

      3+ c0(t)t4 le 1 +

      t2

      3+ c0(t0)t4 (43)

      where

      c0(t) =1

      t4

      ((t

      sin t

      )2

      minus(

      1 +t2

      3

      ))

      which is an increasing function because a2k+1 ge 0 For t0 = π4 c0(t0) le 0074807Hence

      t2

      sin2 t+ t cot 2t le

      (1 +

      t2

      3+ c0

      (π4

      )t4)

      +

      (1

      2minus 2t2

      3minus 8t4

      45

      )=

      3

      2minus t2

      3+

      (c0

      (π4

      )minus 8

      45

      )t4 le 3

      2minus t2

      3le 3

      2

      for t isin [0 π4]Therefore the left side of (41) is at most

      2A+ C middot(

      2q

      π

      )2

      middot 3

      2= 2A+

      6

      π2Cq2

      The following is an alternative approach it yields the other estimate in (41) Webound the terms corresponding to r = 0 r = minus1 r = 1 by A each We let r = plusmnrprimefor rprime ranging from 2 to q2 We obtain that the sum is at most

      3A+sum

      2lerprimeleq2

      min

      A C(sin π

      q

      (rprime minus 1

      2 minus qδ2))2

      +

      sum2lerprimeleq2

      min

      A C(sin π

      q

      (rprime minus 1

      2 + qδ2))2

      (44)

      54 CHAPTER 4 TYPE I SUMS

      We bound a term min(AC sin((πq)(rprime minus 12 plusmn qδ2))2) by A if and only ifC sin((πq)(rprimeminus 1plusmn qδ2))2 ge A (In other words we are choosing which of the twobounds A C| sin(παn)|2 on a case-by-case basis ie for each n instead of makinga single choice for all n in one go This is hardly anything deep but it does result ina marked improvement with respect to the literature and would give an improvementeven if we were given a bound B| sin(παn)| instead of a bound C| sin(παn)|2 asinput) The number of such terms is

      le max(0 b(qπ) arcsin(radicCA)∓ qδ2c)

      and thus at most (2qπ) arcsin(radicCA) in total (Recall that qδ2 le 12) Each

      other term gets bounded by the integral of C sin2(παq) from rprime minus 1 plusmn qδ2 (ge(qπ) arcsin(

      radicCA)) to rprime plusmn qδ2 by convexity Thus (44) is at most

      3A+2q

      πA arcsin

      radicC

      A+ 2

      int q2

      qπ arcsin

      radicCA

      C

      sin2 πtq

      dt

      le 3A+2q

      πA arcsin

      radicC

      A+

      2q

      πC

      radicA

      Cminus 1

      We can easily show (taking derivatives) that arcsinx + x(1 minus x2) le 2x for 0 lex le 1 Setting x = CA we see that this implies that

      3A+2q

      πA arcsin

      radicC

      A+

      2q

      πC

      radicA

      Cminus 1 le 3A+

      4q

      π

      radicAC

      (If CA gt 1 then 3A + (4qπ)radicAC is greater than Aq which is an obvious upper

      bound for the left side of (41))

      Now we will see that if we take out terms with n divisible by q and n is not toolarge then we can give a bound that does not involve a constant term A at all (We arereferring to the bound (203π2)Cq2 below of course 2A + (4qπ)

      radicAC does have

      a constant term 2A ndash it is just smaller than the constant term 3A in the correspondingbound in (41))

      Lemma 412 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any AC ge 0sum

      y1ltnley2q-n

      min

      (A

      C

      | sin(παn)|2

      )le min

      (20

      3π2Cq2 2A+

      4q

      π

      radicAC

      ) (45)

      Proof Clearly αn equals anq + (nQ)βq since y2 le Q2 this means that |αnminusanq| le 12q for n le y2 moreover again for n le y2 the sign of αnminus anq remainsconstant Hence the left side of (45) is at most

      q2sumr=1

      min

      (A

      C

      (sin πq (r minus 12))2

      )+

      q2sumr=1

      min

      (A

      C

      (sin πq r)

      2

      )

      41 TRIGONOMETRIC SUMS 55

      Proceeding as in the proof of Lemma 411 we obtain a bound of at most

      C

      (1

      (sin π2q )2

      +1

      (sin πq )2

      +q

      πcot

      π

      q+q

      πcot

      2q

      )

      for q ge 2 (If q = 1 then the left-side of (45) is trivially zero) Now by (42)

      t2

      (sin t)2+t

      2cot 2t le

      (1 +

      t2

      3+ c0

      (π4

      )t4)

      +1

      4

      (1minus 4t2

      3minus 16t4

      45

      )le 5

      4+

      (c0

      (π4

      )minus 4

      45

      )t4 le 5

      4

      for t isin [0 π4] and

      t2

      (sin t)2+ t cot

      3t

      2le(

      1 +t2

      3+ c0

      (π2

      )t4)

      +2

      3

      (1minus 3t2

      4minus 81t4

      24 middot 45

      )le 5

      3+

      (minus1

      6+

      (c0

      (π2

      )minus 27

      360

      )(π2

      )2)t2 le 5

      3

      for t isin [0 π2] Hence(1

      (sin π2q )2

      +1

      (sin πq )2

      +q

      πcot

      π

      q+q

      πcot

      2q

      )le(

      2q

      π

      )2

      middot 54

      +( qπ

      )2

      middot 53le 20

      3π2q2

      Alternatively we can follow the second approach in the proof of Lemma 411 andobtain an upper bound of 2A+ (4qπ)

      radicAC

      The following bound will be useful when the constant A in an application ofLemma 412 would be too large (This tends to happen for n small)

      Lemma 413 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any BC ge 0

      sumy1ltnley2

      q-n

      min

      (B

      | sin(παn)|

      C

      | sin(παn)|2

      )le 2B

      q

      πmax

      (2 log

      Ce3q

      ) (46)

      The upper bound le (2Bqπ) log(2e2qπ) is also valid

      Proof As in the proof of Lemma 412 we can bound the left side of (46) by

      2

      q2sumr=1

      min

      (B

      sin πq

      (r minus 1

      2

      ) C

      sin2 πq

      (r minus 1

      2

      ))

      56 CHAPTER 4 TYPE I SUMS

      Assume B sin(πq) le C le B By the convexity of 1 sin(t) and 1 sin(t)2 fort isin (0 π2]

      q2sumr=1

      min

      (B

      sin πq

      (r minus 1

      2

      ) C

      sin2 πq

      (r minus 1

      2

      ))

      le B

      sin π2q

      +

      int qπ arcsin C

      B

      1

      B

      sin πq tdt+

      int q2

      qπ arcsin C

      B

      1

      sin2 πq tdt

      le B

      sin π2q

      +q

      π

      (B

      (log tan

      (1

      2arcsin

      C

      B

      )minus log tan

      π

      2q

      )+ C cot arcsin

      C

      B

      )le B

      sin π2q

      +q

      π

      (B

      (log cot

      π

      2qminus log

      C

      B minusradicB2 minus C2

      )+radicB2 minus C2

      )

      Now for all t isin (0 π2)

      2

      sin t+

      1

      tlog cot t lt

      1

      tlog

      (e2

      t

      )

      we can verify this by comparing series Thus

      B

      sin π2q

      +q

      πB log cot

      π

      2qle B q

      πlog

      2e2q

      π

      for q ge 2 (If q = 1 the sum on the left of (46) is empty and so the bound we aretrying to prove is trivial) We also have

      t log(tminusradict2 minus 1) +

      radict2 minus 1 lt minust log 2t+ t (47)

      for t ge 1 (as this is equivalent to log(2t2(1minusradic

      1minus tminus2)) lt 1minusradic

      1minus tminus2 which wecheck easily after changing variables to δ = 1minus

      radic1minus tminus2) Hence

      B

      sin π2q

      +q

      π

      (B

      (log cot

      π

      2qminus log

      C

      B minusradicB2 minus C2

      )+radicB2 minus C2

      )le B q

      πlog

      2e2q

      π+q

      π

      (B minusB log

      2B

      C

      )le B q

      πlog

      Ce3q

      for q ge 2Given any C we can apply the above with C = B instead as for any t gt 0

      min(Bt Ct2) le Bt le min(BtBt2) (We refrain from applying (47) so as toavoid worsening a constant) If C lt B sinπq (or even if C lt (πq)B) we relax theinput to C = B sinπq and go through the above

      42 Type I estimatesLet us give our first main type I estimate2 One of the main innovations is the mannerin which the ldquomain termrdquo (m divisible by q) is separated we are able to keep error

      2The current version of Lemma 421 is an improvement over that included in the first version of thepreprint [Helb]

      42 TYPE I ESTIMATES 57

      terms small thanks to the particular way in which we switch between two differentapproximations

      (These are not necessarily successive approximations in the sense of continuedfractions we do not want to assume that the approximation aq we are given arisesfrom a continued fraction and at any rate we need more control on the denominator qprime

      of the new approximation aprimeqprime than continued fractions would furnish)The following lemma is a theme so to speak to which several variations will be

      given Later in practice we will always use one of the variations rather than theoriginal lemma itself This is so just because even though (48) is the basic type ofsum we treat in type I the sums that we will have to estimate in practice will alwayspresent some minor additional complication Proving the lemma we are about to givein full will give us a chance to see all the main ideas at work leaving complications forlater

      Lemma 421 Let α = aq+ δx (a q) = 1 |δx| le 1qQ0 q le Q0 Q0 ge 16 Letη be continuous piecewise C2 and compactly supported with |η|1 = 1 and ηprimeprime isin L1Let c0 ge |ηprimeprime|infin

      Let 1 le D le x Then if |δ| le 12c2 where c2 = (3π5radicc0)(1 +

      radic133) the

      absolute value of summleD

      micro(m)sumn

      e(αmn)η(mnx

      )(48)

      is at most

      x

      qmin

      (1

      c0(2πδ)2

      ) ∣∣∣∣∣∣∣∣∣∣summleMq

      (mq)=1

      micro(m)

      m

      ∣∣∣∣∣∣∣∣∣∣+Olowast

      (c0

      (1

      4minus 1

      π2

      )(D2

      2xq+D

      2x

      ))(49)

      plus

      2radicc0c1π

      D + 3c1x

      qlog+ D

      c2xq+

      radicc0c1π

      q log+ D

      q2

      +|ηprime|1π

      q middotmax

      (2 log

      c0e3q2

      4π|ηprime|1x

      )+

      (2radic

      3c0c1π

      +3c1c2

      +55c0c212π2

      )q

      (410)

      where c1 = 1 + |ηprime|1(2xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

      In general if |δ| ge 12c2 the absolute value of (48) is at most (49) plus

      2radicc0c1π

      (D + (1 + ε) min

      (lfloorx

      |δ|q

      rfloor+ 1 2D

      )($ε +

      1

      2log+ 2D

      x|δ|q

      ))

      + 3c1

      (2 +

      (1 + ε)

      εlog+ 2D

      x|δ|q

      )x

      Q0+

      35c0c26π2

      q

      (411)

      for ε isin (0 1] arbitrary where $ε =radic

      3 + 2ε+ ((1 +radic

      133)4minus 1)(2(1 + ε))

      58 CHAPTER 4 TYPE I SUMS

      In (49) min(1 c0(2πδ)2) always equals 1 when |δ| le 12c2 (since (35)(1 +radic

      133) gt 1)

      Proof Let Q = bx|δq|c Then α = aq + Olowast(1qQ) and q le Q (If δ = 0 welet Q = infin and ignore the rest of the paragraph since then we will never need Qprime orthe alternative approximation aprimeqprime) Let Qprime = d(1 + ε)Qe ge Q + 1 Then α is notaq + Olowast(1qQprime) and so there must be a different approximation aprimeqprime (aprime qprime) = 1qprime le Qprime such that α = aprimeqprime + Olowast(1qprimeQprime) (since such an approximation alwaysexists) Obviously |aq minus aprimeqprime| ge 1qqprime yet at the same time |aq minus aprimeqprime| le1qQ+ 1qprimeQprime le 1qQ+ 1((1 + ε)qprimeQ) Hence qprimeQ+ q((1 + ε)Q) ge 1 and soqprime ge Qminusq(1+ε) ge (ε(1+ε))Q (Note also that (ε(1+ε))Q ge (2|δq|x)middotbxδqc gt1 and so qprime ge 2)

      Lemma 412 will enable us to treat separately the contribution from terms withm divisible by q and m not divisible by q provided that m le Q2 Let M =min(Q2 D) We start by considering all terms with m le M divisible by q Thene(αmn) equals e((δmx)n) By Poisson summation

      sumn

      e(αmn)η(mnx) =sumn

      f(n)

      where f(u) = e((δmx)u)η((mx)u) Now

      f(n) =

      inte(minusun)f(u)du =

      x

      m

      inte((δ minus xn

      m

      )u)η(u)du =

      x

      mη( xmnminus δ

      )

      By assumption m le M le Q2 le x2|δq| and so |xm| ge 2|δq| ge 2δ Thus by(21) (with k = 2)

      sumn

      f(n) =x

      m

      η(minusδ) +sumn 6=0

      η(nxmminus δ)

      =x

      m

      η(minusδ) +Olowast

      sumn6=0

      1(2π(nxm minus δ

      ))2 middot ∣∣∣ηprimeprime∣∣∣

      infin

      =

      x

      mη(minusδ) +

      m

      x

      c0(2π)2

      Olowast

      max|r|le 1

      2

      sumn 6=0

      1

      (nminus r)2

      (412)

      Since x 7rarr 1x2 is convex on R+

      max|r|le 1

      2

      sumn 6=0

      1

      (nminus r)2=sumn 6=0

      1(nminus 1

      2

      )2 = π2 minus 4

      42 TYPE I ESTIMATES 59

      Therefore the sum of all terms with m leM and q|m issummleMq|m

      x

      mη(minusδ) +

      summleMq|m

      m

      x

      c0(2π)2

      (π2 minus 4)

      =xmicro(q)

      qmiddot η(minusδ) middot

      summleMq

      (mq)=1

      micro(m)

      m

      +Olowast(micro(q)2c0

      (1

      4minus 1

      π2

      )(D2

      2xq+D

      2x

      ))

      We will bound |η(minusδ)| by (21)As we have just seen estimating the contribution of the terms with m divisible by

      q and not too large (m le M ) involves isolating a main term estimating it carefully(with cancellation) and then bounding the remaining error terms

      We will now bound the contribution of all other m ndash that is m not divisible by qand m larger than M Cancellation will now be used only within the inner sum thatis we will bound each inner sum

      Tm(α) =sumn

      e(αmn)η(mnx

      )

      and then we will carefully consider how to bound sums of |Tm(α)| over m efficientlyBy (22) and Lemma 231

      |Tm(α)| le min

      (x

      m+

      1

      2|ηprime|1

      12 |ηprime|1

      | sin(πmα)|m

      x

      c04

      1

      (sinπmα)2

      ) (413)

      For any y2 gt y1 gt 0 with y2 minus y1 le q and y2 le Q2 (413) gives us thatsumy1ltmley2

      q-m

      |Tm(α)| lesum

      y1ltmley2q-m

      min

      (A

      C

      (sinπmα)2

      )(414)

      for A = (xy1)(1 + |ηprime|1(2(xy1))) and C = (c04)(y2x) We must now estimatethe sum sum

      mleMq-m

      |Tm(α)|+sum

      Q2 ltmleD

      |Tm(α)| (415)

      To bound the terms with m le M we can use Lemma 412 The question is thenwhich one is smaller the first or the second bound given by Lemma 412 A briefcalculation gives that the second bound is smaller (and hence preferable) exactly whenradicCA gt (3π10q)(1 +

      radic133) Since

      radicCA sim (

      radicc02)mx this means that

      it is sensible to prefer the second bound in Lemma 412 when m gt c2xq wherec2 = (3π5

      radicc0)(1 +

      radic133)

      It thus makes sense to ask does Q2 le c2xq (so that m le M implies m lec2xq) This question divides our work into two basic cases

      60 CHAPTER 4 TYPE I SUMS

      Case (a) δ large |δ| ge 12c2 where c2 = (3π5radicc0)(1 +

      radic133) Then

      Q2 le c2xq this will induce us to bound the first sum in (415) by the first bound inLemma 412

      Recall that M = min(Q2 D) and so M le c2xq By (414) and Lemma 412

      sum1lemleMq-m

      |Tm(α)| leinfinsumj=0

      sumjqltmlemin((j+1)qM)

      q-m

      min

      (x

      jq + 1+|ηprime|1

      2

      c04

      (j+1)qx

      (sinπmα)2

      )

      le 20

      3π2

      c0q3

      4x

      sum0lejleMq

      (j + 1) le 20

      3π2

      c0q3

      4xmiddot(

      1

      2

      M2

      q2+

      3

      2

      c2x

      q2+ 1

      )

      le 5c0c26π2

      M +5c0q

      3π2

      (3

      2c2 +

      q2

      x

      )le 5c0c2

      6π2M +

      35c0c26π2

      q

      (416)where to bound the smaller terms we are using the inequality Q2 le c2xq andwhere we are also using the observation that since |δx| le 1qQ0 the assumption|δ| ge 12c2 implies that q le 2c2xQ0 moreover since q le Q0 this gives us thatq2 le 2c2x In the main term we are bounding qM2x from above by M middot qQ2x leM2δ le c2M

      If D le (Q + 1)2 then M ge bDc and so (416) is all we need the second sumin (415) is empty Assume from now on that D gt (Q+ 1)2 The first sum in (415)is then bounded by (416) (with M = Q2) To bound the second sum in (415) wewill use the approximation aprimeqprime instead of aq The motivation is the following ifwe used the approximation aq even for m gt Q2 the contribution of the terms withq|m would be too large When we use aprimeqprime the contribution of the terms with qprime|m(or m equiv plusmn1 mod qprime) is very small only a fraction 1qprime (tiny since qprime is large) of allterms are like that and their individual contribution is always small precisely becausem gt Q2

      By (414) (without the restriction q - m on either side) and Lemma 411

      sumQ2ltmleD

      |Tm(α)| leinfinsumj=0

      sumjqprime+Q

      2 ltmlemin((j+1)qprime+Q2D)

      |Tm(α)|

      le

      lfloorDminus(Q+1)2

      qprime

      rfloorsumj=0

      (3c1

      x

      jqprime + Q+12

      +4qprime

      π

      radicc1c0

      4

      x

      jqprime + (Q+ 1)2

      (j + 1)qprime +Q2

      x

      )

      le

      lfloorDminus(Q+1)2

      qprime

      rfloorsumj=0

      (3c1

      x

      jqprime + Q+12

      +4qprime

      π

      radicc1c0

      4

      (1 +

      qprime

      jqprime + (Q+ 1)2

      ))

      where we recall that c1 = 1 + |ηprime|1(2xD) Since qprime ge (ε(1 + ε))QlfloorDminus(Q+1)2

      qprime

      rfloorsumj=0

      x

      jqprime + Q+12

      le x

      Q2+x

      qprime

      int D

      Q+12

      1

      tdt le 2x

      Q+

      (1 + ε)x

      εQlog+ D

      Q+12

      (417)

      42 TYPE I ESTIMATES 61

      Recall now that qprime le (1 + ε)Q+ 1 le (1 + ε)(Q+ 1) Therefore

      qprimebDminus(Q+1)2

      qprime csumj=0

      radic1 +

      qprime

      jqprime + (Q+ 1)2le qprime

      radic1 +

      (1 + ε)Q+ 1

      (Q+ 1)2+

      int D

      Q+12

      radic1 +

      qprime

      tdt

      le qprimeradic

      3 + 2ε+

      (D minus Q+ 1

      2

      )+qprime

      2log+ D

      Q+12

      (418)We conclude that

      sumQ2ltmleD |Tm(α)| is at most

      2radicc0c1π

      (D +

      ((1 + ε)

      radic3 + 2εminus 1

      2

      )(Q+ 1) +

      (1 + ε)Q+ 1

      2log+ D

      Q+12

      )

      + 3c1

      (2 +

      (1 + ε)

      εlog+ D

      Q+12

      )x

      Q

      (419)We sum this to (416) (with M = Q2) and obtain that (415) is at most

      2radicc0c1π

      (D + (1 + ε)(Q+ 1)

      ($ε +

      1

      2log+ D

      Q+12

      ))

      + 3c1

      (2 +

      (1 + ε)

      εlog

      DQ+1

      2

      )x

      Q+

      35c0c26π2

      q

      (420)

      where we are bounding

      5c0c26π2

      =5c06π2

      5radicc0

      (1 +

      radic13

      3

      )=

      radicc0

      (1 +

      radic13

      3

      )le

      2radicc0c1π

      middot 14

      (1 +

      radic13

      3

      )(421)

      and defining

      $ε =radic

      3 + 2ε+

      (1

      4

      (1 +

      radic13

      3

      )minus 1

      )1

      2(1 + ε) (422)

      (Note that $ε ltradic

      3 for ε lt 01741) A quick check against (416) shows that (420)is valid also when D le Q2 even when Q + 1 is replaced by min(Q + 1 2D) Webound Q from above by x|δ|q and log+D((Q + 1)2) by log+ 2D(x|δ|q + 1)and obtain the result

      Case (b) |δ| small |δ| le 12c2 or D le Q02 Then min(c2xqD) le Q2 Westart by bounding the first q2 terms in (415) by (413) and Lemma 413sum

      mleq2

      |Tm(α)| lesum

      mleq2

      min

      ( 12 |ηprime|1

      | sin(πmα)|

      c0q8x

      | sin(πmα)|2

      )

      le |ηprime|1π

      qmax

      (2 log

      c0e3q2

      4π|ηprime|1x

      )

      (423)

      62 CHAPTER 4 TYPE I SUMS

      If q2 lt 2c2x we estimate the terms with q2 lt m le c2xq by Lemma 412which is applicable because min(c2xqD) lt Q2

      sumq2ltmleDprime

      q-m

      |Tm(α)| leinfinsumj=1

      sum(jminus 1

      2 )qltmle(j+ 12 )q

      mlemin( c2xq D)q-m

      min

      (x(

      j minus 12

      )q

      +|ηprime1|2c04

      (j+12)qx

      (sinπmα)2

      )

      le 20

      3π2

      c0q3

      4x

      sum1lejleDprimeq + 1

      2

      (j +

      1

      2

      )le 20

      3π2

      c0q3

      4x

      (c2x

      2q2

      Dprime

      q+

      3

      2

      (c2x

      q2

      )+

      5

      8

      )

      le 5c06π2

      (c2D

      prime + 3c2q +5

      4

      q3

      x

      )le 5c0c2

      6π2

      (Dprime +

      11

      2q

      )

      (424)where we write Dprime = min(c2xqD) If c2xq ge D we stop here Assume thatc2xq lt D Let R = max(c2xq q2) The terms we have already estimated areprecisely those with m le R We bound the terms R lt m le D by the second boundin Lemma 411sum

      RltmleD

      |Tm(α)| leinfinsumj=0

      summgtjq+R

      mlemin((j+1)q+RD)

      min

      (c1x

      jq +Rc04

      (j+1)q+Rx

      (sinπmα)2

      )

      leb 1q (DminusR)csumj=0

      3c1x

      jq +R+

      4q

      π

      radicc1c0

      4

      (1 +

      q

      jq +R

      ) (425)

      (Note there is no need to use two successive approximations aq aprimeqprime as in case (a)We are also including all terms with m divisible by q as we may since |Tm(α)| isnon-negative) Now much as before

      b 1q (DminusR)csumj=0

      x

      jq +Rle x

      R+x

      q

      int D

      R

      1

      tdt le min

      (q

      c2

      2x

      q

      )+x

      qlog+ D

      c2xq (426)

      andb 1q (DminusR)csumj=0

      radic1 +

      q

      jq +Rleradic

      1 +q

      R+

      1

      q

      int D

      R

      radic1 +

      q

      tdt

      leradic

      3 +D minusRq

      +1

      2log+ D

      q2

      (427)

      We sum with (423) and (424) and we obtain that (415) is at most

      2radicc0c1π

      (radic3q +D +

      q

      2log+ D

      q2

      )+

      (3c1 log+ D

      c2xq

      )x

      q

      + 3c1 min

      (q

      c2

      2x

      q

      )+

      55c0c212π2

      q +|ηprime|1π

      q middotmax

      (2 log

      c0e3q2

      4π|ηprime|1x

      )

      (428)

      42 TYPE I ESTIMATES 63

      where we are using the fact that 5c0c26π2 lt 2

      radicc0c1π to make sure that the term

      (5c0c26π2)Dprime from (424) is more than compensated by the termminus2

      radicc0c1Rπ com-

      ing from minusRq in (427) (by the definition of Dprime and R we have R ge D) We canalso use 5c0c26π

      2 lt 2radicc0c1π to bound the term (5c0c26π

      2)Dprime from (424) by theterm 2

      radicc0c1Dπ in (428) in case c2xq ge D (Again by definition Dprime le D) Thus

      (428) is valid both when c2xq lt D and when c2xq ge D

      421 Type I variationsWe will need a version of Lemma 421 with m and n restricted to the odd numbers(We will barely be using the restriction of m whereas the restriction on n is both (a)slightly harder to deal with (b) something that can be turned to our advantage)

      Lemma 422 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge 16 Let η be continuous piecewise C2 and compactly supported with|η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin

      Let 1 le D le x Then if |δ| le 12c2 where c2 = 6π5radicc0 the absolute value ofsum

      mleDm odd

      micro(m)sumn odd

      e(αmn)η(mnx

      )(429)

      is at most

      x

      2qmin

      (1

      c0(πδ)2

      ) ∣∣∣∣∣∣∣∣∣∣summleMq

      (m2q)=1

      micro(m)

      m

      ∣∣∣∣∣∣∣∣∣∣+Olowast

      (c0q

      x

      (1

      8minus 1

      2π2

      )(D

      q+ 1

      )2)

      (430)

      plus

      2radicc0c1π

      D +3c12

      x

      qlog+ D

      c2xq+

      radicc0c1π

      q log+ D

      q2

      +2|ηprime|1π

      q middotmax

      (1 log

      c0e3q2

      4π|ηprime|1x

      )+

      (2radic

      3c0c1π

      +3c12c2

      +55c0c2

      6π2

      )q

      (431)

      where c1 = 1 + |ηprime|1(xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

      In general if |δ| ge 12c2 the absolute value of (48) is at most (430) plus

      2radicc0c1π

      (D + (1 + ε) min

      (lfloorx

      |δ|q

      rfloor+ 1 2D

      )(radic3 + 2ε+

      1

      2log+ 2D

      x|δ|q

      ))

      +3

      2c1

      (2 +

      (1 + ε)

      εlog+ 2D

      x|δ|q

      )x

      Q0+

      35c0c23π2

      q

      (432)for ε isin (0 1] arbitrary

      64 CHAPTER 4 TYPE I SUMS

      If q is even the sum (430) can be replaced by 0

      Proof The proof is almost exactly that of Lemma 421 we go over the differencesThe parameters Q Qprime aprime qprime and M are defined just as before (with 2α wherever wehad α)

      Let us first consider m le M odd and divisible by q (Of course this case arisesonly if q is odd) For n = 2r + 1

      e(αmn) = e(αm(2r + 1)) = e(2αrm)e(αm)

      = e

      xrm

      )e

      ((a

      2q+

      δ

      2x+κ

      2

      )m

      )= e

      (δ(2r + 1)

      2xm

      )e

      (a+ κq

      2

      m

      q

      )= κprimee

      (δ(2r + 1)

      2xm

      )

      where κ isin 0 1 and κprime = e((a + κq)2) isin minus1 1 are independent of m and nHence by Poisson summationsum

      n odd

      e(αmn)η(mnx) = κprimesumn odd

      e((δm2x)n)η(mnx)

      =κprime

      2

      (sumn

      f(n)minussumn

      f(n+ 12)

      )

      (433)

      where f(u) = e((δm2x)u)η((mx)u) Now

      f(t) =x

      (x

      mtminus δ

      2

      )

      Just as before |xm| ge 2|δq| ge 2δ Thus

      1

      2

      ∣∣∣∣∣sumn

      f(n)minussumn

      f(n+ 12)

      ∣∣∣∣∣ le x

      m

      1

      2

      ∣∣∣∣η(minusδ2)∣∣∣∣+

      1

      2

      sumn 6=0

      ∣∣∣∣η( xm n

      2minus δ

      2

      )∣∣∣∣

      =x

      m

      1

      2

      ∣∣∣∣η(minusδ2)∣∣∣∣+

      1

      2middotOlowast

      sumn 6=0

      1(π(nxm minus δ

      ))2 middot ∣∣∣ηprimeprime∣∣∣

      infin

      =

      x

      2m

      ∣∣∣∣η(minusδ2)∣∣∣∣+

      m

      x

      c02π2

      (π2 minus 4)x

      (434)The contribution of the second term in the last line of (434) issum

      mleMm oddq|m

      m

      x

      c02π2

      (π2 minus 4) =q

      x

      c02π2

      (π2 minus 4) middotsum

      mleMq

      m odd

      m

      =qc0x

      (1

      8minus 1

      2π2

      )(M

      q+ 1

      )2

      42 TYPE I ESTIMATES 65

      Hence the absolute value of the sum of all terms with m le M and q|m is given by(430)

      We define Tm(α) by

      Tm(α) =sumn odd

      e(αmn)η(mnx

      ) (435)

      Changing variables by n = 2r + 1 we see that

      |Tm(α)| =

      ∣∣∣∣∣sumr

      e(2α middotmr)η(m(2r + 1)x)

      ∣∣∣∣∣ Hence instead of (413) we get that

      |Tm(α)| le min

      (x

      2m+

      1

      2|ηprime|1

      12 |ηprime|1

      | sin(2πmα)|m

      x

      c02

      1

      (sin 2πmα)2

      ) (436)

      We obtain (414) but with Tm instead of Tm A = (x2y1)(1 + |ηprime|1(xy1)) andC = (c02)(y2x) and so c1 = 1 + |ηprime|1(xD)

      The rest of the proof of Lemma 421 carries almost over word-by-word (For thesake of simplicity we do not really try to take advantage of the odd support of mhere) Since C has doubled it would seem to make sense to reset the value of c2 to bec2 = (3π5

      radic2c0)(1 +

      radic133) this would cause complications related to the fact that

      5c0c23π2 would become larger than 2

      radicc0π and so we set c2 to the slightly smaller

      value c2 = 6π5radicc0 instead This implies

      5c0c23π2

      =2radicc0π

      (437)

      The bound from (416) gets multiplied by 2 (but the value of c2 has changed) thesecond line in (419) gets halved (421) gets replaced by (437) the second term inthe maximum in the second line of (423) gets doubled the bound from (424) getsdoubled and the bound from (426) gets halved

      We will also need a version of Lemma 421 (or rather Lemma 422 we will decideto work with the restriction that n and m be odd) with a factor of (log n) within theinner sum This is the sum SI1 in (39)

      Lemma 423 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(16 2

      radicx) Let η be continuous piecewise C2 and compactly

      supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Assume that for any ρ ge ρ0ρ0 a constant the function η(ρ)(t) = log(ρt)η(t) satisfies

      |η(ρ)|1 le log(ρ)|η|1 |ηprime(ρ)|1 le log(ρ)|ηprime|1 |ηprimeprime(ρ)|infin le c0 log(ρ) (438)

      Letradic

      3 le D le min(xρ0 xe) Then if |δ| le 12c2 where c2 = 6π5radicc0 the

      absolute value of summleDm odd

      micro(m)sumn

      n odd

      (log n)e(αmn)η(mnx

      )(439)

      66 CHAPTER 4 TYPE I SUMS

      is at most

      x

      qmin

      (1c0δ

      2

      (2π)2

      ) ∣∣∣∣∣∣∣∣∣∣summleMq

      (mq)=1

      micro(m)

      mlog

      x

      mq

      ∣∣∣∣∣∣∣∣∣∣+x

      q|log middotη(minusδ)|

      ∣∣∣∣∣∣∣∣∣∣summleMq

      (mq)=1

      micro(m)

      m

      ∣∣∣∣∣∣∣∣∣∣+Olowast

      (c0

      (1

      2minus 2

      π2

      )(D2

      4qxlog

      e12x

      D+

      1

      e

      )) (440)

      plus

      2radicc0c1π

      D logex

      D+

      3c12

      x

      qlog+ D

      c2xqlog

      q

      c2

      +

      (2|ηprime|1π

      max

      (1 log

      c0e3q2

      4π|ηprime|1x

      )log x+

      2radicc0c1π

      (radic3 +

      1

      2log+ D

      q2

      )log

      q

      c2

      )q

      +3c12

      radic2x

      c2log

      2x

      c2+

      20c0c322

      3π2

      radic2x log

      2radicex

      c2(441)

      for c1 = 1 + |ηprime|1(xD) The same bound holds if |δ| ge 12c2 but D le Q02In general if |δ| ge 12c2 the absolute value of (439) is at most

      2radicc0c1π

      D logex

      D+

      2radicc0c1π

      (1 + ε)

      (x

      |δ|q+ 1

      )(radic3 + 2ε middot log+ 2

      radice|δ|q +

      1

      2log+ 2D

      x|δ|q

      log+ 2|δ|q

      )

      +

      (3c14

      (2radic5

      +1 + ε

      2εlog x

      )+

      40

      3

      radic2c0c

      322

      )radicx log x

      (442)for ε isin (0 1]

      Proof DefineQQprimeM aprime and qprime as in the proof of Lemma 421 The same method ofproof works as for Lemma 421 we go over the differences When applying Poissonsummation or (22) use η(xm)(t) = (log xtm)η(t) instead of η(t) Then use thebounds in (438) with ρ = xm in particular

      |ηprimeprime(xm)|infin le c0 logx

      m

      For f(u) = e((δm2x)u)(log u)η((mx)u)

      f(t) =x

      mη(xm)

      (x

      mtminus δ

      2

      )

      42 TYPE I ESTIMATES 67

      and so

      1

      2

      sumn

      ∣∣∣f(n2)∣∣∣ le x

      m

      1

      2

      ∣∣∣∣η(xm)

      (minusδ

      2

      )∣∣∣∣+1

      2

      sumn 6=0

      ∣∣∣∣η( xm n

      2minus δ

      2

      )∣∣∣∣

      =1

      2

      x

      m

      (log middotη

      (minusδ

      2

      )+ log

      ( xm

      (minusδ

      2

      ))+m

      x

      (log

      x

      m

      ) c02π2

      (π2 minus 4)

      The part of the main term involving log(xm) becomes

      xη(minusδ)2

      summleMm oddq|m

      micro(m)

      mlog( xm

      )=xmicro(q)

      qη(minusδ) middot

      summleMq

      (m2q)=1

      micro(m)

      mlog

      (x

      mq

      )

      for q odd (We can see that this like the rest of the main term vanishes for m even)In the term in front of π2 minus 4 we find the sum

      summleMm oddq|m

      m

      xlog( xm

      )le M

      xlog

      x

      M+q

      2

      int Mq

      0

      t logxq

      tdt

      =M

      xlog

      x

      M+M2

      4qxlog

      e12x

      M

      where we use the fact that t 7rarr t log(xt) is increasing for t le xe By the same fact(and by M le D) (M2q) log(e12xM) le (D2q) log(e12xD) It is also easy tosee that (Mx) log(xM) le 1e (since M le D le x)

      The basic estimate for the rest of the proof (replacing (413)) is

      Tm(α) =sumn odd

      e(αmn)(log n)η(mnx

      )=sumn odd

      e(αmn)η(xm)

      (mnx

      )

      = Olowast

      min

      x

      2m|η(xm)|1 +

      |ηprime(xm)|12

      12 |ηprime(xm)|1

      | sin(2πmα)|m

      x

      12 |ηprimeprime(xm)|infin

      (sin 2πmα)2

      = Olowast

      (log

      x

      mmiddotmin

      (x

      2m+|ηprime|1

      2

      12 |ηprime|1

      | sin(2πmα)|m

      x

      c02

      1

      (sin 2πmα)2

      ))

      We wish to bound summleMq-mm odd

      |Tm(α)|+sum

      Q2 ltmleD

      |Tm(α)| (443)

      Just as in the proofs of Lemmas 421 and 422 we give two bounds one valid for|δ| large (|δ| ge 12c2) and the other for δ small (|δ| le 12c2) Again as in the proofof Lemma 422 we ignore the condition that m is odd in (415)

      68 CHAPTER 4 TYPE I SUMS

      Consider the case of |δ| large first Instead of (416) we havesum1lemleMq-m

      |Tm(α)| le 40

      3π2

      c0q3

      2x

      sum0lejleMq

      (j + 1) logx

      jq + 1 (444)

      Since sum0lejleMq

      (j + 1) logx

      jq + 1

      le log x+M

      qlog

      x

      M+

      sum1lejleMq

      logx

      jq+

      sum1lejleMq minus1

      j logx

      jq

      le log x+M

      qlog

      x

      M+

      int Mq

      0

      logx

      tqdt+

      int Mq

      1

      t logx

      tqdt

      le log x+

      (2M

      q+M2

      2q2

      )log

      e12x

      M

      this means thatsum1lemleMq-m

      |Tm(α)| le 40

      3π2

      c0q3

      4x

      (log x+

      (2M

      q+M2

      2q2

      )log

      e12x

      M

      )

      le 5c0c23π2

      M log

      radicex

      M+

      40

      3

      radic2c0c

      322

      radicx log x

      (445)

      where we are using the bounds M le Q2 le c2xq and q2 le 2c2x (just as in (416))Instead of (417) we havelfloor

      Dminus(Q+1)2

      qprime

      rfloorsumj=0

      (log

      x

      jqprime + Q+12

      )x

      jqprime + Q+12

      le x

      Q2log

      2x

      Q+x

      qprime

      int D

      Q+12

      logx

      t

      dt

      t

      le 2x

      Qlog

      2x

      Q+x

      qprimelog

      2x

      Qlog+ 2D

      Q

      recall that the coefficient in front of this sum will be halved by the condition that n isodd Instead of (418) we obtain

      qprimebDminus(Q+1)2

      qprime csumj=0

      radic1 +

      qprime

      jqprime + (Q+ 1)2

      (log

      x

      jqprime + Q+12

      )

      le qprimeradic

      3 + 2ε middot log2x

      Q+ 1+

      int D

      Q+12

      (1 +

      qprime

      2t

      )(log

      x

      t

      )dt

      le qprimeradic

      3 + 2ε middot log2x

      Q+ 1+D log

      ex

      D

      minus Q+ 1

      2log

      2ex

      Q+ 1+qprime

      2log

      2x

      Q+ 1log

      2D

      Q+ 1

      42 TYPE I ESTIMATES 69

      (The boundint ba

      log(xt)dtt le log(xa) log(ba) will be more practical than the exactexpression for the integral) Hence

      sumQ2ltmleD |Tm(α)| is at most

      2radicc0c1π

      D logex

      D

      +2radicc0c1π

      ((1 + ε)

      radic3 + 2ε+

      (1 + ε)

      2log

      2D

      Q+ 1

      )(Q+ 1) log

      2x

      Q+ 1

      minus2radicc0c1π

      middot Q+ 1

      2log

      2ex

      Q+ 1+

      3c12

      (2radic5

      +1 + ε

      εlog+ D

      Q2

      )radicx log

      radicx

      Summing this to (445) (with M = Q2) and using (421) and (422) as before weobtain that (443) is at most

      2radicc0c1π

      D logex

      D

      +2radicc0c1π

      (1 + ε)(Q+ 1)

      (radic3 + 2ε log+ 2

      radicex

      Q+ 1+

      1

      2log+ 2D

      Q+ 1log+ 2x

      Q+ 1

      )+

      3c12

      (2radic5

      +1 + ε

      εlog+ D

      Q2

      )radicx log

      radicx+

      40

      3

      radic2c0c

      322

      radicx log x

      Now we go over the case of |δ| small (or D le Q02) Instead of (423) we havesummleq2

      |Tm(α)| le 2|ηprime|1π

      qmax

      (1 log

      c0e3q2

      4π|ηprime|1x

      )log x (446)

      Suppose q2 lt 2c2x (Otherwise the sum we are about to estimate is empty) Insteadof (424) we havesumq2ltmleDprime

      q-m

      |Tm(α)| le 40

      3π2

      c0q3

      6x

      sum1lejleDprimeq + 1

      2

      (j +

      1

      2

      )log

      x(j minus 1

      2

      )q

      le 10c0q3

      3π2x

      (log

      2x

      q+

      1

      q

      int Dprime

      0

      logx

      tdt+

      1

      q

      int Dprime

      0

      t logx

      tdt+

      Dprime

      qlog

      x

      Dprime

      )

      =10c0q

      3

      3π2x

      (log

      2x

      q+

      (2Dprime

      q+

      (Dprime)2

      2q2

      )log

      radicex

      Dprime

      )le 5c0c2

      3π2

      (4radic

      2c2x log2x

      q+ 4radic

      2c2x log

      radicex

      Dprime+Dprime log

      radicex

      Dprime

      )le 5c0c2

      3π2

      (Dprime log

      radicex

      Dprime+ 4radic

      2c2x log2radicex

      c2

      )(447)

      where Dprime = min(c2xqD) (We are using the bounds q3x le (2c2)32 Dprimeq2x lec2q lt c

      322

      radic2x and Dprimeqx le c2) Instead of (425) we have

      sumRltmleD

      |Tm(α)| lebDminusRq csumj=0

      (3c12 x

      jq +R+

      4q

      π

      radicc1c0

      4

      (1 +

      q

      jq +R

      ))log

      x

      jq +R

      70 CHAPTER 4 TYPE I SUMS

      where R = max(c2xq q2) We can simply reuse (426) multiplying it by log xRthe only difference is that now we take care to bound min(qc2 2xq) by the geometricmean

      radic(qc2)(2xq) =

      radic2xc2 We replace (427) by

      b 1q (DminusR)csumj=0

      radic1 +

      q

      jq +Rlog

      x

      jq +Rleradic

      1 +q

      Rlog

      x

      R+

      1

      q

      int D

      R

      radic1 +

      q

      tlog

      x

      tdt

      leradic

      3 logq

      c2+

      (D

      qlog

      ex

      Dminus R

      qlog

      ex

      R

      )+

      1

      2log

      q

      c2log+ D

      R

      (448)We sum with (446) and (447) and obtain (441) as an upper bound for (443) (Just asin the proof of Lemma 421 the term (5c0c2(3π

      2))Dprime log(radicexDprime) is smaller than

      the term (2radicc1c0π)R log exR in (448) and thus gets absorbed by it when D gt R

      If D le R then again as in Lemma 421 the sumsumRltmleD |Tm(α)| is empty and

      we bound (5c0c2(3π2))Dprime log(

      radicexDprime) by the term (2

      radicc1c0π)D log exD which

      would not appear otherwise)

      Now comes the time to focus on our second type I sum namelysumvleVv odd

      Λ(v)sumuleUu odd

      micro(u)sumn

      n odd

      e(αvun)η(vunx)

      which corresponds to the term SI2 in (39) The innermost two sums on their ownare a sum of type I we have already seen Accordingly for q small we will be able tobound them using Lemma 422 If q is large then that approach does not quite worksince then the approximation avq to vα is not always good enough (As we shall latersee we need q le Qv for the approximation to be sufficiently close for our purposes)

      Fortunately when q is large we can also afford to lose a factor of log since thegains from q will be large Here is the estimate we will use for q large

      Lemma 424 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(2e 2

      radicx) Let η be continuous piecewise C2 and compactly

      supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Let c2 = 6π5radicc0 Assume

      that x ge e2c22Let U V ge 1 satisfy UV +(1918)Q0 le x56 Then if |δ| le 12c2 the absolute

      value of ∣∣∣∣∣∣∣∣sumvleVv odd

      Λ(v)sumuleUu odd

      micro(u)sumn

      n odd

      e(αvun)η(vunx)

      ∣∣∣∣∣∣∣∣ (449)

      is at most

      x

      2qmin

      (1

      c0(πδ)2

      )log V q

      +Olowast(

      1

      4minus 1

      π2

      )middot c0(D2 log V

      2qx+

      3c42

      UV 2

      x+

      (U + 1)2V

      2xlog q

      ) (450)

      42 TYPE I ESTIMATES 71

      plus

      2radicc0c1π

      (D log

      Dradice

      + q

      (radic3 log

      c2x

      q+

      logD

      2log+ D

      q2

      ))+

      3c12

      x

      qlogD log+ D

      c2xq+

      2|ηprime|1π

      qmax

      (1 log

      c0e3q2

      4π|ηprime|1x

      )log

      q

      2

      +3c1

      2radic

      2c2

      radicx log

      c2x

      2+

      25c04π2

      (2c2)32radicx log x

      (451)

      whereD = UV and c1 = 1+ |ηprime|1(2xD) and c4 = 103884 The same bound holdsif |δ| ge 12c2 but D le Q02

      In general if |δ| ge 12c2 the absolute value of (449) is at most (450) plus

      2radicc0c1π

      D logD

      e

      +2radicc0c1π

      (1 + ε)

      (x

      |δ|q+ 1

      )((radic

      3 + 2εminus 1) log

      x|δ|q + 1radic

      2+

      1

      2logD log+ e2D

      x|δ|q

      )

      +

      (3c12

      (1

      2+

      3(1 + ε)

      16εlog x

      )+

      20c03π2

      (2c2)32

      )radicx log x

      (452)for ε isin (0 1]

      Proof We proceed essentially as in Lemma 421 and Lemma 422 Let Q qprime and Qprime

      be as in the proof of Lemma 422 that is with 2α where Lemma 421 uses αLet M = min(UVQ2) We first consider the terms with uv le M u and v odd

      uv divisible by q If q is even there are no such terms Assume q is odd Then by(433) and (434) the absolute value of the contribution of these terms is at most

      sumaleMa oddq|a

      sumv|a

      aUlevleV

      Λ(v)micro(av)

      (xη(minusδ2)

      2a+O

      (a

      x

      |ηprimeprime|infin2π2

      middot (π2 minus 4)

      )) (453)

      Now

      sumaleMa oddq|a

      sumv|a

      aUlevleV

      Λ(v)micro(av)

      a

      =sumvleVv odd

      (vq)=1

      Λ(v)

      v

      sumulemin(UMV )

      u oddq|u

      micro(u)

      u+sumpαleVp oddp|q

      Λ(pα)

      sumulemin(UMV )

      u oddq

      (qpα)|u

      micro(u)

      u

      72 CHAPTER 4 TYPE I SUMS

      which equals

      micro(q)

      q

      sumvleVv odd

      (vq)=1

      Λ(v)

      v

      sumulemin(UqMV q)

      (u2q)=1

      micro(u)

      u

      +micro(

      q(qpα)

      )q

      sumpαleVp oddp|q

      Λ(pα)

      pα(q pα)

      sumulemin( U

      q(qpα)MV

      q(qpα) )u odd

      (u q(qpα) )=1

      micro(u)

      u

      =1

      qmiddotOlowast

      sumvleV

      (v2q)=1

      Λ(v)

      v+sumpαleVp oddp|q

      log p

      pα(q pα)

      where we are using (220) to bound the sums on u by 1 We notice that

      sumpαleVp oddp|q

      log p

      pα(q pα)lesump oddp|q

      (log p)

      vp(q) +sum

      αgtvp(q)

      pαleV

      1

      pαminusvp(q)

      le log q +

      sump oddp|q

      (log p)sumβgt0

      pβle V

      pvp(q)

      log p

      pβle log q +

      sumvleVv odd

      (vq)=1

      Λ(v)

      v

      and so

      sumaleMa oddq|a

      sumv|a

      aUlevleV

      Λ(v)micro(av)

      a=

      1

      qmiddotOlowast

      log q +sumvleV

      (v2)=1

      Λ(v)

      v

      =

      1

      qmiddotOlowast(log q + log V )

      by (212) The absolute value of the sum of the terms with η(minusδ2) in (453) is thus atmost

      x

      q

      η(minusδ2)

      2(log q + log V ) le x

      2qmin

      (1

      c0(πδ)2

      )log V q

      where we are bounding η(minusδ2) by (21) (with k = 2)

      42 TYPE I ESTIMATES 73

      The other terms in (453) contribute at most

      (π2 minus 4)|ηprimeprime|infin2π2

      1

      x

      sumuleU

      sumvleV

      uv odduvleM q|uvu sq-free

      Λ(v)uv (454)

      For any RsumuleRu oddq|u le R24q + 3R4 Using the estimates (212) (215)

      and (216) we obtain that the double sum in (454) is at mostsumvleV

      (v2q)=1

      Λ(v)vsum

      ulemin(UMv)

      u oddq|u

      u+sumpαleVp oddp|q

      (log p)pαsumuleUu oddq

      (qpα)|u

      u

      lesumvleV

      (v2q)=1

      Λ(v)v middot(

      (Mv)2

      4q+

      3M

      4v

      )+sumpαleVp oddp|q

      (log p)pα middot (U + 1)2

      4

      le M2 log V

      4q+

      3c44MV +

      (U + 1)2

      4V log q

      (455)

      where c4 = 103884From this point onwards we use the easy bound∣∣∣∣∣∣∣∣∣

      sumv|a

      aUlevleV

      Λ(v)micro(av)

      ∣∣∣∣∣∣∣∣∣ le log a

      What we must bound now issummleUVm odd

      q - m orm gt M

      (logm)sumn odd

      e(αmn)η(mnx) (456)

      The inner sum is the same as the sum Tm(α) in (435) we will be using the bound(436) Much as before we will be able to ignore the condition that m is odd

      Let D = UV What remains to do is similar to what we did in the proof of Lemma421 (or Lemma 422)

      Case (a) δ large |δ| ge 12c2 Instead of (416) we have

      sum1lemleMq-m

      (logm)|Tm(α)| le 40

      3π2

      c0q3

      4x

      sum0lejleMq

      (j + 1) log(j + 1)q

      74 CHAPTER 4 TYPE I SUMS

      and since M le min(c2xqD) q leradic

      2c2x (just as in the proof of Lemma 421) andsum0lejleMq

      (j + 1) log(j + 1)q

      le M

      qlogM +

      (M

      q+ 1

      )log(M + 1) +

      1

      q2

      int M

      0

      t log t dt

      le(

      2M

      q+ 1

      )log x+

      M2

      2q2log

      Mradice

      we conclude thatsum1lemleMq-m

      |Tm(α)| le 5c0c23π2

      M logMradice

      +20c03π2

      (2c2)32radicx log x

      (457)

      Instead of (417) we have

      bDminus(Q+1)2

      qprime csumj=0

      x

      jqprime + Q+12

      log

      (jqprime +

      Q+ 1

      2

      )le x

      Q+12

      logQ+ 1

      2+x

      qprime

      int D

      Q+12

      log t

      tdt

      le 2x

      Qlog

      Q

      2+

      (1 + ε)x

      2εQ

      ((logD)2 minus

      (log

      Q

      2

      )2)

      Instead of (418) we estimate

      qprime

      lfloorDminusQ+1

      2qprime

      rfloorsumj=0

      (log

      (Q+ 1

      2+ jqprime

      ))radic1 +

      qprime

      jqprime + Q+12

      le qprime(

      logD + (radic

      3 + 2εminus 1) logQ+ 1

      2

      )+

      int D

      Q+12

      log t dt+

      int D

      Q+12

      qprime log t

      2tdt

      le qprime(

      logD +(radic

      3 + 2εminus 1)

      logQ+ 1

      2

      )+

      (D log

      D

      eminus Q+ 1

      2log

      Q+ 1

      2e

      )+qprime

      2logD log+ D

      Q+12

      We conclude that when D ge Q2 the sumsumQ2ltmleD(logm)|Tm(α)| is at most

      2radicc0c1π

      (D log

      D

      e+ (Q+ 1)

      ((1 + ε)(

      radic3 + 2εminus 1) log

      Q+ 1

      2minus 1

      2log

      Q+ 1

      2e

      ))+

      radicc0c1π

      (Q+ 1)(1 + ε) logD log+ e2DQ+1

      2

      +3c12

      (2x

      Qlog

      Q

      2+

      (1 + ε)x

      2εQ

      ((logD)2 minus

      (log

      Q

      2

      )2))

      42 TYPE I ESTIMATES 75

      We must now add this to (457) Since

      (1 + ε)(radic

      3 + 2εminus 1) logradic

      2minus 1

      2log 2e+

      1 +radic

      133

      2log 2radice gt 0

      and Q ge 2radicx we conclude that (456) is at most

      2radicc0c1π

      D logD

      e

      +2radicc0c1π

      (1 + ε)(Q+ 1)

      ((radic

      3 + 2εminus 1) logQ+ 1radic

      2+

      1

      2logD log+ e2D

      Q+12

      )

      +

      (3c12

      (1

      2+

      3(1 + ε)

      16εlog x

      )+

      20c03π2

      (2c2)32

      )radicx log x

      (458)Case (b) δ small |δ| le 12c2 or D le Q02 The analogue of (423) is a bound of

      le 2|ηprime|1π

      qmax

      (1 log

      c0e3q2

      4π|ηprime|1x

      )log

      q

      2

      for the terms with m le q2 If q2 lt 2c2x then much as in (424) we havesumq2ltmleDprime

      q-m

      |Tm(α)|(logm) le 10

      π2

      c0q3

      3x

      sum1lejleDprimeq + 1

      2

      (j +

      1

      2

      )log(j + 12)q

      le 10

      π2

      c0q

      3x

      int Dprime+ 32 q

      q

      x log x dx

      (459)

      Sinceint Dprime+ 32 q

      q

      x log x dx =1

      2

      (Dprime +

      3

      2q

      )2

      logDprime + 3

      2qradiceminus 1

      2q2 log

      qradice

      =

      (1

      2Dprime2 +

      3

      2Dprimeq

      )(log

      Dprimeradice

      +3

      2

      q

      Dprime

      )+

      9

      8q2 log

      Dprime + 32qradiceminus 1

      2q2 log

      qradice

      =1

      2Dprime2 log

      Dprimeradice

      +3

      2Dprimeq logDprime +

      9

      8q2

      (2

      9+

      3

      2+ log

      (Dprime +

      19

      18q

      ))

      where Dprime = min(c2xqD) and since the assumption (UV + (1918)Q0) le x56implies that (29 + 32 + log(Dprime + (1918)q)) le x we conclude thatsum

      q2ltmleDprime

      q-m

      |Tm(α)|(logm)

      le 5c0c23π2

      Dprime logDprimeradice

      +10c03π2

      (3

      4(2c2)32

      radicx log x+

      9

      8(2c2)32

      radicx log x

      )le 5c0c2

      3π2Dprime log

      Dprimeradice

      +25c04π2

      (2c2)32radicx log x

      (460)

      76 CHAPTER 4 TYPE I SUMS

      Let R = max(c2xq q2) We bound the terms R lt m le D as in (425) with afactor of log(jq +R) inside the sum The analogues of (426) and (427) are

      b 1q (DminusR)csumj=0

      x

      jq +Rlog(jq +R) le x

      RlogR+

      x

      q

      int D

      R

      log t

      tdt

      leradic

      2x

      c2log

      radicc2x

      2+x

      qlogD log+ D

      R

      (461)

      where we use the assumption that x ge e2c2 and

      b 1q (DminusR)csumj=0

      log(jq +R)

      radic1 +

      q

      jq +Rleradic

      3 logR

      +1

      q

      (D log

      D

      eminusR log

      R

      e

      )+

      1

      2logD log

      D

      R

      (462)

      (or 0 if D lt R) We sum with (460) and the terms with m le q2 and obtain forDprime = c2xq = R

      2radicc0c1π

      (D log

      Dradice

      + q

      (radic3 log

      c2x

      q+

      logD

      2log+ D

      q2

      ))+

      3c12

      x

      qlogD log+ D

      c2xq+

      2|ηprime|1π

      qmax

      (1 log

      c0e3q2

      4π|ηprime|1x

      )log

      q

      2

      +3c1

      2radic

      2c2

      radicx log

      c2x

      2+

      25c04π2

      (2c2)32radicx log x

      which it is easy to check is also valid even if Dprime = D (in which case (461) and (462)do not appear) or R = q2 (in which case (460) does not appear)

      Chapter 5

      Type II sums

      We must now consider the sum

      SII =summgtU

      (mv)=1

      sumdgtUd|m

      micro(d)

      sumngtV

      (nv)=1

      Λ(n)e(αmn)η(mnx) (51)

      Here the main improvements over classical treatments of type II sums are as fol-lows

      1 obtaining cancellation in the term sumdgtUd|m

      micro(d)

      leading to a gain of a factor of log

      2 using a large sieve for primes getting rid of a further log

      3 exploiting via a non-conventional application of the principle of the large sieve(Lemma 521) the fact that α is in the tail of an interval (when that is the case)

      It should be clear that these techniques are of general applicability (It is also clear that(2) is not new though strangely enough it seems not to have been applied to Gold-bachrsquos problem Perhaps this oversight is due to the fact that proofs of Vinogradovrsquosresult given in textbooks often follow Linnikrsquos dispersion method rather than the largesieve Our treatment of the large sieve for primes will follow the lines set by Mont-gomery and Montgomery-Vaughan [MV73 (16)] The fact that the large sieve forprimes can be combined with the new technique (3) is of course a novelty)

      While (1) is particularly useful for the treatment of a term that generally arises inapplications of Vaughanrsquos identity all of the points above address issues that can arisein more general situations in number theory

      77

      78 CHAPTER 5 TYPE II SUMS

      It is technically helpful to express η as the (multiplicative) convolution of two func-tions of compact support ndash preferrably the same function

      η(x) = η1 lowastM η1 =

      int infin0

      η1(t)η1(xt)dt

      t (52)

      For the smoothing function η(t) = η2(t) = 4 max(log 2 minus | log 2t| 0) equation (52)holds with η1 = 2 middot 1[121] where 1[121] is the characteristic function of the interval[12 1] We will work with η = η2 yet most of our work will be valid for any η of theform η = η1 lowast η1

      By (52) the sum (51) equals

      4

      int infin0

      summgtU

      (mv)=1

      sumdgtUd|m

      micro(d)

      sumngtV

      (nv)=1

      Λ(n)e(αmn)η1(t)η1

      (mnx

      t

      )dt

      t

      = 4

      int xU

      V

      summax( x

      2W U)ltmle xW

      (mv)=1

      sumdgtUd|m

      micro(d)

      summax(VW2 )ltnleW

      (nv)=1

      Λ(n)e(αmn)dW

      W

      (53)by the substitution t = (mx)W (We can assume V le W le xU because otherwiseone of the sums in (54) is empty) As we can see the sums within the integral are nowunsmoothed This will not be truly harmful and to some extent it will be convenientin that ready-to-use large-sieve estimates in the literature have been optimized morecarefully for unsmoothed sums than for smooth sums The fact that the sums start atx2W and W2 rather than at 1 will also be slightly helpful

      (This is presumably why the weight η2 was introduced in [Tao14] which also usesthe large sieve As we will later see the weight η2 ndash or anything like it ndash will simplynot do on the major arcs which are much more sensitive to the choice of weights Onthe minor arcs however η2 is convenient and this is why we use it here For type Isums ndash as should be clear from our work so far which was stated for general weightsndash any function whose second derivative exists almost everywhere and lies in `1 woulddo just as well The option of having no smoothing whatsoever ndash as in Vinogradovrsquoswork or as in most textbook accounts ndash would not be quite as good for type I sumsand would lead to a routine but inconvenient splitting of sums into short intervals inplace of (53))

      We now do what is generally the first thing in type II treatments we use Cauchy-Schwarz A minor note however that may help avoid confusion the treatments fa-miliar to some readers (eg the dispersion method not followed here) start with thespecial case of Cauchy-Schwarz that is most common in number theory∣∣∣∣∣∣

      sumnleN

      an

      ∣∣∣∣∣∣2

      le NsumnleN

      |an|2

      79

      whereas here we apply the general rule

      summ

      ambm leradicsum

      m

      |am|2radicsum

      m

      |bm|2

      to the integrand in (53) At any rate we will have reduced the estimation of a sumto the estimation of two simpler sums

      summ |am|2

      summ |bm|2 but each of these two

      simpler sums will be of a kind that we will lead to a loss of a factor of log x (or(log x)3) if not estimated carefully Since we cannot afford to lose a single factor oflog x we will have to deploy and develop techniques to eliminate these factors of log xThe procedure followed will be quite different for the two sums a variety of techniqueswill be needed

      We separate n prime and n non-prime in the integrand of (53) and as we weresaying we apply Cauchy-Schwarz We obtain that the expression within the integral in(53) is at most

      radicS1(UW ) middot S2(U VW ) +

      radicS1(UW ) middot S3(W ) where

      S1(UW ) =sum

      max( x2W U)ltmle x

      W

      (mv)=1

      sumdgtUd|m

      micro(d)

      2

      S2(U VW ) =sum

      max( x2W U)ltmle x

      W

      (mv)=1

      ∣∣∣∣∣∣∣∣∣∣sum

      max(VW2 )ltpleW(pv)=1

      (log p)e(αmp)

      ∣∣∣∣∣∣∣∣∣∣

      2

      (54)

      and

      S3(W ) =sum

      x2W ltmle x

      W

      (mv)=1

      ∣∣∣∣∣∣∣∣sumnleW

      n non-prime

      Λ(n)

      ∣∣∣∣∣∣∣∣2

      =sum

      x2W ltmle x

      W

      (mv)=1

      (142620W 12

      )2

      le 10171x+ 20341W

      (55)

      (by [RS62 Thm 13]) We will assume V le w thus the condition (p v) = 1 will befulfilled automatically and can be removed

      The contribution of S3(W ) will be negligible We must bound S1(UW ) andS2(U VW ) from above

      80 CHAPTER 5 TYPE II SUMS

      51 The sum S1 cancellationWe shall bound

      S1(UW ) =sum

      max(Ux2W )ltmlexW(mv)=1

      sumdgtUd|m

      micro(d)

      2

      (56)

      There will be a surprising amount of cancellation the expression within the sumwill be bounded by a constant on average ndash a constant less than 1 and usually less than12 in fact In other words the inner sum in (56) is exactly 0 most of the time

      Recall that we need explicit constants throughout and that this essentially con-strains us to elementary means (We will at one point use Dirichlet series and ζ(s) fors real and greater than 1)

      511 Reduction to a sum with microIt is tempting to start by applying Mobius inversion to change d gt U to d le U in(56) but this just makes matters worse We could also try changing variables so thatmd (which is smaller than xUW ) becomes the variable instead of d but this leadsto complications for m non-square-free Instead we write

      summax(Ux2W )ltmlexW

      (mv)=1

      sumdgtUd|m

      micro(d)

      2

      =sum

      x2W ltmle x

      W

      (mv)=1

      sumd1d2|m

      micro(d1 gt U)micro(d2 gt U)

      =sum

      r1ltxWU

      sumr2ltxWU

      (r1r2)=1

      (r1r2v)=1

      suml

      (lr1r2)=1

      r1lr2lgtU

      (`v)=1

      micro(r1l)micro(r2l)sum

      x2W ltmle x

      W

      r1r2l|m(mv)=1

      1

      (57)where d1 = r1l d2 = r2l l = (d1 d2) (The inequality r1 lt xWU comes fromr1r2l|m m le xW r2l gt U r2 lt xWU is proven in the same way) Now (57)equals sum

      slt xWU

      (sv)=1

      sumr1lt

      xWUs

      sumr2lt

      xWUs

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)sum

      max(

      Umin(r1r2)

      xW

      2r1r2s

      )ltlle xW

      r1r2s

      (lr1r2)=1(micro(l))2=1

      (`v)=1

      1 (58)

      where we have set s = m(r1r2l) We begin by simplifying the innermost triple sumThis we do in the following Lemma it is not a trivial task and carrying it out efficientlyactually takes an idea

      51 THE SUM S1 CANCELLATION 81

      Lemma 511 Let z y gt 0 Thensumr1lty

      sumr2lty

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)sum

      min(

      zymin(r1r2)

      z2r1r2

      )ltlle z

      r1r2

      (lr1r2)=1(micro(l))2=1

      (`v)=1

      1 (59)

      equals

      6z

      π2

      v

      σ(v)

      sumr1lty

      sumr2lty

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)

      σ(r1)σ(r2)

      (1minusmax

      (1

      2r1

      yr2

      y

      ))

      +Olowast

      508 ζ

      (3

      2

      )2

      yradicz middotprodp|v

      (1 +

      1radicp

      )(1minus 1

      p32

      )2

      (510)

      If v = 2 the error term in (510) can be replaced by

      Olowast

      (127ζ

      (3

      2

      )2

      yradicz middot(

      1 +1radic2

      )(1minus 1

      232

      )2) (511)

      Proof By Mobius inversion (59) equalssumr1lty

      sumr2lty

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)sum

      lle zr1r2

      lgtmin(

      zymin(r1r2)

      z2r1r2

      )(`v)=1

      sumd1|r1d2|r2d1d2|l

      micro(d1)micro(d2)

      sumd3|vd3|l

      micro(d3)summ2|l

      (mr1r2v)=1

      micro(m)

      (512)

      We can change the order of summation of ri and di by defining si = ridi and we canalso use the obvious fact that the number of integers in an interval (a b] divisible by dis (bminus a)d+Olowast(1) Thus (512) equalssum

      d1d2lty

      (d1d2)=1

      (d1d2v)=1

      micro(d1)micro(d2)sum

      s1ltyd1s2ltyd2

      (d1s1d2s2)=1

      (s1s2v)=1

      micro(d1s1)micro(d2s2)

      sumd3|v

      micro(d3)sum

      mleradic

      z

      d21s1d22s2d3

      (md1s1d2s2v)=1

      micro(m)

      d1d2d3m2

      z

      s1d1s2d2

      (1minusmax

      (1

      2s1d1

      ys2d2

      y

      ))

      (513)

      82 CHAPTER 5 TYPE II SUMS

      plus

      Olowast

      sum

      d1d2lty

      (d1d2v)=1

      sums1ltyd1s2ltyd2

      (s1s2v)=1

      sumd3|v

      summle

      radicz

      d21s1d22s2d3

      m sq-free

      1

      (514)

      If we complete the innermost sum in (513) by removing the condition

      m leradicz(d2

      1sd22s2)

      we obtain (reintroducing the variables ri = disi)

      z middotsum

      r1r2lty

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)

      r1r2

      (1minusmax

      (1

      2r1

      yr2

      y

      ))

      sumd1|r1d2|r2

      sumd3|v

      summ

      (mr1r2v)=1

      micro(d1)micro(d2)micro(m)micro(d3)

      d1d2d3m2

      (515)

      times z Now (515) equalssumr1r2lty

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)z

      r1r2

      (1minusmax

      (1

      2r1

      yr2

      y

      )) prodp|r1r2

      or v

      (1minus 1

      p

      ) prodp-r1r2p-v

      (1minus 1

      p2

      )

      =6z

      π2

      v

      σ(v)

      sumr1r2lty

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)

      σ(r1)σ(r2)

      (1minusmax

      (1

      2r1

      yr2

      y

      ))

      ie the main term in (510) It remains to estimate the terms used to complete thesum their total is by definition given exactly by (513) with the inequality m leradicz(d2

      1sd22s2d3) changed to m gt

      radicz(d2

      1sd22s2d3) This is a total of size at most

      1

      2

      sumd1d2lty

      (d1d2v)=1

      sums1ltyd1s2ltyd2

      (s1s2v)=1

      sumd3|v

      summgt

      radicz

      d21s1d22s2d3

      m sq-free

      1

      d1d2d3m2

      z

      s1d1s2d2 (516)

      Adding this to (514) we obtain as our total error termsumd1d2lty

      (d1d2v)=1

      sums1ltyd1s2ltyd2

      (s1s2v)=1

      sumd3|v

      f

      (radicz

      d21s1d2

      2s2d3

      ) (517)

      51 THE SUM S1 CANCELLATION 83

      where

      f(x) =summlexm sq-free

      1 +1

      2

      summgtxm sq-free

      x2

      m2

      It is easy to see that f(x)x has a local maximum exactly when x is a square-free(positive) integer We can hence check that

      f(x) le 1

      2

      (2 + 2

      (ζ(2)

      ζ(4)minus 125

      ))x = 126981 x

      for all x ge 0 by checking all integers smaller than a constant using m m sq-free subm 4 - m and 15 middot (34) lt 126981 to bound f from below for x larger than aconstant Therefore (517) is at most

      127sum

      d1d2lty

      (d1d2v)=1

      sums1ltyd1s2ltyd2

      (s1s2v)=1

      sumd3|v

      radicz

      d21s1d2

      2s2d3

      = 127radiczprodp|v

      (1 +

      1radicp

      )middot

      sumdlty

      (dv)=1

      sumsltyd

      (sv)=1

      1

      dradics

      2

      We can bound the double sum simply by

      sumdlty

      (dv)=1

      sumsltyd

      1radicsdle 2

      sumdlty

      radicyd

      dle 2radicy middot ζ

      (3

      2

      )prodp|v

      (1minus 1

      p32

      )

      Alternatively if v = 2 we bound

      sumsltyd

      (sv)=1

      1radics

      =sumsltyd

      s odd

      1radicsle 1 +

      1

      2

      int yd

      1

      1radicsds =

      radicyd

      and thus

      sumdlty

      (dv)=1

      sumsltyd

      (sv)=1

      1radicsdle

      sumdlty

      (d2)=1

      radicyd

      dle radicy

      (1minus 1

      232

      (3

      2

      )

      Applying Lemma 511 with y = Ss and z = xWs where S = xWU we

      84 CHAPTER 5 TYPE II SUMS

      obtain that (58) equals

      6x

      π2W

      v

      σ(v)

      sumsltS

      (sv)=1

      1

      s

      sumr1ltSs

      sumr2ltSs

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)

      σ(r1)σ(r2)

      (1minusmax

      (1

      2r1

      Ssr2

      Ss

      ))

      +Olowast

      504ζ

      (3

      2

      )3

      S

      radicx

      W

      prodp|v

      (1 +

      1radicp

      )(1minus 1

      p32

      )3

      (518)with 504 replaced by 127 if v = 2 The main term in (518) can be written as

      6x

      π2W

      v

      σ(v)

      sumsleS

      (sv)=1

      1

      s

      int 1

      12

      sumr1leuSs

      sumr2leuSs

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)

      σ(r1)σ(r2)du (519)

      As we can see the use of an integral eliminates the unpleasant factor(1minusmax

      (1

      2r1

      Ssr2

      Ss

      ))

      From now on we will focus on the cases v = 1 and v = 2 for simplicity (Highervalues of v do not seem to be really profitable in the last analysis)

      512 Explicit bounds for a sum with microWe must estimate the expression within parentheses in (519) It is not too hard toshow that it tends to 0 the first part of the proof of Lemma 512 will reduce this to thefact that

      sumn micro(n)n = 0 Obtaining good bounds is a more delicate matter For our

      purposes we will need the expression to converge to 0 at least as fast as 1(log)2 witha good constant in front For this task the bound (221) on

      sumnlex micro(n)n is enough

      Lemma 512 Let

      gv(x) =sumr1lex

      sumr2lex

      (r1r2)=1

      (r1r2v)=1

      micro(r1)micro(r2)

      σ(r1)σ(r2)

      where v = 1 or v = 2 Then

      |g1(x)| le

      1x if 33 le x le 1061x (111536 + 55768 log x) if 106 le x lt 101000044325(log x)2 + 01079radic

      xif x ge 1010

      |g2(x)| le

      21x if 33 le x le 1061x (163434 + 817168 log x) if 106 le x lt 10100038128(log x)2 + 02046radic

      x if x ge 1010

      51 THE SUM S1 CANCELLATION 85

      Tbe proof involves what may be called a version of Rankinrsquos trick using Dirichletseries and the behavior of ζ(s) near s = 1

      Proof We prove the statements for x le 106 by a direct computation using intervalarithmetic (In fact in that range one gets 20895071x instead of 21x) Assumefrom now on that x gt 106

      Clearly

      g(x) =sumr1lex

      sumr2lex

      (r1r2v)=1

      sumd|(r1r2)

      micro(d)

      micro(r1)micro(r2)

      σ(r1)σ(r2)

      =sumdlex

      (dv)=1

      micro(d)sumr1lex

      sumr2lex

      d|(r1r2)

      (r1r2v)=1

      micro(r1)micro(r2)

      σ(r1)σ(r2)

      =sumdlex

      (dv)=1

      micro(d)

      (σ(d))2

      sumu1lexd

      (u1dv)=1

      sumu2lexd

      (u2dv)=1

      micro(u1)micro(u2)

      σ(u1)σ(u2)

      =sumdlex

      (dv)=1

      micro(d)

      (σ(d))2

      sumrlexd

      (rdv)=1

      micro(r)

      σ(r)

      2

      (520)

      Moreover sumrlexd

      (rdv)=1

      micro(r)

      σ(r)=

      sumrlexd

      (rdv)=1

      micro(r)

      r

      sumdprime|r

      prodp|dprime

      (p

      p+ 1minus 1

      )

      =sum

      dprimelexdmicro(dprime)2=1

      (dprimedv)=1

      prodp|dprime

      minus1

      p+ 1

      sumrlexd

      (rdv)=1

      dprime|r

      micro(r)

      r

      =sum

      dprimelexdmicro(dprime)2=1

      (dprimedv)=1

      1

      dprimeσ(dprime)

      sumrlexddprime

      (rddprimev)=1

      micro(r)

      r

      and sumrlexddprime

      (rddprimev)=1

      micro(r)

      r=

      sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

      1

      dprimeprime

      sumrlexddprimedprimeprime

      micro(r)

      r

      86 CHAPTER 5 TYPE II SUMS

      Hence

      |g(x)| lesumdlex

      (dv)=1

      (micro(d))2

      (σ(d))2

      sum

      dprimelexdmicro(dprime)2=1

      (dprimedv)=1

      1

      dprimeσ(dprime)

      sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

      1

      dprimeprimef(xddprimedprimeprime)

      2

      (521)

      where f(t) =∣∣∣sumrlet micro(r)r

      ∣∣∣We intend to bound the function f(t) by a linear combination of terms of the form

      tminusδ δ isin [0 12) Thus it makes sense now to estimate Fv(s1 s2 x) defined to be thequantity

      sumd

      (dv)=1

      (micro(d))2

      (σ(d))2

      sumdprime1

      (dprime1dv)=1

      micro(dprime1)2

      dprime1σ(dprime1)

      sumdprimeprime1 |(ddprime1v)infin

      1

      dprimeprime1middot (ddprime1dprimeprime1)1minuss1

      sum

      dprime2(dprime2dv)=1

      micro(dprime2)2

      dprime2σ(dprime2)

      sumdprimeprime2 |(ddprime2v)infin

      1

      dprimeprime2middot (ddprime2dprimeprime2)1minuss2

      for s1 s2 isin [12 1] This is equal to

      sumd

      (dv)=1

      micro(d)2

      ds1+s2

      prodp|d

      1

      (1 + pminus1)2

      (1minus pminuss1)prodp|v

      1(1minuspminuss1 )(1minuspminuss2 )

      (1minus pminuss2)

      middot

      sumdprime

      (dprimedv)=1

      micro(dprime)2

      (dprime)s1+1

      prodpprime|dprime

      1

      (1 + pprimeminus1) (1minus pprimeminuss1)

      middot

      sumdprime

      (dprimedv)=1

      micro(dprime)2

      (dprime)s2+1

      prodpprime|dprime

      1

      (1 + pprimeminus1) (1minus pprimeminuss2)

      which in turn can easily be seen to equalprodp-v

      (1 +

      pminuss1pminuss2

      (1minus pminuss1 + pminus1)(1minus pminuss2 + pminus1)

      )prodp|v

      1

      (1minus pminuss1)(1minus pminuss2)

      middotprodp-v

      (1 +

      pminus1pminuss1

      (1 + pminus1)(1minus pminuss1)

      )middotprodp-v

      (1 +

      pminus1pminuss2

      (1 + pminus1)(1minus pminuss2)

      ) (522)

      51 THE SUM S1 CANCELLATION 87

      Now for any 0 lt x le y le x12 lt 1

      (1+xminusy)(1minusxy)(1minusxy2)minus(1+x)(1minusy)(1minusx3) = (xminusy)(y2minusx)(xyminusxminus1)x le 0

      and so

      1 +xy

      (1 + x)(1minus y)=

      (1 + xminus y)(1minus xy)(1minus xy2)

      (1 + x)(1minus y)(1minus xy)(1minus xy2)le (1minus x3)

      (1minus xy)(1minus xy2)

      (523)For any x le y1 y2 lt 1 with y2

      1 le x y22 le x

      1 +y1y2

      (1minus y1 + x)(1minus y2 + x)le (1minus x3)2(1minus x4)

      (1minus y1y2)(1minus y1y22)(1minus y2

      1y2) (524)

      This can be checked as follows multiplying by the denominators and changing vari-ables to x s = y1 + y2 and r = y1y2 we obtain an inequality where the left sidequadratic on s with positive leading coefficient must be less than or equal to the rightside which is linear on s The left side minus the right side can be maximal for givenx r only when s is maximal or minimal This happens when y1 = y2 or when eitheryi =

      radicx or yi = x for at least one of i = 1 2 In each of these cases we have re-

      duced (524) to an inequality in two variables that can be proven automatically1 by aquantifier-elimination program the author has used QEPCAD [HB11] to do this

      Hence Fv(s1 s2 x) is at most

      prodp-v

      (1minus pminus3)2(1minus pminus4)

      (1minus pminuss1minuss2)(1minus pminus2s1minuss2)(1minus pminuss1minus2s2)middotprodp|v

      1

      (1minus pminuss1)(1minus pminuss2)

      middotprodp-v

      1minus pminus3

      (1 + pminuss1minus1)(1 + pminus2s1minus1)

      prodp-v

      1minus pminus3

      (1 + pminuss2minus1)(1 + pminus2s2minus1)

      = Cvs1s2 middotζ(s1 + 1)ζ(s2 + 1)ζ(2s1 + 1)ζ(2s2 + 1)

      ζ(3)4ζ(4)(ζ(s1 + s2)ζ(2s1 + s2)ζ(s1 + 2s2))minus1

      (525)where Cvs1s2 equals 1 if v = 1 and

      (1minus 2minuss1minus2s2)(1 + 2minuss1minus1)(1 + 2minus2s1minus1)(1 + 2minuss2minus1)(1 + 2minus2s2minus1)

      (1minus 2minuss1+s2)minus1(1minus 2minus2s1minuss2)minus1(1minus 2minuss1)(1minus 2minuss2)(1minus 2minus3)4(1minus 2minus4)

      if v = 2For 1 le t le x (221) and (224) imply

      f(t) le

      radic

      2t if x le 1010radic2t + 003

      log x

      (xt

      ) log log 1010

      log xminuslog 1010 if x gt 1010(526)

      1In practice the case yi =radicx leads to a polynomial of high degree and quantifier elimination increases

      sharply in complexity as the degree increases a stronger inequality of lower degree (with (1minus 3x3) insteadof (1minus x3)2(1minus x4)) was given to QEPCAD to prove in this case

      88 CHAPTER 5 TYPE II SUMS

      where we are using the fact that log x is convex-down Note that again by convexity

      log log xminus log log 1010

      log xminus log 1010lt (log t)prime|t=log 1010 =

      1

      log 1010= 00434294

      Obviouslyradic

      2t in (526) can be replaced by (2t)12minusε for any ε ge 0By (521) and (526)

      |gv(x)| le(

      2

      x

      )1minus2ε

      Fv(12 + ε 12 + ε x)

      for x le 1010 We set ε = 1 log x and obtain from (525) that

      Fv(12 + ε 12 + ε x) le Cv 12 +ε 12 +ε

      ζ(1 + 2ε)ζ(32)4ζ(2)2

      ζ(3)4ζ(4)

      le 55768 middot Cv 12 +ε 12 +ε middot(

      1 +log x

      2

      )

      (527)

      where we use the easy bound ζ(s) lt 1 + 1(sminus 1) obtained bysumns lt 1 +

      int infin1

      tsdt

      (For sharper bounds see [BR02]) Now

      C2 12 +ε 12 +ε le(1minus 2minus32minusε)2(1 + 2minus32)2(1 + 2minus2)2(1minus 2minus1minus2ε)

      (1minus 2minus12)2(1minus 2minus3)4(1minus 2minus4)

      le 14652983

      whereas C1 12 +ε 12 +ε = 1 (We are assuming x ge 106 and so ε le 1(log 106)) Hence

      |gv(x)| le

      1x (111536 + 55768 log x) if v = 11x (163434 + 817168 log x) if v = 2

      for 106 le x lt 1010For general x we must use the second bound in (526) Define c = 1(log 1010)

      We see that if x gt 1010

      |gv(x)| le 0032

      (log x)2F1(1minus c 1minus c) middot Cv1minusc1minusc

      + 2 middotradic

      2radicx

      003

      log xF (1minus c 12) middot Cv1minusc12

      +1

      x(111536 + 55768 log x) middot Cv 12 +ε 12 +ε

      For v = 1 this gives

      |g1(x)| le 00044325

      (log x)2+

      21626radicx log x

      +1

      x(111536 + 55768 log x)

      le 00044325

      (log x)2+

      01079radicx

      51 THE SUM S1 CANCELLATION 89

      for v = 2 we obtain

      |g2(x)| le 0038128

      (log x)2+

      25607radicx log x

      +1

      x(163434 + 817168 log x)

      le 0038128

      (log x)2+

      02046radicx

      513 Estimating the triple sumWe will now be able to bound the triple sum in (519) vizsum

      sleS(sv)=1

      1

      s

      int 1

      12

      gv(uSs)du (528)

      where gv is as in Lemma 512As we will soon see Lemma 512 that (528) is bounded by a constant (essentially

      because the integralint 12

      01t(log t)2 converges) We must give as good a constant as

      we can since it will affect the largest term in the final resultClearly gv(R) = gv(bRc) The contribution of each gv(m) 1 le m le S to (528)

      is exactly gv(m) timessumS

      m+1ltsleSm

      1

      s

      (sv)=1

      int 1

      msS

      1du+sum

      S2mltsle

      Sm+1

      1

      s

      (sv)=1

      int (m+1)sS

      msS

      1du

      +sum

      S2(m+1)

      ltsle S2m

      1

      s

      (sv)=1

      int (m+1)sS

      12

      du =sum

      Sm+1ltsle

      Sm

      (sv)=1

      (1

      sminus m

      S

      )

      +sum

      S2mltsle

      Sm+1

      (sv)=1

      1

      S+

      sumS

      2(m+1)ltsle S

      2m

      (sv)=1

      (m+ 1

      Sminus 1

      2s

      )

      (529)

      Write f(t) = 1S for S2m lt t le S(m+1) f(t) = 0 for t gt Sm or t lt S2(m+1) f(t) = 1tminusmS for S(m+ 1) lt t le Sm and f(t) = (m+ 1)S minus 12t forS2(m + 1) lt t le S2m then (529) equals

      sumn(nv)=1 f(n) By Euler-Maclaurin

      (second order)sumn

      f(n) =

      int infinminusinfin

      f(x)minus 1

      2B2(x)f primeprime(x)dx =

      int infinminusinfin

      f(x) +Olowast(

      1

      12|f primeprime(x)|

      )dx

      =

      int infinminusinfin

      f(x)dx+1

      6middotOlowast

      (∣∣∣∣f prime( 3

      2m

      )∣∣∣∣+

      ∣∣∣∣f prime( s

      m+ 1

      )∣∣∣∣)=

      1

      2log

      (1 +

      1

      m

      )+

      1

      6middotOlowast

      ((2m

      s

      )2

      +

      (m+ 1

      s

      )2)

      (530)

      90 CHAPTER 5 TYPE II SUMS

      Similarly

      sumn odd

      f(n) =

      int infinminusinfin

      f(2x+ 1)minus 1

      2B2(x)d

      2f(2x+ 1)

      dx2dx

      =1

      2

      int infinminusinfin

      f(x)dxminus 2

      int infinminusinfin

      1

      2B2

      (xminus 1

      2

      )f primeprime(x)dx

      =1

      2

      int infinminusinfin

      f(x)dx+1

      6

      int infinminusinfin

      Olowast (|f primeprime(x)|) dx

      =1

      4log

      (1 +

      1

      m

      )+

      1

      3middotOlowast

      ((2m

      s

      )2

      +

      (m+ 1

      s

      )2)

      We use these expressions form le C0 where C0 ge 33 is a constant to be computedlater they will give us the main term For m gt C0 we use the bounds on |g(m)| thatLemma 512 gives us

      (Starting now and for the rest of the paper we will focus on the cases v = 1v = 2 when giving explicit computational estimates All of our procedures wouldallow higher values of v as well but as will become clear much later the gains fromhigher values of v are offset by losses and complications elsewhere)

      Let us estimate (528) Let

      cv0 =

      16 if v = 113 if v = 2

      cv1 =

      1 if v = 125 if v = 2

      cv2 =

      55768 if v = 1817168 if v = 2

      cv3 =

      111536 if v = 1163434 if v = 2

      cv4 =

      00044325 if v = 10038128 if v = 2

      cv5 =

      01079 if v = 102046 if v = 2

      Then (528) equals

      summleC0

      gv(m) middot(φ(v)

      2vlog

      (1 +

      1

      m

      )+Olowast

      (cv0

      5m2 + 2m+ 1

      S2

      ))

      +sum

      S106lesltSC0

      1

      s

      int 1

      12

      Olowast(cv1uSs

      )du

      +sum

      S1010lesltS106

      1

      s

      int 1

      12

      Olowast(cv2 log(uSs) + cv3

      uSs

      )du

      +sum

      sltS1010

      1

      s

      int 1

      12

      Olowast

      (cv4

      (log uSs)2+

      cv5radicuSs

      )du

      51 THE SUM S1 CANCELLATION 91

      which issummleC0

      gv(m) middot φ(v)

      2vlog

      (1 +

      1

      m

      )+summleC0

      |g(m)| middotOlowast(cv0

      5m2 + 2m+ 1

      S2

      )

      +Olowast

      (cv1

      log 2

      C0+

      log 2

      106

      (cv3 + cv2(1 + log 106)

      )+

      2minusradic

      2

      10102cv5

      )

      +Olowast

      sumsltS1010

      cv42

      s(logS2s)2

      for S ge (C0 + 1) Note that

      sumsltS1010

      1s(logS2s)2 =

      int 21010

      01

      t(log t)2 dtNow

      cv42

      int 21010

      0

      1

      t(log t)2dt =

      cv42

      log(10102)=

      000009923 if v = 1

      0000853636 if v = 2

      and

      log 2

      106

      (cv3 + cv2(1 + log 106)

      )+

      2minusradic

      2

      105cv5 =

      00006506 if v = 1

      0009525 if v = 2

      For C0 = 10000

      φ(v)

      v

      1

      2

      summleC0

      gv(m) middot log

      (1 +

      1

      m

      )=

      0362482 if v = 10360576 if v = 2

      cv0summleC0

      |gv(m)|(5m2 + 2m+ 1) le

      62040665 if v = 1159113401 if v = 2

      and

      cv1 middot (log 2)C0 =

      000006931 if v = 1000017328 if v = 2

      Thus for S ge 100000sumsleS

      (sv)=1

      1

      s

      int 1

      12

      gv(uSs)du le

      036393 if v = 1037273 if v = 2

      (531)

      For S lt 100000 we proceed as above but using the exact expression (529) insteadof (530) Note (529) is of the form fsm1(S) + fsm2(S)S where both fsm1(S)and fsm2(S) depend only on bSc (and on s andm) Summing overm le S we obtaina bound of the form sum

      sleS(sv)=1

      1

      s

      int 1

      12

      gv(uSs)du le Gv(S)

      92 CHAPTER 5 TYPE II SUMS

      withGv(S) = Kv1(|S|) +Kv2(|S|)S

      where Kv1(n) and Kv2(n) can be computed explicitly for each integer n (For exam-ple Gv(S) = 1minus 1S for 1 le S lt 2 and Gv(S) = 0 for S lt 1)

      It is easy to check numerically that this implies that (531) holds not just for S ge100000 but also for 40 le S lt 100000 (if v = 1) or 16 le S lt 100000 (if v =

      2) Using the fact that Gv(S) is non-negative we can compareint T

      1Gv(S)dSS with

      log(T+1N) for each T isin [2 40]cap 1NZ (N a large integer) to show again numerically

      that int T

      1

      Gv(S)dS

      Sle

      03698 log T if v = 1037273 log T if v = 2

      (532)

      (We use N = 100000 for v = 1 already N = 10 gives us the answer above forv = 2 Indeed computations suggest the better bound 0358 instead of 037273 weare committed to using 037273 because of (531))

      Multiplying by 6vπ2σ(v) we conclude that

      S1(UW ) =x

      WmiddotH1

      ( x

      WU

      )+Olowast

      (508ζ(32)3 x32

      W 32U

      )(533)

      if v = 1

      S1(UW ) =x

      WmiddotH2

      ( x

      WU

      )+Olowast

      (127ζ(32)3 x32

      W 32U

      )(534)

      if v = 2 where

      H1(S) =

      6π2G1(S) if 1 le S lt 40022125 if S ge 40

      H2(s) =

      4π2G2(S) if 1 le S lt 16015107 if S ge 16

      (535)Hence (by (532)) int T

      1

      Hv(S)dS

      Sle

      022482 log T if v = 1015107 log T if v = 2

      (536)

      moreover

      H1(S) le 3

      π2 H2(S) le 2

      π2(537)

      for all S

      Note There is another way to obtain cancellation on micro applicable when (xW ) gtUq (as is unfortunately never the case in our main application) For this alternativeto be taken one must either apply Cauchy-Schwarz on n rather than m (resulting inexponential sums over m) or lump together all m near each other and in the same

      52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 93

      congruence class modulo q before applying Cauchy-Schwarz on m (one can indeed dothis if δ is small) We could then writesum

      msimWmequivr mod q

      sumd|mdgtU

      micro(d) = minussummsimW

      mequivr mod q

      sumd|mdleU

      micro(d) = minussumdleU

      micro(d)(Wqd+O(1))

      and obtain cancellation on d If Uq ge (xW ) however the error term dominates

      52 The sum S2 the large sieve primes and tailsWe must now bound

      S2(U primeW primeW ) =sum

      U primeltmle xW

      (mv)=1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αmp)

      ∣∣∣∣∣∣2

      (538)

      for U prime = max(U x2W ) W prime = max(VW2) (The condition (p v) = 1 will befulfilled automatically by the assumption V gt v)

      From a modern perspective this is clearly a case for a large sieve It is also clear thatwe ought to try to apply a large sieve for sequences of prime support What is subtlerhere is how to do things well for very large q (ie xq small) This is in some sense adual problem to that of q small but it poses additional complications for example it isnot obvious how to take advantage of prime support for very large q

      As in type I we avoid this entire issue by forbidding q large and then taking advan-tage of the error term δx in the approximation α = a

      q + δx This is one of the main

      innovations here Note this alternative method will allow us to take advantage of primesupport

      A key situation to study is that of frequencies αi clustering around given rationalsaq while nevertheless keeping at a certain small distance from each other

      Lemma 521 Let q ge 1 Let α1 α2 αk isin RZ be of the form αi = aiq + υi0 le ai lt q where the elements υi isin R all lie in an interval of length υ gt 0 and whereai = aj implies |υi minus υj | gt ν gt 0 Assume ν + υ le 1q Then for any WW prime ge 1W prime geW2

      ksumi=1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αip)

      ∣∣∣∣∣∣2

      le min

      (1

      2q

      φ(q)

      1

      log ((q(ν + υ))minus1)

      )middot(W minusW prime + νminus1

      ) sumW primeltpleW

      (log p)2

      (539)

      Proof For any distinct i j the angles αi αj are separated by at least ν (if ai = aj) orat least 1qminus|υiminusυj | ge 1qminusυ ge ν (if ai 6= aj) Hence we can apply the large sieve(in the optimal N + δminus1 minus 1 form due to Selberg [Sel91] and Montgomery-Vaughan[MV74]) and obtain the bound in (539) with 1 instead of min(1 ) immediately

      94 CHAPTER 5 TYPE II SUMS

      We can also apply Montgomeryrsquos inequality ([Mon68] [Hux72] see the exposi-tions in [Mon71 pp 27ndash29] and [IK04 sect74]) This gives us that the left side of (539)is at most

      sumrleR

      (rq)=1

      (micro(r))2

      φ(r)

      minus1 sum

      rleR(rq)=1

      sumaprime mod r(aprimer)=1

      ksumi=1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e((αi + aprimer)p)

      ∣∣∣∣∣∣2

      (540)

      If we add all possible fractions of the form aprimer r le R (r q) = 1 to the fractionsaiq we obtain fractions that are separated by at least 1qR2 If ν + υ ge 1qR2 thenthe resulting angles αi + aprimer are still separated by at least ν Thus we can apply thelarge sieve to (540) setting R = 1

      radic(ν + υ)q we see that we gain a factor of

      sumrleR

      (rq)=1

      (micro(r))2

      φ(r)ge φ(q)

      q

      sumrleR

      (micro(r))2

      φ(r)ge φ(q)

      q

      sumdleR

      1

      dge φ(q)

      2qlog((q(ν + υ))minus1

      )

      (541)since

      sumdleR 1d ge log(R) for all R ge 1 (integer or not)

      Let us first give a bound on sums of the type of S2(U VW ) using prime sup-port but not the error terms (or Lemma 521) This is something that can be donevery well using tools available in the literature (Not all of these tools seem to beknown as widely as they should be) Bounds (542) and (544) are completely standardlarge-sieve bounds To obtain the gain of a factor of log in (543) we use a lemmaof Montgomeryrsquos for whose modern proof (containing an improvement by Huxley)we refer to the standard source [IK04 Lemma 715] The purpose of Montgomeryrsquoslemma is precisely to gain a factor of log in applications of the large sieve to sequencessupported on the primes To use the lemma efficiently we apply Montgomery andVaughanrsquos large sieve with weights [MV73 (16)] rather than more common forms ofthe large sieve (The idea ndash used in [MV73] to prove an improved version of the Brun-Titchmarsh inequality ndash is that Farey fractions (rationals with bounded denominator)are not equidistributed this fact can be exploited if a large sieve with weights is used)

      Lemma 522 Let W ge 1 W prime geW2 Let α = aq +Olowast(1qQ) q le Q Then

      sumA0ltmleA1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αmp)

      ∣∣∣∣∣∣2

      lelceil

      A1 minusA0

      min(q dQ2e)

      rceilmiddot (W minusW prime + 2q)

      sumW primeltpleW

      (log p)2

      (542)

      52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 95

      If q lt W2 and Q ge 35W the following bound also holds

      sumA0ltmleA1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αmp)

      ∣∣∣∣∣∣2

      lelceilA1 minusA0

      q

      rceilmiddot q

      φ(q)

      W

      log(W2q)middot

      sumW primeltpleW

      (log p)2

      (543)

      If A1 minusA0 le q and q le ρQ ρ isin [0 1] the following bound also holds

      sumA0ltmleA1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αmp)

      ∣∣∣∣∣∣2

      le (W minusW prime + q(1minus ρ))sum

      W primeltpleW

      (log p)2

      (544)

      Proof Let k = min(q dQ2e) ge dq2e We split (A0 A1] into d(A1minusA0)ke blocksof at most k consecutive integers m0 + 1m0 + 2 For m mprime in such a block αmand αmprime are separated by a distance of at least

      |(aq)(mminusmprime)| minusOlowast(kqQ) = 1q minusOlowast(12q) ge 12q

      By the large sieve

      qsuma=1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(α(m0 + a)p)

      ∣∣∣∣∣∣2

      le ((W minusW prime)+2q)sum

      W primeltpleW

      (log p)2 (545)

      We obtain (542) by summing over all d(A1 minusA0)ke blocksIf A1 minus A0 le |q| and q le ρQ ρ isin [0 1] we obtain (544) simply by applying

      the large sieve without splitting the interval A0 lt m le A1Let us now prove (543) We will use Montgomeryrsquos inequality followed by Mont-

      gomery and Vaughanrsquos large sieve with weights An angle aq + aprime1r1 is separatedfrom other angles aprimeq + aprime2r2 (r1 r2 le R (ai ri) = 1) by at least 1qr1R ratherthan just 1qR2 We will choose R so that qR2 lt Q this implies 1Q lt 1qR2 le1qr1R

      By a lemma of Montgomeryrsquos [IK04 Lemma 715] applied (for each 1 le a le q)to S(α) =

      sumn ane(αn) with an = log(n)e(α(m0 + a)n) if n is prime and an = 0

      otherwise

      1

      φ(r)

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(α(m0 + a)p)

      ∣∣∣∣∣∣2

      lesum

      aprime mod r(aprimer)=1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e

      ((α (m0 + a) +

      aprime

      r

      )p

      )∣∣∣∣∣∣2

      (546)

      96 CHAPTER 5 TYPE II SUMS

      for each square-free r leW prime We multiply both sides of (546) by(W

      2+

      3

      2

      (1

      qrRminus 1

      Q

      )minus1)minus1

      and sum over all a = 0 1 q minus 1 and all square-free r le R coprime to q we willlater make sure that R leW prime We obtain that

      sumrleR

      (rq)=1

      (W

      2+

      3

      2

      (1

      qrRminus 1

      Q

      )minus1)minus1

      micro(r)2

      φ(r)

      middotqsuma=1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(α(m0 + a)p)

      ∣∣∣∣∣∣2

      (547)

      is at mostsumrleR

      (rq)=1

      r sq-free

      (W

      2+

      3

      2

      (1

      qrRminus 1

      Q

      )minus1)minus1

      qsuma=1

      sumaprime mod r(aprimer)=1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e

      ((α (m0 + a) +

      aprime

      r

      )p

      )∣∣∣∣∣∣2

      (548)

      We now apply the large sieve with weights [MV73 (16)] recalling that each angleα(m0 +a)+aprimer is separated from the others by at least 1qrRminus1Q we obtain that(548) is at most

      sumW primeltpleW (log p)2 It remains to estimate the sum in the first line of

      (547) (We are following here a procedure analogous to that used in [MV73] to provethe Brun-Titchmarsh theorem)

      Assume first that q leW135 Set

      R =

      (σW

      q

      )12

      (549)

      where σ = 12e2middot025068 = 030285 It is clear that qR2 lt Q q lt W prime and R ge 2Moreover for r le R

      1

      Qle 1

      35Wle σ

      35

      1

      σW=

      σ

      35

      1

      qR2le σ35

      qrR

      Hence

      W

      2+

      3

      2

      (1

      qrRminus 1

      Q

      )minus1

      le W

      2+

      3

      2

      qrR

      1minus σ35=W

      2+

      3r

      2(1minus σ

      35

      )Rmiddot 2σW

      2

      =W

      2

      (1 +

      1minus σ35rW

      R

      )ltW

      2

      (1 +

      rW

      R

      )

      52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 97

      and so

      sumrleR

      (rq)=1

      (W

      2+

      3

      2

      (1

      qrRminus 1

      Q

      )minus1)minus1

      micro(r)2

      φ(r)

      ge 2

      W

      sumrleR

      (rq)=1

      (1 + rRminus1)minus1micro(r)2

      φ(r)ge 2

      W

      φ(q)

      q

      sumrleR

      (1 + rRminus1)minus1micro(r)2

      φ(r)

      For R ge 2 sumrleR

      (1 + rRminus1)minus1micro(r)2

      φ(r)gt logR+ 025068

      this is true for R ge 100 by [MV73 Lemma 8] and easily verifiable numerically for2 le R lt 100 (It suffices to verify this for R integer with r lt R instead of r le R asthat is the worst case)

      Now

      logR =1

      2

      (log

      W

      2q+ log 2σ

      )=

      1

      2log

      W

      2qminus 025068

      Hence sumrleR

      (1 + rRminus1)minus1micro(r)2

      φ(r)gt

      1

      2log

      W

      2q

      and the statement followsNow consider the case q gt W135 If q is even then in this range inequality

      (542) is always better than (543) and so we are done Assume then that W135 ltq le W2 and q is odd We set R = 2 clearly qR2 lt W le Q and q lt W2 le W primeand so this choice of R is valid It remains to check that

      1

      W2 + 3

      2

      (12q minus

      1Q

      )minus1 +1

      W2 + 3

      2

      (14q minus

      1Q

      )minus1 ge1

      Wlog

      W

      2q

      This follows because

      112 + 3

      2

      (t2 minus

      135

      )minus1 +1

      12 + 3

      2

      (t4 minus

      135

      )minus1 ge logt

      2

      for all 2 le t le 135

      We need a version of Lemma 522 with m restricted to the odd numbers since weplan to set the parameter v equal to 2

      98 CHAPTER 5 TYPE II SUMS

      Lemma 523 Let W ge 1 W prime geW2 Let 2α = aq +Olowast(1qQ) q le Q Then

      sumA0ltmleA1

      m odd

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αmp)

      ∣∣∣∣∣∣2

      lelceilA1 minusA0

      min(2qQ)

      rceilmiddot (W minusW prime + 2q)

      sumW primeltpleW

      (log p)2

      (550)

      If q lt W2 and Q ge 35W the following bound also holds

      sumA0ltmleA1

      m odd

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αmp)

      ∣∣∣∣∣∣2

      lelceilA1 minusA0

      2q

      rceilmiddot q

      φ(q)

      W

      log(W2q)middot

      sumW primeltpleW

      (log p)2

      (551)

      If A1 minusA0 le 2q and q le ρQ ρ isin [0 1] the following bound also holds

      sumA0ltmleA1

      ∣∣∣∣∣∣sum

      W primeltpleW

      (log p)e(αmp)

      ∣∣∣∣∣∣2

      le (W minusW prime + q(1minus ρ))sum

      W primeltpleW

      (log p)2

      (552)

      Proof We follow the proof of Lemma 522 noting the differences Let

      k = min(q dQ2e) ge dq2e

      just as before We split (A0 A1] into d(A1 minusA0)ke blocks of at most 2k consecutiveintegers any such block contains at most k odd numbers For odd m mprime in such ablock αm and αmprime are separated by a distance of

      |α(mminusmprime)| =∣∣∣∣2α

      mminusmprime

      2

      ∣∣∣∣ = |(aq)k| minusOlowast(kqQ) ge 12q

      We obtain (550) and (552) just as we obtained (542) and (544) before To obtain(551) proceed again as before noting that the angles we are working with can belabelled as α(m0 + 2a) 0 le a lt q

      The idea now (for large δ) is that if δ is not negligible then as m increases andαm loops around the circle RZ αm roughly repeats itself every q steps ndash but with aslight displacement This displacement gives rise to a configuration to which Lemma521 is applicable The effect is that we can apply the large sieve once instead of manytimes thus leading to a gain of a large factor (essentially the number of times the largesieve would have been used) This is how we obtain the factor of |δ| in the denominatorof the main term x|δ|q in (556) and (557)

      52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 99

      Proposition 524 Let x ge W ge 1 W prime ge W2 U prime ge x2W Let Q ge 35W Let2α = aq + δx (a q) = 1 |δx| le 1qQ q le Q Let S2(U primeW primeW ) be as in(538) with v = 2

      For q le ρQ where ρ isin [0 1]

      S2(U primeW primeW ) le(

      max(1 2ρ)

      (x

      8q+

      x

      2W

      )+W

      2+ 2q

      )middot

      sumW primeltpleW

      (log p)2

      (553)If q lt W2

      S2(U primeW primeW ) le(

      x

      4φ(q)

      1

      log(W2q)+

      q

      φ(q)

      W

      log(W2q)

      )middot

      sumW primeltpleW

      (log p)2

      (554)If W gt x4q the following bound also holds

      S2(U primeW primeW ) le(W

      2+

      q

      1minus x4Wq

      ) sumW primeltpleW

      (log p)2 (555)

      If δ 6= 0 and x4W + q le x|δ|q

      S2(U primeW primeW ) le min

      12qφ(q)

      log(

      x|δq|(q + x

      4W

      )minus1)

      middot(

      x

      |δq|+W

      2

      ) sumW primeltpleW

      (log p)2

      (556)

      Lastly if δ 6= 0 and q le ρQ where ρ isin [0 1)

      S2(U primeW primeW ) le(

      x

      |δq|+W

      2+

      x

      8(1minus ρ)Q+

      x

      4(1minus ρ)W

      ) sumW primeltpleW

      (log p)2

      (557)

      The trivial bound would be in the order of

      S2(U primeW primeW ) = (x2 log x)sum

      W primeltpleW

      (log p)2

      In practice (555) gets applied when W ge xq

      Proof Let us first prove statements (554) and (553) which do not involve δ Assumefirst q leW2 Then by (551) with A0 = U prime A1 = xW

      S2(U primeW primeW ) le(xW minus U prime

      2q+ 1

      )q

      φ(q)

      W

      log(W2q)

      sumW primeltpleW

      (log p)2

      Clearly (xW minus U prime)W le (x2W ) middotW = x2 Thus (554) holds

      100 CHAPTER 5 TYPE II SUMS

      Assume now that q le ρQ Apply (550) with A0 = U prime A1 = xW Then

      S2(U primeW primeW ) le(

      xW minus U prime

      q middotmin(2 ρminus1)+ 1

      )(W minusW prime + 2q)

      sumW primeltpleW

      (log p)2

      Now (xW minus U prime

      q middotmin(2 ρminus1)+ 1

      )middot (W minusW prime + 2q)

      le( xWminus U prime

      ) W minusW prime

      qmin(2 ρminus1)+ max(1 2ρ)

      ( xWminus U prime

      )+W2 + 2q

      le x4

      qmin(2 ρminus1)+ max(1 2ρ)

      x

      2W+W2 + 2q

      This implies (553)If W gt x4q apply (544) with = x4Wq ρ = 1 This yields (555)Assume now that δ 6= 0 and x4W + q le x|δq| Let Qprime = x|δq| For any m1

      m2 with x2W lt m1m2 le xW we have |m1 minusm2| le x2W le 2(Qprime minus q) andso ∣∣∣∣m1 minusm2

      2middot δx+ qδx

      ∣∣∣∣ le Qprime|δ|x =1

      q (558)

      The conditions of Lemma 521 are thus fulfilled with υ = (x4W ) middot |δ|x and ν =|δq|x We obtain that S2(U primeW primeW ) is at most

      min

      (1

      2q

      φ(q)

      1

      log ((q(ν + υ))minus1)

      )(W minusW prime + νminus1

      ) sumW primeltpleW

      (log p)2

      Here W minusW prime + νminus1 = W minusW prime + x|qδ| leW2 + x|qδ| and

      (q(ν + υ))minus1 =

      (q|δ|x

      )minus1 (q +

      x

      4W

      )minus1

      Lastly assume δ 6= 0 and q le ρQ We let Qprime = x|δq| ge Q again and we splitthe range U prime lt m le xW into intervals of length 2(Qprime minus q) so that (558) still holdswithin each interval We apply Lemma 521 with υ = (Qprimeminus q) middot |δ|x and ν = |δq|xWe obtain that S2(U primeW primeW ) is at most(

      1 +xW minus U2(Qprime minus q)

      )(W minusW prime + νminus1

      ) sumW primeltpleW

      (log p)2

      Here W minusW prime + νminus1 leW2 + xq|δ| as before Moreover(W

      2+

      x

      q|δ|

      )(1 +

      xW minus U2(Qprime minus q)

      )le(W

      2+Qprime

      )(1 +

      x2W

      2(1minus ρ)Qprime

      )le W

      2+Qprime +

      x

      8(1minus ρ)Qprime+

      x

      4W (1minus ρ)

      le x

      |δq|+W

      2+

      x

      8(1minus ρ)Q+

      x

      4(1minus ρ)W

      Hence (557) holds

      Chapter 6

      Minor-arc totals

      It is now time to make all of our estimates fully explicit choose our parameters putour type I and type II estimates together and give our final minor-arc estimates

      Let x gt 0 be given Starting in section 631 we will assume that x ge x0 =216 middot1020 We will choose our main parameters U and V gradually as the need ariseswe assume from the start that 2 middot 106 le V lt x4 and UV le x

      We are also given an angle α isin RZ We choose an approximation 2α = aq +δx (a q) = 1 q le Q |δx| le 1qQ The parameter Q will be chosen later weassume from the start that Q ge max(16 2

      radicx) and Q ge max(2U xU)

      (Actually U and V will be chosen in different ways depending on the size of qActually evenQ will depend on the size of q this may seem circular but what actuallyhappens is the following we will first set a value for Q depending only on x and ifthe corresponding value of q le Q is larger than a certain parameter y depending on xthen we reset U V and Q and obtain a new value of q)

      Let SI1 SI2 SII S0 be as in (39) with the smoothing function η = η2 as in(34) (We bounded the type I sums SI1 SI2 for a general smoothing function η it isonly here that we are specifying η)

      The term S0 is 0 because V lt x4 and η2 is supported on [minus14 1] We set v = 2

      61 The smoothing functionFor the smoothing function η2 in (34)

      |η2|1 = 1 |ηprime2|1 = 8 log 2 |ηprimeprime2 |1 = 48 (61)

      as per [Tao14 (59)ndash(513)] Similarly for η2ρ(t) = log(ρt)η2(t) where ρ ge 4

      |η2ρ|1 lt log(ρ)|η2|1 = log(ρ)

      |ηprime2ρ|1 = 2η2ρ(12) = 2 log(ρ2)η2(12) lt (8 log 2) log ρ

      |ηprimeprime2ρ|1 = 4 log(ρ4) + |2 log ρminus 4 log(ρ4)|+ |4 log 2minus 4 log ρ|+ | log ρminus 4 log 2|+ | log ρ| lt 48 log ρ

      (62)

      101

      102 CHAPTER 6 MINOR-ARC TOTALS

      In the first inequality we are using the fact that log(ρt) is always positive (and less thanlog(ρ)) when t is in the support of η2

      Write log+ x for max(log x 0)

      62 Contributions of different types

      621 Type I terms SI1The term SI1 can be handled directly by Lemma 423 with ρ0 = 4 and D = U (Condition (438) is valid thanks to (62)) Since U le Q2 the contribution of SI1gets bounded by (440) and (441) the absolute value of SI1 is at most

      x

      qmin

      (1c0δ

      2

      (2π)2

      ) ∣∣∣∣∣∣∣∣∣∣summleUq

      (mq)=1

      micro(m)

      mlog

      x

      mq

      ∣∣∣∣∣∣∣∣∣∣+x

      q|log middotη(minusδ)|

      ∣∣∣∣∣∣∣∣∣∣summleUq

      (mq)=1

      micro(m)

      m

      ∣∣∣∣∣∣∣∣∣∣+

      2radicc0c1π

      (U log

      ex

      U+radic

      3q logq

      c2+q

      2log

      q

      c2log+ 2U

      q

      )+

      3c1x

      2qlog

      q

      c2log+ U

      c2xq

      +3c12

      radic2x

      c2log

      2x

      c2+

      (c02minus 2c0π2

      )(U2

      4qxlog

      e12x

      U+

      1

      e

      )+

      2|ηprime|1π

      qmax

      (1 log

      c0e3q2

      4π|ηprime|1x

      )log x+

      20c0c322

      3π2

      radic2x log

      2radicex

      c2

      (63)where c0 = 31521 (by Lemma B23) c1 = 10000028 gt 1 + (8 log 2)V ge 1 +(8 log 2)(xU) and c2 = 6π5

      radicc0 = 067147 By (21) (with k = 2) (B17) and

      Lemma B24

      |log middotη(minusδ)| le min

      (2minus log 4

      24 log 2

      π2δ2

      )

      By (220) (222) and (223) the first line of (63) is at most

      x

      qmin

      (1cprime0δ2

      )(min

      (4

      5

      qφ(q)

      log+ Uq2

      1

      )log

      x

      U+ 100303

      q

      φ(q)

      )

      +x

      qmin

      (2minus log 4

      cprimeprime0δ2

      )min

      (4

      5

      qφ(q)

      log+ Uq2

      1

      )

      where cprime0 = 0798437 gt c0(2π)2 cprimeprime0 = 1685532 Clearly cprimeprime0c0 gt 1 gt 2minus log 4Taking derivatives we see that t 7rarr (t2) log(tc2) log+ 2Ut takes its maxi-

      mum (for t isin [1 2U ]) when log(tc2) log+ 2Ut = log tc2 minus log+ 2Ut sincetrarr log tc2 minus log+ 2Ut is increasing on [1 2U ] we conclude that

      q

      2log

      q

      c2log+ 2U

      qle U log

      2U

      c2

      62 CONTRIBUTIONS OF DIFFERENT TYPES 103

      Similarly t 7rarr t log(xt) log+(Ut) takes its maximum at a point t isin [0 U for whichlog(xt) log+(Ut) = log(xt) + log+(Ut) and so

      x

      qlog

      q

      c2log+ U

      c2xq

      le U

      c2(log x+ logU)

      We conclude that

      |SI1| lex

      qmin

      (1cprime0δ2

      )(min

      (4qφ(q)

      5 log+ Uq2

      1

      )(log

      x

      U+ c3I

      )+ c4I

      q

      φ(q)

      )

      +

      (c7I log

      q

      c2+ c8I log xmax

      (1 log

      c11Iq2

      x

      ))q + c10I

      U2

      4qxlog

      e12x

      U

      +

      (c5I log

      2U

      c2+ c6I log xU

      )U + c9I

      radicx log

      2radicex

      c2+c10I

      e

      (64)where c2 and cprime0 are as above c3I = 211104 gt cprimeprime0c

      prime0 c4I = 100303 c5I =

      357422 gt 2radicc0c1π c6I = 223389 gt 3c12c2 c7I = 619072 gt 2

      radic3c0c1π

      c8I = 353017 gt 2(8 log 2)π

      c9I = 191568 gt3radic

      2c12radicc2

      +20radic

      2c0c322

      3π2

      c10I = 937301 gt c0(12minus 2π2) and c11I = 90857 gt c0e3(4π middot 8 log 2)

      622 Type I terms SI2The case q le QV If q le QV then for v le V

      2vα =va

      q+Olowast

      (v

      Qq

      )=va

      q+Olowast

      (1

      q2

      )

      and so vaq is a valid approximation to 2vα (Here we are using v to label an integervariable bounded above by v le V we no longer need v to label the quantity in (310)since that has been set equal to the constant 2) Moreover for Qv = Qv we see that2vα = (vaq) +Olowast(1qQv) If α = aq + δx then vα = vaq + δ(xv) Now

      SI2 =sumvleVv odd

      Λ(v)summleUm odd

      micro(m)sumn

      n odd

      e((vα) middotmn)η(mn(xv)) (65)

      We can thus estimate SI2 by applying Lemma 422 to each inner double sum in (65)We obtain that if |δ| le 12c2 where c2 = 6π5

      radicc0 and c0 = 31521 then |SI2| is

      at most

      sumvleV

      Λ(v)

      xv2qvmin

      (1

      c0(πδ)2

      ) ∣∣∣∣∣∣∣∣∣sum

      mleMvq

      (m2q)=1

      micro(m)

      m

      ∣∣∣∣∣∣∣∣∣+c10Iq

      4xv

      (U

      qv+ 1

      )2

      (66)

      104 CHAPTER 6 MINOR-ARC TOTALS

      plus

      sumvleV

      Λ(v)

      (2radicc0c+

      πU +

      3c+2

      x

      vqvlog+ U

      c2xvqv

      +

      radicc0c+

      πqv log+ U

      qv2

      )

      +sumvleV

      Λ(v)

      (c8I max

      (log

      c11Iq2v

      xv 1

      )qv +

      (2radic

      3c0c+π

      +3c+2c2

      +55c0c2

      6π2

      )qv

      )

      (67)where qv = q(q v) Mv isin [min(Q2v U) U ] and c+ = 1 + (8 log 2)(xUV ) if|δ| ge 12c2 then |SI2| is at most (66) plus

      sumvleV

      Λ(v)

      radicc0c1π2

      U +3c12

      2 +(1 + ε)

      εlog+ 2U

      xv|δ|qv

      x

      Q+

      35c0c23π2

      qv

      +sumvleV

      Λ(v)

      radicc0c1π2

      (1 + ε) min

      (lfloorxv

      |δ|qv

      rfloor+ 1 2U

      )radic3 + 2ε+

      log+ 2U

      b xv|δ|qv c+1

      2

      (68)

      Write SV =sumvleV Λ(v)(vqv) By (212)

      SV lesumvleV

      Λ(v)

      vq+

      sumvleV

      (vq)gt1

      Λ(v)

      v

      ((q v)

      qminus 1

      q

      )

      le log V

      q+

      1

      q

      sump|q

      (log p)

      vp(q) +sumαge1

      pα+vp(q)leV

      1

      pαminussumαge1

      pαleV

      1

      le log V

      q+

      1

      q

      sump|q

      (log p)vp(q) =log V q

      q

      (69)

      This helps us to estimate (66) We could also use this to estimate the second term inthe first line of (67) but for that purpose it will actually be wiser to use the simplerbound sum

      vleV

      Λ(v)x

      vqvlog+ U

      c2xvqv

      lesumvleV

      Λ(v)Uc2ele 10004

      ec2UV (610)

      (by (214) and the fact that t log+At takes its maximum at t = Ae)We bound the sum over m in (66) by (220) and (222)∣∣∣∣∣∣∣∣∣

      summleMvq

      (m2q)=1

      micro(m)

      m

      ∣∣∣∣∣∣∣∣∣ le min

      (4

      5

      qφ(q)

      log+ Mv

      2q2 1

      )

      62 CONTRIBUTIONS OF DIFFERENT TYPES 105

      To bound the terms involving (Uqv + 1)2 we usesumvleV

      Λ(v)v le 05004V 2 (by (217))

      sumvleV

      Λ(v)v(v q)j lesumvleV

      Λ(v)v + VsumvleV

      (vq)6=1

      Λ(v)(v q)j

      sumvleV

      (vq) 6=1

      Λ(v)(v q) lesump|q

      (log p)sum

      1leαlelogp V

      pvp(q) lesump|q

      (log p)log V

      log ppvp(q)

      le (log V )sump|q

      pvp(q) le q log V

      and sumvleV

      (vq)6=1

      Λ(v)(v q)2 lesump|q

      (log p)sum

      1leαlelogp V

      pvp(q)+α

      lesump|q

      (log p) middot 2pvp(q) middot plogp V le 2qV log q

      Using (214) and (69) as well we conclude that (66) is at most

      x

      2qmin

      (1

      c0(πδ)2

      )min

      (4

      5

      qφ(q)

      log+ min(Q2VU)2q2

      1

      )log V q

      +c10I

      4x

      (05004V 2q

      (U

      q+ 1

      )2

      + 2UV q log V + 2U2V log V

      )

      AssumeQ le 2UVe Using (214) (610) (218) and the inequality vq le V q le Q(which implies q2 le Ue) we see that (67) is at most

      10004

      ((2radicc0c+

      π+

      3c+2ec2

      )UV +

      radicc0c+

      πQ log

      U

      q2

      )+

      (c5I2 max

      (log

      c11Iq2V

      x 2

      )+ c6I2

      )Q

      where c5I2 = 353312 gt 10004 middot c8I and

      c6I2 = 10004

      (2radic

      3c0c+π

      +3c+2c2

      +55c0c2

      6π2

      ) (611)

      The expressions in (68) get estimated similarly The first line of (68) is at most

      10004

      (2radicc0c+

      πUV +

      3c+2

      (2 +

      1 + ε

      εlog+ 2UV |δ|q

      x

      )xV

      Q+

      35c0c23π2

      qV

      )

      106 CHAPTER 6 MINOR-ARC TOTALS

      by (214) Since q le QV we can obviously bound qV by Q As for the second lineof (68) ndash

      sumvleV

      Λ(v) min

      (lfloorxv

      |δ|qv

      rfloor+ 1 2U

      )middot 1

      2log+ 2Ulfloor

      xv|δ|qv

      rfloor+ 1

      lesumvleV

      Λ(v) maxtgt0

      t log+ U

      tlesumvleV

      Λ(v)U

      e=

      10004

      eUV

      but

      sumvleV

      Λ(v) min

      (lfloorxv

      |δ|qv

      rfloor+ 1 2U

      )le

      sumvle x

      2U|δ|q

      Λ(v) middot 2U

      +sum

      x2U|δ|qltvleV

      (vq)=1

      Λ(v)x|δ|vq

      +sumvleV

      Λ(v) +sumvleV

      (vq)6=1

      Λ(v)x|δ|v

      (1

      qvminus 1

      q

      )

      le 103883x

      |δ|q+

      x

      |δ|qmax

      (log V minus log

      x

      2U |δ|q+ log

      3radic2 0

      )+ 10004V +

      x

      |δ|1

      q

      sump|q

      (log p)vp(q)

      le x

      |δ|q

      (103883 + log q + log+ 6UV |δ|qradic

      2x

      )+ 10004V

      by (212) (213) (214) and (215) we are proceeding much as in (69)

      Let us collect our bounds If |δ| le 12c2 then assuming Q le 2UVe we con-clude that |SI2| is at most

      x

      2φ(q)min

      (1

      c0(πδ)2

      )min

      (45

      log+ Q4V q2

      1

      )log V q

      + c8I2x

      q

      (UV

      x

      )2 (1 +

      q

      U

      )2

      +c10I

      2

      (UV

      xq log V +

      U2V

      xlog V

      ) (612)

      plus

      (c4I2 +c9I2)UV +(c10I2 logU

      q+c5I2 max

      (log

      c11Iq2V

      x 2

      )+c12I2)middotQ (613)

      62 CONTRIBUTIONS OF DIFFERENT TYPES 107

      where

      c4I2 = 357565(1 + ε0) gt 10004 middot 2radicc0c+πc5I2 = 353312 gt 10004 middot c8I

      c8I2 = 117257 gtc10I

      4middot 05004

      c9I2 = 082214(1 + 2ε0) gt 3c+ middot 100042ec2

      c10I2 = 178783radic

      1 + 2ε0 gt 10004radicc0c+π

      c12I2 = 293333 + 11902ε0

      gt 10004

      (3

      2c2c+ +

      2radic

      3c0π

      radicc+ +

      55c0c26π2

      )+ 178783(1 + ε0) log 2

      = c6I2 + c10I2 log 2

      and c10I = 937301 as before Here ε0 = (4 log 2)(xUV ) and c6I2 is as in (611)If |δ| ge 12c2 then |SI2| is at most (612) plus

      (c4I2 + (1 + ε)c13I2)UV + cε

      (c14I2

      (log q + log+ 6UV |δ|qradic

      2x

      )+ c15I2

      )x

      |δ|q

      + c16I2

      (2 +

      1 + ε

      εlog+ 2UV |δ|q

      x

      )x

      QV+ c17I2Q+ cε middot c4I2V

      (614)where

      c13I2 = 131541(1 + ε0) gt2radicc0c+

      πmiddot 10004

      e

      c14I2 = 357422radic

      1 + 2ε0 gt2radicc0c+

      π

      c15I2 = 371301radic

      1 + 2ε0 gt2radicc0c+

      πmiddot 103883

      c16I2 = 15006(1 + 2ε0) gt 10004 middot 3c+2

      c17I2 = 250295 gt 10004 middot 35c0c23π2

      and cε = (1 + ε)radic

      3 + 2ε We recall that c2 = 6π5radicc0 = 067147 We will

      choose ε isin (0 1) later we also leave the task of bounding ε0 for laterThe case q gt QV We use Lemma 424 in this case

      623 Type II termsAs we showed in (51)ndash(55) SII (given in (51)) is at most

      4

      int xU

      V

      radicS1(UW ) middot S2(U VW )

      dW

      W+4

      int xU

      V

      radicS1(UW ) middot S3(W )

      dW

      W (615)

      where S1 S2 and S3 are as in (54) and (55) We bounded S1 in (533) and (534) S2

      in Prop 524 and S3 in (55)

      108 CHAPTER 6 MINOR-ARC TOTALS

      Let us try to give some structure to the bookkeeping we must now inevitably doThe second integral in (615) will be negligible (because S3 is) let us focus on the firstintegral

      Thanks to our work in sect51 the term S1(UW ) is bounded by a (small) constanttimes xW (This represents a gain of several factors of log with respect to the trivialbound) We bounded S2(U VW ) using the large sieve we expected and got a boundthat is better than trivial by a factor of size roughly radicq log x ndash the exact factor inthe bound depends on the value of W In particular it is only in the central part of therange for W that we will really be able to save a factor of radicq log x as opposed tojust radicq We will have to be slightly clever in order to get a good total bound in theend

      We first recall our estimate for S1 In the whole range [V xU ] for W we knowfrom (533) (534) and (537) that S1(UW ) is at most

      2

      π2

      x

      W+ κ0ζ(32)3 x

      W

      radicxWU

      U (616)

      whereκ0 = 127

      (We recall we are working with v = 2)We have better estimates for the constant in front in some parts of the range in

      what is usually the main part (534) and (536) give us a constant of 015107 insteadof 2π2 Note that 127ζ(32)3 = 226417 We should choose U V so that thefirst term in (616) dominates For the while being assume only

      U ge 5 middot 105 x

      V U (617)

      then (616) givesS1(UW ) le κ1

      x

      W (618)

      whereκ1 =

      2

      π2+

      226418radic1062

      le 02347

      This will suffice for our cruder estimatesThe second integral in (615) is now easy to bound By (55)

      S3(W ) le 10171x+ 20341W le 10172x

      since W le xU le x5 middot 105 Hence

      4

      int xU

      V

      radicS1(UW ) middot S3(W )

      dW

      Wle 4

      int xU

      V

      radicκ1

      x

      Wmiddot 10172x

      dW

      W

      le κ9xradicV

      62 CONTRIBUTIONS OF DIFFERENT TYPES 109

      whereκ9 = 8 middot

      radic10172 middot κ1 le 39086

      Let us now examine S2 which was bounded in Prop 524 We set the parametersW prime U prime as follows in accordance with (54)

      W prime = max(VW2) U prime = max(U x2W )

      Since W prime geW2 and W ge V gt 117 we can always boundsumW primeltpleW

      (log p)2 le 1

      2W (logW ) (619)

      by (219)Bounding S2 for δ arbitrary We set

      W0 = min(max(2θq V ) xU)

      where θ ge e is a parameter that will be set laterFor V leW lt W0 we use the bound (553)

      S2(U primeW primeW ) le(

      max(1 2ρ)

      (x

      8q+

      x

      2W

      )+W

      2+ 2q

      )middot 1

      2W (logW )

      le max

      (1

      2 ρ

      )(W

      8q+

      1

      2

      )x logW +

      W 2 logW

      4+ qW logW

      where ρ = qQIf W0 gt V the contribution of the terms with V leW lt W0 to (615) is (by 618)

      bounded by

      4

      int W0

      V

      radicκ1

      x

      W

      (ρ0

      4

      (W

      4q+ 1

      )x logW +

      W 2 logW

      4+ qW logW

      )dW

      W

      le κ2

      2

      radicρ0x

      int W0

      V

      radiclogW

      W 32dW +

      κ2

      2

      radicx

      int W0

      V

      radiclogW

      W 12dW

      + κ2

      radicρ0x2

      16q+ qx

      int W0

      V

      radiclogW

      WdW

      le(κ2radicρ0

      xradicV

      + κ2

      radicxW0

      )radiclogW0

      +2κ2

      3

      radicρ0x2

      16q+ qx

      ((logW0)32 minus (log V )32

      )

      (620)

      where ρ0 = max(1 2ρ) and

      κ2 = 4radicκ1 le 193768

      (We are using the easy boundradica+ b+ c le

      radica+radicb+radicc)

      110 CHAPTER 6 MINOR-ARC TOTALS

      We now examine the terms with W ge W0 If 2θq gt xU then W0 = xU thecontribution of the case is nil and the computations below can be ignored Thus wecan assume that 2θq le xU

      We use (554)

      S2(U primeW primeW ) le(

      x

      4φ(q)

      1

      log(W2q)+

      q

      φ(q)

      W

      log(W2q)

      )middot 1

      2W logW

      Byradica+ b le

      radica+radicb we can take out the qφ(q) middotW log(W2q) term and estimate

      its contribution on its own it is at most

      4

      int xU

      W0

      radicκ1

      x

      Wmiddot q

      φ(q)middot 1

      2W 2

      logW

      logW2q

      dW

      W

      =κ2radic

      2

      radicq

      φ(q)

      int xU

      W0

      radicx logW

      W logW2qdW

      le κ2radic2

      radicqx

      φ(q)

      int xU

      W0

      1radicW

      (1 +

      radiclog 2q

      logW2q

      )dW

      (621)

      Nowint xU

      W0

      1radicW

      radiclog 2q

      logW2qdW =

      radic2q log 2q

      int x2Uq

      max(θV2q)

      1radict log t

      dt

      We bound this last integral somewhat crudely for T ge e

      int T

      e

      1radict log t

      dt le 23

      radicT

      log T (622)

      (This is shown as follows since

      1radicT log T

      lt

      (23

      radicT

      log T

      )prime

      if and only if T gt T0 where T0 = e(1minus223)minus1

      = 213594 it is enough to check(numerically) that (622) holds for T = T0) Since θ ge e this gives us that

      int xU

      W0

      1radicW

      (1 +

      radiclog 2q

      logW2q

      )dW

      le 2

      radicx

      U+ 23

      radic2q log 2q middot

      radicx2Uq

      log x2Uq

      62 CONTRIBUTIONS OF DIFFERENT TYPES 111

      and so (621) is at most

      radic2κ2

      radicq

      φ(q)

      (1 + 115

      radiclog 2q

      log x2Uq

      )xradicU

      We are left with what will usually be the main term viz

      4

      int xU

      W0

      radicS1(UW ) middot

      (x

      8φ(q)

      logW

      logW2q

      )WdW

      W (623)

      which by (534) is at most xradicφ(q) times the integral of

      1

      W

      radicradicradicradic(2H2

      ( x

      WU

      )+κ4

      2

      radicxWU

      U

      )logW

      logW2q

      for W going from W0 to xU where H2 is as in (535) and

      κ4 = 4κ0ζ(32)3 le 905671

      By the arithmeticgeometric mean inequality the integrand is at most 1W times

      β + βminus1 middot 2H2(xWU)

      2+βminus1

      2

      κ4

      2

      radicxWU

      U+β

      2

      log 2q

      logW2q(624)

      for any β gt 0 We will choose β laterThe first summand in (624) gives what we can think of as the main or worst term

      in the whole paper let us compute it first The integral isint xU

      W0

      β + βminus1 middot 2H2(xWU)

      2

      dW

      W=

      int xUW0

      1

      β + βminus1 middot 2H2(s)

      2

      ds

      s

      le(β

      2+κ6

      )log

      x

      UW0

      (625)

      by (536) whereκ6 = 060428

      Thus the main term is simply(β

      2+κ6

      )xradicφ(q)

      logx

      UW0 (626)

      The integral of the second summand is at most

      βminus1 middot κ4

      4

      radicx

      U

      int xU

      V

      dW

      W 32le βminus1 middot κ4

      2

      radicxUV

      U

      112 CHAPTER 6 MINOR-ARC TOTALS

      By (617) this is at most

      βminus1

      radic2middot 10minus3 middot κ4 le βminus1κ72

      where

      κ7 =

      radic2κ4

      1000le 01281

      Thus the contribution of the second summand is at most

      βminus1κ7

      2middot xradic

      φ(q)

      The integral of the third summand in (624) is

      β

      2

      int xU

      W0

      log 2q

      logW2q

      dW

      W (627)

      If V lt 2θq le xU this is

      β

      2

      int xU

      2θq

      log 2q

      logW2q

      dW

      W=β

      2log 2q middot

      int x2Uq

      θ

      1

      log t

      dt

      t

      2log 2q middot

      (log log

      x

      2Uqminus log log θ

      )

      If 2θq gt xU the integral is over an empty range and its contribution is hence 0If 2θq le V (627) is

      β

      2

      int xU

      V

      log 2q

      logW2q

      dW

      W=β log 2q

      2

      int x2Uq

      V2q

      1

      log t

      dt

      t

      =β log 2q

      2middot (log log

      x

      2Uqminus log log V2q)

      =β log 2q

      2middot log

      (1 +

      log xUV

      log V2q

      )

      (628)

      (Let us stop for a moment and ask ourselves when this will be smaller than whatwe can see as the main term namely the term (β2) log xUW0 in (625) Clearlylog(1 + (log xUV )(log V2q)) le (log xUV )(log V2q) and that is smaller than(log xUV ) log 2q when V2q gt 2q Of course it does not actually matter if (628)is smaller than the term from (625) or not since we are looking for upper bounds herenot for asymptotics)

      The total bound for (623) is thus

      xradicφ(q)

      middot(β middot(

      1

      2log

      x

      UW0+

      Φ

      2

      )+ βminus1

      (1

      4κ6 log

      x

      UW0+κ7

      2

      )) (629)

      62 CONTRIBUTIONS OF DIFFERENT TYPES 113

      where

      Φ =

      log 2q(

      log log x2Uq minus log log θ

      )if V2θ lt q lt x(2θU)

      log 2q log(

      1 + log xUVlog V2q

      )if q le V2θ

      (630)

      Choosing β optimally we obtain that (623) is at most

      xradic2φ(q)

      radic(log

      x

      UW0+ Φ

      )(κ6 log

      x

      UW0+ 2κ7

      ) (631)

      where Φ is as in (630)Bounding S2 for |δ| ge 8 Let us see how much a non-zero δ can help us It makes

      sense to apply (556) only when |δ| ge 8 otherwise (554) is almost certainly betterNow by definition |δ|x le 1qQ and so |δ| ge 8 can happen only when q le x8Q

      With this in mind let us apply (556) assuming |δ| gt 8 Note first that

      x

      |δq|

      (q +

      x

      4W

      )minus1

      ge 1|δq|qx + 1

      4W

      ge 4|δq|1

      2Q + 1W

      ge 4W

      |δ|qmiddot 1

      1 + W2Q

      ge 4W

      |δ|qmiddot 1

      1 + xU2Q

      This is at least 2 min(2QW )|δq| Thus we are allowed to apply (556) when |δq| le2 min(2QW ) Since Q ge xU we know that min(2QW ) = W for all W le xU and so it is enough to assume that |δq| le 2W We will soon be making a strongerassumption

      Recalling also (619) we see that (556) gives us

      S2(U primeW primeW ) le min

      12qφ(q)

      log

      (4W|δ|q middot

      1

      1+xU2Q

      )( x

      |δq|+W

      2

      )middot 1

      2W (logW )

      (632)Similarly to before we define W0 = max(V θ|δq|) where θ ge 3e28 will be set

      later (Here θ ge 3e28 is an assumption we do not yet need but we will be using itsoon to simplify matters slightly) For W geW0 we certainly have |δq| le 2W Hencethe part of the first term of (615) coming from the range W0 leW lt xU is

      4

      int xU

      W0

      radicS1(UW ) middot S2(U VW )

      dW

      W

      le 4

      radicq

      φ(q)

      int xU

      W0

      radicradicradicradicradicS1(UW ) middot logW

      log

      (4W|δ|q middot

      1

      1+xU2Q

      ) (Wx

      |δq|+W 2

      2

      )dW

      W

      (633)

      114 CHAPTER 6 MINOR-ARC TOTALS

      By (534) the contribution of the term Wx|δq| to (633) is at most

      4xradic|δ|φ(q)

      int xU

      W0

      radicradicradicradicradicradic(H2

      ( x

      WU

      )+κ4

      4

      radicxWU

      U

      )logW

      log

      (4W|δ|q middot

      1

      1+xU2Q

      ) dWW

      Note that 1 + (xU)2Q le 32 Proceeding as in (623)ndash(631) we obtain that this isat most

      2xradic|δ|φ(q)

      radic(log

      x

      UW0+ Φ

      )(κ6 log

      x

      UW0+ 2κ7

      )

      where

      Φ =

      log (1+ε1)|δq|4 log

      (1 + log xUV

      log 4V|δ|(1+ε1)q

      )if |δq| le Vθ

      log 3|δq|8

      (log log 8x

      3U |δq| minus log log 8θ3

      )if Vθ lt |δq| le xθU

      (634)

      where ε1 = x2UQ This is what we think of as the main termBy (618) the contribution of the term W 22 to (633) is at most

      4

      radicq

      φ(q)

      int xU

      W0

      radicκ1

      2xdWradicWmiddot maxW0leWle x

      U

      radiclogW

      log 8W3|δq|

      (635)

      Since trarr (log t)(log tc) is decreasing for t gt c (635) is at most

      4radic

      2κ1

      radicq

      φ(q)

      (xradicUminusradicxW0

      )radiclogW0

      log 8W0

      3|δq| (636)

      If W0 gt V we also have to consider the range V leW lt W0 By Prop 524 and(619) the part of (615) coming from this is

      4

      int θ|δq|

      V

      radicS1(UW ) middot (logW )

      (Wx

      2|δq|+W 2

      4+

      Wx

      16(1minus ρ)Q+

      x

      8(1minus ρ)

      )dW

      W

      The contribution of W 24 is at most

      4

      int W0

      V

      radicκ1

      x

      WlogW middot W

      2

      4

      dW

      Wle 4radicκ1 middot

      radicxW0 middot

      radiclogW

      the sum of this and (636) is at most

      4radicκ1

      (radic2q

      φ(q)

      (xradicUminusradicxW0

      )radiclogW0

      log 8θ3

      +radicxW0

      radiclogW0

      )

      le κ2 middotradic

      q

      φ(q)

      xradicU

      radiclogW0

      62 CONTRIBUTIONS OF DIFFERENT TYPES 115

      where we use the facts that W0 = θ|δq| (by W0 gt V ) and θ ge 3e28 and where werecall that κ2 = 4

      radicκ1

      The terms Wx2|δ|q and Wx(16(1minus ρ)Q) contribute at most

      4radicκ1

      int θ|δq|

      V

      radicx

      Wmiddot (logW )W

      (x

      2|δq|+

      x

      16(1minus ρ)Q

      )dW

      W

      = κ2x

      (1radic2|δ|q

      +1

      4radic

      (1minus ρ)Q

      )int θ|δq|

      V

      radiclogW

      dW

      W

      =2κ2

      3x

      (1radic2|δ|q

      +1

      4radic

      (1minus ρ)Q

      )((log θ|δ|q)32 minus (log V )32

      )

      The term x8(1minus ρ) contributes

      radic2κ1x

      int θ|δq|

      V

      radiclogW

      W (1minus ρ)

      dW

      Wleradic

      2κ1xradic1minus ρ

      int infinV

      radiclogW

      W 32dW

      le κ2xradic2(1minus ρ)V

      (radic

      log V +radic

      1 log V )

      where we use the estimate

      int infinV

      radiclogW

      W 32dW =

      1radicV

      int infin1

      radiclog u+ log V

      u32du

      le 1radicV

      int infin1

      radiclog V

      u32du+

      1radicV

      int infin1

      1

      2radic

      log V

      log u

      u32du

      = 2

      radiclog VradicV

      +1

      2radicV log V

      middot 4 le 2radicV

      (radiclog V +

      radic1 log V

      )

      It is time to collect all type II terms Let us start with the case of general δ We willset θ ge e later If q le V2θ then |SII | is at most

      xradic2φ(q)

      middot

      radic(log

      x

      UV+ log 2q log

      (1 +

      log xUV

      log V2q

      ))(κ6 log

      x

      UV+ 2κ7

      )+radic

      2κ2

      radicq

      φ(q)

      (1 + 115

      radiclog 2q

      log x2Uq

      )xradicU

      + κ9xradicV

      (637)

      116 CHAPTER 6 MINOR-ARC TOTALS

      If V2θ lt q le x2θU then |SII | is at most

      xradic2φ(q)

      middot

      radic(log

      x

      U middot 2θq+ log 2q log

      log x2Uq

      log θ

      )(κ6 log

      x

      U middot 2θq+ 2κ7

      )

      +radic

      2κ2

      radicq

      φ(q)

      (1 + 115

      radiclog 2q

      log x2Uq

      )xradicU

      + (κ2

      radiclog 2θq + κ9)

      xradicV

      +κ2

      6

      ((log 2θq)32 minus (log V )32

      ) xradicq

      + κ2

      (radic2θ middot log 2θq +

      2

      3((log 2θq)32 minus (log V )32)

      )radicqx

      (638)where we use the fact that Q ge xU (implying that ρ0 = max(1 2qQ) equals 1 forq le x2U ) Finally if q gt x2θU

      |SII | le (κ2

      radic2 log xU + κ9)

      xradicV

      + κ2

      radiclog xU

      xradicU

      +2κ2

      3((log xU)32 minus (log V )32)

      (x

      2radic

      2q+radicqx

      )

      (639)

      Now let us examine the alternative bounds for |δ| ge 8 Here we assume θ ge 3e28If |δq| le Vθ then |SII | is at most

      2xradic|δ|φ(q)

      radicradicradicradiclogx

      UV+ log

      |δq|(1 + ε1)

      4log

      (1 +

      log xUV

      log 4V|δ|(1+ε1)q

      )

      middotradicκ6 log

      x

      UV+ 2κ7

      + κ2

      radic2q

      φ(q)middot

      radiclog V

      log 2V|δq|middot xradic

      U+ κ9

      xradicV

      (640)

      where ε1 = x2UQ If Vθ lt |δ|q le xθU then |SII | is at most

      2xradic|δ|φ(q)

      radicradicradicradic(logx

      U middot θ|δ|q+ log

      3|δq|8

      loglog 8x

      3U |δq|

      log 8θ3

      )(κ6 log

      x

      U middot θ|δq|+ 2κ7

      )

      +2κ2

      3

      (xradic2|δq|

      +x

      4radicQminus q

      )((log θ|δq|)32 minus (log V )32

      )+

      (κ2radic

      2(1minus ρ)

      (radiclog V +

      radic1 log V

      )+ κ9

      )xradicV

      + κ2

      radicq

      φ(q)middotradic

      log θ|δq| middot xradicU

      (641)

      63 ADJUSTING PARAMETERS CALCULATIONS 117

      where ρ = qQ Note that |δ| le xQq implies ρ le xQ2 and so ρ will be very smalland Qminus q will be very close to Q

      The case |δq| gt xθU will not arise in practice essentially because of |δ|q le xQ

      63 Adjusting parameters Calculations

      We must bound the exponential sumsumn Λ(n)e(αn)η(nx) By (38) it is enough to

      sum the bounds we obtained in sect62 We will now see how it will be best to set U Vand other parameters

      Usually the largest terms will be

      C0UV (642)

      where C0 equals

      c4I2 + c9I2 = 439779 + 521993ε0 if |δ| le 12c2 sim 074463c4I2 + (1 + ε)c13I2 = (489106 + 131541ε)(1 + ε0) if |δ| gt 12c2

      (643)(from (613) and (614) type I we will specify ε and ε0 = (4 log 2)(xUV ) later)and

      xradicδ0φ(q)

      radicradicradicradiclogx

      UV+ (log δ0(1 + ε1)q) log

      (1 +

      log xUV

      log Vδ0(1+ε1)q

      )

      middotradicκ6 log

      x

      UV+ 2κ7

      (644)

      (from (637) and (640) type II here δ0 = max(2 |δ|4) while ε1 = x2UQ for|δ| gt 8 and ε1 = 0 for |δ| lt 8

      We set UV = κxradicqδ0 we must choose κ gt 0

      Let us first optimize (or rather almost optimize) κ in the case |δ| le 4 so thatδ0 = 2 and ε1 = 0 For the purpose of choosing κ we replace

      radicφ(q) by

      radicqC1

      where C1 = 23536 sim 510510φ(510510) and also replace V by q2c c a constantWe use the approximation

      log

      (1 +

      log xUV

      log V|2q|

      )= log

      (1 +

      log(radic

      2qκ)

      log(q2c)

      )= log

      (3

      2+

      log 2radiccκ

      log q2c

      )sim log

      3

      2+

      2 log 2radiccκ

      3 log q2c

      118 CHAPTER 6 MINOR-ARC TOTALS

      What we must minimize then is

      C0κradic2q

      +C1radic2q

      radicradicradicradic(log

      radic2q

      κ+ log 2q

      (log

      3

      2+

      2 log 2radicc

      κ3 log q

      2c

      ))(κ6 log

      radic2q

      κ+ 2κ7

      )

      le C0κradic2q

      +C1

      2radicq

      radicκ6radicκprime1

      radicκprime1 log q minus

      (5

      3+

      2

      3

      log 4c

      log q2c

      )logκ + κprime2

      middot

      radicκprime1 log q minus 2κprime1 logκ +

      4κprime1κ7

      κ6+ κprime1 log 2

      le C0radic2q

      (κ + κprime4

      (κprime1 log q minus

      ((5

      6+ κprime1

      )+

      1

      3

      log 4c

      log q2c

      )logκ + κprime3

      ))

      (645)where

      κprime1 =1

      2+ log

      3

      2 κprime2 = log

      radic2 + log 2 log

      3

      2+

      log 4c log 2q

      3 log q2c

      κprime3 =1

      2

      (κprime2 +

      4κprime1κ7

      κ6+ κprime1 log 2

      )=

      log 4c

      6+

      (log 4c)2

      6 log q2c

      + κprime5

      κprime4 =C1

      C0

      radicκ6

      2κprime1sim

      030915

      1+118694ε0if |δ| le 4

      027797(1+026894ε)(1+ε0) if |δ| gt 4

      κprime5 =1

      2(logradic

      2 + log 2 log3

      2+

      4κprime1κ7

      κ6+ κprime1 log 2) sim 101152

      Taking derivatives we see that the minimum is attained when

      κ =

      (5

      6+ κprime1 +

      1

      3

      log 4c

      log q2c

      )κprime4 sim

      (17388 +

      log 4c

      3 log q2c

      )middot 030915

      1 + 119ε0(646)

      provided that |δ| le 4 (What we obtain for |δ| gt 4 is essentially the same only withδ0q = δq4 instead of 2q and 027797((1 + 027ε)(1 + ε0)) in place of 030915) Forq = 5 middot 105 c = 25 and |δ| le 4 (typical values in the most delicate range) we get thatκ should be about 05582(1 + 119ε0) Values of q c nearby give similar values forκ whether |δ| le 4 or for |δ| gt 4

      (Incidentally at this point we could already give a back-of-the-envelope estimatefor the last line of (645) ie our main term It suggests that choosing w = 1 insteadof w = 2 would have given bounds worse by about 15 percent)

      We make the choices

      κ = 12 and so UV =x

      2radicqδ0

      for the sake of simplicity (Unsurprisingly (645) changes very slowly around its min-imum) Note by the way that this means that ε0 = (2 log 2)

      radicqδ0

      Now we must decide how to choose U V and Q given our choice of UV We willactually make two sets of choices

      63 ADJUSTING PARAMETERS CALCULATIONS 119

      First we will use the SI2 estimates for q le QV to treat all α of the form α =aq +Olowast(1qQ) q le y (Here y is a parameter satisfying y le QV )

      Then the remaining α will get treated with the (coarser) SI2 estimate for q gtQV with Q reset to a lower value (call it Qprime) If α was not treated in the first go (sothat it must be dealt with the coarser estimate) then α = aprimeqprime + δprimex where eitherqprime gt y or δprimeqprime gt xQ (Otherwise α = aprimeqprime +Olowast(1qprimeQ) would be a valid estimatewith qprime le y) The value of Qprime is set to be smaller than Q both because this is helpful(it diminishes error terms that would be large for large q) and because this is harmless(since we are no longer assuming that q le QV )

      631 First choice of parameters q le y

      The largest items affected strongly by our choices at this point are

      c16I2

      (2 +

      1 + ε

      εlog+ 2UV |δ|q

      x

      )x

      QV+ c17I2Q (from SI2 |δ| gt 12c2)(

      c10I2 logU

      q+ 2c5I2 + c12I2

      )Q (from SI2 |δ| le 12c2)

      (647)and

      κ2

      radic2q

      φ(q)

      (1 + 115

      radiclog 2q

      log x2Uq

      )xradicU

      + κ9xradicV

      (from SII any |delta|)

      (648)with

      κ2

      radic2q

      φ(q)middot

      radiclog V

      log 2V|δq|middot xradic

      U(from SII )

      as an alternative to (648) for |δ| ge 8 (In several of these expressions we are apply-ing some minor simplifications that our later choices will justify Of course even ifthese simplifications were not justified we would not be getting incorrect results onlypotentially suboptimal ones we are trying to decide how choose certain parameters)

      In addition we have a relatively mild but important dependence on V in the mainterm (644) even when we hold UV constant (as we do in so far as we have alreadychosen UV ) We must also respect the condition q le QV the lower bound onU given by (617) and the assumptions made at the beginning of the chapter (egQ ge xU V ge 2 middot 106) Recall that UV = x2

      radicqδ0

      We setQ =

      x

      8y

      since we will then have not just q le y but also q|δ| le xQ = 8y and so qδ0 le 2yWe want q le QV to be true whenever q le y this means that

      q le Q

      V=QU

      UV=

      QU

      x2radicqδ0

      =Uradicqδ0

      4y

      120 CHAPTER 6 MINOR-ARC TOTALS

      must be true when q le y and so it is enough to set U = 4y2radicqδ0 The following

      choices make sense we will work with the parameters

      y =x13

      6 Q =

      x

      8y=

      3

      4x23 xUV = 2

      radicqδ0 le 2

      radic2y

      U =4y2

      radicqδ0

      =x23

      9radicqδ0

      V =x

      (xUV ) middot U=

      x

      8y2=

      9x13

      2

      (649)

      where as before δ0 = max(2 |δ|4) So for instance we obtain ε1 le x2UQ =6radicqδ0x

      13 le 2radic

      3x16 Assuming

      x ge 216 middot 1020 (650)

      we obtain that U(xUV ) ge (x239radicqδ0)(2

      radicqδ0) = x2318qδ0 ge x136 ge

      106 and so (617) holds We also get that ε1 le 0002Since V = x8y2 = (92)x13 (650) also implies that V ge 2 middot 106 (in fact

      V ge 27 middot 106) It is easy to check that

      V lt x4 UV le x Q ge max(16 2radicx) Q ge max(2U xU) (651)

      as stated at the beginning of the chapter Let θ = (32)3 = 278 Then

      V

      2θq=x8y2

      2θqge x

      16θy3=

      x

      54y3= 4 gt 1

      V

      θ|δq|ge x8y2

      8θyge x

      64θy3=

      x

      216y3= 1

      (652)

      The first type I bound is

      |SI1| lex

      qmin

      (1cprime0δ2

      )min

      45

      qφ(q)

      log+ x23 9

      q52 δ

      120

      1

      (log 9x13

      radicqδ0 + c3I

      )+c4Iq

      φ(q)

      +

      (c7I log

      y

      c2+ c8I log x

      )y +

      c10Ix13

      3422q32δ120

      (log 9x13radiceqδ0)

      +

      (c5I log

      2x23

      9c2radicqδ0

      + c6I logx53

      9radicqδ0

      )x23

      9radicqδ0

      + c9Iradicx log

      2radicex

      c2+c10I

      e

      (653)where the constants are as in sect621 For any cR ge 1 the function

      xrarr (log cx)(log xR)

      attains its maximum on [Rprimeinfin] Rprime gt R at x = Rprime Hence for qδ0 fixed

      min

      45

      log+ 4x23

      9(δ0q)52

      1

      (log 9x13

      radicqδ0 + c3I

      )(654)

      63 ADJUSTING PARAMETERS CALCULATIONS 121

      attains its maximum for x isin [(9e45(δ0q)524)32infin) at

      x =(

      9e45(δ0q)524

      )32

      = (278)e65(qδ0)154 (655)

      Now notice that for smaller values of x (654) increases as x increases since the termmin( 1) equals the constant 1 Hence (654) attains its maximum for x isin (0infin)at (655) and so

      min

      45

      log+ 4x23

      9(δ0q)52

      1

      (log 9x13

      radicqδ0 + c3I

      )+ c4I

      le log27

      2e25(δ0q)

      74 + c3I + c4I le7

      4log δ0q + 611676

      Examining the other terms in (653) and using (650) we conclude that

      |SI1| lex

      qmin

      (1cprime0δ2

      )middot q

      φ(q)

      (7

      4log δ0q + 611676

      )+

      x23

      radicqδ0

      (067845 log xminus 120818) + 037864x23

      (656)

      where we are using (650) (and of course the trivial bound δ0q ge 2) to simplify thesmaller error terms We recall that cprime0 = 0798437 gt c0(2π)2

      Let us now consider SI2 The terms that appear both for |δ| small and |δ| large aregiven in (612) The second line in (612) equals

      c8I2

      (x

      4q2δ0+

      2UV 2

      x+qV 2

      x

      )+c10I

      2

      (q

      2radicqδ0

      +x23

      18qδ0

      )log

      9x13

      2

      le c8I2(

      x

      4q2δ0+

      9x13

      2radic

      2+

      27

      8

      )+c10I

      2

      (y16

      232+

      x23

      18qδ0

      )(1

      3log x+ log

      9

      2

      )le 029315

      x

      q2δ0+ (008679 log x+ 039161)

      x23

      qδ0+ 000153

      radicx

      where we are using (650) to simplify Now

      min

      (45

      log+ Q4V q2

      1

      )log V q = min

      (45

      log+ y4q2

      1

      )log

      9x13q

      2(657)

      can be bounded trivially by log(9x13q2) le (23) log x+log 34 We can also bound(657) as we bounded (654) before namely by fixing q and finding the maximum forx variable In this way we obtain that (657) is maximal for y = 4e45q2 since bydefinition x136 = y (657) then equals

      log9(6 middot 4e45q2)q

      2= 3 log q + log 108 +

      4

      5le 3 log q + 548214

      122 CHAPTER 6 MINOR-ARC TOTALS

      We conclude that (612) is at most

      min

      (1

      4cprime0δ2

      )middot(

      3

      2log q + 274107

      )x

      φ(q)

      + 029315x

      q2δ0+ (00434 log x+ 01959)x23

      (658)

      If |δ| le 12c2 we must consider (613) This is at most

      (c4I2 + c9I2)x

      2radicqδ0

      + (c10I2 logx23

      9q32radicδ0

      + 2c5I2 + c12I2) middot 3

      4x23

      le 21989xradicqδ0

      +361818x

      qδ0+ (177019 log x+ 292955)x23

      where we recall that ε0 = (4 log 2)(xUV ) = (2 log 2)radicqδ0 which can be bounded

      crudely byradic

      2 log 2 (Thus c10I2 leradic

      1 +radic

      8 log 2middot178783 lt 354037 and c12I2 le293333 + 11902

      radic2 log 2 le 410004)

      If |δ| gt 12c2 we must consider (614) instead For ε = 007 that is at most

      (c4I2 + (1 + ε)c13I2)x

      2radicqδ0

      (1 +

      2 log 2radicqδ0

      )+ (338845

      (1 +

      2 log 2radicqδ0

      )log δq3 + 208823)

      x

      |δ|q

      +

      (688133

      (1 +

      4 log 2radicqδ0

      )log |δ|q + 720828

      )x23 + 604141x13

      = 249157xradicqδ0

      (1 +

      2 log 2radicqδ0

      )+ (338845 log δq3 + 326771)

      x

      |δ|q

      +

      (229378 log x+ 190791

      log |δ|qradicqδ0

      + 130691

      )x

      23

      le 249157xradicqδ0

      + (359676 log δ0 + 273032 log q + 912218)x

      qδ0

      + (229378 log x+ 411228)x23

      where besides the crude bound ε0 leradic

      2 log 2 we use the inequalities

      log |δ|qradicqδ0

      le log 4qδ0radicqδ0

      le log 8radic2

      log qradicqδ0le 1radic

      2

      log qradicqle 1radic

      2

      log e2

      e=

      radic2

      e

      1

      |δ|le 4c2

      δ0

      log |δ||δ|

      le 2

      e log 2middot log δ0

      δ0

      (Obviously 1|δ| le 4c2δ0 is based on the assumption |δ| gt 12c2 and on the inequal-ity 16c2 ge 1 The bound on (log |δ|)|δ| is based on the fact that (log t)t reaches itsmaximum at t = e and (log δ0)δ0 = (log 2)2 for |δ| le 8)

      63 ADJUSTING PARAMETERS CALCULATIONS 123

      We sum (658) and whichever one of our bounds for (613) and (614) is greater(namely the latter) We obtain that for any δ

      |SI2| le 249157xradicqδ0

      + min

      (1

      4cprime0δ2

      )middot(

      3

      2log q + 274107

      )x

      φ(q)

      + (359676 log δ0 + 273032 log q + 91515)x

      qδ0+ (229812 log x+ 411424)x23

      (659)where we bound one of the lower-order terms in (658) by xq2δ0 le xqδ0

      For type II we have to consider two cases (a) |δ| lt 8 and (b) |δ| ge 8 Considerfirst |δ| lt 8 Then δ0 = 2 Recall that θ = 278 We have q le V2θ and |δq| le Vθthanks to (652) We apply (637) and obtain that for |δ| lt 8

      |SII | lexradic

      2φ(q)middot

      radicradicradicradic1

      2log 4qδ0 + log 2q log

      (1 +

      12 log 4qδ0

      log V2q

      )middotradic

      030214 log 4qδ0 + 02562

      + 822088

      radicq

      φ(q)

      1 + 115

      radicradicradicradic log 2q

      log 9x13radicδ0

      2radicq

      (qδ0)14x23 + 184251x56

      le xradic2φ(q)

      middotradicCx2q log 2q +

      log 8q

      2middotradic

      030214 log 2q + 067506

      + 16406

      radicq

      φ(q)x34 + 184251x56

      (660)where we bound

      log 2q

      log 9x13radicδ0

      2radicq

      lelog x13

      3

      log 9x16radic

      2

      2radic

      16

      lt limxrarrinfin

      log x13

      3

      log 9x16radic

      2

      2radic

      16

      = 2

      and where we define

      Cxt = log

      (1 +

      log 4t

      2 log 9x13

      2004t

      )

      for 0 lt t lt 9x132 (We have 2004 here instead of 2 because we want a constantge 2(1 + ε1) in later occurences of Cxt for reasons that will soon become clear)

      For purposes of later comparison we remark that 16404 le 157863x45minus34 forx ge 216 middot 1020

      Consider now case (b) namely |δ| ge 8 Then δ0 = |δ|4 By (652) |δq| le Vθ

      124 CHAPTER 6 MINOR-ARC TOTALS

      Hence (640) gives us that

      |SII | le2xradic|δ|φ(q)

      middot

      radicradicradicradic1

      2log |δq|+ log

      |δq|(1 + ε1)

      4log

      (1 +

      log |δ|q2 log 18x13

      |δ|(1+ε1)q

      )middotradic

      030214 log |δ|q + 02562

      + 822088

      radicq

      φ(q)

      radicradicradicradic log 9x13

      2

      log 9x13

      |δq|

      middot (qδ0)14x23 + 184251x56

      le xradicδ0φ(q)

      radicCxδ0q log δ0(1 + ε1)q +

      log 4δ0q

      2

      radic030214 log δ0q + 067506

      + 179926

      radicq

      φ(q)x45 + 184251x56

      (661)since

      822088

      radicradicradicradic log 9x13

      2

      log 9x13

      |δq|

      middot (qδ0)14 le 822088

      radiclog 9x13

      2

      log 274

      middot (x133)14

      le 179926x45minus23

      for x ge 216 middot 1020 Clearly

      log δ0(1 + ε1)q = log δ0q + log(1 + ε1) le log δ0q + ε1

      By Lemma C22 qφ(q) le z(y) = z(x136) (since x ge 183) It is easy tocheck that x rarr

      radicz(x136)x45minus56 is decreasing for x ge 216 middot 1020 (in fact for

      183) Using (650) we conclude that 167718radicqφ(q)x45 le 089657x56 and by

      the way 16406radicqφ(q)x34 le 078663x56 This allows us to simplify the last lines

      of (660) and (661) We obtain that for δ arbitrary

      |SII | lexradicδ0φ(q)

      radicCxδ0q(log δ0q + ε1) +

      log 4δ0q

      2

      radic030214 log δ0q + 067506

      + 273908x56(662)

      It is time to sum up SI1 SI2 and SII The main terms come from the first lineof (662) and the first term of (659) Lesser-order terms can be dealt with roughlywe bound min(1 cprime0δ

      2) and min(1 4cprime0δ2) from above by 2δ0 (using the fact that

      cprime0 = 0798437 lt 16 which implies that 8δ gt 4cprime0δ2 for δ gt 8 of course for δ le 8

      we have min(1 4cprime0δ2) le 1 = 22 = 2δ0)

      63 ADJUSTING PARAMETERS CALCULATIONS 125

      The terms inversely proportional to q φ(q) or q2 thus add up to at most

      2x

      δ0qmiddot q

      φ(q)

      (7

      4log δ0q + 611676

      )+

      2x

      δ0φ(q)

      (3

      2log q + 274107

      )+ (359676 log δ0 + 273032 log q + 91515)

      x

      qδ0

      le 2x

      δ0φ(q)

      (13

      4log δ0q + 781811

      )+

      2x

      δ0q(136516 log δ0q + 375415)

      where for instance we bound (32) log q + 274107 by (32) log δ0q + 274107 minus(32) log 2

      As for the other terms ndash we use the assumption x ge 216 middot 1020 to bound x23

      and x23 log x by a small constant times x56 We bound x23radicqδ0 by x23

      radic2 (in

      (656)) We obtain

      x23

      radic2

      (067845 log xminus 120818) + 037864x23

      + (229812 log x+ 411424)x23 + 273908x

      56 le 335531x56

      The sums S0infin and S0w in (311) are 0 (by (650) and the fact that η2(t) = 0 fort le 14) We conclude that for q le y = x136 x ge 216 middot 1020 and η = η2 as in(34)

      |Sη(x α)| le |SI1|+ |SI2|+ |SII |

      le xradicδ0φ(q)

      radicCxδ0q(log δ0q + 0002) +

      log 4δ0q

      2

      radic030214 log δ0q + 067506

      +249157xradic

      δ0q+

      2x

      δ0φ(q)

      (13

      4log δ0q + 781811

      )+

      2x

      δ0q(136516 log δ0q + 375415)

      + 335531x56(663)

      where

      δ0 = max(2 |δ|4) Cxt = log

      (1 +

      log 4t

      2 log 9x13

      2004t

      ) (664)

      SinceCxt is an increasing function as a function of t (for x fixed and t le 9x132004)and δ0q le 2y we see that Cxt le Cx2y It is clear that x 7rarr Cxt (fixed t) is adecreasing function of x For x = 216 middot 1020 Cx2y = 139942

      632 Second choice of parameters

      If with the original choice of parameters we obtained q gt y = x136 we now resetour parameters (Q U and V ) Recall that while the value of q may now change (due tothe change inQ) we will be able to assume that either q gt y or |δq| gt x(x8y) = 8y

      126 CHAPTER 6 MINOR-ARC TOTALS

      We want U(xUV ) ge 5 middot 105 (this is (617)) We also want UV small With thisin mind we let

      V =x13

      3 U = 500

      radic6x13 Q =

      x

      U=

      x23

      500radic

      6 (665)

      Then (617) holds (as an equality) Since we are assuming (650) we have V ge 2 middot106It is easy to check that (650) also implies that U le

      radicx2 and Q ge 2

      radicx and so the

      inequalities in (651) all holdWrite 2α = aq + δx for the new approximation we must have either q gt y or

      |δ| gt 8yq since otherwise aq would already be a valid approximation under the firstchoice of parameters Thus either (a) q gt y or both (b1) |δ| gt 8 and (b2) |δ|q gt 8ySince now V = 2y we have q gt V2θ in case (a) and |δq| gt Vθ in case (b) for anyθ ge 1 We set θ = 4

      (Thanks to this choice of θ we have |δq| le xQ le xθU as we commented at theend of sect623 this will help us avoid some case-work later)

      By (64)

      |SI1| lex

      qmin

      (1cprime0δ2

      )(log x23 minus log 500

      radic6 + c3I + c4I

      q

      φ(q)

      )+

      (c7I log

      Q

      c2+ c8I log x log c11I

      Q2

      x

      )Q+ c10I

      U2

      4xlog

      e12x23

      500radic

      6+c10I

      e

      +

      (c5I log

      1000radic

      6x13

      c2+ c6I log 500

      radic6x43

      )middot 500radic

      6x13 + c9Iradicx log

      2radicex

      c2

      le x

      qmin

      (1cprime0δ2

      )(2

      3log xminus 499944 + 100303

      q

      φ(q)

      )+

      289

      1000x23(log x)2

      where we are bounding

      c7I logQ

      c2+ c8I log x log c11I

      Q2

      x

      =c8I(log x)2 minus(c8I(log 1500000minus log c11I)minus

      2

      3c7I

      )log x+ c7I log

      1

      500radic

      6c2

      lec8I(log x)2 minus 38 log x

      We are also using the assumption (650) repeatedly in order to show that the sum ofall lower-order terms is less than (38c8I log x)(500

      radic6) Note that c8I(log x)2Q le

      000289x23(log x)2We have qφ(q) le z(Q) (where z is as in (C19)) and since Q gt

      radic6 middot 12 middot 109

      for x ge 216 middot 1020

      100303z(Q) le 100303

      (eγ log logQ+

      250637

      log logradic

      6 middot 12 middot 109

      )le 02359 logQ+ 079 lt 01573 log x

      63 ADJUSTING PARAMETERS CALCULATIONS 127

      (It is possible to give a much better estimation but it is not worthwhile since this willbe a very minor term) We have either q gt y or q|δ| gt 8y if q|δ| gt 8y but q le y then|δ| ge 8 and so cprime0δ

      2q lt 18|δ|q lt 164y lt 1y Hence

      |SI1| lex

      y

      ((2

      3+ 01573

      )log x

      )+ 000289x23(log x)2

      le 24719x23 log x+ 000289x23(log x)2

      We bound |SI2| using Lemma 424 First we bound (450) this is at most

      x

      2qmin

      (1

      4cprime0δ2

      )log

      x13q

      3

      + c0

      (1

      4minus 1

      π2

      ) (UV )2 log x13

      3

      2x+

      3c42

      500radic

      6

      9+

      (500radic

      6x13 + 1)2x13 log x

      23

      6x

      where c4 = 103884 We bound the second line of this using (650) As for the firstline we have either q ge y (and so the first line is at most (x2y)(log x13y3)) orq lt y and 4cprime0δ

      2q lt 116y lt 1y (and so the same bound applies) Hence (450) isat most

      3x23

      (2

      3log xminus log 18

      )+ 002017x23 log x = 202017x23 log xminus3(log 18)x23

      Now we bound (451) which comes up when |δ| le 12c2 where c2 = 6π5radicc0

      c0 = 31521 (and so c2 = 06714769 ) Since 12c2 lt 8 it follows that q gt y (thealternative q le y q|δ| gt 8y is impossible since it implies |δ| gt 8) Then (451) is atmost

      2radicc0c1π

      (UV log

      UVradice

      +Q

      (radic3 log

      c2x

      Q+

      logUV

      2log

      UV

      Q2

      ))+

      3c12

      x

      ylogUV log

      UV

      c2xy+

      16 log 2

      πQ log

      c0e3Q2

      4π middot 8 log 2 middot xlog

      Q

      2

      +3c1

      2radic

      2c2

      radicx log

      c2x

      2+

      25c04π2

      (3c2)12radicx log x

      (666)

      where c1 = 1000189 gt 1 + (8 log 2)(2xUV )The first line of (666) is a linear combination of terms of the form x23 logCx

      C gt 1 using (650) we obtain that it is at most 1144693x23 log x (The main contri-bution comes from the first term) Similarly we can bound the first term in the secondline by 330536x23 log x Since log(c0e

      3Q2(4π middot 8 log 2 middot x)) logQ2 is at mostlog x13 log x23 the second term in the second line is at most 00006406x(log x)2The third line of (666) can be bounded easily by 00122x23 log x

      Hence (666) is at most

      117776x23 log x+ 00006406x23(log x)2

      128 CHAPTER 6 MINOR-ARC TOTALS

      If |δ| gt 12c2 then we know that |δq| gt min(y2c2 8y) = y2c2 Thus (452)(with ε = 001) is at most

      2radicc0c1π

      UV logUVradice

      +202radicc0c1

      π

      (x

      y2c2+ 1

      )((radic

      302minus 1) log

      xy2c2

      + 1radic

      2+

      1

      2logUV log

      e2UVx

      y2c2

      )

      +

      (3c12

      (1

      2+

      303

      016log x

      )+

      20c03π2

      (2c2)32

      )radicx log x

      Again by (650) and in much the same way as before this simplifies to

      le (114466 + 15107 + 68523)x23 log x+ 29136x12(log x)2

      le 122885x23(log x)

      Hence in total and for any |δ|

      |SI2| le 202017x23 log x+ 122885x23(log x) + 00006406x23(log x)2

      le 12309x23(log x) + 00006406x23(log x)2

      Now we must estimate SII As we said before either (a) q gt y or both (b1)|δ| gt 8 and (b2) |δ|q gt 8y Recall that θ = 4 In case (a) we have q gt x136 =V2 gt V2θ thus we can use (638) and obtain that if q le x8U |SII | is at most

      xradicz(q)radic2q

      radic(log

      x

      U middot 8q+ log 2q log

      log x(2Uq)

      log 4

      )(κ6 log

      x

      U middot 8q+ 2κ7

      )

      +radic

      2κ2

      radicz( x

      8U

      )(1 + 115

      radiclog x4U

      log 4

      )xradicU

      + (κ2

      radiclog xU + κ9)

      xradicV

      +κ2

      6

      ((log 8y)32 minus (log 2y)32

      ) xradicy

      + κ2

      (radic8 log xU +

      2

      3((log xU)32 minus (log V )32)

      )xradic8U

      (667)where z is as in (C19) (We are already simplifying the third line the bound givenis justified by a derivative test) It is easy to check that q rarr (log 2q)(log log q)q isdecreasing for q ge y (indeed for q ge 9) and so the first line of (667) is maximal forq = y

      63 ADJUSTING PARAMETERS CALCULATIONS 129

      We can thus bound (667) by x56 timesradic3z(et36)

      (t

      3minus log 8c+

      (t

      3minus log 3

      )log

      t3 minus log 2c

      log 4

      )(κ6

      3tminus 4214

      )+

      radic2κ2radic6c

      radicz(e2t3

      48c

      )1 + 115

      radic23 tminus log 24c

      log 4

      +

      (κ2

      radic2t

      3minus log 6c+ κ9

      )radic

      3

      +κ2radic

      6

      ((t

      3+ log

      8

      6

      ) 32

      minus(t

      3+ log

      2

      6

      ) 32

      )

      +κ2radic48c

      (radic8

      (2t

      3minus log 6c

      )+

      2

      3

      ((2t

      3minus log 6c

      ) 32

      minus(t

      3minus log 3

      ) 32

      ))(668)

      where t = log x and c = 500radic

      6 Asymptotically the largest term in (667) comesfrom the last line (of order t32) even if the first line is larger in practice (while beingof order at most t log t) Let us bound (668) by a multiple of t32

      First of all notice that

      d

      dt

      z(et3

      6

      )log t

      =

      (eγ log

      (t3 minus log 6

      )+ 250637

      log( t3minuslog 6)

      )primelog t

      minusz(et3

      6

      )t(log t)2

      =eγ minus 250637

      log2( t3minuslog 6)

      (tminus 3 log 6) log tminuseγ + 250637

      log2( t3minuslog 6)

      t log tmiddot

      log(t3 minus log 6

      )log t

      (669)

      which for t ge 100 is

      gteγ log 3minus 2middot250637 log t

      log2( t3minuslog 6)

      t(log t)2ge

      195671minus 892482log t

      t(log t)2gt 0

      Similarly for t ge 2000

      d

      dt

      z(e2t3

      48c

      )log t

      gteγ log 3

      2 minus250637 log t

      log2( 2t3 minuslog 48c)

      minus 250637

      log( 2t3 minuslog 48c)

      t(log t)2

      ge072216minus 545234

      log t

      t(log t)2gt 0

      Thus

      z(et3

      6

      )le (log t) middot lim

      srarrinfin

      z(es3

      6

      )log s

      = eγ log t for t ge 100

      z(e2t3

      48c

      )le (log t) middot lim

      srarrinfin

      z(e2s3

      48c

      )log s

      = eγ log t for t ge 2000

      (670)

      130 CHAPTER 6 MINOR-ARC TOTALS

      Also note that since (x32)prime = (32)radicx((

      t

      3+ log

      8

      6

      ) 32

      minus(t

      3+ log

      2

      6

      ) 32

      )le 3

      2

      radict

      3+ log

      8

      6middot log 4 le 120083

      radict

      for t ge 2000 We also have(2t

      3minus log 6c

      ) 32

      minus(t

      3minus log 3

      ) 32

      lt

      (2t

      3minus log 9

      ) 32

      minus(t

      3minus log 3

      ) 32

      = (232 minus 1)

      (t

      3minus log 3

      ) 32

      lt (232 minus 1)t32

      332le 035189t32

      Of course

      t

      3minus log 8c+

      (t

      3minus log 3

      )log

      t3 minus log 2c

      log 4lt

      (t

      3+t

      3log

      t

      3

      )ltt

      3log t

      We conclude that for t ge 2000 (668) is at mostradic3 middot eγ log t middot t

      3log t middot κ6

      3t+

      radic2κ2radic6c

      radiceγ log t

      (1 + 079749

      radict)

      +

      (κ2

      radic2

      3t12 + κ9

      )radic

      3 +κ2radic

      6middot 12009

      radict+

      κ2radic48c

      (radic16t

      3+

      2

      3middot 035189t32

      )le (010181 + 000012 + 000145 + 0000048 + 000462)t32 le 010848t32

      On the remaining interval log(216 middot 1020) le t le log 2000 we use interval arith-metic (as in sect26 with 30 iterations) to bound the ratio of (668) to t32 We obtain thatit is at most

      0275964t32

      Hence for all x ge 216 middot 1020

      |SII | le 0275964x56(log x)32 (671)

      in the case y lt q le x8U If x8U lt q le Q we use (639) In this range x2

      radic2q +

      radicqx adopts its max-

      imum at q = Q (because x2radic

      2q for q = x8U is smaller thanradicqx for q = Q by

      (665) and (650)) Hence (639) is at most x56 times(κ2

      radic2

      (2

      3tminus log cprime

      )+ κ9

      )radic

      3 + κ2

      radic2

      3tminus log cprime middot 1radic

      cprime

      +2κ2

      3

      ((2

      3tminus log cprime

      ) 32

      minus(t

      3minus log 3

      ) 32

      )( radiccprime

      2radic

      2eminust6 +

      1radiccprime

      )

      63 ADJUSTING PARAMETERS CALCULATIONS 131

      where t = log x (as before) and cprime = 500radic

      6 This is at most

      (2κ2 +radic

      3κ9)radict+

      κ2radiccprime

      radic2

      3

      radict+

      2κ2

      3

      232 minus 1

      332t32

      ( radiccprime

      2radic

      2eminust6 +

      1radiccprime

      )le 010327

      for t ge log(216 middot 1020

      ) and so

      |SII | le 010327x56(log x)32

      for x8U lt q le Q using the assumption x ge 216 middot 1020Finally let us treat case (b) that is |δ| gt 8 and |δ|q gt 8y we can also assume

      q le y as otherwise we are in case (a) which has already been treated Since |δx| le1qQ we know that

      |δq| le x

      Q= U = 500

      radic6x13 le x23

      2000radic

      6=

      x

      4U=

      x

      θU

      again under assumption (650) We apply (641) and obtain that |SII | is at most

      2xradicz(y)radic8y

      radic(log

      x

      U middot 4 middot 8y+ log 3y log

      log x3Uy

      log 323

      )(κ6 log

      x

      U middot 4 middot 8y+ 2κ7

      )+

      2κ2

      3

      (xradic16y

      ((log 32y)32 minus (log 2y)

      32 ) +

      x4radicQminus y

      ((log 4U)32 minus (log 2y)

      32 )

      )+

      (κ2radic

      2(1minus yQ)

      (radiclog V +

      radic1 log V

      )+ κ9

      )xradicV

      + κ2

      radicz(y) middot

      radiclog 4U middot xradic

      U

      (672)where we are using the facts that (log 3t8)t is increasing for t ge 8y gt 8e3 and that

      d

      dt

      (log t)32 minus (log V )32

      radict

      =3(log t)12 minus ((log t)32 minus (log V )32)

      2t32

      = minuslog t

      e3 middotradic

      log tminus (log V )32

      2t32lt 0

      for t ge θ middot 8y = 16V thanks to(log

      16V

      e3

      )2

      log 16V gt (log V )3 +

      (log 16minus 2 log

      e3

      16

      )(log V )2

      +

      ((log

      16

      e3

      )2

      minus 2 loge3

      16log 16

      )log V gt (log V )3

      132 CHAPTER 6 MINOR-ARC TOTALS

      (valid for log V ge 1) Much as before we can rewrite (672) as x56 times

      2radicz(et36)radic

      86

      radict

      3minus log 32c+

      (t

      3minus log 2

      )log

      t3 minus log 3c

      log 323

      middot

      radicκ6

      (t

      3minus log 32c

      )+ 2κ7 +

      2κ2

      3

      radic3

      8

      ((t

      3+ log

      32

      6

      ) 32

      minus(t

      3minus log 3

      ) 32

      )

      +2κ2

      3

      14radicet3

      6c minus16

      ((t

      3+ log 24c

      )32

      minus(t

      3minus log 3

      )32)

      +κ2

      radic3radic

      2(1minus c

      et3

      )(radic

      t3minus log 3 +1radic

      t3minus log 3

      )+ κ9

      radic3

      + κ2

      radicz(et36)

      radict3 + log 24c

      6c

      (673)where t = log x and c = 500

      radic6 For t ge 100 we use (670) to bound z(et36)

      and we obtain that (673) is at most

      2radiceγradic

      86

      radic1

      3middot κ6

      3middot (log t)t+

      2κ2

      3

      radic3

      8middot 1

      2

      (t

      3+ log

      32

      6

      )12

      middot log 16

      +2κ2

      3

      14radice1003

      6c minus 16

      middot 1

      2

      (t

      3+ log 24c

      )12

      middot log 72c

      +κ2

      radic3radic

      2(1minus c

      e1003

      )(radic

      t3 +1radict3

      )+ κ9

      radic3 + κ2

      radiceγ log t

      radict3 + log 24c

      6c

      (674)where we have bounded expressions of the form a32minusb32 (a gt b) by (a122)middot(aminusb)The ratio of (674) to t32 is clearly a decreasing function of t For t = 200 this ratiois 023747 hence (674) (and thus (673)) is at most 023748t32 for t ge 200

      On the range log(216 middot 1020) le t le 200 the bisection method (with 25 iterations)gives that the ratio of (673) to t32 is at most 023511

      We conclude that when |δ| gt 8 and |δ|q gt 8y

      |SII | le 023511x56(log x)32

      Thus (671) gives the worst caseWe now take totals and obtain

      Sη(x α) le |SI1|+ |SI2|+ |SII |le (24719 + 12309)x23 log x+ (000289 + 00006406)x23(log x)2

      + 0275964x56(log x)32

      le 027598x56(log x)32 + 123338x23 log x(675)

      64 CONCLUSION 133

      where we use (650) yet again

      64 ConclusionProof of Theorem 311 We have shown that |Sη(α x)| is at most (663) for q lex136 and at most (675) for q gt x136 It remains to simplify (663) slightlyBy the geometric meanarithmetic mean inequalityradic

      Cxδ0q(log δ0q + 0002) +log 4δ0q

      2

      radic030214 log δ0q + 067506 (676)

      is at most

      1

      2radicρ

      (Cxδ0q(log δ0q + 0002) +

      log 4δ0q

      2

      )+

      radicρ

      2(030214 log δ0q + 067506)

      for any ρ gt 0 We recall that

      Cxt = log

      (1 +

      log 4t

      2 log 9x13

      2004t

      )

      Let

      ρ =Cx12q0(log 2q0 + 0002) + log 8q0

      2

      030214 log 2q0 + 067506= 3397962

      where x1 = 1025 q0 = 2 middot 105 (In other words we are optimizing matters for x = x1δ0q = 2q0 the losses in nearby ranges will be very slight) We obtain that (676) is atmost

      Cxδ0q2radicρ

      (log δ0q + 0002) +

      (1

      4radicρ

      +

      radicρ middot 030214

      2

      )log δ0q

      +1

      2

      (log 2radicρ

      +

      radicρ

      2middot 067506

      )le 027125Cxt(log δ0q + 0002) + 04141 log δ0q + 049911

      (677)

      Now for x ge x0 = 216 middot 1020

      Cxtlog t

      le Cx0t

      log t=

      1

      log tlog

      (1 +

      log 4t

      2 log 54middot106

      2004t

      )le 008659

      for 8 le t le 106 (by the bisection method with 20 iterations) and

      Cxtlog t

      leC(6t)3t

      log tle 1

      log tlog

      (1 +

      log 4t

      2 log 9middot62004

      )le 008659

      if 106 lt t le x136 Hence

      027125 middot Cxδ0q middot 0002 le 0000047 log δ0q

      134 CHAPTER 6 MINOR-ARC TOTALS

      We conclude that for q le x136

      |Sη(α x)| le Rxδ0q log δ0q + 049911radicφ(q)δ0

      middot x+2492xradicqδ0

      +2x

      δ0φ(q)

      (13

      4log δ0q + 782

      )+

      2x

      δ0q(1366 log δ0q + 3755) + 336x56

      where

      Rxt = 027125 log

      (1 +

      log 4t

      2 log 9x13

      2004t

      )+ 041415

      Part II

      Major arcs

      135

      Chapter 7

      Major arcs overview andresults

      Our task as in Part I will be to estimate

      Sη(α x) =sumn

      Λ(n)e(αn)η(nx) (71)

      where η R+ rarr C us a smooth function Λ is the von Mangoldt function and e(t) =e2πit Here we will treat the case of α lying on the major arcs

      We will see how we can obtain good estimates by using smooth functions η basedon the Gaussian eminust

      22 This will involve proving new fully explicit bounds for theMellin transform of the twisted Gaussian or what is the same bounds on paraboliccylindrical functions in certain ranges It will also require explicit formulae that aregeneral and strong enough even for moderate values of x

      Let α = aq + δx For us saying that α lies on a major arc will be the same assaying that q and δ are bounded more precisely q will be bounded by a constant r and|δ| will be bounded by a constant times rq As is customary on the major arcs wewill express our exponential sum (31) as a linear combination of twisted sums

      Sηχ(δx x) =

      infinsumn=1

      Λ(n)χ(n)e(δnx)η(nx) (72)

      for χ Zrarr C a Dirichlet character mod q ie a multiplicative character on (ZqZ)lowast

      lifted to Z (The advantage here is that the phase term is now e(δnx) rather thane(αn) and e(δnx) varies very slowly as n grows) Our task then is to estimateSηχ(δx x) for δ small

      Estimates on Sηχ(δx x) rely on the properties of DirichletL-functionsL(s χ) =sumn χ(n)nminuss What is crucial is the location of the zeroes of L(s χ) in the critical strip

      0 le lt(s) le 1 (a region in which L(s χ) can be defined by analytic continuation) Incontrast to most previous work we will not use zero-free regions which are too narrowfor our purposes Rather we use a verification of the Generalized Riemann Hypothesisup to bounded height for all conductors q le 300000 (due to D Platt [Plab])

      137

      138 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

      A key feature of the present work is that it allows one to mimic a wide varietyof smoothing functions by means of estimates on the Mellin transform of a singlesmoothing function ndash here the Gaussian eminust

      22

      71 Results

      Write ηhearts(t) = eminust22 Let us first give a bound for exponential sums on the primes

      using ηhearts as the smooth weight Without loss of generality we may assume that ourcharacter χ mod q is primitive ie that it is not really a character to a smaller modulusqprime|q

      Theorem 711 Let x be a real numberge 108 Let χ be a primitive Dirichlet charactermod q 1 le q le r where r = 300000

      Then for any δ isin R with |δ| le 4rq

      infinsumn=1

      Λ(n)χ(n)e

      xn

      )eminus

      (nx)2

      2 = Iq=1 middot ηhearts(minusδ) middot x+ E middot x

      where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

      |E| le 4306 middot 10minus22 +1radicx

      (650400radicq

      + 112

      )

      We normalize the Fourier transform f as follows f(t) =intinfinminusinfin e(minusxt)f(x)dx Of

      course ηhearts(minusδ) is justradic

      2πeminus2π2δ2 As it turns out smooth weights based on the Gaussian are often better in applica-

      tions than the Gaussian ηhearts itself Let us give a bound based on η(t) = t2ηhearts(t)

      Theorem 712 Let η(t) = t2eminust22 Let x be a real number ge 108 Let χ be a

      primitive character mod q 1 le q le r where r = 300000Then for any δ isin R with |δ| le 4rq

      infinsumn=1

      Λ(n)χ(n)e

      xn

      )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

      where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

      |E| le 2485 middot 10minus19 +1radicx

      (281200radicq

      + 56

      )

      The advantage of η(t) = t2ηhearts(t) over ηhearts is that it vanishes at the origin (to secondorder) as we shall see this makes it is easier to estimate exponential sums with thesmoothing η lowastM g where lowastM is a Mellin convolution and g is nearly arbitrary Here isa good example that is used crucially in Part III

      71 RESULTS 139

      Corollary 713 Let η(t) = t2eminust22 lowastM η2(t) where η2 = η1 lowastM η1 and η1 =

      2 middot I[121] Let x be a real number ge 108 Let χ be a primitive character mod q1 le q le r where r = 300000

      Then for any δ isin R with |δ| le 4rq

      infinsumn=1

      Λ(n)χ(n)e

      xn

      )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

      where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

      |E| le 2485 middot 10minus19 +1radicx

      (381500radicq

      + 76

      )

      Let us now look at a different kind of modification of the Gaussian smoothing Saywe would like a weight of a specific shape for example what we will need to do inPart III we would like an approximation to the function

      η t 7rarr

      t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

      (73)

      At the same time what we have is an estimate for the Mellin transform of the Gaussianeminust

      22 centered at t = 0The route taken here is to work with an approximation η+ to η We let

      η+(t) = hH(t) middot teminust22 (74)

      where hH is a band-limited approximation to

      h(t) =

      t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

      (75)

      By band-limited we mean that the restriction of the Mellin transform of hH to theimaginary axis is of compact support (We could alternatively let hH be a functionwhose Fourier transform is of compact support this would be technically easier insome ways but it would also lead to using GRH verifications less efficiently)

      To be precise we define

      FH(t) =sin(H log y)

      π log y

      hH(t) = (h lowastM FH)(y) =

      int infin0

      h(tyminus1)FH(y)dy

      y

      (76)

      and H is a positive constant It is easy to check that MFH(iτ) = 1 for minusH ltτ lt H and MFH(iτ) = 0 for τ gt H or τ lt minusH (unsurprisingly since FH is aDirichlet kernel under a change of variables) Since in general the Mellin transform ofa multiplicative convolution f lowastM g equals Mf middotMg we see that the Mellin transform

      140 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

      of hH on the imaginary axis equals the truncation of the Mellin transform of h to[minusiH iH] Thus hH is a band-limited approximation to h as we desired

      The distinction between the odd and the even case in the statement that followssimply reflects the two different points up to which computations where carried out in[Plab] these computations were in turn to some extent tailored to the needs of thepresent work (as was the shape of η+ itself)

      Theorem 714 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

      H = 200 Let x be a real numberge 1012 Let χ be a primitive character mod q where1 le q le 150000 if q is odd and 1 le q le 300000 if q is even

      Then for any δ isin R with |δ| le 600000 middot gcd(q 2)q

      infinsumn=1

      Λ(n)χ(n)e

      xn

      )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

      where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

      |E| le 13482 middot 10minus14 +1617 middot 10minus10

      q+

      1radicx

      (499900radicq

      + 52

      )

      If q = 1 we have the sharper bound

      |E| le 4772 middot 10minus11 +251400radic

      x

      This is a paradigmatic example in that following the proof given in sect94 we canbound exponential sums with weights of the form hH(t)eminust

      22 where hH is a band-limited approximation to just about any continuous function of our choosing

      Lastly we will need an explicit estimate of the `2 norm corresponding to the sumin Thm 714 for the trivial character

      Proposition 715 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

      H = 200 Let x be a real number ge 1012Theninfinsumn=1

      Λ(n)(log n)η2(nx) = x middotint infin

      0

      η2+(t) log xt dt+ E1 middot x log x

      = 0640206x log xminus 0021095x+ E2 middot x log x

      where|E1| le 5123 middot 10minus15 +

      36691radicx

      |E2| le 2 middot 10minus6 +36691radic

      x

      72 Main ideasAn explicit formula gives an expression

      Sηχ(δx x) = Iq=1η(minusδ)xminussumρ

      Fδ(ρ)xρ + small error (77)

      72 MAIN IDEAS 141

      where Iq=1 = 1 if q = 1 and Iq=1 = 0 otherwise Here ρ runs over the complexnumbers ρ with L(ρ χ) = 0 and 0 lt lt(ρ) lt 1 (ldquonon-trivial zerosrdquo) The function Fδis the Mellin transform of e(δt)η(t) (see sect24)

      The questions are then where are the non-trivial zeros ρ of L(s χ) How fast doesFδ(ρ) decay as =(ρ)rarr plusmninfin

      Write σ = lt(s) τ = =(s) The belief is of course that σ = 12 for every non-trivial zero (Generalized Riemann Hypothesis) but this is far from proven Most workto date has used zero-free regions of the form σ le 1minus1C log q|τ | C a constant Thisis a classical zero-free region going back qualitatively to de la Vallee-Poussin (1899)The best values of C known are due to McCurley [McC84a] and Kadiri [Kad05]

      These regions seem too narrow to yield a proof of the three-primes theorem Whatwe will use instead is a finite verification of GRH ldquoup to Tqrdquo ie a computation show-ing that for every Dirichlet character of conductor q le r0 (r0 a constant as above)every non-trivial zero ρ = σ + iτ with |τ | le Tq satisfies lt(σ) = 12 Such verifica-tions go back to Riemann modern computer-based methods are descended in part froma paper by Turing [Tur53] (See the historical article [Boo06b]) In his thesis [Pla11]D Platt gave a rigorous verification for r0 = 105 Tq = 108q In coordination withthe present work he has extended this to

      bull all odd q le 3 middot 105 with Tq = 108q

      bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

      This was a major computational effort involving in particular a fast implementationof interval arithmetic (used for the sake of rigor)

      What remains to discuss then is how to choose η in such a way Fδ(ρ) decreasesfast enough as |τ | increases so that (77) gives a good estimate We cannot hope forFδ(ρ) to start decreasing consistently before |τ | is at least as large as a constant times|δ| Since δ varies within (minuscr0q cr0q) this explains why Tq is taken inverselyproportional to q in the above As we will work with r0 ge 150000 we also see that wehave little margin for maneuver we want Fδ(ρ) to be extremely small already for say|τ | ge 80|δ| We also have a Scylla-and-Charybdis situation courtesy of the uncertaintyprinciple roughly speaking Fδ(ρ) cannot decrease faster than exponentially on |τ ||δ|both for |δ| le 1 and for δ large

      The most delicate case is that of δ large since then |τ ||δ| is small It turns outwe can manage to get decay that is much faster than exponential for δ large while noslower than exponential for δ small This we will achieve by working with smoothingfunctions based on the (one-sided) Gaussian ηhearts(t) = eminust

      22The Mellin transform of the twisted Gaussian e(δt)eminust

      22 is a parabolic cylinderfunction U(a z) with z purely imaginary Since fully explicit estimates for U(a z)z imaginary have not been worked in the literature we will have to derive them our-selves

      Once we have fully explicit estimates for the Mellin transform of the twisted Gaus-sian we are able to use essentially any smoothing function based on the Gaussianηhearts(t) = eminust

      22 As we already saw we can and will consider smoothing functionsobtained by convolving the twisted Gaussian with another function and also functionsobtained by multiplying the twisted Gaussian with another function All we need to

      142 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

      do is use an explicit formula of the right kind ndash that is a formula that does not as-sume too much about the smoothing function or the region of holomorphy of its Mellintransform but still gives very good error terms with simple expressions

      All results here will be based on a single general explicit formula (Lem 911) validfor all our purposes The contribution of the zeros in the critical trip can be handled ina unified way (Lemmas 913 and 914) All that has to be done for each smoothingfunction is to bound a simple integral (in (924)) We then apply a finite verification ofGRH and are done

      Chapter 8

      The Mellin transform of thetwisted Gaussian

      Our aim in this chapter is to give fully explicit yet relatively simple bounds for theMellin transform Fδ(ρ) of e(δt)ηhearts(t) where ηhearts(t) = eminust

      22 and δ is arbitrary Therapid decay that results will establish that the Gaussian ηhearts is a very good choice for asmoothing particularly when the smoothing has to be twisted by an additive charactere(δt)

      The Gaussian smoothing has been used before in number theory see notablyHeath-Brownrsquos well-known paper on the fourth power moment of the Riemann zetafunction [HB79] What is new here is that we will derive fully explicit bounds on theMellin transform of the twisted Gaussian This means that the Gaussian smoothing willbe a real option in explicit work on exponential sums in number theory and elsewherefrom now on1

      Theorem 801 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of fδ

      Let s = σ + iτ σ ge 0 τ 6= 0 Let ` = minus2πδ Then if sgn(δ) 6= sgn(τ) and δ 6= 0

      |Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

      c1σττ

      σ2 for ρ arbitraryc2στ`

      σ for ρ le 32(81)

      1 There has also been work using the Gaussian after a logarithmic change of variables see in particular[Leh66] In that case the Mellin transform is simply a Gaussian (as in eg [MV07 Ex XII29]) Howeverfor δ non-zero the Mellin transform of a twist e(δt)eminus(log t)22 decays very slowly and thus would not beuseful for our purposes or in general for most applications in which GRH is not assumed

      143

      144 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      where ρ = 4τ`2

      E(ρ) =1

      2

      (arccos

      1

      υ(ρ)minus 2(υ(ρ)minus 1)

      ρ

      )

      c1στ =1

      2

      1 + 214

      (2

      1 + sin2 π8

      )σ2+eminus(radic

      2minus12

      )τ(

      tan π8

      c2στ =1

      2

      1 + min

      2σ+ 12

      radicsec 2π

      5(sin π

      5

      )σ+

      eminusτ6

      (1radic

      3)σ

      (82)

      and

      υ(ρ) =

      radic1 +

      radicρ2 + 1

      2

      If sgn(δ) = sgn(τ) or δ = 0

      |Fδ(s)| le |x0|minusσ middot eminus12 `

      2

      |Γ(s)|eπ2 |τ | middot((

      1 +π

      232

      )eminus

      π4 |τ | +

      1

      2eminusπ|τ |

      ) (83)

      where

      |x0| ge

      051729

      radicτ for ρ arbitrary

      084473 |τ ||`| for ρ le 32(84)

      As we shall see the choice of smoothing function η(t) = eminust22 can be easily

      motivated by the method of stationary phase but the problem is actually solved by thesaddle-point method One of the challenges here is to keep all expressions explicit andpractical

      (In particular the more critical estimate (81) is optimal up to a constant dependingon σ the constants we give will be good rather than optimal)

      The expressions in Thm 801 can be easily simplified further especially if one isready to introduce some mild constraints and make some sacrifices in the main term

      Corollary 802 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of

      fδ Let s = σ + iτ where σ isin [0 1] and |τ | ge 20 Then for 0 le k le 2

      |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

      κk0(|τ ||`|

      )keminus01065( 2|τ|

      |`| )2

      if 4|τ |`2 lt 32

      κk1|τ |k2eminus01598|τ | if 4|τ |`2 ge 32

      whereκ00 le 3001 κ10 le 4903 κ20 le 796

      κ01 le 3286 κ11 le 4017 κ21 le 513

      We are considering Fδ(s + k) and not just Fδ(s) because bounding Fδ(s + k)

      enables us to work with smoothing functions equal to or based on tkeminust22 Clearly

      we can easily derive bounds with k arbitrary from Thm 801 It is just that we will

      81 HOW TO CHOOSE A SMOOTHING FUNCTION 145

      use k = 0 1 2 in practice Corollary 802 is meant to be applied to cases where τis larger than a constant (10 say) times |`| and σ cannot be bounded away from 1 ifeither condition fails to hold it is better to apply Theorem 801 directly

      Let us end by a remark that may be relevant to applications outside number theoryBy (89) Thm 801 gives us bounds on the parabolic cylinder function U(a z) for zpurely imaginary (Surprisingly there seem to have been no fully explicit bounds forthis case in the literature) The bounds are useful when |=(a)| is at least somewhatlarger than |=(z)| (ie when |τ | is large compared to `) While the Thm 801 is statedfor σ ge 0 (ie for lt(a) ge minus12) extending the result to larger half-planes for a isnot hard

      81 How to choose a smoothing functionLet us motivate our choice of smoothing function η The method of stationary phase([Olv74 sect411] [Won01 sectII3])) suggests that the main contribution to the integral

      Fδ(t) =

      int infin0

      e(δt)η(t)tsdt

      t(85)

      should come when the phase has derivative 0 The phase part of (85) is

      e(δt)t=(s)i = e(2πδt+τ log t)i

      (where we write s = σ + iτ ) clearly

      (2πδt+ τ log t)prime = 2πδ +τ

      t= 0

      when t = minusτ2πδ This is meaningful when t ge 0 ie sgn(τ) 6= sgn(δ) Thecontribution of t = minusτ2πδ to (85) is then

      η(t)e(δt)tsminus1 = η

      (minusτ2πδ

      )eminusiτ

      (minusτ2πδ

      )σ+iτminus1

      (86)

      multiplied by a ldquowidthrdquo approximately equal to a constant divided byradic|(2πiδt+ τ log t)primeprime| =

      radic| minus τt2| = 2π|δ|radic

      |τ |

      The absolute value of (86) is

      η(minus τ

      2πδ

      )middot∣∣∣∣ minusτ2πδ

      ∣∣∣∣σminus1

      (87)

      In other words if sgn(τ) 6= sgn(δ) and δ is not too small asking that Fδ(σ + iτ)decay rapidly as |τ | rarr infin amounts to asking that η(t) decay rapidly as t rarr 0 Thusif we ask for Fδ(σ + iτ) to decay rapidly as |τ | rarr infin for all moderate δ we arerequesting that

      146 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      1 η(t) decay rapidly as trarrinfin

      2 the Mellin transform F0(σ + iτ) decay rapidly as τ rarr plusmninfin

      Requirement (2) is there because we also need to consider Fδ(σ+ it) for δ very smalland in particular for δ = 0

      There is clearly an uncertainty-principle issue here one cannot do arbitrarily wellin both aspects at the same time Once we are conscious of this the choice η(t) = eminust

      in Hardy-Littlewood actually looks fairly good obviously η(t) = eminust decays expo-nentially and its Mellin transform Γ(s + iτ) also decays exponentially as τ rarr plusmninfinMoreover for this choice of η the Mellin transform Fδ(s) can be written explicitlyFδ(s) = Γ(s)(1minus 2πiδ)s

      It is not hard to work out an explicit formula2 for η(t) = eminust However it is nothard to see that for Fδ(s) as above Fδ(12 + it) decays like eminust2π|δ| just as weexpected from (87) This is a little too slow for our purposes we will often haveto work with relatively large δ and we would like to have to check the zeroes of Lfunctions only up to relatively low heights t ndash say up to 50|δ| Then eminust2π|δ| gteminus8 = 000033 which is not very small We will settle for a different choice of ηthe Gaussian

      The decay of the Gaussian smoothing function η(t) = eminust22 is much faster than

      exponential Its Mellin transform is Γ(s2) which decays exponentially as =(s) rarrplusmninfin Moreover the Mellin transform Fδ(s) (δ 6= 0) while not an elementary orvery commonly occurring function equals (after a change of variables) a relativelywell-studied special function namely a parabolic cylinder function U(a z) (or inWhittakerrsquos [Whi03] notation Dminusaminus12(z))

      For δ not too small the main term will indeed work out to be proportional toeminus(τ2πδ)22 as the method of stationary phase indicated This is of course muchbetter than eminusτ2π|δ| The ldquocostrdquo is that the Mellin transform Γ(s2) for δ = 0 nowdecays like eminus(π4)|τ | rather than eminus(π2)|τ | This we can certainly afford

      82 The twisted Gaussian overview and setup

      821 Relation to the existing literatureWe wish to approximate the Mellin transform

      Fδ(s) =

      int infin0

      eminust22e(δt)ts

      dt

      t (88)

      where δ isin R The parabolic cylinder function U C2 rarr C is given by

      U(a z) =eminusz

      24

      Γ(

      12 + a

      ) int infin0

      taminus12 eminus

      12 t

      2minusztdt

      2There may be a minor gap in the literature in this respect The explicit formula given in [HL22 Lemma4] does not make all constants explicit The constants and trivial-zero terms were fully worked out forq = 1 by [Wig20] (cited in [MV07 Exercise 12118(c)] the sign of hypκq(z) there seems to be off) Aswas pointed out by Landau (see [Har66 p 628]) [HL22] seems to neglect the effect of the zeros ρ withlt(ρ) = 0 =(ρ) 6= 0 for χ non-primitive (The author thanks R C Vaughan for this information and thereferences)

      82 THE TWISTED GAUSSIAN OVERVIEW AND SETUP 147

      for lt(a) gt minus12 the function can be extended to all a z isin C either by analyticcontinuation or by other integral representations ([AS64 sect195] [Tem10 sect125(i)])Hence

      Fδ(s) = e(πiδ)2Γ(s)U

      (sminus 1

      2minus2πiδ

      ) (89)

      The second argument of U is purely imaginary it would be otherwise if a Gaussian ofnon-zero mean were chosen

      Let us briefly discuss the state of knowledge up to date on Mellin transforms ofldquotwistedrdquo Gaussian smoothings that is eminust

      22 multiplied by an additive charactere(δt) As we have just seen these Mellin transforms are precisely the parabolic cylin-der functions U(a z)

      The function U(a z) has been well-studied for a and z real see eg [Tem10]Less attention has been paid to the more general case of a and z complex The mostnotable exception is by far the work of Olver [Olv58] [Olv59] [Olv61] [Olv65] hegave asymptotic series for U(a z) a z isin C These were asymptotic series in the senseof Poincare and thus not in general convergent they would solve our problem if andonly if they came with error term bounds Unfortunately it would seem that all fullyexplicit error terms in the literature are either for a and z real or for a and z outsideour range of interest (see both Olverrsquos work and [TV03]) The bounds in [Olv61]involve non-explicit constants Thus we will have to find expressions with expliciterror bounds ourselves Our case is that of a in the critical strip z purely imaginary

      822 General approach

      We will use the saddle-point method (see eg [dB81 sect5] [Olv74 sect47] [Won01sectII4]) to obtain bounds with an optimal leading-order term and small error terms (Weused the stationary-phase method solely as an exploratory tool)

      What do we expect to obtain Both the asymptotic expressions in [Olv59] and thebounds in [Olv61] make clear that if the sign of τ = =(s) is different from that of δthere will a change in behavior when τ gets to be of size about (2πδ)2 This is unsur-prising given our discussion using stationary phase for |=(a)| smaller than a constanttimes |=(z)|2 the term proportional to eminus(π4)|τ | = eminus|=(a)|2 should be dominantwhereas for |=(a)| much larger than a constant times |=(z)|2 the term proportional to

      eminus12 ( τ

      2πδ )2

      should be dominantThere is one important difference between the approach we will follow here and

      that in [Hela] In [Hela] the integral (88) was estimated by a direct application ofthe saddle-point method Here following a suggestion of N Temme we will use theidentity

      U(a z) =e

      14 z

      2

      radic2πi

      int c+iinfin

      cminusiinfineminuszu+u2

      2 uminusaminus12 du (810)

      (see eg [OLBC10 (1256)] c gt 0 is arbitrary) Together (89) and (810) give usthat

      Fδ(s) =eminus2π2δ2Γ(s)radic

      2πi

      int c+iinfin

      cminusiinfine2πiδu+u2

      2 uminussdu (811)

      148 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      Estimating the integral in (811) turns out to be a somewhat cleaner task than estimating(88) The overall procedure however is in essence the same in both cases

      We write

      φ(u) = minusu2

      2minus (2πiδ)u+ iτ log u (812)

      for u real or complex so that the integral in (811) equals

      I(s) =

      int c+iinfin

      cminusiinfineminusφ(u)uminusσdu (813)

      We wish to find a saddle point A saddle point is a point u at which φprime(u) = 0This means that

      minus uminus 2πiδ +iτ

      u= 0 ie u2 minus i`uminus iτ = 0 (814)

      where ` = minus2πδ The solutions to φprime(u) = 0 are thus

      u0 =i`plusmnradicminus`2 + 4iτ

      2 (815)

      The value of φ(u) at u0 is

      φ(u0) = minus i`u0 + iτ

      2+ i`u0 + iτ log u0

      =i`

      2u0 + iτ log

      u0radice

      (816)

      The second derivative at u0 is

      φprimeprime(u0) = minus 1

      u20

      (u2

      0 + iτ)

      = minus 1

      u20

      (i`u0 + 2iτ) (817)

      Assign the names u0+ u0minus to the roots in (815) according to the sign in frontof the square-root (where the square-root is defined so as to have its argument in theinterval (minusπ2 π2]) We will actually have to pay attention just to u0+ since unlikeu0minus it lies on the right half of the plane where our contour of integration also liesWe remark that

      u0+ =i`+ |`|

      radicminus1 + 4iτ

      `2

      2=`

      2

      (iplusmnradicminus1 +

      `2i

      )(818)

      where the sign plusmn is + if ` gt 0 and minus if ` lt 0 If ` = 0 then u0+ = (1radic

      2 +iradic

      2)radicτ

      We can assume without loss of generality that τ ge 0 We will find it convenient toassume τ gt 0 since we can deal with τ = 0 simply by letting τ rarr 0+

      83 THE SADDLE POINT 149

      83 The saddle point

      831 The coordinates of the saddle point

      We should start by determining u0+ explicitly both in rectangular and polar coordi-nates For one thing we will need to estimate the integrand in (813) for u = u0+ Theabsolute value of the integrand is then

      ∣∣eminusφ(u0+)uminusσ0+

      ∣∣ = |u0+|minusσeminusltφ(u0+) and by(816)

      ltφ(u0+) = minus `2=(u0+)minus arg(u0+)τ (819)

      If ` = 0 we already know that lt(u0+) = =(u0+) =radicτ2 |u0+| =

      radicτ and

      arg u0+ = π4 Assume from now on that ` 6= 0

      We will use the expression for u0+ in (818) Solving a quadratic equation we seethat

      radicminus1 +

      `2i =

      radicj(ρ)minus 1

      2+ i

      radicj(ρ) + 1

      2 (820)

      where j(ρ) = (1 + ρ2)12 and ρ = 4τ`2 Hence

      lt(u0+) = plusmn `2

      radicj(ρ)minus 1

      2 =(u0+) =

      `

      2

      (1plusmn

      radicj(ρ) + 1

      2

      ) (821)

      Here and in what follows the signplusmn is + if ` gt 0 andminus if ` lt 0 (Notice thatlt(u0+)and =(u0+) are always positive except for τ = ` = 0 in which case lt(u0+) ==(u0+) = 0) By (821)

      |u0+| =|`|2middot

      ∣∣∣∣∣radicminus1 + j(ρ)

      2+

      (1plusmn

      radic1 + j(ρ)

      2

      )i

      ∣∣∣∣∣=|`|2

      radicminus1 + j(ρ)

      2+

      1 + j(ρ)

      2+ 1plusmn 2

      radic1 + j(ρ)

      2

      =|`|2

      radic1 + j(ρ)plusmn 2

      radic1 + j(ρ)

      2=|`|radic

      2

      radicυ(ρ)2 plusmn υ(ρ)

      (822)

      150 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      where υ(ρ) =radic

      (1 + j(ρ))2 We now compute the argument of u0+

      arg(u0+) = arg(`(iplusmnradicminus1 + iρ

      ))= arg

      (radicminus1 + j(ρ)

      2+ i

      (plusmn1 +

      radic1 + j(ρ)

      2

      ))

      = arcsin

      plusmn1 +radic

      1+j(ρ)2radic

      1 + j(ρ)plusmn 2radic

      1+j(ρ)2

      = arcsin

      radicplusmn1 +

      radic1+j(ρ)

      2radic2radic

      1+j(ρ)2

      = arcsin

      radicradicradicradic1

      2

      (1plusmn

      radic2

      1 + j(ρ)

      ) =π

      2minus 1

      2arccos

      (plusmn

      radic2

      1 + j(ρ)

      )(823)

      (by cos(π minus 2θ) = minus cos 2θ = 2 sin2 θ minus 1) Thus

      arg(u0+) =

      π2 minus

      12 arccos 1

      υ(ρ) = 12 arccos minus1

      υ(ρ) if ` gt 012 arccos 1

      υ(ρ) if ` lt 0(824)

      In particular arg(u0+) lies in [0 π2] and is close to π2 only when ` gt 0 andρ rarr 0+ Here and elsewhere we follow the convention that arcsin and arctan haveimage in [minusπ2 π2] whereas arccos has image in [0 π]

      832 The direction of steepest descent

      As is customary in the saddle-point method it is now time to determine the directionof steepest descent at the saddle-point u0+ Even if we decide to use a contour thatgoes through the saddle-point in a direction that is not quite optimal it will be usefulto know what the direction w of steepest descent actually is A contour that passesthrough the saddle-point making an angle between minusπ4 + ε and π4 minus ε with wmay be acceptable in that the contribution of the saddle point is then suboptimal by atmost a bounded factor depending on ε an angle approaching minusπ4 or π4 leads to acontribution suboptimal by an unbounded factor

      Let w isin C be the unit vector pointing in the direction of steepest descent Thenby definition w2φprimeprime(u0+) is real and positive where φ is as in (812) Thus arg(w) =minus arg(φprimeprime(u0+))2 modπ (The direction of steepest descent is defined only moduloπ) By (817)

      arg(φprimeprime(u0+)) = minusπ + arg(i`u0+ + 2iτ)minus 2 arg(u0+) mod 2π

      = minusπ2

      + arg(`u0+ + 2τ)minus 2 arg(u0+) mod 2π

      83 THE SADDLE POINT 151

      By (821)

      lt(`u0+ + 2τ) =`2

      2

      (plusmnradicj(ρ)minus 1

      2+

      `2

      )=`2

      2

      (ρplusmn

      radicj(ρ)minus 1

      2

      )

      =(`u0+ + 2τ) =`2

      2

      (1plusmn

      radicj(ρ) + 1

      2

      )

      Therefore arg(`u0+ + 2τ) = arctan$ where

      $ =1plusmn

      radicj(ρ)+1

      2

      ρplusmnradic

      j(ρ)minus12

      It is easy to check that sgn$ = sgn ` Hence

      arctan$ = plusmnπ2minus arctan

      ρplusmnradic

      j(ρ)minus12

      1plusmnradic

      j(ρ)+12

      At the same time

      ρplusmnradic

      jminus12

      1plusmnradic

      j+12

      =

      (ρplusmn

      radicjminus1

      2

      )(1∓

      radicj+1

      2

      )1minus j+1

      2

      =ρplusmn

      radic2(j minus 1)∓ ρ

      radic2(j + 1)

      1minus j

      =ρplusmn

      radic2j+1

      (radicj2 minus 1minus ρ middot (j + 1)

      )1minus j

      =ρplusmn 1

      υ (ρminus ρ middot (j + 1))

      1minus j

      =ρ(1∓ jυ)

      1minus j=

      (minus1plusmn jυ)(j + 1)

      ρ=

      2υ(minusυ plusmn j)ρ

      (825)Hence modulo 2π

      arg(φprimeprime(u0+)) = minus arctan2υ(minusυ plusmn j)

      ρminus 2 arg(u0+)minus

      0 if ` ge 0

      π if ` lt 0

      Therefore the direction of steepest descent is

      arg(w) = minusarg(φprimeprime(u0+))

      2= arg(u0+) +

      1

      2arctan

      2υ(minusυ plusmn j)ρ

      +

      0 if ` ge 0π2 if ` lt 0

      (826)By (824) and arccos 1υ = arctan

      radicυ2 minus 1 = arctan

      radic(j minus 1)2 we conclude that

      arg(w) =

      π2 + 1

      2

      (minus arctan 2υ(j+υ)

      ρ + arctanradic

      jminus12

      )if ` lt 0

      π2 + 1

      2

      (arctan 2υ(jminusυ)

      ρ minus arctanradic

      jminus12

      )if ` ge 0

      (827)

      152 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      Figure 81 arg(w) minus π2 as a function ofρ for ` lt 0

      Figure 82 arg(w) minus π2 as a function ofρ for ` ge 0

      There is nothing wrong in using plots here to get an idea of the behavior of arg(w)since at any rate the direction of steepest descent will play only an advisory role inour choices See Figures 81 and 82

      84 The integral over the contourWe must now choose the contour of integration The optimal contour should be one onwhich the phase of the integrand in (813) is constant ie =(φ(u)) is constant Thisis so because throughout the contour we want to keep descending from the saddleas rapidly as possible and so we want to maximize the absolute value of the deriva-tive of the real part of the exponent minusφ(u) At any point u if we are to maximize|lt(dφ(u)dt)| we want our contour to be such that =(dφ(u)dt) = 0 (We can alsosee this as follows if =(φ(u)) is constant there is no cancellation in (813) for us tomiss)

      Writing u = x+ iy we obtain from (812) that

      =(φ(u)) = minusxy + `x+ τ logradicx2 + y2 (828)

      We would thus be considering the curve =(φ(u)) = c where c is a constant Since weneed the contour to pass through the saddle point u0+ we set c = =(φ(u0+)) Theonly problem is that the curve =(φ(u)) = 0 given by (828) is rather uncomfortable towork with

      Instead we shall use several rather simple contours each appropriate for differentvalues of ` and τ

      841 A simple contourAssume first that ` gt 0 We could just let our contour L be the vertical line goingthrough u0+ Since the direction of steepest descent is never far from vertical (see

      84 THE INTEGRAL OVER THE CONTOUR 153

      (82)) this would be a good choice However the vertical line has the defect of goingtoo close to the origin when ρrarr 0

      Instead we will let L consist of three segments (a) the straight vertical ray

      (x0 y) y ge y0

      where x0 = ltu0+ ge 0 y0 = =u0+ gt 0 (b) the straight segment going downwardsand to the right from u0+ to the x-axis forming an angle of π2 minus β (where β gt 0will be determined later) with the x-axis at a point (x1 0) (c) the straight vertical ray(x1 y) y le 0 Let us call these three segments L1 L2 L3 Shifting the contour in(813) we obtain

      I =

      intL

      eminusφ(u)uminusσdu

      and so |I| le I1 + I2 + I3 where

      Ij =

      intLj

      ∣∣∣eminusφ(u)uminusσ∣∣∣ |du| (829)

      As we shall see we have chosen the segments Lj so that each of the three integrals Ijwill be easy to bound

      Let us start with I1 Since σ ge 0

      I1 le |u0+|minusσint infiny0

      eminusltφ(x0+iy)dy

      where by (812)

      ltφ(x+ iy) =y2 minus x2

      2minus `y minus τ arg(x+ iy) (830)

      Let us expand the expression on the right of (830) for x = x0 and y around y0 ==u0+ gt 0 The constant term is

      ltφ(u0+) = minus `2y0 minus τ arg(u0+) = minus`

      2

      4(1 + υ(ρ))minus τ

      2arccos

      minus1

      υ(ρ)

      = minus(

      1 + υ(ρ)

      ρ+

      1

      2arccos

      minus1

      υ(ρ)

      (831)

      where we are using (819) (821) and (824)The linear term vanishes because u0+ is a saddle-point (and thus a local extremum

      on L) It remains to estimate the quadratic term Now in (830) the term arg(x+ iy)equals arctan(yx) whose quadratic term we should now examine ndash but instead weare about to see that we can bound it trivially In general for t0 t isin R and f isin C2

      f(t) = f(t0) + f prime(t0) middot (tminus t0) +

      int t

      t0

      int r

      t0

      f primeprime(s)dsdr (832)

      Now arctanprimeprime(s) = minus2s(s2 + 1)2 and this is negative for s gt 0 and obeys

      arctanprimeprime(minuss) = minus arctanprimeprime(s)

      154 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      for all s Hence for t0 ge 0 and t ge minust0

      arctan t le arctan t0 + (arctanprime t0) middot (tminus t0) (833)

      Therefore in (830) we can consider only the quadratic term coming from (y2minusx2)2ndash namely (yminusy0)22 ndash and ignore the quadratic term coming from arg(x+ iy) Thus

      ltφ(x0 + iy) ge (y minus y0)2

      2+ ltφ(u0+) (834)

      for y ge minusy0 and in particular for y ge y0 Henceint infiny0

      eminusltφ(x0+iy)dy le eminusltφ(u0+)

      int infiny0

      eminus12 (yminusy0)2dy =

      radicπ2 middot eminusltφ(u0+) (835)

      Notice that once we choose to use the approximation (833) the vertical direction isactually optimal (In turn the fact that the direction of steepest descent is close tovertical shows us that we are not losing much by using the approximation (833))

      As for |u0+|minusσ we will estimate it by the easy bound

      |u0+| =`radic2

      radicυ2 + υ ge `radic

      2max

      (radicρ

      2radic

      2

      )= max(

      radicτ `) (836)

      where we use (822)Let us now bound I2 As we already said the linear term at u0+ vanishes Let

      u be the point at which L2 meets the line normal to it through the origin We musttake care that the angle formed by the origin u0+ and u be no larger than the angleformed by the origin (x1 0) and u0 this will ensure that we are in the range in whichthe approximation (833) is valid (namely t ge minust0 where t0 = tanα0) The firstangle is π2 +βminus arg u0+ whereas the second angle is π2minusβ Hence it is enoughto set β le (arg u0+)2 Then we obtain from (812) and (833) that

      ltφ(u) ge ltφ(u0+)minuslt (uminus u0+)2

      2 (837)

      If we let s = |uminus u0+| we see that

      lt (uminus u0+)2

      2=s2

      2cos(

      2 middot(π

      2minus β

      ))= minuss

      2

      2cos 2β

      Hence

      I2 le |u|minusσintL2

      eminusltφ(u)|du|

      lt |u|minusσint infin

      0

      eminusltφ(u0+)minus s22 cos 2βds = |u|minusσeminusltφ(u0+)

      radicπ

      2 cos 2β

      (838)

      Since arg u0 = arg u0+ minus β we see that by (821)

      |u| = lt ((x0 + iy0) (cosβ minus i sinβ))

      =`

      2

      (radicj minus 1

      2cosβ +

      (1 +

      radicj + 1

      2

      )sinβ

      )

      (839)

      84 THE INTEGRAL OVER THE CONTOUR 155

      The square of the expression within the outer parentheses is at least

      j minus 1

      2cos2 β +

      (1 +

      j + 1

      2+radic

      2(j + 1)

      )sin2 β +

      (radicj2 minus 1

      4+

      radicj minus 1

      2

      )sin 2β

      ge j

      2+

      7

      2sin2 β minus 1

      2cos2 β +

      j

      2sin2 β

      If β ge π8 then tanβ gt 1radic

      7 and so since j gt ρ we obtain

      |u| ge`

      2

      radicj

      2(1 + sin2 β) gt

      `radicρ

      232

      radic1 + sin2 β

      We can also apply the trivial bound j ge 1 directly to (839) Thus

      |u| ge max

      (radicτ

      2

      radic1 + sin2 β ` sinβ

      )

      Let us choose β as follows We could always set β = π8 since arg u0+ ge π4 wethen have β le (arg u0+)2 as required However if ρ le 32 then υ(ρ) le 118381and so by (824) arg u0+ ge 128842 We can thus set either β = π6 = 0523598 or β = π5 = 0628318 say either of which is smaller than (arg u0+)2 Goingback to (838) we conclude that

      I2 le eminusltφ(u0+) middotradicπ

      214

      ∣∣∣∣radicτ

      2

      radic1 + sin2 π

      8

      ∣∣∣∣minusσfor ρ arbitrary and

      I2 le eminusltφ(u0+) middotmin

      (radicπ2

      cos 2π5middot∣∣∣` sin

      π

      5

      ∣∣∣minusσ radicπ ∣∣∣∣ `2∣∣∣∣minusσ)

      when υ(ρ) le 32It remains to estimate I3 For u = x1

      minuslt (uminus u0+)2

      2= minuslty

      20 (tanβ minus i)2

      2=

      1

      2

      (1minus tan2 β

      )y2

      0

      ge(1minus tan2 β

      )middot `

      2

      8

      (1 +

      j + 1

      2

      )ge `2

      8

      (1minus tan2 β

      )middot ρ

      2

      ge 1

      4

      (1minus tan2 β

      where we are using (821) Thus (837) tells us that

      ltφ(x1) ge ltφ(u0+) +1minus tan2 β

      At the same time by (830) and τ ` ge 0

      ltφ(x1 + iy) ge ltφ(x1) +y2

      2

      156 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      for y le 0 Hence

      I3 le |x1|minusσintL3

      eminusltφ(u)|du| le |x1|minusσeminusltφ(x1)

      int 0

      minusinfineminusy

      22dy

      le |x1|minusσ middotradicπ

      2eminus

      1minustan2 β4 τeminusltφ(u0+)

      Here note that x1 ge (tanβ)|u0+| and so by (836)

      x1 ge tanβ middotmax(radicτ `)

      We conclude that for ` gt 0

      |I| le

      1 + 214

      (2

      1 + sin2 π8

      )σ2+eminus(radic

      2minus12

      )τ(

      tan π8

      )σ middot radicπ2

      τσ2eminusltφ(u0+)

      (since (1minus tan2 π8)4 = (radic

      2minus 1)2) and when ρ le 32

      |I| le

      1 + min

      2σ+ 12

      radicsec 2π

      5(sin π

      5

      )σ+

      eminusτ6

      (1radic

      3)σ

      middot radicπ2`σ

      eminusltφ(u0+)

      We know ltφ(u0+) from (831) Write

      E(ρ) =1

      2arccos

      1

      υ(ρ)minus υ(ρ)minus 1

      ρ (840)

      so that

      minusltφ(u0+) =1 + υ(ρ)

      ρ+

      1

      2arccos

      minus1

      υ(ρ)=π

      2minus E(ρ) +

      2

      ρ

      To finish we just need to apply (811) It makes sense to group together Γ(s)eπ2 τ

      since it is bounded on the critical line (by the classical formula |Γ(12 + iτ)| =radicπ coshπτ as in [MV07 Exer C1(b)]) and in general of slow growth on bounded

      strips Using (811) and noting that 2π2δ2 = `22 = (2ρ) middot τ we obtain

      |Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

      c1σττ

      σ2 for ρ arbitraryc2στ`

      σ for ρ le 32(841)

      where

      c1στ =1

      2

      1 + 214

      (2

      1 + sin2 π8

      )σ2+eminus(radic

      2minus12

      )τ(

      tan π8

      c2στ =1

      2

      1 + min

      2σ+ 12

      radicsec 2π

      5(sin π

      5

      )σ+

      eminusτ6

      (1radic

      3)σ

      (842)

      84 THE INTEGRAL OVER THE CONTOUR 157

      We have assumed throughout that ` ge 0 and τ ge 0 We can immediately obtain abound valid for ` le 0 τ le 0 by reflection on the x-axis we simply put absolutevalues around τ and ` in (841)

      We see that we have obtained a bound in a neat closed form without too mucheffort Of course this effortlessness is usually in part illusory the contour we haveused here is actually the product of some trial and error in that some other contoursgive results that are comparable in quality but harder to simplify We will have tochoose a different contour when sgn(`) 6= sgn(τ)

      842 Another simple contourWe now wish to give a bound for the case of sgn(`) 6= sgn(τ) ie sgn(δ) = sgn(τ)We expect a much smaller upper bound than for sgn(`) = sgn(τ) given what wealready know from the method of stationary phase This also means that we will notneed to be as careful in order to get a bound that is good enough for all practicalpurposes

      Our contour L will consist of three segments (a) the straight vertical ray (x0 y) y ge 0 (b) the quarter-circle from (x0 0) to (0minusx0) (that is an arc where the argu-ment runs from 0 to minusπ2) and (c) the straight vertical ray (0 y) y le minusx0 Wecall these segments L1 L2 L3 and define the integrals I1 I2 and I3 just as in (829)

      Much as before we have

      I1 le xminusσ0

      int infin0

      eminusltφ(x0+iy)dy

      Since (833) is valid for t ge 0 (834) holds and so

      I1 le xminusσ0 eminusltφ(u0+)

      int infinminusinfin

      eminus12 (yminusy0)2dy = xminusσ0

      radic2π middot eminusltφ(u0+)

      By (812) and (830)

      I2 le xminusσ0

      intL2

      eminusltφ(u)du = x1minusσ0

      int π2

      0

      eminus(minus x

      20 cos 2α

      2 +`x0 sinα+τα

      )dα (843)

      Now for α ge 0 and ` le 0

      (`x0 sinα+ τα)prime

      = `x0 cosα+ τ ge `x0 + τ

      Since j =radic

      1 + ρ2 le 1 + ρ22 we haveradic

      (j minus 1)2 le ρ2 and so by (821)|`x0| le `2ρ4 = τ and thus `x0 + τ ge 0 In other words the exponent in (843)equals (x2

      0 cos 2α)2 minus an increasing function and so since ltφ(x0) = minusx202

      I2 le xminusσ0 middot x0

      int π2

      0

      ex20 cos 2α

      2 dα = xminusσ0 middot π2x0 middot I0(x2

      02)

      where I0(t) = 1π

      int π0et cos θdθ is the modified Bessel function of the first kind (and

      order 0)

      158 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      Since cos θ =radic

      1minus sin2 θ lt 1minus (sin2 θ)2 le 1minus 2θ2π2 we have3

      I0(t) le 1

      π

      int π

      0

      et(

      1minus 2θ2

      π2

      )dθ lt et middot 1

      π

      int infin0

      eminus2tπ2 θ

      2

      dθ = etπradic

      2t

      π

      radicπ

      2=

      radicπ

      232

      etradict

      for t ge 0Using the fact that ltφ(x0) = minusx2

      02 we conclude that

      I2 le xminusσ0 middot π2x0 middot

      radicπ

      232

      ex202

      x0radic

      2=π32

      4xminusσ0 eminusltφ(x0)

      By (834) which is valid for all ` we know that ltφ(x0) ge ltφ(u0+)Let us now estimate the integral on L3 Again by (830) for y lt 0

      ltφ(iy) =y2

      2minus `y + τ

      π

      2

      Hence ∣∣∣∣intL3

      eminusφ(u)uminusσdu

      ∣∣∣∣ le xminusσ0

      int minusx0

      minusinfineminus(y2

      2 minus`y+τ π2

      )du

      = xminusσ0 e12 `

      2

      eminusτπ2

      int minusx0

      minusinfineminus

      12 (yminus`)2dy = xminusσ0 eminus

      τπ2

      radicπ

      2

      since yminus` le minus` for y le minusx0 andint minus`minusinfin eminust

      22dt leradicπ2middoteminus`22 (by [AS64 7113])

      Now that we have bounded the integrals over L1 L2 and L3 it remains to boundx0 from below starting from (821) We will bound it differently for ρ lt 32 and forρ ge 32 (The choice of 32 is fairly arbitrary)

      Expanding (radic

      1 + t minus 1)2 gt 0 we obtain that 2(1 + t) minus 2radic

      1 + t ge t for allt ge minus1 and so(radic

      1 + tminus 1

      t

      )prime=

      1

      t2

      (t

      2radic

      1 + tminus (radic

      1 + tminus 1)

      )lt 0

      ie (radic

      1 + tminus 1)t decreases as t increases Hence for ρ le ρ0 where ρ0 ge 0

      j(ρ) =radic

      1 + ρ2 ge 1 +

      radic1 + ρ2

      0 minus 1

      ρ20

      ρ2 (844)

      which equals 1 + (29)(radic

      13minus 2)ρ2 for ρ0 = 32 Thus for ρ le 32

      x0 ge|`|2

      radic29 (radic

      13minus 2)ρ2

      2=

      radicradic13minus 2

      6|`|ρ

      =2radicradic

      13minus 2

      3

      τ

      |`|ge 084473

      |τ |`

      (845)

      3It is actually not hard to prove rigorously the better bound I0(t) le 0468823etradict For t ge 8 this can

      be done directly by the change of variables cos θ = 1 minus 2s2 dθ = 2dsradic

      1minus s2 followed by the usageof different upper bounds on the the integrand exp(minus2ts2

      radic1minus s2) for 0 le s le 12 and 12 le s le 1

      (Thanks are due G Kuperberg for this argument) For t lt 8 use the Taylor expansion of I0(t) aroundt = 0 [AS64 (9612)] truncate it after 16 terms and then bound the maximum of the truncated series bythe bisection method implemented via interval arithmetic (as described in sect26)

      85 CONCLUSIONS 159

      On the other hand(j(ρ)minus 1

      ρ

      )prime=

      1

      ρ2(jprime(ρ)ρminus (j(ρ)minus 1)) =

      ρ2 minus (1 + ρ2) +radic

      1 + ρ2

      ρ2radic

      1 + ρ2ge 0

      and so for ρ ge 32 (j(ρ) minus 1)ρ is minimal at ρ = 32 where it takes the value(radic

      13minus 2)3 Hence

      x0 =|`|2

      radicj(ρ)minus 1

      2ge|`|radicρ

      2

      radicradic13minus 2radic

      6=

      radicradic13minus 2radic

      6

      radicτ ge 051729

      radicτ (846)

      We now sum I1 I2 and I3 and then use (811) we obtain that when ` lt 0 andτ ge 0

      |Fδ(s)| leeminus2π2δ2 |Γ(s)|radic

      ∣∣∣∣intL

      eminusφ(u)uminusσdu

      ∣∣∣∣le |x0|minusσ

      ((1 +

      π

      232

      )eminusltφ(u0+) +

      1

      2eminus

      τπ2

      )eminus

      12 `

      2

      |Γ(s)|(847)

      By (819) (821) and (824)

      minuslt(φ(u0+)) =`2

      4(1minus υ(ρ)) +

      τ

      2arccos

      1

      υ(ρ)ltτ

      2arccos

      1

      υ(ρ)le π

      We conclude that when sgn(`) 6= sgn(τ) (ie sgn(δ) = sgn(τ))

      |Fδ(s)| le |x0|minusσ middot eminus12 `

      2

      |Γ(s)|eπ2 |τ | middot((

      1 +π

      232

      )eminus

      π4 |τ | +

      1

      2eminusπ|τ |

      )

      where x0 can be bounded as in (845) and (846) Here as before we reducing the caseτ lt 0 to the case τ gt 0 by reflection This concludes the proof of Theorem 801

      85 ConclusionsWe have obtained bounds on |Fδ(s)| for sgn(δ) 6= sgn(τ) (841) and for sgn(δ) =sgn(τ) (847) Our task is now to simplify them

      First let us look at the exponent E(ρ) defined as in (82) Its plot can be seen inFigure 85 We claim that

      E(ρ) ge

      01598 if ρ ge 1501065ρ if ρ lt 15

      (848)

      This is so for ρ ge 15 because E(ρ) is increasing on ρ and E(15) = 015982 Thecase ρ lt 15 is a little more delicate We can easily see that arccos(1minus t22) ge t for0 le t ge 2 (since the derivative of the left side is 1

      radic1minus t24 which is always ge 1)

      We also have

      1 +ρ2

      2minus ρ4

      8le j(ρ) le 1 +

      ρ2

      2

      160 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      Figure 83 The function E(ρ)

      for 0 le ρ leradic

      8 and so

      1 +ρ2

      8minus 5ρ4

      128le υ(ρ) le 1 +

      ρ2

      8

      for 0 le ρ leradic

      325 this in turn gives us that 1υ(ρ) le 1minus ρ28 + 7ρ4128 (againfor 0 le ρ le

      radic325) and so 1υ(ρ) le 1 minus (1 minus 764)ρ28 for 0 le ρ le 12 We

      conclude that

      arccos1

      υ(ρ)ge 1

      2

      radic57

      64ρ

      therefore

      E(ρ) ge 1

      4

      radic57

      64ρminus ρ

      8gt 011093ρ gt 01065ρ

      In the remaining range 12 le ρ le 32 we prove that E(ρ)ρ gt 0106551 usingthe bisection method (with 20 iterations) implemented by means of interval arithmeticThis concludes the proof of (848)

      Assume from this point onwards that |τ | ge 20 Let us show that the contributionof (83) is negligible relative to that of (81) Indeed((

      1 +π

      232

      )eminus

      π4 |τ | +

      1

      2eminusπ|τ |

      )le 78

      106eminus01598τ

      It is useful to note that eminus`22 = eminus2τρ and so for σ le k + 1 and ρ le 32

      eminus2τρ

      (084473|τ |`)σle eminus40ρ(

      0844734 ρ

      )σ`σle 1

      (4

      084473 middot 15

      )σeminus80(3t)

      le 1

      `σmiddot 315683k+1 e

      minus80(3t)

      tk+1

      (849)

      85 CONCLUSIONS 161

      where t = 2ρ3 le 1 Since eminuscttk+1 attains its maximum at t = c(k + 1)

      eminus80(3t)

      tk+1le eminus(k+1)

      (3(k + 1)

      80

      )k+1

      and so for ρ le 32

      |x0|minusσeminus12 `

      2

      le 1

      `σmiddot

      004355 if 0 le σ le 1

      000759 if 1 le σ le 2

      000224 if 2 le σ le 3

      whereas |x0|minusσeminus`22 le |x0|minusσ le (051729

      radicτ)minusσ for ρ ge 32

      We conclude that for |τ | ge 20 and σ le 3

      |Fδ(s)| le |Γ(s)|eπ2 τ middot eminus01598τ middot

      4

      1071`σ if ρ le 32

      6105

      1τσ2

      if ρ ge 32(850)

      provided that sgn(δ) = sgn(τ) or δ = 0 This will indeed be negligible compared toour bound for the case sgn(δ) = minus sgn(τ)

      Let us now deal with the factor |Γ(s)|eπ2 τ By Stirlingrsquos formula with remainderterm [GR94 (8344)]

      log Γ(s) =1

      2log(2π) +

      (sminus 1

      2

      )log sminus s+

      1

      12s+R2(s)

      where

      |R2(s)| lt 130

      12|s|3 cos3(

      arg s2

      ) =

      radic2

      180|s|3

      for lt(s) ge 0 The real part of (sminus 12) log sminus s is

      (σ minus 12) log |s| minus τ arg(s)minus σ = (σ minus 12) log |s| minus π

      2τ + τ

      (arctan

      σ

      |τ |minus σ

      |τ |

      )for s = σ + iτ σ ge 0 Since arctan(r) le r for r ge 0 we conclude that

      |Γ(s)|eπ2 τ leradic

      2π|s|σminus 12 e

      112|s|+

      radic2

      180|s|3 (851)

      Lastly |s|σminus12 = |τ |σminus12|1 + iστ |σminus12 For |τ | ge 20

      |1 + iστ |σminus12 le

      1000625 if 0 le σ le 11007491 if 1 le σ le 21028204 if 2 le σ le 3

      ande

      112|τ|+

      radic2

      180|τ|3 le 1004177

      162 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

      Thus

      |Γ(s)|eπ2 τ le |τ |σminus12 middot

      251868 if 0 le σ le 1253596 if 1 le σ le 225881 if 2 le σ le 3

      (852)

      Let us now estimate the constants c1στ and c2στ in (82) By |τ | ge 20

      eminus(radic

      2minus12

      )τ le 0015889 eminus

      τ6 le 0035674 (853)

      Since 8 sin(π8) = 3061467 gt 1 we obtain that

      c1στ le

      130454 if 0 le σ le 1158361 if 1 le σ le 2198186 if 2 le σ le 3

      c2στ le

      194511 if 0 le σ le 1315692 if 1 le σ le 2502186 if 2 le σ le 3

      Lastly note that for k le σ le k + 1 we have

      1

      τσ2middot |τ |σminus12 = |τ |(σminus1)2 le τk2

      whereas for ρ le 32 and 0 le γ le 1

      |τ |γminus12

      |`|γle |τ |

      γ2minus

      12

      ( τ`2

      )γ2le 20

      γ2minus

      12

      (32

      4

      )γ2le(

      3

      8

      )12

      and so1

      `σmiddot |τ |σminus12 =

      (|τ |`

      )k |τ |σminus12

      |`|σleradic

      3

      8middot(|τ |`

      )k

      Multiplying and remembering to add (850) we obtain that for k = 0 1 2 σ isin[0 1] and |τ | ge 20

      |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

      κk0(|τ ||`|

      )keminus01065( 2|τ|

      |`| )2

      if ρ lt 32

      κk1|τ |keminus01598|τ | if ρ ge 32

      whereκ00 le (4 middot 10minus7 + 194511) middot 251868 middot

      radic38 le 3001

      κ10 le (4 middot 10minus7 + 315692) middot 253596 middotradic

      38 le 4903

      κ20 le (4 middot 10minus7 + 502186) middot 25881 middotradic

      38 le 796

      and similarly

      κ01 le (6 middot 10minus5 + 130454) middot 251868 le 3286

      κ11 le (6 middot 10minus5 + 158361) middot 253596 le 4017

      κ21 le (6 middot 10minus5 + 198186) middot 25881 le 513

      This concludes the proof of Corollary 802

      Chapter 9

      Explicit formulas

      An explicit formula is an expression restating a sum such as Sηχ(δx x) as a sum ofthe Mellin transformGδ(s) over the zeros of the L function L(s χ) More specificallyfor us Gδ(s) is the Mellin transform of η(t)e(δt) for some smoothing function η andsome δ isin R We want a formula whose error terms are good both for δ very close orequal to 0 and for δ farther away from 0 (Indeed our choice(s) of η will be made sothat Fδ(s) decays rapidly in both cases)

      We will be able to base all of our work on a single general explicit formula namelyLemma 911 This explicit formula has simple error terms given purely in terms of afew norms of the given smoothing function η We also give a common framework forestimating the contribution of zeros on the critical strip (Lemmas 913 and 914)

      The first example we work out is that of the Gaussian smoothing η(t) = eminust22

      We actually do this in part for didactic purposes and in part because of its likely ap-plicability elsewhere for our applications we will always use smoothing functionsbased on teminust

      22 and t2eminust22 generally in combination with something else Since

      η(t) = eminust22 does not vanish at t = 0 its Mellin transform has a pole at s = 0

      ndash something that requires some additional work (Lemma 912 see also the proof ofLemma 911)

      Other than that for each function η(t) all that has to be done is to bound an integral(from Lemma 913) and bound a few norms Still both for ηlowast and for η+ we find afew interesting complications Since η+ is defined in terms of a truncation of a Mellintransform (or alternatively in terms of a multiplicative convolution with a Dirichletkernel as in (74) and (76)) bounding the norms of η+ and ηprime+ takes a little work Weleave this to Appendix A The effect of the convolution is then just to delay the decaya shift in that a rapidly decaying function f(τ) will get replaced by f(τ minus H) H aconstant

      The smoothing function ηlowast is defined as a multiplicative convolution of t2eminust22

      with something else Given that we have an explicit formula for t2eminust22 we obtain an

      explicit formula for ηlowast by what amounts to just exchanging the order of a sum and anintegral (We already went over this in the introduction in (140))

      163

      164 CHAPTER 9 EXPLICIT FORMULAS

      91 A general explicit formulaWe will prove an explicit formula valid whenever the smoothing η and its derivative ηprime

      satisfy rather mild assumptions ndash they will be assumed to be L2-integrable and to havestrips of definition containing s 12 le lt(s) le 32 though any strip of the forms ε le lt(s) le 1 + ε would do just as well

      (For explicit formulas with different sets of assumptions see eg [IK04 sect55] and[MV07 Ch 12])

      The main idea in deriving any explicit formula is to start with an expression givinga sum as integral over a vertical line with an integrand involving a Mellin transform(here Gδ(s)) and an L-function (here L(s χ)) We then shift the line of integration tothe left If stronger assumptions were made (as in Exercise 5 in [IK04 sect55]) we couldshift the integral all the way tolt(s) = minusinfin the integral would then disappear replacedentirely by a sum over zeros (or even as in the same Exercise 5 by a particularly simpleintegral) Another possibility is to shift the line only to lt(s) = 12 + ε for some ε gt 0ndash but this gives a weaker result and at any rate the factor Lprime(s χ)L(s χ) can be largeand messy to estimate within the critical strip 0 lt lt(s) lt 1

      Instead we will shift the line to lts = minus12 We can do this because the assump-tions on η and ηprime are enough to continue Gδ(s) analytically up to there (with a possiblepole at s = 0) The factor Lprime(s χ)L(s χ) is easy to estimate for lts lt 0 and s = 0(by the functional equation) and the part of the integral on lts = minus12 coming fromGδ(s) can be estimated easily using the fact that the Mellin transform is an isometry

      Lemma 911 Let η R+0 rarr R be in C1 Let x isin R+ δ isin R Let χ be a primitive

      character mod q q ge 1Write Gδ(s) for the Mellin transform of η(t)e(δt) Assume that η(t) and ηprime(t) are

      in `2 (with respect to the measure dt) and that η(t)tσminus1 and ηprime(t)tσminus1 are in `1 (againwith respect to dt) for all σ in an open interval containing [12 32]

      Theninfinsumn=1

      Λ(n)χ(n)e

      xn

      )η(nx) = Iq=1 middot η(minusδ)xminus

      sumρ

      Gδ(ρ)xρ

      minusR+Olowast ((log q + 601) middot (|ηprime|2 + 2π|δ||η|2))xminus12

      (91)

      where

      Iq=1 =

      1 if q = 10 if q 6= 1

      R = η(0)

      (log

      q+ γ minus Lprime(1 χ)

      L(1 χ)

      )+Olowast(c0)

      (92)

      for q gt 1 R = η(0) log 2π for q = 1 and

      c0 =2

      3Olowast(∣∣∣∣ηprime(t)radict

      ∣∣∣∣1

      +∣∣∣ηprime(t)radict∣∣∣

      1+ 2π|δ|

      (∣∣∣∣η(t)radict

      ∣∣∣∣1

      + |η(t)radict|1))

      (93)

      The norms |η|2 |ηprime|2 |ηprime(t)radict|1 etc are taken with respect to the usual measure dt

      The sumsumρ is a sum over all non-trivial zeros ρ of L(s χ)

      91 A GENERAL EXPLICIT FORMULA 165

      Proof Since (a) η(t)tσminus1 is in `1 for σ in an open interval containing 32 and (b)η(t)e(δt) has bounded variation (since η ηprime isin `1 implying that the derivative ofη(t)e(δt) is also in `1) the Mellin inversion formula (as in eg [IK04 4106]) holds

      η(nx)e(δnx) =1

      2πi

      int 32 +iinfin

      32minusiinfin

      Gδ(s)xsnminussds

      Since Gδ(s) is bounded for lt(s) = 32 (by η(t)t32minus1 isin `1) andsumn Λ(n)nminus32 is

      bounded as well we can change the order of summation and integration as follows

      infinsumn=1

      Λ(n)χ(n)e(δnx)η(nx) =

      infinsumn=1

      Λ(n)χ(n) middot 1

      2πi

      int 32 +iinfin

      32minusiinfin

      Gδ(s)xsnminussds

      =1

      2πi

      int 32 +iinfin

      32minusiinfin

      infinsumn=1

      Λ(n)χ(n)Gδ(s)xsnminussds

      =1

      2πi

      int 32 +iinfin

      32minusiinfin

      minusLprime(s χ)

      L(s χ)Gδ(s)x

      sds

      (94)

      (This is the way the procedure always starts see for instance [HL22 Lemma 1] orto look at a recent standard reference [MV07 p 144] We are being very scrupulousabout integration because we are working with general η)

      The first question we should ask ourselves is up to where can we extend Gδ(s)Since η(t)tσminus1 is in `1 for σ in an open interval I containing [12 32] the transformGδ(s) is defined for lt(s) in the same interval I However we also know that thetransformation rule M(tf prime(t))(s) = minuss middotMf(s) (see (210) by integration by parts)is valid when s is in the holomorphy strip for both M(tf prime(t)) and Mf In our case(f(t) = η(t)e(δt)) this happens when lt(s) isin (I minus 1) cap I (so that both sides of theequation in the rule are defined) Hence s middot Gδ(s) (which equals s middotMf(s)) can beanalytically continued to lt(s) in (I minus 1) cup I which is an open interval containing[minus12 32] This implies immediately that Gδ(s) can be analytically continued to thesame region with a possible pole at s = 0

      When does Gδ(s) have a pole at s = 0 This happens when sGδ(s) is non-zero ats = 0 ie when M(tf prime(t))(0) 6= 0 for f(t) = η(t)e(δt) Now

      M(tf prime(t))(0) =

      int infin0

      f prime(t)dt = limtrarrinfin

      f(t)minus f(0)

      We already know that f prime(t) = (ddt)(η(t)e(δt)) is in `1 Hence limtrarrinfin f(t) existsand must be 0 because f is in `1 Hence minusM(tf prime(t))(0) = f(0) = η(0)

      Let us look at the next term in the Laurent expansion of Gδ(s) at s = 0 It is

      limsrarr0

      sGδ(s)minus η(0)

      s= limsrarr0

      minusM(tf prime(t))(s)minus f(0)

      s= minus lim

      srarr0

      1

      s

      int infin0

      f prime(t)(ts minus 1)dt

      = minusint infin

      0

      f prime(t) limsrarr0

      ts minus 1

      sdt = minus

      int infin0

      f prime(t) log t dt

      166 CHAPTER 9 EXPLICIT FORMULAS

      Here we were able to exchange the limit and the integral because f prime(t)tσ is in `1for σ in a neighborhood of 0 in turn this is true because f prime(t) = ηprime(t) + 2πiδη(t)and ηprime(t)tσ and η(t)tσ are both in `1 for σ in a neighborhood of 0 In fact we willuse the easy bounds |η(t) log t| le (23)(|η(t)tminus12|1 + |η(t)t12|1) |ηprime(t) log t| le(23)(|ηprime(t)tminus12|1 + |ηprime(t)t12|1) resulting from the inequality

      2

      3

      (tminus

      12 + t

      12

      )le | log t| (95)

      valid for all t gt 0We conclude that the Laurent expansion of Gδ(s) at s = 0 is

      Gδ(s) =η(0)

      s+ c0 + c1s+ (96)

      where

      c0 = Olowast(|f prime(t) log t|1)

      =2

      3Olowast(∣∣∣∣ηprime(t)radict

      ∣∣∣∣1

      +∣∣∣ηprime(t)radict∣∣∣

      1+ 2πδ

      (∣∣∣∣η(t)radict

      ∣∣∣∣1

      + |η(t)radict|1))

      We shift the line of integration in (94) to lt(s) = minus12 We obtain

      1

      2πi

      int 2+iinfin

      2minusiinfinminusLprime(s χ)

      L(s χ)Gδ(s)x

      sds = Iq=1Gδ(1)xminussumρ

      Gδ(ρ)xρ minusR

      minus 1

      2πi

      int minus12+iinfin

      minus12minusiinfin

      Lprime(s χ)

      L(s χ)Gδ(s)x

      sds

      (97)

      where

      R = Ress=0Lprime(s χ)

      L(s χ)Gδ(s)

      Of course

      Gδ(1) = M(η(t)e(δt))(1) =

      int infin0

      η(t)e(δt)dt = η(minusδ)

      Let us work out the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 By the func-tional equation (as in eg [IK04 Thm 415])

      Lprime(s χ)

      L(s χ)= log

      π

      qminus 1

      (s+ κ

      2

      )minus 1

      (1minus s+ κ

      2

      )minus Lprime(1minus s χ)

      L(1minus s χ) (98)

      where ψ(s) = Γprime(s)Γ(s) and

      κ =

      0 if χ(minus1) = 1

      1 if χ(minus1) = minus1

      91 A GENERAL EXPLICIT FORMULA 167

      By ψ(1 minus x) minus ψ(x) = π cotπx (immediate from Γ(s)Γ(1 minus s) = π sinπs) andψ(s) + ψ(s+ 12) = 2(ψ(2s)minus log 2) (Legendre [AS64 (638)])

      minus 1

      2

      (s+ κ

      2

      )+ ψ

      (1minus s+ κ

      2

      ))= minusψ(1minuss)+log 2+

      π

      2cot

      π(s+ κ)

      2 (99)

      Hence unless q = 1 the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 is

      1minus κs

      +

      (log

      qminus ψ(1)minus Lprime(1 χ)

      L(1 χ)

      )+a1

      s+a2

      s2+

      Here ψ(1) = minusγ the Euler gamma constant [AS64 (632)]There is a special case for q = 1 due to the pole of ζ(s) at s = 1 We know that

      ζ prime(0)ζ(0) = log 2π (see eg [MV07 p 331])From this and (96) we conclude that if η(0) = 0 then

      R =

      c0 if q gt 1 and χ(minus1) = 10 otherwise

      where c0 = Olowast(|ηprime(t) log t|1 + 2π|δ||η(t) log t|1) If η(0) 6= 0 then

      R = η(0)

      (log

      q+ γ minus Lprime(1 χ)

      L(1 χ)

      )+

      c0 if χ(minus1) = 1

      0 otherwise

      for q gt 1 andR = η(0) log 2π

      for q = 1It is time to estimate the integral on the right side of (97) For that we will need to

      estimate Lprime(s χ)L(s χ) for lt(s) = minus12 using (98) and (99)If lt(z) = 32 then |t2 + z2| ge 94 for all real t Hence by [OLBC10 (5915)]

      and [GR94 (34111)]

      ψ(z) = log z minus 1

      2zminus 2

      int infin0

      tdt

      (t2 + z2)(e2πt minus 1)

      = log z minus 1

      2z+ 2 middotOlowast

      (int infin0

      tdt94 (e2πt minus 1)

      )= log z minus 1

      2z+

      8

      9Olowast(int infin

      0

      tdt

      e2πt minus 1

      )= log z minus 1

      2z+

      8

      9middotOlowast

      (1

      (2π)2Γ(2)ζ(2)

      )= log z minus 1

      2z+Olowast

      (1

      27

      )= log z +Olowast

      (10

      27

      )

      (910)

      Thus in particular ψ(1 minus s) = log(32 minus iτ) + Olowast(1027) where we write s =12 + iτ Now ∣∣∣∣cot

      π(s+ κ)

      2

      ∣∣∣∣ =

      ∣∣∣∣e∓π4 iminusπ2 τ + eplusmnπ4 i+

      π2 τ

      e∓π4 iminus

      π2 τ minus eplusmnπ4 i+π

      2 τ

      ∣∣∣∣ = 1

      168 CHAPTER 9 EXPLICIT FORMULAS

      Since lt(s) = minus12 a comparison of Dirichlet series gives∣∣∣∣Lprime(1minus s χ)

      L(1minus s χ)

      ∣∣∣∣ le |ζ prime(32)||ζ(32)|

      le 150524 (911)

      where ζ prime(32) and ζ(32) can be evaluated by Euler-Maclaurin Therefore (98) and(99) give us that for s = minus12 + iτ ∣∣∣∣Lprime(s χ)

      L(s χ)

      ∣∣∣∣ le ∣∣∣logq

      π

      ∣∣∣+ log

      ∣∣∣∣32 + iτ

      ∣∣∣∣+10

      27+ log 2 +

      π

      2+ 150524

      le∣∣∣log

      q

      π

      ∣∣∣+1

      2log

      (τ2 +

      9

      4

      )+ 41396

      (912)

      Recall that we must bound the integral on the right side of (97) The absolute valueof the integral is at most xminus12 times

      1

      int minus 12 +iinfin

      minus 12minusiinfin

      ∣∣∣∣Lprime(s χ)

      L(s χ)Gδ(s)

      ∣∣∣∣ ds (913)

      By Cauchy-Schwarz this is at mostradicradicradicradic 1

      int minus 12 +iinfin

      minus 12minusiinfin

      ∣∣∣∣Lprime(s χ)

      L(s χ)middot 1

      s

      ∣∣∣∣2 |ds| middotradicradicradicradic 1

      int minus 12 +iinfin

      minus 12minusiinfin

      |Gδ(s)s|2 |ds|

      By (912)radicradicradicradicint minus 12 +iinfin

      minus 12minusiinfin

      ∣∣∣∣Lprime(s χ)

      L(s χ)middot 1

      s

      ∣∣∣∣2 |ds| leradicradicradicradicint minus 1

      2 +iinfin

      minus 12minusiinfin

      ∣∣∣∣ log q

      s

      ∣∣∣∣2 |ds|+

      radicradicradicradicint infinminusinfin

      ∣∣ 12 log

      (τ2 + 9

      4

      )+ 41396 + log π

      ∣∣214 + τ2

      leradic

      2π log q +radic

      226844

      where we compute the last integral numerically1

      Again we use the fact that by (210) sGδ(s) is the Mellin transform of

      minus td(e(δt)η(t))

      dt= minus2πiδte(δt)η(t)minus te(δt)ηprime(t) (914)

      Hence by Plancherel (as in (26))radicradicradicradic 1

      int minus 12 +iinfin

      minus 12minusiinfin

      |Gδ(s)s|2 |ds| =

      radicint infin0

      |minus2πiδte(δt)η(t)minus te(δt)ηprime(t)|2 tminus2dt

      = 2π|δ|

      radicint infin0

      |η(t)|2dt+

      radicint infin0

      |ηprime(t)|2dt

      (915)1By a rigorous integration from τ = minus100000 to τ = 100000 using VNODE-LP [Ned06] which runs

      on the PROFILBIAS interval arithmetic package [Knu99]

      91 A GENERAL EXPLICIT FORMULA 169

      Thus (913) is at most(log q +

      radic226844

      )middot (|ηprime|2 + 2π|δ||η|2)

      Lemma 911 leaves us with three tasks bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with small imaginary part bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with large imaginary part and bounding Lprime(1 χ)L(1 χ) Letus start with the last task while in a narrow sense it is optional ndash in that in theapplications we actually need (Thm 712 Cor 713 and Thm 714) we will haveη(0) = 0 thus making the term Lprime(1 χ)L(1 χ) disappear ndash it is also very easy andcan be dealt with quickly

      Since we will be using a finite GRH check in all later applications we might aswell use it here

      Lemma 912 Let χ be a primitive character mod q q gt 1 Assume that all non-trivialzeroes ρ = σ + it of L(s χ) with |t| le 58 satisfy lt(ρ) = 12 Then∣∣∣∣Lprime(1 χ)

      L(1 χ)

      ∣∣∣∣ le 5

      2logM(q) + c

      where M(q) = maxn

      ∣∣∣summlen χ(m)∣∣∣ and

      c = 5 log2radic

      3

      ζ(94)ζ(98)= 1507016

      Proof By a lemma of Landaursquos (see eg [MV07 Lemma 63] where the constantsare easily made explicit) based on the Borel-Caratheodory Lemma (as in [MV07Lemma 62]) any function f analytic and zero-free on a disc Cs0R = s |sminus s0| leR of radius R gt 0 around s0 satisfies

      f prime(s)

      f(s)= Olowast

      (2R logM|f(s0)|

      (Rminus r)2

      )(916)

      for all s with |s minus s0| le r where 0 lt r lt R and M is the maximum of |f(z)| onCs0R Assuming L(s χ) has no non-trivial zeros off the critical line with |=(s)| le H where H gt 12 we set s0 = 12 +H r = H minus 12 and let Rrarr Hminus We obtain

      Lprime(1 χ)

      L(1 χ)= Olowast

      (8H log

      maxsisinCs0H |L(s χ)||L(s0 χ)|

      ) (917)

      Now

      |L(s0 χ)| geprodp

      (1 + pminuss0)minus1 =prodp

      (1minus pminus2s0)minus1

      (1minus pminuss0)minus1=ζ(2s0)

      ζ(s0)

      Since s0 = 12 +H Cs0H is contained in s isin C lt(s) gt 12 for any value of H We choose (somewhat arbitrarily) H = 58

      170 CHAPTER 9 EXPLICIT FORMULAS

      By partial summation for s = σ + it with 12 le σ lt 1 and any N isin Z+

      L(s χ) =sumnleN

      χ(m)nminuss minus

      summleN

      χ(m)

      (N + 1)minuss

      +sum

      ngeN+1

      summlen

      χ(m)

      (nminuss minus (n+ 1)minuss+1)

      = Olowast(N1minus12

      1minus 12+N1minusσ +M(q)Nminusσ

      )

      (918)

      where M(q) = maxn

      ∣∣∣summlen χ(m)∣∣∣ We set N = M(q)3 and obtain

      |L(s χ)| le 2M(q)Nminus12 = 2radic

      3radicM(q) (919)

      We put this into (917) and are done

      Let M(q) be as in the statement of Lem 912 Since the sum of χ(n) (χ mod qq gt 1) over any interval of length q is 0 it is easy to see that M(q) le q2 We alsohave the following explicit version of the Polya-Vinogradov inequality

      M(q) le

      2π2

      radicq log q + 4

      π2

      radicq log log q + 3

      2

      radicq if χ(minus1) = 1

      12π

      radicq log q + 1

      π

      radicq log log q +

      radicq if χ(minus1) = 1

      (920)

      Taken together with M(q) le q2 this implies that

      M(q) le q45 (921)

      for all q ge 1 and also thatM(q) le 2q35 (922)

      for all q ge 1Notice lastly that ∣∣∣∣log

      q+ γ

      ∣∣∣∣ le log q + logeγ middot 2π

      32

      for all q ge 3 (There are no primitive characters modulo 2 so we can omit q = 2)We conclude that for χ primitive and non-trivial∣∣∣∣log

      q+ γ minus Lprime(1 χ)

      L(1 χ)

      ∣∣∣∣ le logeγ middot 2π

      32+ log q +

      5

      2log q

      45 + 1507017

      le 3 log q + 15289

      Obviously 15289 is more than log 2π the bound for χ trivial Hence the absolutevalue of the quantity R in the statement of Lemma 911 is at most

      |η(0)|(3 log q + 15289) + |c0| (923)

      91 A GENERAL EXPLICIT FORMULA 171

      for all primitive χIt now remains to bound the sum

      sumρGδ(ρ)xρ in (91) Clearly∣∣∣∣∣sum

      ρ

      Gδ(ρ)xρ

      ∣∣∣∣∣ lesumρ

      |Gδ(ρ)| middot xlt(ρ)

      Recall that these are sums over the non-trivial zeros ρ of L(s χ)We first prove a general lemma on sums of values of functions on the non-trivial

      zeros of L(s χ) This is little more than partial summation given a (classical) boundfor the number of zeroesN(T χ) of L(s χ) with |=(s)| le T The error term becomesparticularly simple if f is real-valued and decreasing the statement is then practicallyidentical to that of [Leh66 Lemma 1] (for χ principal) except for the fact that the errorterm is improved here

      Lemma 913 Let f R+ rarr C be piecewise C1 Assume limtrarrinfin f(t)t log t = 0Let χ be a primitive character mod q q ge 1 let ρ denote the non-trivial zeros ρ ofL(s χ) Then for any y ge 1sum

      ρ non-trivial=(ρ)gty

      f(=(ρ)) =1

      int infiny

      f(T ) logqT

      2πdT

      +1

      2Olowast(|f(y)|gχ(y) +

      int infiny

      |f prime(T )| middot gχ(T )dT

      )

      (924)

      wheregχ(T ) = 05 log qT + 177 (925)

      If f is real-valued and decreasing on [yinfin) the second line of (924) equals

      Olowast(

      1

      4

      int infiny

      f(T )

      TdT

      )

      Proof WriteN(T χ) for the number of non-trivial zeros ofL(s χ) satisfying |=(s)| leT Write N+(T χ) for the number of (necessarily non-trivial) zeros of L(s χ) with0 lt =(s) le T Then for any f R+ rarr C with f piecewise differentiable andlimtrarrinfin f(t)N(T χ) = 0sum

      ρ=(ρ)gty

      f(=(ρ)) =

      int infiny

      f(T ) dN+(T χ)

      = minusint infiny

      f prime(T )(N+(T χ)minusN+(y χ))dT

      = minus1

      2

      int infiny

      f prime(T )(N(T χ)minusN(y χ))dT

      Now by [Ros41 Thms 17ndash19] and [McC84a Thm 21] (see also [Tru Thm 1])

      N(T χ) =T

      πlog

      qT

      2πe+Olowast (gχ(T )) (926)

      172 CHAPTER 9 EXPLICIT FORMULAS

      for T ge 1 where gχ(T ) is as in (925) (This is a classical formula the referencesserve to prove the explicit form (925) for the error term gχ(T ))

      Thus for y ge 1sumρ=(ρ)gty

      f(=(ρ)) = minus1

      2

      int infiny

      f prime(T )

      (T

      πlog

      qT

      2πeminus y

      πlog

      qy

      2πe

      )dT

      +1

      2Olowast(|f(y)|gχ(y) +

      int infiny

      |f prime(T )| middot gχ(T )dT

      )

      (927)

      Here

      minus 1

      2

      int infiny

      f prime(T )

      (T

      πlog

      qT

      2πeminus y

      πlog

      qy

      2πe

      )dT =

      1

      int infiny

      f(T ) logqT

      2πdT (928)

      If f is real-valued and decreasing (and so by limtrarrinfin f(t) = 0 non-negative)

      |f(y)|gχ(y) +

      int infiny

      |f prime(T )| middot gχ(T )dT = f(y)gχ(y)minusint infiny

      f prime(T )gχ(T )dT

      = 05

      int infiny

      f(T )

      TdT

      since gprimeχ(T ) le 05T for all T ge T0

      Let us bound the part of the sumsumρGδ(ρ)xρ corresponding to ρ with bounded

      |=(ρ)| The bound we will give is proportional toradicT0 log qT0 whereas a very naive

      approach (based on the trivial bound |Gδ(σ + iτ)| le |G0(σ)|) would give a boundproportional to T0 log qT0

      We could obtain a bound proportional toradicT0 log qT0 for η(t) = tkeminust

      22 by usingTheorem 801 Instead we will give a bound of that same quality valid for η essentiallyarbitrary simply by using the fact that the Mellin transform is an isometry (preceded byan application of Cauchy-Schwarz)

      Lemma 914 Let η R+0 rarr R be such that both η(t) and (log t)η(t) lie in L1 cap L2

      and η(t)radict lies in L1 (with respect to dt) Let δ isin R Let Gδ(s) be the Mellin

      transform of η(t)e(δt)Let χ be a primitive character mod q q ge 1 Let T0 ge 1 Assume that all non-

      trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Thensumρ non-trivial|=(ρ)|leT0

      |Gδ(ρ)|

      is at most

      (|η|2 + |η middot log |2)radicT0 log qT0 + (1721|η middot log |2 minus (log 2π

      radice)|η|2)

      radicT0

      +∣∣∣η(t)

      radict∣∣∣1middot (132 log q + 345)

      (929)

      91 A GENERAL EXPLICIT FORMULA 173

      Proof For s = 12 + iτ we have the trivial bound

      |Gδ(s)| leint infin

      0

      |η(t)|t12 dtt

      =∣∣∣η(t)

      radict∣∣∣1 (930)

      where Fδ is as in (947) We also have the trivial bound

      |Gprimeδ(s)| =∣∣∣∣int infin

      0

      (log t)η(t)tsdt

      t

      ∣∣∣∣ le int infin0

      |(log t)η(t)|tσ dtt

      =∣∣(log t)η(t)tσminus1

      ∣∣1

      (931)for s = σ + iτ

      Let us start by bounding the contribution of very low-lying zeros (|=(ρ)| le 1) By(926) and (925)

      N(1 χ) =1

      πlog

      q

      2πe+Olowast (05 log q + 177) = Olowast(0819 log q + 168)

      Therefore sumρ non-trivial|=(ρ)|le1

      |Gδ(ρ)| le∣∣∣η(t)tminus12

      ∣∣∣1middot (0819 log q + 168)

      Let us now consider zeros ρ with |=(ρ)| gt 1 Apply Lemma 913 with y = 1 and

      f(t) =

      |Gδ(12 + it)| if t le T0

      0 if t gt T0

      This gives us thatsumρ1lt|=(ρ)|leT0

      f(=(ρ)) =1

      π

      int T0

      1

      f(T ) logqT

      2πdT

      +Olowast(|f(1)|gχ(1) +

      int infin1

      |f prime(T )| middot gχ(T ) dT

      )

      (932)

      where we are using the fact that f(σ+ iτ) = f(σminus iτ) (because η is real-valued) ByCauchy-Schwarz

      1

      π

      int T0

      1

      f(T ) logqT

      2πdT le

      radic1

      π

      int T0

      1

      |f(T )|2dT middot

      radic1

      π

      int T0

      1

      (log

      qT

      )2

      dT

      Now

      1

      π

      int T0

      1

      |f(T )|2dT le 1

      int infinminusinfin

      ∣∣∣∣Gδ (1

      2+ iT

      )∣∣∣∣2 dT le int infin0

      |e(δt)η(t)|2dt = |η|22

      by Plancherel (as in (26)) We also haveint T0

      1

      (log

      qT

      )2

      dT le 2π

      q

      int qT02π

      0

      (log t)2dt le

      ((log

      qT0

      2πe

      )2

      + 1

      )middot T0

      174 CHAPTER 9 EXPLICIT FORMULAS

      Hence1

      π

      int T0

      1

      f(T ) logqT

      2πdT le

      radic(log

      qT0

      2πe

      )2

      + 1 middot |η|2radicT0

      Again by Cauchy-Schwarzint infin1

      |f prime(T )| middot gχ(T ) dT le

      radic1

      int infinminusinfin|f prime(T )|2dT middot

      radic1

      π

      int T0

      1

      |gχ(T )|2dT

      Since |f prime(T )| = |Gprimeδ(12 + iT )| and (Mη)prime(s) is the Mellin transform of log(t) middote(δt)η(t) (by (210))

      1

      int infinminusinfin|f prime(T )|2dT = |η(t) log(t)|2

      Much as beforeint T0

      1

      |gχ(T )|2dT leint T0

      0

      (05 log qT + 177)2dT

      = (025(log qT0)2 + 172(log qT0) + 29609)T0

      Summing we obtain

      1

      π

      int T0

      1

      f(T ) logqT

      2πdT +

      int infin1

      |f prime(T )| middot gχ(T ) dT

      le((

      logqT0

      2πe+

      1

      2

      )|η|2 +

      (log qT0

      2+ 1721

      )|η(t)(log t)|2

      )radicT0

      Finally by (930) and (925)

      |f(1)|gχ(1) le∣∣∣η(t)

      radict∣∣∣1middot (05 log q + 177)

      By (932) and the assumption that all non-trivial zeros with |=(ρ)| le T0 lie on the linelt(s) = 12 we conclude thatsum

      ρ non-trivial1lt|=(ρ)|leT0

      |Gδ(ρ)| le (|η|2 + |η middot log |2)radicT0 log qT0

      + (1721|η middot log |2 minus (log 2πradice)|η|2)

      radicT0

      +∣∣∣η(t)

      radict∣∣∣1middot (05 log q + 177)

      All that remains is to bound the contribution tosumρGδ(ρ)xρ corresponding to all

      zeroes ρ with |=(ρ)| gt T0 This will do by another application of Lemma 913combined with bounds on Gδ(ρ) for =(ρ) large This is the only part that will requireus to take a look at the actual smoothing function η we are working with it is at thispoint not before that we actually have to look at each of our options for η one by one

      92 SUMS AND DECAY FOR THE GAUSSIAN 175

      92 Sums and decay for the GaussianIt is now time to derive our bounds for the Gaussian smoothing As we were sayingthere is really only one thing left to do namely an estimate for the sum

      sumρ |Fδ(ρ)|

      over all zeros ρ with |=(ρ)| gt T0

      Lemma 921 Let ηhearts(t) = eminust22 Let x isin R+ δ isin R Let χ be a primitive character

      mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge 50

      Write Fδ(s) for the Mellin transform of η(t)e(δt) Thensumρ

      |=(ρ)|gtT0

      |Fδ(ρ)| le logqT0

      2πmiddot(

      353eminus01598T0 + 225δ2

      T0eminus01065( T0

      π|δ| )2)

      Here we have preferred to give a bound with a simple form It is probably feasibleto derive from Theorem 801 a bound essentially proportional to eminusE(ρ)T0 where ρ =T0(πδ)

      2 and E(ρ) is as in (82) (As we discussed in sect85 E(ρ) behaves as eminus(π4)T0

      for ρ large and as eminus0125(T0(πδ))2

      for ρ small)

      Proof First of allsumρ

      |=(ρ)|gtT0

      |Fδ(ρ)| =sumρ

      =(ρ)gtT0

      (|Fδ(ρ)|+ |Fδ(1minus ρ)|)

      by the functional equation (which implies that non-trivial zeros come in pairs ρ 1minusρ)Hence by a somewhat brutish application of Cor 802sum

      ρ

      |=(ρ)|gtT0

      |Fδ(ρ)| lesumρ

      =(ρ)gtT0

      f(=(ρ)) (933)

      wheref(τ) = 3001eminus01065( τ

      πδ )2

      + 3286eminus01598|τ | (934)

      Obviously f(τ) is a decreasing function of τ for τ ge T0We now apply Lemma 913 We obtain thatsum

      ρ

      =(ρ)gtT0

      f(=(ρ)) leint infinT0

      f(T )

      (1

      2πlog

      qT

      2π+

      1

      4T

      )dT (935)

      We just need to estimate some integrals For any y ge 1 c c1 gt 0int infiny

      (log t+

      c1t

      )eminusctdt le

      int infiny

      (log tminus 1

      ct

      )eminusctdt+

      (1

      c+ c1

      )int infiny

      eminusct

      tdt

      =(log y)eminuscy

      c+

      (1

      c+ c1

      )E1(cy)

      176 CHAPTER 9 EXPLICIT FORMULAS

      where E1(x) =intinfinxeminustdtt Clearly E1(x) le

      intinfinxeminustdtx = eminusxx Henceint infin

      y

      (log t+

      c1t

      )eminusctdt le

      (log y +

      (1

      c+ c1

      )1

      y

      )eminuscy

      c

      We conclude thatint infinT0

      eminus01598t

      (1

      2πlog

      qt

      2π+

      1

      4t

      )dt

      le 1

      int infinT0

      (log t+

      π2

      t

      )eminusctdt+

      log q2π

      2πc

      int infinT0

      eminusctdt

      =1

      2πc

      (log T0 + log

      q

      2π+

      (1

      c+π

      2

      )1

      T0

      )eminuscT0

      (936)

      with c = 01598 Since T0 ge 50 and q ge 1 this is at most

      1072 logqT0

      2πeminuscT0 (937)

      Now let us deal with the Gaussian term (It appears only if T0 lt (32)(πδ)2 asotherwise |τ | ge (32)(πδ)2 holds whenever |τ | ge T0) For any y ge e c ge 0int infin

      y

      eminusct2

      dt =1radicc

      int infinradiccy

      eminust2

      dt le 1

      cy

      int infinradiccy

      teminust2

      dt le eminuscy2

      2cy (938)

      int infiny

      eminusct2

      tdt =

      int infincy2

      eminust

      2tdt =

      E1(cy2)

      2le eminuscy

      2

      2cy2 (939)int infin

      y

      (log t)eminusct2

      dt leint infiny

      (log t+

      log tminus 1

      2ct2

      )eminusct

      2

      dt =log y

      2cyeminuscy

      2

      (940)

      Hence int infinT0

      eminus01065( Tπδ )2(

      1

      2πlog

      qT

      2π+

      1

      4T

      )dT

      =

      int infinT0π|δ|

      eminus01065t2(|δ|2

      logq|δ|t

      2+

      1

      4t

      )dt

      le

      |δ|2 log T0

      π|δ|

      2cprime T0

      π|δ|+|δ|2 log q|δ|

      2

      2cprime T0

      π|δ|+

      1

      8cprime(T0

      π|δ|

      )2

      eminuscprime( T0π|δ| )

      2

      (941)

      with cprime = 01065 Since T0 ge 50 and q ge 1

      8T0le π

      200le 00152 middot 1

      2log

      qT0

      Thus the last line of (941) is less than

      10152|δ|2 log qT0

      2π2cprimeT0

      π|δ|eminusc

      prime( T0π|δ| )

      2

      = 7487δ2

      T0middot log

      qT0

      2πmiddot eminusc

      prime( T0π|δ| )

      2

      (942)

      92 SUMS AND DECAY FOR THE GAUSSIAN 177

      Again by T0 ge 4π2|δ| we see that 10057π|δ|(4cT0) le 10057(16cπ) le 018787To obtain our final bound we simply sum (937) and (942) after multiplying them

      by the constants 3286 and 3001 in (934) We conclude that the integral in (935) is atmost (

      353eminus01598T0 + 225δ2

      T0eminus01065( T0

      π|δ| )2)

      logqT0

      We need to record a few norms related to the Gaussian ηhearts(t) = eminust22 before we

      proceed Recall we are working with the one-sided Gaussian ie we set ηhearts(t) = 0for t lt 0 Symbolic integration then gives

      |ηhearts|22 =

      int infin0

      eminust2

      dt =

      radicπ

      2

      |ηprimehearts|22 =

      int infin0

      (teminust22)2dt =

      radicπ

      4

      |ηhearts middot log |22 =

      int infin0

      eminust2

      (log t)2dt

      =

      radicπ

      16

      (π2 + 2γ2 + 8γ log 2 + 8(log 2)2

      )le 194753

      (943)

      |ηhearts(t)radict|1 =

      int infin0

      eminust22

      radictdt =

      Γ(14)

      234le 215581

      |ηprimehearts(t)radict| = |ηhearts(t)

      radict|1 =

      int infin0

      eminust2

      2

      radictdt =

      Γ(34)

      214le 103045∣∣∣ηprimehearts(t)t12

      ∣∣∣1

      =∣∣∣ηhearts(t)t32

      ∣∣∣1

      =

      int infin0

      eminust2

      2 t32 dt = 107791

      (944)

      We can now state what is really our main result for the Gaussian smoothing (Theversion in sect71 will as we shall later see follow from this given numerical inputs)

      Proposition 922 Let η(t) = eminust22 Let x ge 1 δ isin R Let χ be a primitive character

      mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie onthe critical line Assume that T0 ge 50

      Then

      infinsumn=1

      Λ(n)χ(n)e

      xn

      )η(nx

      )=

      η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

      (945)where

      errηχ(δ x) = logqT0

      2πmiddot(

      353eminus01598T0 + 225δ2

      T0eminus01065( T0

      π|δ| )2)

      + (2337radicT0 log qT0 + 21817

      radicT0 + 285 log q + 7438)xminus

      12

      + (3 log q + 14|δ|+ 17)xminus1 + (log q + 6) middot (1 + 5|δ|) middot xminus32

      178 CHAPTER 9 EXPLICIT FORMULAS

      Proof Let Fδ(s) be the Mellin transform of ηhearts(t)e(δt) By Lemmas 914 (withGδ =Fδ) and Lemma 921 ∣∣∣∣∣∣

      sumρ non-trivial

      Fδ(ρ)xρ

      ∣∣∣∣∣∣is at most (929) (with η = ηhearts) times

      radicx plus

      logqT0

      2πmiddot(

      353eminus01598T0 + 225|δ|2

      T0eminus01065( T0

      π|δ| )2)middot x

      By the norm computations in (943) and (944) we see that (929) is at most

      2337radicT0 log qT0 + 21817

      radicT0 + 285 log q + 7438

      Let us now apply Lemma 911 We saw that the value of R in Lemma 911 isbounded by (923) We know that ηhearts(0) = 1 Again by (943) and (944) the quantityc0 defined in (93) is at most 14056 + 133466|δ| Hence

      |R| le 3 log q + 13347|δ|+ 16695

      Lastly|ηprimehearts|2 + 2π|δ||ηhearts|2 le 0942 + 4183|δ| le 1 + 5|δ|

      Clearly(601minus 6) middot (1 + 5|δ|) + 13347|δ|+ 16695 lt 14|δ|+ 17

      and so we are done

      93 The case of ηlowast(t)We will now work with a weight based on the Gaussian

      η(t) =

      t2eminust

      22 if t ge 00 if t lt 0

      (946)

      The fact that this vanishes at t = 0 actually makes it easier to work with at severallevels

      Its Mellin transform is just a shift of that of the Gaussian Write

      Fδ(s) = (M(eminust2

      2 e(δt)))(s)

      Gδ(s) = (M(η(t)e(δt)))(s)(947)

      Then by the definition of the Mellin transform

      Gδ(s) = Fδ(s+ 2)

      We start by bounding the contribution of zeros with large imaginary part just asbefore

      93 THE CASE OF ηlowast(T ) 179

      Lemma 931 Let η(t) = t2eminust22 Let x isin R+ δ isin R Let χ be a primitive character

      mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge max(10π|δ| 50)

      Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

      sumρ

      |=(ρ)|gtT0

      |Gδ(ρ)| le T0 logqT0

      2πmiddot(

      611eminus01598T0 + 1578eminus01065middot T

      20

      (πδ)2

      )

      Proof We start by writingsumρ

      |=(ρ)|gtT0

      |Gδ(ρ)| =sumρ

      =(ρ)gtT0

      (|Fδ(ρ+ 2)|+ |Fδ((1minus ρ) + 2)|)

      where we are usingGδ(ρ) = Fδ(ρ+2) and the fact that non-trivial zeros come in pairsρ 1minus ρ

      By Cor 802 with k = 2sumρ

      |=(ρ)|gtT0

      |Gδ(ρ)| lesumρ

      =(ρ)gtT0

      f(=(ρ))

      where

      f(τ) =

      κ21|τ |eminus01598|τ | +κ20

      4

      (|τ |πδ

      )2

      eminus01065( |τ|πδ )2

      if |τ | lt 32 (πδ)2

      κ21|τ |eminus01598|τ | if |τ | ge 32 (πδ)2

      (948)

      where κ20 = 796 and κ21 = 513 We are including the term |τ |eminus01598|τ | in bothcases in part because we cannot be bothered to take it out (just as we could not bebothered in the proof of Lem 921) and in part to ensure that f(τ) is a decreasingfunction of τ for τ ge T0

      We can now apply Lemma 913 We obtain againsumρ

      =(ρ)gtT0

      f(=(ρ)) leint infinT0

      f(T )

      (1

      2πlog

      qT

      2π+

      1

      4T

      )dT (949)

      Just as before we will need to estimate some integralsFor any y ge 1 c c1 gt 0 such that log y gt 1(cy)int infin

      y

      teminusctdt =

      (y

      c+

      1

      c2

      )eminuscy

      int infiny

      (t log t+

      c1t

      )eminusctdt le

      int infiny

      ((t+

      aminus 1

      c

      )log tminus 1

      cminus a

      c2t

      )eminusctdt

      =(yc

      +a

      c2

      )eminuscy log y

      (950)

      180 CHAPTER 9 EXPLICIT FORMULAS

      where

      a =

      log yc + 1

      c + c1y

      log yc minus

      1c2y

      Setting c = 01598 c1 = π2 y = T0 ge 50 we obtain thatint infinT0

      (1

      2πlog

      qT

      2π+

      1

      4T

      )Teminus01598T dT

      le 1

      (log

      q

      2πmiddot(T0

      c+

      1

      c2

      )+

      (T0

      c+a

      c2

      )log T0

      )eminus01598T0

      (951)

      and

      a =

      log T0

      01598 + 101598 + π2

      T0

      log T0

      01598 minus1

      015982T0

      le 1299

      It is easy to see that ratio of the expression within parentheses on the right side of(951) to T0 log(qT02π) increases as q decreases and if we hold q fixed decreases asT0 ge 2π increases thus it is maximal for q = 1 and T0 = 50 Multiplying (951) byκ21 = 513 and simplifying by the assumption T0 ge 50 we obtain thatint infin

      T0

      513Teminus01598T

      (1

      2πlog

      qT0

      2π+

      1

      4T

      )dT le 611T0 log

      qT0

      2πmiddot eminus01598T0

      (952)Now let us examine the Gaussian term First of all ndash when does it arise If T0 ge

      (32)(πδ)2 then |τ | ge (32)(πδ)2 holds whenever |τ | ge T0 and so (948) does notgive us a Gaussian term Recall that T0 ge 10π|δ| which means that |δ| le 20(3π)implies that T0 ge (32)(πδ)2 We can thus assume from now on that |δ| gt 20(3π)since otherwise there is no Gaussian term to treat

      For any y ge 1 c c1 gt 0int infiny

      t2eminusct2

      dt lt

      int infiny

      (t2 +

      1

      4c2t2

      )eminusct

      2

      dt =

      (y

      2c+

      1

      4c2y

      )middot eminuscy

      2

      int infiny

      (t2 log t+ c1t) middot eminusct2

      dt leint infiny

      (t2 log t+

      at log et

      2cminus log et

      2cminus a

      4c2t

      )eminusct

      2

      dt

      =(2cy + a) log y + a

      4c2middot eminuscy

      2

      where

      a =c1y + log ey

      2cy log ey

      2c minus 14c2y

      =1

      y+

      c1y + 14c2y2

      y log ey2c minus 1

      4c2y

      =1

      y+

      2c1c

      log ey+

      c12cy log ey + 1

      4c2y2

      y log ey2c minus 1

      4c2y

      (Note that a decreases as y ge y0 increases provided that log ey0 gt 1(2cy20)) Setting

      93 THE CASE OF ηlowast(T ) 181

      c = 01065 c1 = 1(2|δ|) le 316 and y = T0(π|δ|) ge 4π we obtainint infinT0π|δ|

      (1

      2πlog

      q|δ|t2

      +1

      4π|δ|t

      )t2eminus01065t2dt

      le(

      1

      2πlog

      q|δ|2

      )middot(

      T0

      2πc|δ|+

      1

      4c2 middot 10

      )middot eminus01065( T0

      π|δ| )2

      +1

      2πmiddot

      (2c T0

      π|δ| + a)

      log T0

      π|δ| + a

      4c2middot eminus01065( T0

      π|δ| )2

      and

      a le 1

      10+

      (2middot203π

      )minus1 middot 10 + 14middot010652middot102

      10 log 10e2middot01065 minus

      14middot010652middot10

      le 0117

      Multiplying by (κ204)π|δ| we get thatint infinT0

      κ20

      4

      (T

      π|δ|

      )2

      eminus01065( Tπ|δ| )

      2(

      1

      2πlog

      qT0

      2π+

      1

      4T

      )dT (953)

      is at most eminus01065( T0π|δ| )

      2

      times((1487T0 + 2194|δ|) middot log

      q|δ|2

      + 1487T0 logT0

      π|δ|+ 2566|δ| log

      eT0

      π|δ|

      )le

      (1487 + 2566 middot

      1 + 1log T0π|δ|

      T0|δ|

      )T0 log

      qT0

      2πle 1578 middot T0 log

      qT0

      (954)

      where we are using several times the assumption that T0 ge 4π2|δ| (and in one occa-sion the fact that |δ| gt 20(3π) gt 2)

      We sum (952) and the estimate for (953) we have just got to reach our conclusion

      Again we record some norms obtained by symbolic integration for η as in (946)

      |η|22 =3

      8

      radicπ |ηprime|22 =

      7

      16

      radicπ

      |η middot log |22 =

      radicπ

      64

      (8(3γ minus 8) log 2 + 3π2 + 6γ2 + 24(log 2)2 + 16minus 32γ

      )le 016364

      |η(t)radict|1 =

      214Γ(14)

      4le 107791 |η(t)

      radict|1 =

      3

      4234Γ(34) le 154568

      |ηprime(t)radict|1 =

      int radic2

      0

      t32eminust2

      2 dtminusint infinradic

      2

      t32eminust2

      2 dt le 148469

      |ηprime(t)radict|1 le 172169

      (955)

      182 CHAPTER 9 EXPLICIT FORMULAS

      Proposition 932 Let η(t) = t2eminust22 Let x ge 1 δ isin R Let χ be a primitive

      character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0

      lie on the critical line Assume that T0 ge max(10π|δ| 50)Theninfinsumn=1

      Λ(n)χ(n)e

      xn

      )η(nx) =

      η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

      (956)where

      errηχ(δ x) = T0 logqT0

      2πmiddot(

      611eminus01598T0 + 1578eminus01065middot T

      20

      (πδ)2

      )+(

      122radicT0 log qT0 + 5056

      radicT0 + 1423 log q + 3719

      )middot xminus12

      + (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32(957)

      Proof We proceed as in the proof of Prop 922 The contribution of Lemma 931 is

      T0 logqT0

      2πmiddot(

      611eminus01598T0 + 1578eminus01065middot T

      20

      (πδ)2

      )middot x

      whereas the contribution of Lemma 914 is at most

      (122radicT0 log qT0 + 5056

      radicT0 + 1423 log q + 37188)

      radicx

      Let us now apply Lemma 911 Since η(0) = 0 we have

      R = Olowast(c0) = Olowast(2138 + 1099|δ|)

      Lastly|ηprime|2 + 2π|δ||η|2 le 0881 + 5123|δ|

      Now that we have Prop 932 we can derive from it similar bounds for a smoothingdefined as the multiplicative convolution of η with something else In general forϕ1 ϕ2 [0infin)rarr C if we know how to bound sums of the form

      Sfϕ1(x) =sumn

      f(n)ϕ1(nx) (958)

      we can bound sums of the form Sfϕ1lowastMϕ2 simply by changing the order of summationand integration

      Sfϕ1lowastMϕ2 =sumn

      f(n) middot (ϕ1 lowastM ϕ2)(nx

      )=

      int infin0

      sumn

      f(n)ϕ1

      ( n

      wx

      )ϕ2(w)

      dw

      w=

      int infin0

      Sfϕ1(wx)ϕ2(w)

      dw

      w

      (959)

      93 THE CASE OF ηlowast(T ) 183

      This is particularly nice if ϕ2(t) vanishes in a neighbourhood of the origin since thenthe argument wx of Sfϕ1(wx) is always large

      We will use ϕ1(t) = t2eminust22 ϕ2(t) = η1 lowastM η1 where η1 is 2 times the char-

      acteristic function of the interval [12 1] The motivation for the choice of ϕ1 and ϕ2

      is clear we have just got bounds based on ϕ1(t) in the major arcs and we obtainedminor-arc bounds for the weight ϕ2(t) in Part I

      Corollary 933 Let η(t) = t2eminust22 η1 = 2 middot I[121] η2 = η1 lowastM η1 Let ηlowast =

      η2 lowastM η Let x isin R+ δ isin R Let χ be a primitive character mod q q ge 1 Assumethat all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Assumethat T0 ge max(10π|δ| 50)

      Theninfinsumn=1

      Λ(n)χ(n)e

      xn

      )ηlowast(nx) =

      ηlowast(minusδ)x+Olowast (errηlowastχ(δ x)) middot x if q = 1Olowast (errηlowastχ(δ x)) middot x if q gt 1

      (960)where

      errηχlowast(δ x) = T0 logqT0

      2πmiddot(

      611eminus01598T0 + 00102 middot eminus01065middot T20

      (πδ)2

      )+(

      1679radicT0 log qT0 + 6957

      radicT0 + 1958 log q + 5117

      )middot xminus 1

      2

      + (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32(961)

      Proof The left side of (960) equalsint infin0

      infinsumn=1

      Λ(n)χ(n)e

      (δn

      x

      )η( n

      wx

      )η2(w)

      dw

      w

      =

      int 1

      14

      infinsumn=1

      Λ(n)χ(n)e

      (δwn

      wx

      )η( n

      wx

      )η2(w)

      dw

      w

      since η2 is supported on [minus14 1] By Prop 932 the main term (if q = 1) contributesint 1

      14

      η(minusδw)xw middot η2(w)dw

      w= x

      int infin0

      η(minusδw)η2(w)dw

      = x

      int infin0

      int infinminusinfin

      η(t)e(δwt)dt middot η2(w)dw = x

      int infin0

      int infinminusinfin

      η( rw

      )e(δr)

      dr

      wη2(w)dw

      = x

      int infinminusinfin

      (int infin0

      η( rw

      )η2(w)

      dw

      w

      )e(δr)dr = ηlowast(minusδ) middot x

      The error term isint 1

      14

      errηχ(δwwx) middot wx middot η2(w)dw

      w= x middot

      int 1

      14

      errηχ(δwwx)η2(w)dw (962)

      184 CHAPTER 9 EXPLICIT FORMULAS

      Using the fact that

      η2(w) =

      4 log 4w if w isin [14 12]4 logwminus1 if w isin [12 1]0 otherwise

      we can easily check thatint infin0

      η2(w)dw = 1

      int infin0

      wminus12η2(w)dw le 137259int infin0

      wminus1η2(w)dw = 4(log 2)2 le 192182

      int infin0

      wminus32η2(w)dw le 274517

      and by rigorous numerical integration from 14 to 12 and from 12 to 1 (using egVNODE-LP [Ned06])int infin

      0

      eminus01065middot102( 1w2minus1)η2(w)dw le 0006446

      We then see that (957) and (962) imply (961)

      94 The case of η+(t)

      We will work with

      η(t) = η+(t) = hH(t) middot tηhearts(t) = hH(t) middot teminust22 (963)

      where hH is as in (76) We recall that hH is a band-limited approximation to thefunction h defined in (75) ndash to be more precise MhH(it) is the truncation of Mh(it)to the interval [minusHH]

      We are actually defining h hH and η in a slightly different way from what was donein the first version of [Hela] The difference is instructive There η(t) was defined ashH(t)eminust

      22 and hH was a band-limited approximation to a function h defined as in(75) but with t3(2 minus t)3 instead of t2(2 minus t)3 The reason for our new definitions isthat now the truncation of Mh(it) will not break the holomorphy of Mη and so wewill be able to use the general results we proved in sect91

      In essence Mh will still be holomorphic because the Mellin transform of tηhearts(t) isholomorphic in the domain we care about unlike the Mellin transform of ηhearts(t) whichdoes have a pole at s = 0

      As usual we start by bounding the contribution of zeros with large imaginary partThe procedure is much as before since η+(t) = ηH(t)ηhearts(t) the Mellin transformMη+ is a convolution of M(teminust

      22) and something of support in [minusHH]i namelyMηH restricted to the imaginary axis This means that the decay of Mη+ is (at worst)like the decay of M(teminust

      22) delayed by H

      94 THE CASE OF η+(T ) 185

      Lemma 941 Let η = η+ be as in (963) for some H ge 25 Let x isin R+ δ isin R Letχ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ)with |=(ρ)| le T0 satisfy lt(s) = 12 where T0 ge H + max(10π|δ| 50)

      Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

      sumρ

      |=(ρ)|gtT0

      |Gδ(ρ)| le

      (11308

      radicT prime0eminus01598T prime0 + 16147|δ|e

      minus01065

      (T prime0πδ

      )2)log

      qT0

      where T prime0 = T0 minusH

      Proof As usual sumρ

      |=(ρ)|gtT0

      |Gδ(ρ)| =sumρ

      =(ρ)gtT0

      (|Gδ(ρ)|+ |Gδ(1minus ρ)|)

      Let Fδ be as in (947) Then since η+(t)e(δt) = hH(t)teminust22e(δt) where hH is as

      in (76) we see by (29) that

      Gδ(s) =1

      int H

      minusHMh(ir)Fδ(s+ 1minus ir)dr

      and so since |Mh(ir)| = |Mh(minusir)|

      |Gδ(ρ)|+ |Gδ(1minus ρ)| le 1

      int H

      minusH|Mh(ir)|(|Fδ(1 +ρminus ir)|+ |Fδ(2minus (ρminus ir))|)dr

      (964)We apply Cor 802 with k = 1 and T0minusH instead of T0 and obtain that |Fδ(ρ)|+

      |Fδ(1minus ρ)| le g(τ) where

      g(τ) = κ11

      radic|τ |eminus01598|τ | + κ10

      |τ |2π|δ|

      eminus01065( τπδ )

      2

      (965)

      where κ10 = 4903 and κ11 = 4017 (As in the proof of Lemmas 921 and 931 weare putting in extra terms so as to simplify our integrals)

      From (964) we conclude that

      |Gδ(ρ)|+ |Gδ(1minus ρ)| le f(τ)

      for ρ = σ + iτ τ gt 0 where

      f(τ) =|Mh(ir)|1

      2πmiddot g(τ minusH)

      is decreasing for τ ge T0 (because g(τ) is decreasing for τ ge T0 minus H) By (A17)|Mh(ir)|1 le 16193918

      186 CHAPTER 9 EXPLICIT FORMULAS

      We apply Lemma 913 and get that

      sumρ

      |=(ρ)|gtT0

      |Gδ(ρ)| leint infinT0

      f(T )

      (1

      2πlog

      qT

      2π+

      1

      4T

      )dT

      =|Mh(ir)|1

      int infinT0

      g(T minusH)

      (1

      2πlog

      qT

      2π+

      1

      4T

      )dT

      (966)

      Now we just need to estimate some integrals For any y ge e2 c gt 0 and κ κ1 ge 0int infiny

      radicteminusctdt le

      (radicy

      c+

      1

      2c2radicy

      )eminuscy

      int infiny

      (radict log(t+ κ) +

      κ1radict

      )eminusctdt le

      (radicy

      c+

      a

      c2radicy

      )log(y + κ)eminuscy

      where

      a =1

      2+

      1 + cκ1

      log(y + κ)

      The contribution of the exponential term in (965) to (966) thus equals

      κ11|Mh(ir)|12π

      int infinT0

      (1

      2πlog

      qT

      2π+

      1

      4T

      )radicT minusH middot eminus01598(TminusH)dT

      le 103532

      int infinT0minusH

      (1

      2πlog(T +H) +

      log q2π

      2π+

      1

      4T

      )radicTeminus01598T dT

      le 103532

      (radicT0 minusH01598

      +a

      015982radicT0 minusH

      )log

      qT0

      2πmiddot eminus01598(T0minusH)

      (967)

      where a = 12+(1+01598π2) log T0 Since T0minusH ge 50 and T0 ge 50+25 = 75this is at most

      11308radicT0 minusH log

      qT0

      2πmiddot eminus01598(T0minusH)

      We now estimate a few more integrals so that we can handle the Gaussian term in(965) For any y gt 1 c gt 0 κ κ1 ge 0int infin

      y

      teminusct2

      dt =eminuscy

      2

      2c

      int infiny

      (t log(t+ κ) + κ1)eminusct2

      dt le

      (1 +

      κ1 + 12cy

      y log(y + κ)

      )log(y + κ) middot eminuscy2

      2c

      Proceeding just as before we see that the contribution of the Gaussian term in (965)

      94 THE CASE OF η+(T ) 187

      to (966) is at most

      κ10|Mh(ir)|12π

      int infinT0

      (1

      2πlog

      qT

      2π+

      1

      4T

      )T minusH2π|δ|

      middot eminus01065(TminusHπδ )2

      dT

      le 126368 middot |δ|4

      int infinT0minusHπ|δ|

      (log

      (T +

      H

      π|δ|

      )+ log

      q|δ|2

      +π2

      T

      )Teminus01065T 2

      dT

      le 126368 middot |δ|8 middot 01065

      1 +

      π2 + π|δ|

      2middot01065middot(T0minusH)

      T0minusHπ|δ| log T0

      π|δ|

      logqT0

      2πmiddot eminus01065(T0minusHπδ )

      2

      (968)Since (T0 minusH)(π|δ|) ge 10 this is at most

      16147|δ| logqT0

      2πmiddot eminus01065(T0minusHπδ )

      2

      Proposition 942 Let η = η+ be as in (963) for some H ge 25 Let x ge 103 δ isin RLet χ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0 lie on the critical line where T0 ge H + max(10π|δ| 50)

      Theninfinsumn=1

      Λ(n)χ(n)e

      xn

      )η+(nx) =

      η+(minusδ)x+Olowast

      (errη+χ(δ x)

      )middot x if q = 1

      Olowast(errη+χ(δ x)

      )middot x if q gt 1

      (969)where

      errη+χ(δ x) =

      (11308

      radicT prime0 middot eminus01598T prime0 + 16147|δ|e

      minus01065

      (T prime0πδ

      )2)log

      qT0

      + (1634radicT0 log qT0 + 1243

      radicT0 + 1321 log q + 3451)x12

      + (9 + 11|δ|)xminus1 + (log q)(11 + 6|δ|)xminus32(970)

      where T prime0 = T0 minusH

      Proof We can apply Lemmas 911 and Lemma 914 because η+(t) (log t)η+(t) andηprime+(t) are in `2 (by (A25) (A28) and (A32)) and η+(t)tσminus1 and ηprime+(t)tσminus1 are in`1 for σ in an open interval containing [12 32] (by (A30) and (A33)) (Because of(95) the fact that η+(t)tminus12 and η+(t)t12 are in `1 implies that η+(t) log t is also in`1 as is required by Lemma 914)

      We apply Lemmas 911 914 and 941 We bound the norms involving η+ usingthe estimates in sectA3 and sectA4 Since η+(0) = 0 (by the definition (A3) of η+) theterm R in (92) is at most c0 where c0 is as in (93) We bound

      c0 le2

      3

      (2922875

      (radicΓ(12) +

      radicΓ(32)

      )+ 1062319

      (radicΓ(52) +

      radicΓ(72)

      ))+

      3|δ| middot 1062319

      (radicΓ(32) +

      radicΓ(52)

      )le 6536232 + 9319578|δ|

      188 CHAPTER 9 EXPLICIT FORMULAS

      using (A30) and (A33) By (A25) (A32) and the assumption H ge 25

      |η+|2 le 080365 |ηprime+|2 le 10845789

      Thus the error terms in (91) total at most

      6536232+9319578|δ|+ (log q + 601)(10845789 + 2π middot 080365|δ|)xminus12

      le 9 + 11|δ|+ (log q)(11 + 6|δ|)xminus12(971)

      The part of the sumsumρGδ(ρ)xρ in (91) corresponding to zeros ρ with |=(ρ)| gt

      T0 gets estimated by Lem 941 By Lemma 914 the part of the sum correspondingto zeros ρ with |=(ρ)| le T0 is at most

      (1634radicT0 log qT0 + 1243

      radicT0 + 1321 log q + 3451)x12

      where we estimate the norms |η+|2 |η middot log |2 and |η(t)radict|1 by (A25) (A28) and

      (A30)

      95 A sum for η+(t)2

      Using a smoothing function sometimes leads to considering sums involving the squareof the smoothing function In particular in Part III we will need a result involving η2

      +

      ndash something that could be slightly challenging to prove given the way in which η+ isdefined Fortunately we have bounds on |η+|infin and other `infin-norms (see AppendixA5) Our task will also be made easier by the fact that we do not have a phase e(δnx)this time All in all this will be yet another demonstration of the generality of theframework developed in sect91

      Proposition 951 Let η = η+ be as in (963) H ge 25 Let x ge 108 Assume thatall non-trivial zeros ρ of the Riemann zeta function ζ(s) with |=(ρ)| le T0 lie on thecritical line where T0 ge max(2H + 25 200)

      Theninfinsumn=1

      Λ(n)(log n)η2+(nx) = x middot

      int infin0

      η2+(t) log xt dt+Olowast(err`2η+) middot x log x (972)

      where

      err`2η+ =

      ((0462

      (log T1)2

      log x+ 0909 log T1

      )T1 + 171

      (1 +

      log T1

      log x

      )H

      )eminus

      π4 T1

      + (2445radicT0 log T0 + 5004) middot xminus12

      (973)and T1 = T0 minus 2H

      The assumption T0 ge 200 is stronger than what we strictly need but as it happenswe could make much stronger assumptions still Proposition 951 relies on a verifica-tion of zeros of the Riemann zeta function such verifications have gone up to valuesof T0 much higher than 200

      95 A SUM FOR η+(T )2 189

      Proof We will need to consider two smoothing functions namely η+0(t) = η+(t)2

      and η+1 = η+(t)2 log t Clearly

      infinsumn=1

      Λ(n)(log n)η2+(nx) = (log x)

      infinsumn=1

      Λ(n)η+0(nx) +

      infinsumn=1

      Λ(n)η+1(nx)

      Since η+(t) = hH(t)teminust22

      η+0(r) = h2H(t)t2eminust

      2

      η+1(r) = h2H(t)(log t)t2eminust

      2

      Let η+2 = (log x)η+0 + η+1 = η2+(t) log xt

      We wish to apply Lemma 911 For this we must first check that some norms arefinite Clearly

      η+2(t) = η2+(t) log x+ η2

      +(t) log t

      ηprime+2(t) = 2η+(t)ηprime+(t) log x+ 2η+(t)ηprime+(t) log t+ η2+(t)t

      (974)

      Thus we see that η+2(t) is in `2 because η+(t) is in `2 and η+(t) η+(t) log t are bothin `infin (see (A25) (A38) (A40))

      |η+2(t)|2 le∣∣η2

      +(t)∣∣2

      log x+∣∣η2

      +(t) log t∣∣2

      le |η+|infin |η+|2 log x+ |η+(t) log t|infin |η+|2 (975)

      Similarly ηprime+2(t) is in `2 because η+(t) is in `2 ηprime+(t) is in `2 (A32) and η+(t)η+(t) log t and η+(t)t (see (A41)) are all in `infin∣∣ηprime+2(t)

      ∣∣2le∣∣2η+(t)ηprime+(t)

      ∣∣2

      log x+∣∣2η+(t)ηprime+(t) log t

      ∣∣2

      +∣∣η2

      +(t)t∣∣2

      le 2 |η+|infin∣∣ηprime+∣∣2 log x+ 2 |η+(t) log t|infin

      ∣∣ηprime+∣∣2 + |η+(t)t|infin |η+|2 (976)

      In the same way we see that η+2(t)tσminus1 is in `1 for all σ in (minus1infin) (because the sameis true of η+(t)tσminus1 (A30) and η+(t) η+(t) log t are both in `infin) and ηprime+2(t)tσminus1 isin `1 for all σ in (0infin) (because the same is true of η+(t)tσminus1 and ηprime+(t)tσminus1 (A33)and η+(t) η+(t) log t η+(t)t are all in `infin)

      We now apply Lemma 911 with q = 1 δ = 0 Since η+2(0) = 0 the residueterm R equals c0 which by (974) is at most 23 times

      2 (|η+|infin log x+ |η+(t) log t|infin)(∣∣∣ηprime+(t)

      radict∣∣∣1

      +∣∣∣ηprime+(t)

      radict∣∣∣1

      )+ |η+(t)t|infin

      (∣∣∣η+(t)radict∣∣∣1

      +∣∣∣η+(t)

      radict∣∣∣1

      )

      Using the bounds (A38) (A40) (A41) (with the assumption H ge 25) (A30) and(A33) we get that this means that

      c0 le 1857606 log x+ 863264

      190 CHAPTER 9 EXPLICIT FORMULAS

      Since q = 1 and δ = 0 we get from (976) (and (A38) (A40) (A41) with theassumption H ge 25 and also (A25) and (A32)) that

      (log q + 601)middot(∣∣ηprime+2∣∣2 + 2π|δ| |η+2|2

      )xminus12

      = 601∣∣ηprime+2∣∣2 xminus12 le (16256 log x+ 59325)xminus12

      Using the assumption x ge 108 we obtain

      c0 + (18526 log x+ 71799)xminus12 le 19064 log x (977)

      We will now apply Lemma 914 ndash as we may because of the finiteness of the normswe have already checked together with

      |η+2(t) log t|2 le∣∣η2

      +(t) log t∣∣2

      log x+∣∣η2

      +(t)(log t)2∣∣2

      le |η+(t) log t|infin (|η+(t)|2 log x+ |η+(t) log t|2)

      le 04976 middot (080365 log x+ 082999) le 03999 log x+ 041301(978)

      (by (A40) (A25) and (A28) use the assumption H ge 25) We also need the bounds

      |η+2(t)|2 le 114199 log x+ 039989 (979)

      (from (975) by the norm bounds (A38) (A40) and (A25) all with H ge 25) and∣∣∣η+2(t)radict∣∣∣1le (|η+(t)|infin log x+ |η+(t) log t|infin)

      ∣∣∣η+(t)radict∣∣∣1

      le 14211 log x+ 049763(980)

      (by (A38) (A40) (again with H ge 25) and (A30))Applying Lemma 914 we obtain that the sum

      sumρ |G0(ρ)|xρ (where G0(ρ) =

      Mη+2(ρ)) over all non-trivial zeros ρ with |=(ρ)| le T0 is at most x12 times

      (154189 log x+ 08129)radicT0 log T0 + (421245 log x+ 617301)

      radicT0

      + 491 log x+ 172(981)

      where we are bounding norms by (979) (978) and (980) (We are using the fact thatT0 ge 2π

      radice to ensure that the quantity

      radicT0 log T0minus (log 2π

      radice)radicT0 being multiplied

      by |η+2|2 is positive thus an upper bound for |η+2|2 suffices) By the assumptionsx ge 108 T0 ge 200 (981) is at most

      (2445radicT0 log T0 + 50034) log x

      In comparison 19064xminus12 log x le 0002 log x since x ge 108It remains to bound the sum of Mη+2(ρ) over zeros with |=(ρ)| gt T0 This we

      will do as usual by Lemma 913 For that we will need to bound Mη+2(ρ) for ρ inthe critical strip

      95 A SUM FOR η+(T )2 191

      The Mellin transform of eminust2

      is Γ(s2)2 and so the Mellin transform of t2eminust2

      is Γ(s2 + 1)2 By (210) this implies that the Mellin transform of (log t)t2eminust2

      isΓprime(s2 + 1)4 Hence by (29)

      Mη+2(s) =1

      int infinminusinfin

      M(h2H)(ir) middot Fx (sminus ir) dr (982)

      whereFx(s) = (log x)Γ

      (s2

      + 1)

      +1

      2Γprime(s

      2+ 1) (983)

      Moreover

      M(h2H)(ir) =

      1

      int infinminusinfin

      MhH(iu)MhH(i(r minus u)) du (984)

      and so M(h2H)(ir) is supported on [minus2H 2H] We also see that |Mh2

      H(ir)|1 le|MhH(ir)|212π We know that |MhH(ir)|212π le 4173727 by (A17)

      Hence

      |Mη+2(s)| le 1

      int infinminusinfin|M(h2

      H)(ir)|dr middot max|r|le2H

      |Fx(sminus ir)|

      le 4173727

      4πmiddot max|r|le2H

      |Fx(sminus ir)| le 332135 middot max|r|le2H

      |Fx(sminus ir)|(985)

      By (851) (Stirling with explicit constants)

      |Γ(s)| leradic

      2π|s|σminus 12 e

      112|s|+

      radic2

      180|s|3 eminusπ|=(s)|2 (986)

      when lt(s) ge 0 and so

      |Γ(s)| leradic

      (radic1252 + 152

      125

      )e

      112middot125 +

      radic2

      180middot1253 middot |=(s)|eminusπ|=(s)|2

      le 2542|=(s)|eminusπ|=(s)|2

      (987)

      for s isin C with 0 lt lt(s) le 32 and |=(s)| ge 252 Moreover by [OLBC10 5112]and the remarks at the beginning of [OLBC10 511(ii)]

      Γprime(s)

      Γ(s)= log sminus 1

      2s+Olowast

      (1

      12|s|2middot 1

      cos3 θ2

      )for | arg(s)| lt θ (θ isin (minusπ π)) Again for s = σ + iτ with 0 lt σ le 32 and|τ | ge 252 this gives us

      Γprime(s)

      Γ(s)= log |τ |+ log

      radic|τ |2 + 152

      |τ |+Olowast

      (1

      2|τ |

      )+Olowast

      (1

      12|τ |2middot 1

      (1radic

      2)3

      )= log |τ |+Olowast

      (9

      8|τ |2+

      1

      2|τ |

      )+Olowast(0236)

      |τ |2

      = log |τ |+Olowast(

      0609

      |τ |

      )

      192 CHAPTER 9 EXPLICIT FORMULAS

      Hence for 0 le lt(s) le 1 (or in fact minus2 le lt(s) le 1) and |=(s)| ge 25

      |Fx(s)| le(

      (log x) +1

      2log∣∣∣τ2

      ∣∣∣+1

      2Olowast(

      0609

      |τ2|

      ))Γ(s

      2+ 1)

      le 2542((log x) +1

      2log |τ | minus 0297)

      |τ |2eminusπ|τ |2

      (988)

      Thus by (985) for ρ = σ + iτ with |τ | ge T0 ge 2H + 25 and 0 le σ le 1

      |Mη+2(ρ)| le f(τ)

      where

      f(T ) = 845

      (log x+

      1

      2log T

      )(|τ |2minusH

      )middot eminus

      π(|τ|minus2H)4 (989)

      The functions t 7rarr teminusπt2 and t 7rarr (log t)teminusπt2 are decreasing for t ge e (or in factfor t ge 1762) setting t = T2minusH we see that the right side of (989) is a decreasingfunction of T for T ge T0 since T02minusH ge 252 gt e

      We can now apply Lemma 913 and get thatsumρ

      |=(ρ)|gtT0

      |Mη+2(ρ)| leint infinT0

      f(T )

      (1

      2πlog

      T

      2π+

      1

      4T

      )dT (990)

      Since T ge T0 ge 75 gt 2 we know that ((12π) log(T2π) + 14T ) le (12π) log T Hence the right side of (990) is at most

      839

      int infinT0

      ((log x)(log T ) +

      (log T )2

      2

      )(T minus 2H)eminus

      π(Tminus2H)4 dT

      le 0668

      int infinT1

      ((log x)

      (log t+

      2H

      t

      )+

      ((log t)2

      2+ 2H

      log t

      t

      ))teminus

      πt4 dt

      (991)

      where T1 = T0 minus 2H and t = T minus 2H we are using the facts that (log t)primeprime lt 0 fort gt 0 and ((log t)2)primeprime lt 0 for t gt e (Of course T1 ge 25 gt e)

      Of courseintinfinT1eminus(π4)t = (4π)eminus(π4)T1 We recall (936) and (950)int infinT1

      log t middot eminusπ4 tdt le(

      log T1 +4π

      T1

      )eminus

      π4 T1

      π4int infinT1

      (log t)teminusπ4 tdt le

      (T1 +

      4a

      π

      )eminus

      π4 T1 log T1

      π4

      for T1 ge 1 satisfying log T1 gt 4(πT1) where a = 1 + (1 + 4(πT1))(log T1 minus4(πT1)) It is easy to check that log T1 gt 4(πT1) and 4aπ le 16957 for T1 ge 25of course we also have (4π)25 le 0051 Lastlyint infin

      T1

      (log t)2teminusπ4 tdt le

      (T1 +

      4b

      π

      )eminus

      π4 T1(log T1)2

      π4

      96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 193

      for T1 ge e where b = 1 + (2 + 8(πT1))(log T1 minus 8(πT1)) and we check that4bπ le 21319 for T1 ge 25 We conclude that the integral on the second line of (991)is at most

      4

      π

      ((log T1)2

      2(T1 + 2132) + (log x)(log T1)(T1 + 1696)

      )eminus

      π4 T1

      +4

      πmiddot 2H(log T1 + 0051 + log x)eminus

      π4 T1

      Multiplying this by 0668 and simplifying further (using T1 ge 25) we conclude thatsumρ|=(ρ)|gtT0

      |Mη+2(ρ)| is at most

      ((0462 log T1 + 0909 log x)(log T1)T1 + 171(log T1 + log x)H) eminusπ4 T1

      96 A verification of zeros and its consequencesDavid Platt verified in his doctoral thesis [Pla11] that for every primitive character χof conductor q le 105 all the non-trivial zeroes of L(s χ) with imaginary partle 108qlie on the critical line ie have real part exactly 12 (We call this a GRH verificationup to 108q)

      In work undertaken in coordination with the present work [Plab] Platt has extendedthese computations to

      bull all odd q le 3 middot 105 with Tq = 108q

      bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

      The method used was rigorous its implementation uses interval arithmeticLet us see what this verification gives us when used as an input to Prop 922 We

      are interested in bounds on | errηχlowast(δ x)| for q le r and |δ| le 4rq We set r = 3middot105(We will not be using the verification for q even with 3 middot 105 lt q le 4 middot 105 though wecertainly could)

      We let T0 = 108q Thus

      T0 ge108

      3 middot 105=

      1000

      3

      T0

      π|δ|ge 108q

      π middot 4rq=

      1000

      12π

      (992)

      and so by |δ| le 4rq le 12 middot 106q le 12 middot 106

      353eminus01598T0 le 2597 middot 10minus23

      225δ2

      T0eminus01065

      T20

      (πδ)2 le |δ| middot 7715 middot 10minus34 le 9258 middot 10minus28

      194 CHAPTER 9 EXPLICIT FORMULAS

      Since qT0 le 108 this gives us that

      logqT0

      2πmiddot(

      353eminus01598T0 + 225δ2

      T0eminus01065

      T20

      (πδ)2

      )le 43054 middot 10minus22 +

      154 middot 10minus26

      qle 4306 middot 10minus22

      Again by T0 = 108q

      2337radicT0 log qT0 + 21817

      radicT0 + 285 log q + 7438

      is at most648662radicq

      + 111

      and

      3 log q + 14|δ|+ 17 le 55 +17 middot 107

      q

      (log q + 6) middot (1 + 5|δ|) le 19 +12 middot 108

      q

      Hence assuming x ge 108 to simplify we see that Prop 922 gives us that

      errηχ(δ x) le 4306 middot 10minus22 +

      648662radicq + 111radicx

      +55 + 17middot107

      q

      x+

      19 + 12middot108

      q

      x32

      le 4306 middot 10minus22 +1radicx

      (650400radicq

      + 112

      )for η(t) = eminust

      22 This proves Theorem 711Let us now see what Plattrsquos calculations give us when used as an input to Prop 932

      and Cor 933 Again we set r = 3 middot 105 δ0 = 8 |δ| le 4rq and T0 = 108q so(992) is still valid We obtain

      T0 logqT0

      2πmiddot(

      611eminus01598T0 + 1578eminus01065middot T

      20

      (πδ)2

      )le log

      108

      (611 middot 1000

      3eminus01598middot 10003 + 108 middot 1578eminus01065( 1000

      12π )2)

      le 2485 middot 10minus19

      since t exp(minus01598t) is decreasing on t for t ge 101598 We use the same boundwhen we have 00102 instead of 1578 on the left side as in (961) (The coefficientaffects what is by far the smaller term so we are wasting nothing) Again by T0 =108q and q le r

      122radicT0 log qT0 + 5053

      radicT0 + 1423 log q + 3719 le 279793

      radicq

      + 552

      1679radicT0 log qT0 + 6957

      radicT0 + 1958 log q + 5117 le 378854

      radicq

      + 759

      96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 195

      For x ge 108 we use |δ| le 4rq le 12 middot 106q to bound

      (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32 le(

      00004 +1322

      q

      )xminus12

      (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32 le(

      00007 +2644

      q

      )xminus12

      Summing we obtain

      errηχ le 2485 middot 10minus19 +1radicx

      (281200radicq

      + 56

      )for η(t) = t2eminust

      22 and

      errηχ le 2485 middot 10minus19 +1radicx

      (381500radicq

      + 76

      )for η(t) = t2eminust

      22 lowastM η2(t) This proves Theorem 712 and Corollary 713Now let us work with the smoothing weight η+ This time around set r = 150000

      if q is odd and r = 300000 if q is even As before we assume

      q le r |δ| le 4rq

      We can see that Plattrsquos verification [Plab] mentioned before allows us to take

      T0 = H +250r

      q H = 200

      since Tq is always at least this (Tq = 108q ge 200 + 7 middot 107q gt 200 + 375 middot 107qfor q le 150000 odd Tq ge 200 + 75 middot 107q for q le 300000 even)

      Thus

      T0 minusH =250r

      qge 250r

      r= 250

      T0 minusHπδ

      ge 250r

      πδqge 250

      4π= 1989436

      and also

      T0 le 200 + 250 middot 150000 le 3751 middot 107 qT0 le rH + 250r le 135 middot 108

      Hence sinceradicteminus01598t is decreasing on t for t ge 1(2 middot 01598)

      11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

      (T0minusH)2

      (πδ)2

      le 79854 middot 10minus16 +4r

      qmiddot 79814 middot 10minus18

      le 79854 middot 10minus16 +95777 middot 10minus12

      q

      196 CHAPTER 9 EXPLICIT FORMULAS

      Examining (970) we get

      errη+χ(δ x) le log135 middot 108

      2πmiddot(

      79854 middot 10minus16 +95777 middot 10minus12

      q

      )+

      ((1634 log(135 middot 108) + 1243

      ) radic135 middot 108

      radicq

      + 1321 log 300000 + 3451

      )1radicx

      +

      (9 + 11 middot 12 middot 106

      q

      )xminus1 + (log 300000)

      (11 + 6 middot 12 middot 106

      q

      )xminus32

      le 13482 middot 10minus14 +1617 middot 10minus10

      q

      +

      (499845radicq

      + 5117 +132 middot 106

      qradicx

      +9radicx

      +91 middot 107

      qx+

      139

      x

      )1radicx

      Making the assumption x ge 1012 we obtain

      errη+χ(δ x) le 13482 middot 10minus14 +1617 middot 10minus10

      q+

      (499900radicq

      + 52

      )1radicx

      This proves Theorem 714 for general qLet us optimize things a little more carefully for the trivial character χT Again

      we will make the assumption x ge 1012 We will also assume as we did before that|δ| le 4rq this now gives us |δ| le 600000 since q = 1 and r = 150000 for q oddWe will go up to a height T0 = H + 600000π middot t where H = 200 and t ge 10 Then

      T0 minusHπδ

      =600000πt

      4πrge t

      Hence

      11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

      (T0minusH)2

      (πδ)2

      le 10minus1300000 + 9689000eminus01065t2

      Looking at (970) we get

      errη+χT (δ x) le logT0

      2πmiddot(

      10minus1300000 + 9689000eminus01065t2)

      + ((1634 log T0 + 1243)radicT0 + 3451)xminus12 + 6600009xminus1

      The value t = 20 seems good enough we choose it because it is not far from optimalfor x sim 1027 We get that T0 = 12000000π + 200 since T0 lt 108 we are within therange of the computations in [Plab] (or for that matter [Wed03] or [Plaa]) We obtain

      errη+χT (δ x) le 4772 middot 10minus11 +251400radic

      x

      Lastly let us look at the sum estimated in (972) Here it will be enough to go upto just T0 = 2H + max(50 H4) = 450 where as before H = 200 Of course the

      96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 197

      verification of the zeros of the Riemann zeta function does go that far as we alreadysaid it goes until 108 (or rather more see [Wed03] and [Plaa]) We make again theassumption x ge 1012 We look at (973) and obtain that err`2η+ is at most((

      0462(log 50)2

      log 1012+ 0909 log 50

      )middot 50 + 171

      (1 +

      log 50

      log 1012

      )middot 200

      )eminus

      π4 50

      + (2445radic

      450 log 450 + 5004) middot xminus12

      le 5123 middot 10minus15 +36691radic

      x

      (993)It remains only to estimate the integral in (972) First of allint infin

      0

      η2+(t) log xt dt =

      int infin0

      η2(t) log xt dt

      + 2

      int infin0

      (η+(t)minus η(t))η(t) log xt dt+

      int infin0

      (η+(t)minus η(t))2 log xt dt

      The main term will be given byint infin0

      η2(t) log xt dt =

      (064020599736635 +O

      (10minus14

      ))log x

      minus 0021094778698867 +O(10minus15

      )

      where the integrals were computed rigorously using VNODE-LP [Ned06] (The in-tegral

      intinfin0η2(t)dt can also be computed symbolically) By Cauchy-Schwarz and the

      triangle inequalityint infin0

      (η+(t)minus η(t))η(t) log xt dt le |η+ minus η|2|η(t) log xt|2

      le |η+ minus η|2(|η|2 log x+ |η middot log |2)

      le 27486

      H72(080013 log x+ 0214)

      le 1944 middot 10minus6 middot log x+ 52 middot 10minus7

      where we are using (A23) and evaluate |η middot log |2 rigorously as above By (A23) and(A24)int infin

      0

      (η+(t)minus η(t))2 log xt dt le(

      27486

      H72

      )2

      log x+27428

      H7

      le 5903 middot 10minus12 middot log x+ 2143 middot 10minus12

      We conclude thatint infin0

      η2+(t) log xt dt

      = (0640206 +Olowast(195 middot 10minus6)) log xminus 0021095 +Olowast(53 middot 10minus7)

      (994)

      198 CHAPTER 9 EXPLICIT FORMULAS

      We add to this the error term 5123 middot 10minus15 + 36691radicx from (993) and simplify

      using the assumption x ge 1012 We obtain

      infinsumn=1

      Λ(n)(log n)η2+(nx) = 0640206x log xminus 0021095x

      +Olowast(2 middot 10minus6x log x+ 36691

      radicx log x

      )

      (995)

      and so Prop 951 gives us Proposition 715As we can see the relatively large error term 2 middot 10minus6 comes from the fact that we

      have wanted to give the main term in (972) as an explicit constant rather than as anintegral This is satisfactory Prop 715 is an auxiliary result that will be needed forone specific purpose in Part III as opposed to Thms 711ndash714 which while crucialfor Part III are also of general applicability and interest

      Part III

      The integral over the circle

      199

      Chapter 10

      The integral over the major arcs

      LetSη(α x) =

      sumn

      Λ(n)e(αn)η(nx) (101)

      where α isin RZ Λ is the von Mangoldt function and η R rarr C is of fast enoughdecay for the sum to converge

      Our ultimate goal is to bound from belowsumn1+n2+n3=N

      Λ(n1)Λ(n2)Λ(n3)η1(n1x)η2(n2x)η3(n3x) (102)

      where η1 η2 η3 R rarr C Once we know that this is neither zero nor very close tozero we will know that it is possible to write N as the sum of three primes n1 n2 n3

      in at least one way that is we will have proven the ternary Goldbach conjectureAs can be readily seen (102) equalsint

      RZSη1(α x)Sη2(α x)Sη3(α x)e(minusNα) dα (103)

      In the circle method the set RZ gets partitioned into the set of major arcs M and theset of minor arcs m the contribution of each of the two sets to the integral (103) isevaluated separately

      Our objective here is to treat the major arcs we wish to estimateintM

      Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα)dα (104)

      for M = Mδ0r where

      Mδ0r =⋃qlerq odd

      ⋃a mod q

      (aq)=1

      (a

      qminus δ0r

      2qxa

      q+δ0r

      2qx

      )cup⋃qle2rq even

      ⋃a mod q

      (aq)=1

      (a

      qminus δ0r

      qxa

      q+δ0r

      qx

      )(105)

      201

      202 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      and δ0 gt 0 r ge 1 are givenIn other words our major arcs will be few (that is a constant number) and narrow

      While [LW02] used relatively narrow major arcs as well their number as in all pre-vious proofs of Vinogradovrsquos result was not bounded by a constant (In his proof ofthe five-primes theorem [Tao14] is able to take a single major arc around 0 this is notpossible here)

      What we are about to see is the general major-arc setup This is naturally the placewhere the overlap with the existing literature is largest Two important differences cannevertheless be singled out

      bull The most obvious one is the presence of smoothing At this point it improvesand simplifies error terms but it also means that we will later need estimates forexponential sums on major arcs and not just at the middle of each major arc (Ifthere is smoothing we cannot use summation by parts to reduce the problem ofestimating sums to a problem of counting primes in arithmetic progressions orweighted by characters)

      bull Since our L-function estimates for exponential sums will give bounds that arebetter than the trivial one by only a constant ndash even if it is a rather large con-stant ndash we need to be especially careful when estimating error terms findingcancellation when possible

      101 Decomposition of Sη by charactersWhat follows is largely classical cf [HL22] or say [Dav67 sect26] The only differencefrom the literature lies in the treatment of n non-coprime to q and the way in whichwe show that our exponential sum (108) is equal to a linear combination of twistedsums Sηχlowast over primitive characters χlowast (Non-primitive characters would give us L-functions with some zeroes inconveniently placed on the line lt(s) = 0)

      Write τ(χ b) for the Gauss sum

      τ(χ b) =sum

      a mod q

      χ(a)e(abq) (106)

      associated to a b isin ZqZ and a Dirichlet character χ with modulus q We let τ(χ) =τ(χ 1) If (b q) = 1 then τ(χ b) = χ(bminus1)τ(χ)

      Recall that χlowast denotes the primitive character inducing a given Dirichlet characterχ Writing

      sumχ mod q for a sum over all characters χ of (ZqZ)lowast) we see that for any

      a0 isin ZqZ

      1

      φ(q)

      sumχ mod q

      τ(χ b)χlowast(a0) =1

      φ(q)

      sumχ mod q

      suma mod q

      (aq)=1

      χ(a)e(abq)χlowast(a0)

      =sum

      a mod q

      (aq)=1

      e(abq)

      φ(q)

      sumχ mod q

      χlowast(aminus1a0) =sum

      a mod q

      (aq)=1

      e(abq)

      φ(q)

      sumχ mod qprime

      χ(aminus1a0)

      (107)

      101 DECOMPOSITION OF Sη BY CHARACTERS 203

      where qprime = q gcd(q ainfin0 ) Nowsumχ mod qprime χ(aminus1a0) = 0 unless a = a0 (in which

      casesumχ mod qprime χ(aminus1a0) = φ(qprime)) Thus (107) equals

      φ(qprime)

      φ(q)

      suma mod q

      (aq)=1

      aequiva0 mod qprime

      e(abq) =φ(qprime)

      φ(q)

      sumk mod qqprime

      (kqqprime)=1

      e

      ((a0 + kqprime)b

      q

      )

      =φ(qprime)

      φ(q)e

      (a0b

      q

      ) sumk mod qqprime

      (kqqprime)=1

      e

      (kb

      qqprime

      )=φ(qprime)

      φ(q)e

      (a0b

      q

      )micro(qqprime)

      provided that (b q) = 1 (We are evaluating a Ramanujan sum in the last step) Hencefor α = aq + δx q le x (a q) = 1

      1

      φ(q)

      sumχ

      τ(χ a)sumn

      χlowast(n)Λ(n)e(δnx)η(nx)

      equals sumn

      micro((q ninfin))

      φ((q ninfin))Λ(n)e(αn)η(nx)

      Since (a q) = 1 τ(χ a) = χ(a)τ(χ) The factor micro((q ninfin))φ((q ninfin)) equals 1when (n q) = 1 the absolute value of the factor is at most 1 for every n Clearlysum

      n(nq)6=1

      Λ(n)η(nx

      )=sump|q

      log psumαge1

      η

      (pα

      x

      )

      Recalling the definition (101) of Sη(α x) we conclude that

      Sη(α x) =1

      φ(q)

      sumχ mod q

      χ(a)τ(χ)Sηχlowast

      x x

      )+Olowast

      2sump|q

      log psumαge1

      η

      (pα

      x

      )

      (108)where

      Sηχ(β x) =sumn

      Λ(n)χ(n)e(βn)η(nx) (109)

      Hence Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα) equals

      1

      φ(q)3

      sumχ1

      sumχ2

      sumχ3

      τ(χ1)τ(χ2)τ(χ3)χ1(a)χ2(a)χ3(a)e(minusNaq)

      middot Sη1χlowast1 (δx x)Sη2χlowast2 (δx x)Sη3χlowast3 (δx x)e(minusδNx)

      (1010)

      plus an error term of absolute value at most

      2

      3sumj=1

      prodjprime 6=j

      |Sηjprime (α x)|sump|q

      log psumαge1

      ηj

      (pα

      x

      ) (1011)

      204 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      We will later see that the integral of (1011) over S1 is negligible ndash for our choices ofηj it will in fact be of size O(x(log x)A) A a constant The error term O(x(log x)A)should be compared to the main term which will be of size about a constant times x2

      In (1010) we have reduced our problems to estimating Sηχ(δx x) for χ prim-itive a more obvious way of reaching the same goal would have made (1011) worseby a factor of about

      radicq

      102 The integral over the major arcs the main term

      We are to estimate the integral (104) where the major arcs Mδ0r are defined as in(105) We will use η1 = η2 = η+ η3(t) = ηlowast(κt) where η+ and ηlowast will be set later

      We can write

      Sηχ(δx x) = Sη(δx x) =

      int infin0

      η(tx)e(δtx)dt+Olowast(errηχ(δ x)) middot x

      = η(minusδ) middot x+Olowast(errηχT (δ x)) middot x(1012)

      for χ = χT the trivial character and

      Sηχ(δx) = Olowast(errηχ(δ x)) middot x (1013)

      for χ primitive and non-trivial The estimation of the error terms err will come laterlet us focus on (a) obtaining the contribution of the main term (b) using estimates onthe error terms efficiently

      The main term three principal characters The main contribution will be given bythe term in (1010) with χ1 = χ2 = χ3 = χ0 where χ0 is the principal character modq

      The sum τ(χ0 n) is a Ramanujan sum as is well-known (see eg [IK04 (32)])

      τ(χ0 n) =sumd|(qn)

      micro(qd)d (1014)

      This simplifies to micro(q(q n))φ((q n)) for q square-free The special case n = 1 givesus that τ(χ0) = micro(q)

      Thus the term in (1010) with χ1 = χ2 = χ3 = χ0 equals

      e(minusNaq)φ(q)3

      micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx) (1015)

      where of course Sηχlowast0 (α x) = Sη(α x) (since χlowast0 is the trivial character) Summing(1015) for α = aq+δx and a going over all residues mod q coprime to q we obtain

      micro(

      q(qN)

      )φ((qN))

      φ(q)3micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx)

      102 THE INTEGRAL OVER THE MAJOR ARCS THE MAIN TERM 205

      The integral of (1015) over all of M = Mδ0r (see (105)) thus equals

      sumqlerq odd

      φ((qN))

      φ(q)3micro(q)2micro((qN))

      int δ0r2qx

      minus δ0r2qx

      S2η+χlowast0

      (α x)Sηlowastχlowast0 (α x)e(minusαN)dα

      +sumqle2rq even

      φ((qN))

      φ(q)3micro(q)2micro((qN))

      int δ0rqx

      minus δ0rqxS2η+χlowast0

      (α x)Sηlowastχlowast0 (α x)e(minusαN)dα

      (1016)The main term in (1016) is

      x3 middotsumqlerq odd

      φ((qN))

      φ(q)3micro(q)2micro((qN))

      int δ0r2qx

      minus δ0r2qx

      (η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

      +x3 middotsumqle2rq even

      φ((qN))

      φ(q)3micro(q)2micro((qN))

      int δ0rqx

      minus δ0rqx(η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

      (1017)We would like to complete both the sum and the integral Before we should say

      that we will want to be able to use smoothing functions η+ whose Fourier transformsare not easy to deal with directly All we want to require is that there be a smoothingfunction η easier to deal with such that η be close to η+ in `2 norm

      Assume then that

      |η+ minus η|2 le ε0|η|

      where η is thrice differentiable outside finitely many points and satisfies η(3) isin L1

      Then (1017) equals

      x3 middotsumqlerq odd

      φ((qN))

      φ(q)3micro(q)2micro((qN))

      int δ0r2qx

      minus δ0r2qx

      (η(minusαx))2ηlowast(minusαx)e(minusαN)dα

      +x3 middotsumqle2rq even

      φ((qN))

      φ(q)3micro(q)2micro((qN))

      int δ0rqx

      minus δ0rqx(η(minusαx))2ηlowast(minusαx)e(minusαN)dα

      (1018)plus

      Olowast

      (x2 middot

      sumq

      micro(q)2

      φ(q)2

      int infinminusinfin|(η+(minusα))2 minus (η(minusα))2||ηlowast(minusα)|dα

      ) (1019)

      206 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      Here (1019) is bounded by 282643x2 (by (C9)) times

      |ηlowast(minusα)|infin middot

      radicint infinminusinfin|η+(minusα)minus η(minusα)|2dα middot

      int infinminusinfin|η+(minusα) + η(minusα)|2dα

      le |ηlowast|1 middot |η+ minus η|2|η+ + η|2 = |ηlowast|1 middot |η+ minus η|2|η+ + η|2le |ηlowast|1 middot |η+ minus η|2(2|η|2 + |η+ minus η|2) = |ηlowast|1|η|22 middot (2 + ε0)ε0

      Now (1018) equals

      x3

      int infinminusinfin

      (η(minusαx))2ηlowast(minusαx)e(minusαN)sum

      q(q2)lemin( δ0r

      2|α|x r)micro(q)2=1

      φ((qN))

      φ(q)3micro((qN))dα

      = x3

      int infinminusinfin

      (η(minusαx))2ηlowast(minusαx)e(minusαN)dα middot

      sumqge1

      φ((qN))

      φ(q)3micro(q)2micro((qN))

      minusx3

      int infinminusinfin

      (η(minusαx))2ηlowast(minusαx)e(minusαN)sum

      q(q2)

      gtmin( δ0r

      2|α|x r)micro(q)2=1

      φ((qN))

      φ(q)3micro((qN))dα

      (1020)The last line in (1020) is bounded1 by

      x2|ηlowast|infinint infinminusinfin|η(minusα)|2

      sumq

      (q2)gtmin( δ0r2|α| r)

      micro(q)2

      φ(q)2dα (1021)

      By (21) (with k = 3) (C16) and (C17) this is at most

      x2|ηlowast|1int δ02

      minusδ02|η(minusα)|2 431004

      rdα

      + 2x2|ηlowast|1int infinδ02

      (|η(3) |1

      (2πα)3

      )2862008|α|

      δ0rdα

      le |ηlowast|1

      (431004|η|22 + 000113

      |η(3) |21δ50

      )x2

      r

      It is easy to see that

      sumqge1

      φ((qN))

      φ(q)3micro(q)2micro((qN)) =

      prodp|N

      (1minus 1

      (pminus 1)2

      )middotprodp-N

      (1 +

      1

      (pminus 1)3

      )

      1This is obviously crude in that we are bounding φ((qN))φ(q) by 1 We are doing so in order toavoid a potentially harmful dependence on N

      103 THE `2 NORM OVER THE MAJOR ARCS 207

      Expanding the integral implicit in the definition of f int infininfin

      (η(minusαx))2ηlowast(minusαx)e(minusαN)dα =

      1

      x

      int infin0

      int infin0

      η(t1)η(t2)ηlowast

      (N

      xminus (t1 + t2)

      )dt1dt2

      (1022)

      (This is standard One rigorous way to obtain (1022) is to approximate the integralover α isin (minusinfininfin) by an integral with a smooth weight at different scales as the scalebecomes broader the Fourier transform of the weight approximates (as a distribution)the δ function Apply Plancherel)

      Hence (1017) equals

      x2 middotint infin

      0

      int infin0

      η(t1)η(t2)ηlowast

      (N

      xminus (t1 + t2)

      )dt1dt2

      middotprodp|N

      (1minus 1

      (pminus 1)2

      )middotprodp-N

      (1 +

      1

      (pminus 1)3

      )

      (1023)

      (the main term) plus

      282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 000113

      |η(3) |21δ50

      r

      |ηlowast|1x2 (1024)

      Here (1023) is just as in the classical case [IK04 (1910)] except for the fact thata factor of 12 has been replaced by a double integral Later in chapter 11 we will seehow to choose our smoothing functions (and x in terms ofN ) so as to make the doubleintegral as large as possible in comparison with the error terms This is an importantoptimization (We already had a first discussion of this in the introduction see (139)and what follows)

      What remains to estimate is the contribution of all the terms of the form errηχ(δ x)in (1012) and (1013) Let us first deal with another matter ndash bounding the `2 norm of|Sη(α x)|2 over the major arcs

      103 The `2 norm over the major arcs

      We can always bound the integral of |Sη(α x)|2 on the whole circle by Plancherel Ifwe only want the integral on certain arcs we use the bound in Prop 1212 (based onwork by Ramare) If these arcs are really the major arcs ndash that is the arcs on whichwe have useful analytic estimates ndash then we can hope to get better bounds using L-functions This will be useful both to estimate the error terms in this section and tomake the use of Ramarersquos bounds more efficient later

      208 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      By (108)

      suma mod q

      gcd(aq)=1

      ∣∣∣∣Sη (aq +δ

      x χ

      )∣∣∣∣2

      =1

      φ(q)2

      sumχ

      sumχprime

      τ(χ)τ(χprime)

      suma mod q

      gcd(aq)=1

      χ(a)χprime(a)

      middot Sηχlowast(δx x)Sηχprimelowast(δx x)

      +Olowast(

      2(1 +radicq)(log x)2|η|infinmax

      α|Sη(α x)|+

      ((1 +

      radicq)(log x)2|η|infin

      )2)=

      1

      φ(q)

      sumχ

      |τ(χ)|2|Sηχlowast(δx x)|2 +Kq1(2|Sη(0 x)|+Kq1)

      where

      Kq1 = (1 +radicq)(log x)2|η|infin

      As is well-known (see eg [IK04 Lem 31])

      τ(χ) = micro

      (q

      qlowast

      )χlowast(q

      qlowast

      )τ(χlowast)

      where qlowast is the modulus of χlowast (ie the conductor of χ) and

      |τ(χlowast)| =radicqlowast

      Using the expressions (1012) and (1013) we obtain

      suma mod q

      (aq)=1

      ∣∣∣∣Sη (aq +δ

      x x

      )∣∣∣∣2 =micro2(q)

      φ(q)|η(minusδ)x+Olowast (errηχT (δ x) middot x)|2

      +1

      φ(q)

      sumχ 6=χT

      micro2

      (q

      qlowast

      )qlowast middotOlowast

      (| errηχ(δ x)|2x2

      )+Kq1(2|Sη(0 x)|+Kq1)

      =micro2(q)x2

      φ(q)

      (|η(minusδ)|2 +Olowast (|errηχT (δ x)(2|η|1 + errηχT (δ x))|)

      )+Olowast

      (maxχ6=χT

      qlowast| errηχlowast(δ x)|2x2 +Kq2x

      )

      where Kq2 = Kq1(2|Sη(0 x)|x+Kq1x)

      103 THE `2 NORM OVER THE MAJOR ARCS 209

      Thus the integral of |Sη(α x)|2 over M (see (105)) is

      sumqlerq odd

      suma mod q

      (aq)=1

      int aq+

      δ0r2qx

      aqminus

      δ0r2qx

      |Sη(α x)|2 dα+sumqle2rq even

      suma mod q

      (aq)=1

      int aq+

      δ0rqx

      aqminus

      δ0rqx

      |Sη(α x)|2 dα

      =sumqlerq odd

      micro2(q)x2

      φ(q)

      int δ0r2qx

      minus δ0r2qx

      |η(minusαx)|2 dα+sumqle2rq even

      micro2(q)x2

      φ(q)

      int δ0rqx

      minus δ0rqx|η(minusαx)|2 dα

      +Olowast

      (sumq

      micro2(q)x2

      φ(q)middot gcd(q 2)δ0r

      qx

      (ET

      ηδ0r2

      (2|η|1 + ETηδ0r2

      )))

      +sumqlerq odd

      δ0rx

      qmiddotOlowast

      maxχ mod q

      χ 6=χT|δ|leδ0r2q

      qlowast| errηχlowast(δ x)|2 +Kq2

      x

      +sumqle2rq even

      2δ0rx

      qmiddotOlowast

      maxχ mod q

      χ 6=χT|δ|leδ0rq

      qlowast| errηχlowast(δ x)|2 +Kq2

      x

      (1025)where

      ETηs = max|δ|les

      | errηχT (δ x)|

      and χT is the trivial character If all we want is an upper bound we can simply remarkthat

      xsumqlerq odd

      micro2(q)

      φ(q)

      int δ0r2qx

      minus δ0r2qx

      |η(minusαx)|2 dα+ xsumqle2rq even

      micro2(q)

      φ(q)

      int δ0rqx

      minus δ0rqx|η(minusαx)|2 dα

      le

      sumqlerq odd

      micro2(q)

      φ(q)+sumqle2rq even

      micro2(q)

      φ(q)

      |η|22 = 2|η|22sumqlerq odd

      micro2(q)

      φ(q)

      If we also need a lower bound we proceed as follows

      Again we will work with an approximation η such that (a) |η minus η|2 is small (b)

      210 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      η is thrice differentiable outside finitely many points (c) η(3) isin L1 Clearly

      xsumqlerq odd

      micro2(q)

      φ(q)

      int δ0r2qx

      minus δ0r2qx

      |η(minusαx)|2 dα

      lesumqlerq odd

      micro2(q)

      φ(q)

      (int δ0r2q

      minus δ0r2q

      |η(minusα)|2 dα+ 2〈|η| |η minus η|〉+ |η minus η|22

      )

      =sumqlerq odd

      micro2(q)

      φ(q)

      int δ0r2q

      minus δ0r2q

      |η(minusα)|2 dα

      +Olowast(

      1

      2log r + 085

      )(2 |η|2 |η minus η|2 + |η minus η|22

      )

      where we are using (C11) and isometry Alsosumqle2rq even

      micro2(q)

      φ(q)

      int δ0rqx

      minus δ0rqx|η(minusαx)|2 dα =

      sumqlerq odd

      micro2(q)

      φ(q)

      int δ0r2qx

      minus δ0r2qx

      |η(minusαx)|2 dα

      By (21) and Plancherelint δ0r2q

      minus δ0r2q

      |η(minusα)|2 dα =

      int infinminusinfin|η(minusα)|2 dαminusOlowast

      (2

      int infinδ0r2q

      |η(3) |21

      (2πα)6dα

      )

      = |η|22 +Olowast

      (|η(3) |21q5

      5π6(δ0r)5

      )

      Hence

      sumqlerq odd

      micro2(q)

      φ(q)

      int δ0r2q

      minus δ0r2q

      |η(minusα)|2 dα = |η|22 middotsumqlerq odd

      micro2(q)

      φ(q)+Olowast

      sumqlerq odd

      micro2(q)

      φ(q)

      |η(3) |21q5

      5π6(δ0r)5

      Using (C18) we get thatsumqlerq odd

      micro2(q)

      φ(q)

      |η(3) |21q5

      5π6(δ0r)5le 1

      r

      sumqlerq odd

      micro2(q)q

      φ(q)middot |η

      (3) |21

      5π6δ50

      le |η(3) |21

      5π6δ50

      middot(

      064787 +log r

      4r+

      0425

      r

      )

      Going back to (1025) we use (C7) to boundsumq

      micro2(q)x2

      φ(q)

      gcd(q 2)δ0r

      qxle 259147 middot δ0rx

      103 THE `2 NORM OVER THE MAJOR ARCS 211

      We also note that sumqlerq odd

      1

      q+sumqle2rq even

      2

      q=sumqler

      1

      qminussumqle r2

      1

      2q+sumqler

      1

      q

      le 2 log er minus logr

      2le log 2e2r

      We have proven the following result

      Lemma 1031 Let η [0infin) rarr R be in L1 cap Linfin Let Sη(α x) be as in (101) andlet M = Mδ0r be as in (105) Let η [0infin) rarr R be thrice differentiable outsidefinitely many points Assume η(3)

      isin L1Assume r ge 182 ThenintM

      |Sη(α x)|2dα = Lrδ0x+Olowast(

      519δ0xr

      (ET

      ηδ0r2middot(|η|1 +

      ETηδ0r2

      2

      )))+Olowast

      (δ0r(log 2e2r)

      (x middot E2

      ηrδ0 +Kr2

      ))

      (1026)where

      Eηrδ0 = maxχ mod q

      qlermiddotgcd(q2)

      |δ|legcd(q2)δ0r2q

      radicqlowast| errηχlowast(δ x)| ETηs = max

      |δ|les| errηχT (δ x)|

      Kr2 = (1 +radic

      2r)(log x)2|η|infin(2|Sη(0 x)|x+ (1 +radic

      2r)(log x)2|η|infinx)(1027)

      and Lrδ0 satisfies both

      Lrδ0 le 2|η|22sumqlerq odd

      micro2(q)

      φ(q)(1028)

      and

      Lrδ0 = 2|η|22sumqlerq odd

      micro2(q)

      φ(q)+Olowast(log r + 17) middot

      (2 |η|2 |η minus η|2 + |η minus η|22

      )

      +Olowast

      (2|η(3) |21

      5π6δ50

      )middot(

      064787 +log r

      4r+

      0425

      r

      )

      (1029)Here as elsewhere χlowast denotes the primitive character inducing χ whereas qlowast denotesthe modulus of χlowast

      The error term xrETηδ0r will be very small since it will be estimated using theRiemann zeta function the error term involving Kr2 will be completely negligibleThe term involving xr(r+1)E2

      ηrδ0 we see that it constrains us to have | errηχ(xN)|

      less than a constant times 1r if we do not want the main term in the bound (1026) tobe overwhelmed

      212 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      104 The integral over the major arcs conclusion

      There are at least two ways we can evaluate (104) One is to substitute (1010) into(104) The disadvantages here are that (a) this can give rise to pages-long formulae (b)this gives error terms proportional to xr| errηχ(xN)| meaning that to win we wouldhave to show that | errηχ(xN)| is much smaller than 1r What we will do instead isto use our `2 estimate (1026) in order to bound the contribution of non-principal termsThis will give us a gain of almost

      radicr on the error terms in other words to win it will

      be enough to show later that | errηχ(xN)| is much smaller than 1radicr

      The contribution of the error terms in Sη3(α x) (that is all terms involving thequantities errηχ in expressions (1012) and (1013)) to (104) is

      sumqlerq odd

      1

      φ(q)

      sumχ3 mod q

      τ(χ3)sum

      a mod q

      (aq)=1

      χ3(a)e(minusNaq)

      int δ0r2qx

      minus δ0r2qx

      Sη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

      +sumqle2rq even

      1

      φ(q)

      sumχ3 mod q

      τ(χ3)sum

      a mod q

      (aq)=1

      χ3(a)e(minusNaq)

      int δ0rqx

      minus δ0rqxSη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

      (1030)

      We should also remember the terms in (1011) we can integrate them over all of RZand obtain that they contribute at most

      intRZ

      2

      3sumj=1

      prodjprime 6=j

      |Sηjprime (α x)| middotmaxqler

      sump|q

      log psumαge1

      ηj

      (pα

      x

      )dα

      le 2

      3sumj=1

      prodjprime 6=j

      |Sηjprime (α x)|2 middotmaxqler

      sump|q

      log psumαge1

      ηj

      (pα

      x

      )

      = 2sumn

      Λ2(n)η2+(nx) middot log r middotmax

      pler

      sumαge1

      ηlowast

      (pα

      x

      )

      + 4

      radicsumn

      Λ2(n)η2+(nx) middot

      sumn

      Λ2(n)η2lowast(nx) middot log r middotmax

      pler

      sumαge1

      ηlowast

      (pα

      x

      )

      by Cauchy-Schwarz and Plancherel

      104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 213

      The absolute value of (1030) is at most

      sumqlerq odd

      suma mod q

      (aq)=1

      int δ0r2qx

      minus δ0r2qx

      ∣∣Sη+(α+ aq x)∣∣2 dα middot max

      χ mod q

      |δ|leδ0r2q

      radicqlowast| errηlowastχlowast(δ x)|

      +sumqle2rq even

      suma mod q

      (aq)=1

      int δ0rqx

      minus δ0rqx

      ∣∣Sη+(α+ aq x)∣∣2 dα middot max

      χ mod q

      |δ|leδ0rq

      radicqlowast| errηlowastχlowast(δ x)|

      leintMδ0r

      ∣∣Sη+(α)∣∣2 dα middot max

      χ mod q

      qlermiddotgcd(q2)

      |δ|legcd(q2)δ0rq

      radicqlowast| errηlowastχlowast(δ x)|

      (1031)We can bound the integral of |Sη+(α)|2 by (1026)

      What about the contribution of the error part of Sη2(α x) We can obviouslyproceed in the same way except that to avoid double-counting Sη3(α x) needs tobe replaced by

      1

      φ(q)τ(χ0)η3(minusδ) middot x =

      micro(q)

      φ(q)η3(minusδ) middot x (1032)

      which is its main term (coming from (1012)) Instead of having an `2 norm as in(1031) we have the square-root of a product of two squares of `2 norms (by Cauchy-Schwarz) namely

      intM|Slowastη+(α)|2dα and

      sumqlerq odd

      micro2(q)

      φ(q)2

      int δ0r2qx

      minus δ0r2qx

      |ηlowast(minusαx)x|2 dα+sumqle2rq even

      micro2(q)

      φ(q)2

      int δ0rqx

      minus δ0rqx|ηlowast(minusαx)x|2 dα

      le x|ηlowast|22 middotsumq

      micro2(q)

      φ(q)2

      (1033)

      By (C9) the sum over q is at most 282643As for the contribution of the error part of Sη1(α x) we bound it in the same way

      using solely the `2 norm in (1033) (and replacing both Sη2(α x) and Sη3(α x) byexpressions as in (1032))

      The total of the error terms is thus

      x middot maxχ mod q

      qlermiddotgcd(q2)

      |δ|legcd(q2)δ0rq

      radicqlowast middot | errηlowastχlowast(δ x)| middotA

      + x middot maxχ mod q

      qlermiddotgcd(q2)

      |δ|legcd(q2)δ0rq

      radicqlowast middot | errη+χlowast(δ x)|(

      radicA+

      radicB+)

      radicBlowast

      (1034)

      where A = (1x)intM|Sη+(α x)|2dα (bounded as in (1026)) and

      Blowast = 282643|ηlowast|22 B+ = 282643|η+|22 (1035)

      214 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      In conclusion we have proven

      Proposition 1041 Let x ge 1 Let η+ ηlowast [0infin)rarr R Assume η+ isin C2 ηprimeprime+ isin L2

      and η+ ηlowast isin L1 cap L2 Let η [0infin) rarr R be thrice differentiable outside finitelymany points Assume η(3)

      isin L1 and |η+ minus η|2 le ε0|η|2 where ε0 ge 0Let Sη(α x) =

      sumn Λ(n)e(αn)η(nx) Let errηχ χ primitive be given as in

      (1012) and (1013) Let δ0 gt 0 r ge 1 Let M = Mδ0r be as in (105)Then for any N ge 0int

      M

      Sη+(α x)2Sηlowast(α x)e(minusNα)dα

      equals

      C0Cηηlowastx2 +

      282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

      |η(3) |21δ50

      r

      |ηlowast|1x2

      +Olowast(Eηlowastrδ0Aη+ + Eη+rδ0 middot 16812(radicAη+ + 16812|η+|2)|ηlowast|2) middot x2

      +Olowast(

      2Zη2+2(x)LSηlowast(x r) middot x+ 4radicZη2+2(x)Zη2lowast2(x)LSη+(x r) middot x

      )

      (1036)where

      C0 =prodp|N

      (1minus 1

      (pminus 1)2

      )middotprodp-N

      (1 +

      1

      (pminus 1)3

      )

      Cηηlowast =

      int infin0

      int infin0

      η(t1)η(t2)ηlowast

      (N

      xminus (t1 + t2)

      )dt1dt2

      (1037)

      Eηrδ0 = maxχ mod q

      qlegcd(q2)middotr|δ|legcd(q2)δ0r2q

      radicqlowast middot | errηχlowast(δ x)| ETηs = max

      |δ|lesq| errηχT (δ x)|

      Aη =1

      x

      intM

      ∣∣Sη+(α x)∣∣2 dα Lηrδ0 le 2|η|22

      sumqlerq odd

      micro2(q)

      φ(q)

      Kr2 = (1 +radic

      2r)(log x)2|η|infin(2Zη1(x)x+ (1 +radic

      2r)(log x)2|η|infinx)

      Zηk(x) =1

      x

      sumn

      Λk(n)η(nx) LSη(x r) = log r middotmaxpler

      sumαge1

      η

      (pα

      x

      )

      (1038)and errηχ is as in (1012) and (1013)

      Here is how to read these expressions The error term in the first line of (1036)will be small provided that ε0 is small and r is large The third line of (1036) willbe negligible as will be the term 2δ0r(log er)Kr2 in the definition of Aη (ClearlyZηk(x)η (log x)kminus1 and LSη(x q)η τ(q) log x for any η of rapid decay)

      104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 215

      It remains to estimate the second line of (1036) This includes estimating Aη ndasha task that was already accomplished in Lemma 1031 We see that we will have togive very good bounds for Eηrδ0 when η = η+ or η = ηlowast We also see that we wantto make C0Cη+ηlowastx

      2 as large as possible it will be competing not just with the errorterms here but more importantly with the bounds from the minor arcs which will beproportional to |η+|22|ηlowast|1

      216 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

      Chapter 11

      Optimizing and adaptingsmoothing functions

      One of our goals is to maximize the quantity Cηηlowast in (1037) relative to |η|22|ηlowast|1One way to do this is to ensure that (a) ηlowast is concentrated on a very short1 interval [0 ε)(b) η is supported on the interval [0 2] and is symmetric around t = 1 meaning thatη(t) sim η(2minus t) Then for x sim N2 the integralint infin

      0

      int infin0

      η(t1)η(t2)ηlowast

      (N

      xminus (t1 + t2)

      )dt1dt2

      in (1037) should be approximately equal to

      |ηlowast|1 middotint infin

      0

      η(t)η

      (N

      xminus t)dt = |ηlowast|1 middot

      int infin0

      η(t)2dt = |ηlowast|1 middot |η|22 (111)

      provided that η0(t) ge 0 for all t It is easy to check (using Cauchy-Schwarz in thesecond step) that this is essentially optimal (We will redo this rigorously in a littlewhile)

      At the same time the fact is that major-arc estimates are best for smoothing func-tions η of a particular form and we have minor-arc estimates from Part I for a differentspecific smoothing η2 The issue then is how do we choose η and ηlowast as above so that

      bull ηlowast is concentrated on [0 ε)

      bull η is supported on [0 2] and symmetric around t = 1

      bull we can give minor-arc and major-arc estimates for ηlowast

      bull we can give major-arc estimates for a function η+ close to η in `2 norm

      1This is an idea appearing in work by Bourgain in a related context [Bou99]

      217

      218 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

      111 The symmetric smoothing function ηWe will later work with a smoothing function ηhearts whose Mellin transform decreasesvery rapidly Because of this rapid decay we will be able to give strong results basedon an explicit formula for ηhearts The issue is how to define η given ηhearts so that η issymmetric around t = 1 (ie η(2minus x) sim η(x)) and is very small for x gt 2

      We will later set ηhearts(t) = eminust22 Let

      h t 7rarr

      t3(2minus t)3etminus12 if t isin [0 2]0 otherwise

      (112)

      We define η Rrarr R by

      η(t) = h(t)ηhearts(t) =

      t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 otherwise

      (113)

      It is clear that η is symmetric around t = 1 for t isin [0 2]

      1111 The product η(t)η(ρminus t)We now should go back and redo rigorously what we discussed informally around(111) More precisely we wish to estimate

      η(ρ) =

      int infinminusinfin

      η(t)η(ρminus t)dt =

      int infinminusinfin

      η(t)η(2minus ρ+ t)dt (114)

      for ρ le 2 close to 2 In this it will be useful that the Cauchy-Schwarz inequalitydegrades slowly in the following sense

      Lemma 1111 Let V be a real vector space with an inner product 〈middot middot〉 Then forany v w isin V with |w minus v|2 le |v|22

      〈v w〉 = |v|2|w|2 +Olowast(271|v minus w|22)

      Proof By a truncated Taylor expansion

      radic1 + x = 1 +

      x

      2+x2

      2max

      0letle1

      1

      4(1minus (tx)2)32

      = 1 +x

      2+Olowast

      (x2

      232

      )for |x| le 12 Hence for δ = |w minus v|2|v|2

      |w|2|v|2

      =

      radic1 +

      2〈w minus v v〉+ |w minus v|22|v|22

      = 1 +2 〈wminusvv〉|v|22

      + δ2

      2+Olowast

      ((2δ + δ2)2

      232

      )= 1 + δ +Olowast

      ((1

      2+

      (52)2

      232

      )δ2

      )= 1 +

      〈w minus v v〉|v|22

      +Olowast(

      271|w minus v|22|v|22

      )

      112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS219

      Multiplying by |v|22 we obtain that

      |v|2|w|2 = |v|22 + 〈w minus v v〉+Olowast(271|w minus v|22

      )= 〈v w〉+Olowast

      (271|w minus v|22

      )

      Applying Lemma 1111 to (114) we obtain that

      (η lowast η)(ρ) =

      int infinminusinfin

      η(t)η((2minus ρ) + t)dt

      =

      radicint infinminusinfin|η(t)|2dt

      radicint infinminusinfin|η((2minus ρ) + t)|2dt

      +Olowast(

      271

      int infinminusinfin|η(t)minus η((2minus ρ) + t)|2 dt

      )= |η|22 +Olowast

      (271

      int infinminusinfin

      (int 2minusρ

      0

      |ηprime(r + t)| dr)2

      dt

      )

      = |η|22 +Olowast(

      271(2minus ρ)

      int 2minusρ

      0

      int infinminusinfin|ηprime(r + t)|2 dtdr

      )= |η|22 +Olowast(271(2minus ρ)2|ηprime|22)

      (115)

      We will be working with ηlowast supported on the non-negative reals we recall that ηis supported on [0 2] Henceint infin

      0

      int infin0

      η(t1)η(t2)ηlowast

      (N

      xminus (t1 + t2)

      )dt1dt2

      =

      int Nx

      0

      (η lowast η)(ρ)ηlowast

      (N

      xminus ρ)dρ

      =

      int Nx

      0

      (|η|22 +Olowast(271(2minus ρ)2|ηprime|22)) middot ηlowast(N

      xminus ρ)dρ

      = |η|22int N

      x

      0

      ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

      x

      0

      ((2minusNx) + ρ)2ηlowast(ρ)dρ

      )

      (116)provided that Nx ge 2 We see that it will be wise to set Nx very slightly larger than2 As we said before ηlowast will be scaled so that it is concentrated on a small interval[0 ε)

      112 The smoothing function ηlowast adapting minor-arcbounds

      Here the challenge is to define a smoothing function ηlowast that is good both for minor-arcestimates and for major-arc estimates The two regimes tend to favor different kinds of

      220 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

      smoothing function For minor-arc estimates we use as [Tao14] did

      η2(t) = 4 max(log 2minus | log 2t| 0) = ((2I[121]) lowastM (2I[121]))(t) (117)

      where I[121](t) is 1 if t isin [12 1] and 0 otherwise For major-arc estimates we willuse a function based on

      ηhearts = eminust22

      We will actually use here the function t2eminust22 whose Mellin transform isMηhearts(s+2)

      (by eg [BBO10 Table 111]))We will follow the simple expedient of convolving the two smoothing functions

      one good for minor arcs the other one for major arcs In general let ϕ1 ϕ2 [0infin)rarrC It is easy to use bounds on sums of the form

      Sfϕ1(x) =

      sumn

      f(n)ϕ1(nx) (118)

      to bound sums of the form Sfϕ1lowastMϕ2

      Sfϕ1lowastMϕ2=sumn

      f(n)(ϕ1 lowastM ϕ2)(nx

      )=

      int infin0

      sumn

      f(n)ϕ1

      ( n

      wx

      )ϕ2(w)

      dw

      w=

      int infin0

      Sfϕ1(wx)ϕ2(w)dw

      w

      (119)The same holds of course if ϕ1 and ϕ2 are switched since ϕ1 lowastM ϕ2 = ϕ2 lowastM ϕ1The only objection is that the bounds on (118) that we input might not be valid ornon-trivial when the argument wx of Sfϕ1

      (wx) is very small Because of this it isimportant that the functions ϕ1 ϕ2 vanish at 0 and desirable that their first derivativesdo so as well

      Let us see how this works out in practice for ϕ1 = η2 Here η2 [0infin) rarr R isgiven by

      η2 = η1 lowastM η1 = 4 max(log 2minus | log 2t| 0) (1110)

      where η1 = 2 middot I[121]Let us restate the bounds from Theorem 311 ndash the main result of Part I We will

      use Lemma C22 to bound terms of the form qφ(q)Let x ge x0 x0 = 216 middot 1020 Let 2α = aq + δx q le Q gcd(a q) = 1

      |δx| le 1qQ where Q = (34)x23 Then if 3 le q le x136 Theorem 311 givesus that

      |Sη2(α x)| le gx(

      max

      (1|δ|8

      )middot q)x (1111)

      where

      gx(r) =(Rx2r log 2r + 05)

      radicz(r) + 25radic

      2r+L2r

      r+ 336xminus16 (1112)

      112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS221

      with

      Rxt = 027125 log

      (1 +

      log 4t

      2 log 9x13

      2004t

      )+ 041415

      Lt = z(t2)

      (13

      4log t+ 782

      )+ 1366 log t+ 3755

      (1113)

      If q gt x136 then again by Theorem 311

      |Sη2(α x)| le h(x)x (1114)

      whereh(x) = 0276xminus16(log x)32 + 1234xminus13 log x (1115)

      We will work with x varying within a range and so we must pay some attentionto the dependence of (1111) and (1114) on x Let us prove two auxiliary lemmas onthis

      Lemma 1121 Let gx(r) be as in (1112) and h(x) as in (1115) Then

      x 7rarr

      h(x) if x lt (6r)3

      gx(r) if x ge (6r)3

      is a decreasing function of x for r ge 11 fixed and x ge 21

      Proof It is clear from the definitions that x 7rarr h(x) (for x ge 21) and x 7rarr gx(r) areboth decreasing Thus we simply have to show that h(xr) ge gxr (r) for xr = (6r)3Since xr ge (6 middot 11)3 gt e125

      Rxr2r le 027125 log(0065 log xr + 1056) + 041415

      le 027125 log((0065 + 00845) log xr) + 041415 le 027215 log log xr

      Hence

      Rxr2r log 2r + 05 le 027215 log log xr log x13r minus 027215 log 125 log 3 + 05

      le 009072 log log xr log xr minus 0255

      At the same time

      z(r) = eγ log logx

      13r

      6+

      250637

      log log rle eγ log log xr minus eγ log 3 + 19521

      le eγ log log xr

      (1116)

      for r ge 37 and we also get z(r) le eγ log log xr for r isin [11 37] by the bisectionmethod with 10 iterations Hence

      (Rxr2r log 2r + 05)radicz(r) + 25

      le (009072 log log xr log xr minus 0255)radiceγ log log xr + 25

      le 01211 log xr(log log xr)32 + 2

      222 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

      and so

      (Rxr2r log 2r + 05)radic

      z(r) + 25radic2r

      le (021 log xr(log log xr)32 + 347)xminus16

      r

      Now by (1116)

      L2r le eγ log log xr middot(

      13

      4log(x13

      r 3) + 782

      )+ 1366 log(x13

      r 3) + 3755

      le eγ log log xr middot(

      13

      12xr + 425

      )+ 456 log xr + 2255

      It is clear that

      425eγ log log xr + 456 log xr + 2255

      x13r 6

      lt 1234xminus13r log xr

      for xr ge e we make the comparison for xr = e and take the derivative of the ratio ofthe left side by the right side

      It remains to show that

      021 log xr(log log xr)32 + 347 + 336 +

      13

      2eγxminus13

      r log xr log log xr (1117)

      is less than 0276(log xr)32 for xr large enough Since t 7rarr (log t)32t12 is de-

      creasing for t gt e3 we see that

      021 log xr(log log xr)32 + 683 + 13

      2 eγxminus13r log xr log log xr

      0276(log xr)32lt 1

      for all xr ge e33 simply because it is true for x = e33 which is greater than ee3

      We conclude that h(xr) ge gxr (r) = gxr (x

      13r 6) for xr ge e33 We check that

      h(xr) ge gxr (x13r 6) for log xr isin [log 663 33] as well by the bisection method

      (applied with 30 iterations with log xr as the variable on the intervals [log 663 20][20 25] [25 30] and [30 33]) Since r ge 11 implies xr ge 663 we are done

      Lemma 1122 Let Rxr be as in (1112) Then t rarr Retr(r) is convex-up for t ge3 log 6r

      Proof Since trarr eminust6 and trarr t are clearly convex-up all we have to do is to showthat trarr Retr is convex-up In general since

      (log f)primeprime =

      (f prime

      f

      )prime=f primeprimef minus (f prime)2

      f2

      a function of the form (log f) is convex-up exactly when f primeprimef minus (f prime)2 ge 0 If f(t) =1 + a(tminus b) we have f primeprimef minus (f prime)2 ge 0 whenever

      (t+ aminus b) middot (2a) ge a2

      ie a2 + 2at ge 2ab and that certainly happens when t ge b In our case b =3 log(2004r9) and so t ge 3 log 6r implies t ge b

      112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS223

      Now we come to the point where we prove bounds on exponential sums of the formSηlowast(α x) (that is sums based on the smoothing ηlowast) based on our bounds (1111) and(1114) on the exponential sums Sη2(α x) This is straightforward as promised

      Proposition 1123 Let x ge Kx0 x0 = 216 middot 1020 K ge 1 Let Sη(α x) be asin (101) Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in L1

      Let 2α = aq+δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23If q le (xK)136 then

      Sηlowast(α x) le gxϕ(

      max

      (1|δ|8

      )q

      )middot |ϕ|1x (1118)

      where

      gxϕ(r) =(RxKϕ2r log 2r + 05)

      radicz(r) + 25radic

      2r+L2r

      r+ 336K16xminus16

      RxKϕt = Rxt + (RxKt minusRxt)Cϕ2K|ϕ|1

      logK(1119)

      with Rxt and Lt are as in (1113) and

      Cϕ2K = minusint 1

      1K

      ϕ(w) logw dw (1120)

      If q gt (xK)136 then

      |Sηlowast(α x)| le hϕ(xK) middot |ϕ|1x

      wherehϕ(x) = h(x) + Cϕ0K|ϕ|1

      Cϕ0K = 104488

      int 1K

      0

      |ϕ(w)|dw(1121)

      and h(x) is as in (1115)

      Proof By (119)

      Sηlowast(α x) =

      int 1K

      0

      Sη2(αwx)ϕ(w)dw

      w+

      int infin1K

      Sη2(αwx)ϕ(w)dw

      w

      We bound the first integral by the trivial estimate |Sη2(αwx)| le |Sη2(0 wx)| andCor C13 int 1K

      0

      |Sη2(0 wx)|ϕ(x)dw

      wle 104488

      int 1K

      0

      wxϕ(w)dw

      w

      = 104488x middotint 1K

      0

      ϕ(w)dw

      224 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

      Ifw ge 1K thenwx ge x0 and we can use (1111) or (1114) If q gt (xK)136then |Sη2(αwx)| le h(xK)wx by (1114) moreover |Sη2(α y)| le h(y)y forxK le y lt (6q)3 (by (1114)) and |Sη2(α y)| le gy1(r) for y ge (6q)3 (by (1111))Thus Lemma 1121 gives us thatint infin

      1K

      |Sη2(αwx)|ϕ(w)dw

      wleint infin

      1K

      h(xK)wx middot ϕ(w)dw

      w

      = h(xK)x

      int infin1K

      ϕ(w)dw le h(xK)|ϕ|1 middot x

      If q le (xK)136 we always use (1111) We can use the coarse boundint infin1K

      336xminus16 middot wx middot ϕ(w)dw

      wle 336K16|ϕ|1x56

      Since Lr does not depend on xint infin1K

      Lrrmiddot wx middot ϕ(w)

      dw

      wle Lr

      r|ϕ|1x

      By Lemma 1122 and q le (xK)136 y 7rarr Reyt is convex-up and decreasingfor y isin [log(xK)infin) Hence

      Rwxt le

      logwlog 1

      K

      RxKt +(

      1minus logwlog 1

      K

      )Rxt if w lt 1

      Rxt if w ge 1

      Thereforeint infin1K

      Rwxt middot wx middot ϕ(w)dw

      w

      leint 1

      1K

      (logw

      log 1K

      RxKt +

      (1minus logw

      log 1K

      )Rxt

      )xϕ(w)dw +

      int infin1

      Rxtϕ(w)xdw

      le Rxtx middotint infin

      1K

      ϕ(w)dw + (RxKt minusRxt)x

      logK

      int 1

      1K

      ϕ(w) logwdw

      le(Rxt|ϕ|1 + (RxKt minusRxt)

      Cϕ2logK

      )middot x

      where

      Cϕ2K = minusint 1

      1K

      ϕ(w) logw dw

      We finish by proving a couple more lemmas

      Lemma 1124 Let x gt K gt 1 Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) andϕ [0infin)rarr [0infin) is continuous and in L1 Let gxϕ be as in (1119)

      Then gxϕ(r) is a decreasing function of r for 670 le r le (xK)136

      112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS225

      Proof Taking derivatives we can easily see that

      r 7rarr log log r

      r r 7rarr log r

      r r 7rarr log r log log r

      r r 7rarr (log r)2 log log r

      r(1122)

      are decreasing for r ge 20 The same is true if log log r is replaced by z(r) sincez(r) log log r is a decreasing function for r ge e Since (Cϕ2K|ϕ|1) logK le 1(by (1120)) we see that it is enough to prove that r 7rarr Ry2r log 2r

      radiclog log r

      radic2r is

      decreasing on r for y = x and y = xK (under the assumption that r ge 670)Looking at (1113) and at (1122) we see that it remains only to check that

      r 7rarr log

      (1 +

      log 8r

      2 log 9y13

      4008r

      )log 2r middot

      radiclog log r

      r(1123)

      is decreasing on r for r ge 670 Taking logarithms and then derivatives we see that wehave to show that

      1r `+

      log 8rr

      2`2(1 + log 8r

      2`

      )log(

      1 + log 8r2`

      ) +1

      r log 2r+

      1

      2r log r log log rlt

      1

      2r

      where ` = log 9y13

      4008r We multiply by 2r and see that this is equivalent to

      1`

      (2minus 1

      1+ log 8r2`

      )log(

      1 + log 8r2`

      ) +2

      log 2r+

      1

      log r log log rlt 1 (1124)

      A derivative test is enough to show that s log(1 + s) is an increasing function of s fors gt 0 hence so is s middot (2minus 1(1 + s)) log(1 + s) Setting s = (log 8r)` we obtainthat the left side of (1124) is a decreasing function of ` for r ge 1 fixed

      Since r le y136 ` ge log 544008 gt 26 Thus for (1124) to hold it is enoughto ensure that

      126

      (2minus 1

      1+ log 8r52

      )log(

      1 + log 8r52

      ) +2

      log 2r+

      1

      log r log log rlt 1 (1125)

      A derivative test shows that (2 minus 1s) log(1 + s) is a decreasing function of s fors ge 123 since log(8 middot 75)52 gt 123 this implies that the left side of (1125) is adecreasing function of r for r ge 75

      We check that the left side of (1125) is indeed less than 1 for r = 670 we concludethat it is less than 1 for all r ge 670

      Lemma 1125 Let x ge 1025 Let φ [0infin) rarr [0infin) be continuous and in L1 Letgxφ(r) and h(x) be as in (1119) and (1115) respectively Then

      gxφ

      (3

      8x415

      )ge h(2x log x)

      226 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

      Proof We can bound gxφ(r) from below by

      gmx(r) =(Rxr log 2r + 05)

      radicz(r) + 25radic

      2r

      Let r = (38)x415 Using the assumption that x ge 1025 we see that

      Rxr = 027125 log

      1 +log(

      3x415

      2

      )2 log

      (9

      2004middot 38middot x 1

      3minus415)+ 041415 ge 063368

      (1126)(It is easy to see that the left side of (1126) is increasing on x) Using x ge 1025 againwe get that

      z(r) = eγ log log r +250637

      log log rge 568721

      Since log 2r = (415) log x+ log(34) we conclude that

      gmx(r) ge 040298 log x+ 325765radic34 middot x215

      Recall that

      h(x) =0276(log x)32

      x16+

      1234 log x

      x13

      We can see that

      x 7rarr (log x+ 33)x215

      (log(2x log x))32(2x log x)16(1127)

      is increasing for x ge 1025 (and indeed for x ge e27) by taking the logarithm of theright side of (1127) and then taking its derivative with respect to t = log x We cansee in the same way that (1x215)(log(2x log x)(2x log x)13) is increasing forx ge e22 Since

      040298(log x+ 33)radic34 middot x215

      ge 0276(log(2x log x))32

      (2x log x)16

      325765minus 33 middot 040298radic34 middot x215

      ge 1234 log(2x log(x))

      (2x log(x))13

      for x = 1025 we are done

      Chapter 12

      The `2 norm and the large sieve

      Our aim here is to give a bound on the `2 norm of an exponential sum over the minorarcs While we care about an exponential sum in particular we will prove a result validfor all exponential sums S(α x) =

      sumn ane(αn) with an of prime support

      We start by adapting ideas from Ramarersquos version of the large sieve for primes toestimate `2 norms over parts of the circle (sect121) We are left with the task of givingan explicit bound on the factor in Ramarersquos work this we do in sect122 As a side effectthis finally gives a fully explicit large sieve for primes that is asymptotically optimalmeaning a sieve that does not have a spurious factor of eγ in front this was an arguablyimportant gap in the literature

      121 Variations on the large sieve for primes

      We are trying to estimate an integralintRZ |S(α)|3dα Instead of bounding it trivially by

      |S|infin|S|22 we can use the fact that large (ldquomajorrdquo) values of S(α) have to be multipliedonly by

      intM|S(α)|2dα where M is a union (small in measure) of major arcs Now

      can we give an upper bound forintM|S(α)|2dα better than |S|22 =

      intRZ |S(α)|2dα

      The first version of [Helb] gave an estimate on that integral using a technique due toHeath-Brown which in turn rests on an inequality of Montgomeryrsquos ([Mon71 (39)]see also eg [IK04 Lem 715]) The technique was communicated by Heath-Brownto the present author who communicated it to Tao who used it in his own notable workon sums of five primes (see [Tao14 Lem 46] and adjoining comments) We will beable to do better than that estimate here

      The role played by Montgomeryrsquos inequality in Heath-Brownrsquos method is playedhere by a result of Ramarersquos ([Ram09 Thm 21] see also [Ram09 Thm 52]) Thefollowing proposition is based on Ramarersquos result or rather on one possible proof ofit Instead of using the result as stated in [Ram09] we will actually be using elementsof the proof of [Bom74 Thm 7A] credited to Selberg Simply integrating Ramarersquosinequality would give a non-trivial if slightly worse bound

      227

      228 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      Proposition 1211 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

      radicx Let Q0 ge 1 δ0 ge 1 be such that

      δ0Q20 le x2 set Q =

      radicx2δ0 ge Q0 Let

      M =⋃qleQ0

      ⋃a mod q

      (aq)=1

      (a

      qminus δ0r

      qxa

      q+δ0r

      qx

      ) (121)

      Let S(α) =sumn ane(αn) for α isin RZ Thenint

      M

      |S(α)|2 dα le(

      maxqleQ0

      maxsleQ0q

      Gq(Q0sq)

      Gq(Qsq)

      )sumn

      |an|2

      where

      Gq(R) =sumrleR

      (rq)=1

      micro2(r)

      φ(r) (122)

      Proof By (121)intM

      |S(α)|2 dα =sumqleQ0

      int δ0Q0qx

      minus δ0Q0qx

      suma mod q

      (aq)=1

      ∣∣∣∣S (aq + α

      )∣∣∣∣2 dα (123)

      Thanks to the last equations of [Bom74 p 24] and [Bom74 p 25]

      suma mod q

      (aq)=1

      ∣∣∣∣S (aq)∣∣∣∣2 =

      1

      φ(q)

      sumqlowast|q

      (qlowastqqlowast)=1

      micro2(qqlowast)=1

      qlowast middotsumlowast

      χ mod qlowast

      ∣∣∣∣∣sumn

      anχ(n)

      ∣∣∣∣∣2

      for every q leradicx where we use the assumption that n is prime and gt

      radicx (and thus

      coprime to q) when an 6= 0 HenceintM

      |S(α)|2 dα =sumqleQ0

      sumqlowast|q

      (qlowastqqlowast)=1

      micro2(qqlowast)=1

      qlowastint δ0Q0

      qx

      minus δ0Q0qx

      1

      φ(q)

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      =sumqlowastleQ0

      qlowast

      φ(qlowast)

      sumrleQ0qlowast

      (rqlowast)=1

      micro2(r)

      φ(r)

      int δ0Q0qlowastrx

      minus δ0Q0qlowastrx

      sumlowast

      χ mod qlowast

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      =sumqlowastleQ0

      qlowast

      φ(qlowast)

      int δ0Q0qlowastx

      minus δ0Q0qlowastx

      sumrleQ0

      qlowast min(1δ0|α|x )

      (rqlowast)=1

      micro2(r)

      φ(r)

      sumlowast

      χ mod qlowast

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 229

      Here |α| le δ0Q0qlowastx implies (Q0q)δ0|α|x ge 1 Thereforeint

      M

      |S(α)|2 dα le(

      maxqlowastleQ0

      maxsleQ0qlowast

      Gqlowast(Q0sqlowast)

      Gqlowast(Qsqlowast)

      )middot Σ (124)

      where

      Σ =sumqlowastleQ0

      qlowast

      φ(qlowast)

      int δ0Q0qlowastx

      minus δ0Q0qlowastx

      sumrle Q

      qlowast min(1δ0|α|x )

      (rqlowast)=1

      micro2(r)

      φ(r)

      sumlowast

      χ mod qlowast

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      lesumqleQ

      q

      φ(q)

      sumrleQq(rq)=1

      micro2(r)

      φ(r)

      int δ0Qqrx

      minus δ0Qqrx

      sumlowast

      χ mod q

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      As stated in the proof of [Bom74 Thm 7A]

      χ(r)χ(n)τ(χ)cr(n) =

      qrsumb=1

      (bqr)=1

      χ(b)e2πin bqr

      for χ primitive of modulus q Here cr(n) stands for the Ramanujan sum

      cr(n) =sum

      u mod r(ur)=1

      e2πnur

      For n coprime to r cr(n) = micro(r) Since χ is primitive |τ(χ)| =radicq Hence for

      r leradicx coprime to q

      q

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      =

      ∣∣∣∣∣∣∣∣qrsumb=1

      (bqr)=1

      χ(b)S

      (b

      qr+ α

      )∣∣∣∣∣∣∣∣2

      Thus

      Σ =sumqleQ

      sumrleQq(rq)=1

      micro2(r)

      φ(rq)

      int δ0Qqrx

      minus δ0Qqrx

      sumlowast

      χ mod q

      ∣∣∣∣∣∣∣∣qrsumb=1

      (bqr)=1

      χ(b)S

      (b

      qr+ α

      )∣∣∣∣∣∣∣∣2

      lesumqleQ

      1

      φ(q)

      int δ0Qqx

      minus δ0Qqx

      sumχ mod q

      ∣∣∣∣∣∣∣∣qsumb=1

      (bq)=1

      χ(b)S

      (b

      q+ α

      )∣∣∣∣∣∣∣∣2

      =sumqleQ

      int δ0Qqx

      minus δ0Qqx

      qsumb=1

      (bq)=1

      ∣∣∣∣S ( bq + α

      )∣∣∣∣2 dα

      230 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      Let us now check that the intervals (bq minus δ0Qqx bq + δ0Qqx) do not overlapSince Q =

      radicx2δ0 we see that δ0Qqx = 12qQ The difference between two

      distinct fractions bq bprimeqprime is at least 1qqprime For q qprime le Q 1qqprime ge 12qQ+ 12QqprimeHence the intervals around bq and bprimeqprime do not overlap We conclude that

      Σ leintRZ|S(α)|2 =

      sumn

      |an|2

      and so by (124) we are done

      We will actually use Prop 1211 in the slightly modified form given by the follow-ing statement

      Proposition 1212 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

      radicx Let Q0 ge 1 δ0 ge 1 be such that

      δ0Q20 le x2 set Q =

      radicx2δ0 ge Q0 Let M = Mδ0Q0

      be as in (105)Let S(α) =

      sumn ane(αn) for α isin RZ Then

      intMδ0Q0

      |S(α)|2 dα le

      maxqle2Q0

      q even

      maxsle2Q0q

      Gq(2Q0sq)

      Gq(2Qsq)

      sumn

      |an|2

      where

      Gq(R) =sumrleR

      (rq)=1

      micro2(r)

      φ(r) (125)

      Proof By (105)intM

      |S(α)|2 dα =sumqleQ0

      q odd

      int δ0Q02qx

      minus δ0Q02qx

      suma mod q

      (aq)=1

      ∣∣∣∣S (aq + α

      )∣∣∣∣2 dα+sumqleQ0

      q even

      int δ0Q0qx

      minus δ0Q0qx

      suma mod q

      (aq)=1

      ∣∣∣∣S (aq + α

      )∣∣∣∣2 dαWe proceed as in the proof of Prop 1211 We still have (123) Hence

      intM|S(α)|2 dα

      equals

      sumqlowastleQ0

      qlowast odd

      qlowast

      φ(qlowast)

      int δ0Q02qlowastx

      minus δ0Q02qlowastx

      sumrleQ0

      qlowast min(1δ0

      2|α|x )(r2qlowast)=1

      micro2(r)

      φ(r)

      sumlowast

      χ mod qlowast

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      +sum

      qlowastle2Q0

      qlowast even

      qlowast

      φ(qlowast)

      int δ0Q0qlowastx

      minus δ0Q0qlowastx

      sumrle 2Q0

      qlowast min(1δ0

      2|α|x )(rqlowast)=1

      micro2(r)

      φ(r)

      sumlowast

      χ mod qlowast

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 231

      (The sum with q odd and r even is equal to the first sum hence the factor of 2 in front)Therefore int

      M

      |S(α)|2 dα le

      maxqlowastleQ0

      qlowast odd

      maxsleQ0qlowast

      G2qlowast(Q0sqlowast)

      G2qlowast(Qsqlowast)

      middot 2Σ1

      +

      maxqlowastle2Q0

      qlowast even

      maxsle2Q0qlowast

      Gqlowast(2Q0sqlowast)

      Gqlowast(2Qsqlowast)

      middot Σ2

      (126)

      where

      Σ1 =sumqleQq odd

      q

      φ(q)

      sumrleQq

      (r2q)=1

      micro2(r)

      φ(r)

      int δ0Q2qrx

      minus δ0Q2qrx

      sumlowast

      χ mod q

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      =sumqleQq odd

      q

      φ(q)

      sumrle2Qq

      (rq)=1

      r even

      micro2(r)

      φ(r)

      int δ0Qqrx

      minus δ0Qqrx

      sumlowast

      χ mod q

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      Σ2 =sumqle2Qq even

      q

      φ(q)

      sumrle2Qq

      (rq)=1

      micro2(r)

      φ(r)

      int δ0Qqrx

      minus δ0Qqrx

      sumlowast

      χ mod q

      ∣∣∣∣∣sumn

      ane(αn)χ(n)

      ∣∣∣∣∣2

      The two expressions within parentheses in (126) are actually equalMuch as before using [Bom74 Thm 7A] we obtain that

      Σ1 lesumqleQq odd

      1

      φ(q)

      int δ0Q2qx

      minus δ0Q2qx

      qsumb=1

      (bq)=1

      ∣∣∣∣S ( bq + α

      )∣∣∣∣2 dαΣ1 + Σ2 le

      sumqle2Qq even

      1

      φ(q)

      int δ0Qqx

      minus δ0Qqx

      qsumb=1

      (bq)=1

      ∣∣∣∣S ( bq + α

      )∣∣∣∣2 dαLet us now check that the intervals of integration (bq minus δ0Q2qx bq + δ0Q2qx)(for q odd) (bq minus δ0Qqx bq + δ0Qqx) (for q even) do not overlap Recall thatδ0Qqx = 12qQ The absolute value of the difference between two distinct fractionsbq bprimeqprime is at least 1qqprime For q qprime le Q odd this is larger than 14qQ + 14Qqprimeand so the intervals do not overlap For q le Q odd and qprime le 2Q even (or vice versa)1qqprime ge 14qQ + 12Qqprime and so again the intervals do not overlap If q le Qand qprime le Q are both even then |bq minus bprimeqprime| is actually ge 2qqprime Clearly 2qqprime ge12qQ+ 12Qqprime and so again there is no overlap We conclude that

      2Σ1 + Σ2 leintRZ|S(α)|2 =

      sumn

      |an|2

      232 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      122 Bounding the quotient in the large sieve for primesThe estimate given by Proposition 1211 involves the quotient

      maxqleQ0

      maxsleQ0q

      Gq(Q0sq)

      Gq(Qsq) (127)

      where Gq is as in (122) The appearance of such a quotient (at least for s = 1)is typical of Ramarersquos version of the large sieve for primes see eg [Ram09] Wewill see how to bound such a quotient in a way that is essentially optimal not justasymptotically but also in the ranges that are most relevant to us (This includes forexample Q0 sim 106 Q sim 1015)

      As the present work shows an approach based on Ramarersquos work gives bounds thatare in some contexts better than those of other large sieves for primes by a constantfactor (approaching eγ = 178107 ) Thus giving a fully explicit and nearly optimalbound for (127) is a task of clear general relevance besides being needed for our maingoal

      We will obtain bounds for Gq(Q0sq)Gq(Qsq) when Q0 le 2 middot 1010 Q ge Q20

      As we shall see our bounds will be best when s = q = 1 ndash or sometimes when s = 1and q = 2 instead

      Write G(R) for G1(R) =sumrleR micro

      2(r)φ(r) We will need several estimates forGq(R) and G(R) As stated in [Ram95 Lemma 34]

      G(R) le logR+ 14709 (128)

      for R ge 1 By [MV73 Lem 7]

      G(R) ge logR+ 107 (129)

      for R ge 6 There is also the trivial bound

      G(R) =sumrleR

      micro2(r)

      φ(r)=sumrleR

      micro2(r)

      r

      prodp|r

      (1minus 1

      p

      )minus1

      =sumrleR

      micro2(r)

      r

      prodp|r

      sumjge1

      1

      pjgesumrleR

      1

      rgt logR

      (1210)

      The following bound also well-known and easy

      G(R) le q

      φ(q)Gq(R) le G(Rq) (1211)

      can be obtained by multiplying Gq(R) =sumrleR(rq)=1 micro

      2(r)φ(r) term-by-term byqφ(q) =

      prodp|q(1 + 1φ(p))

      We will also use Ramarersquos estimate from [Ram95 Lem 34]

      Gd(R) =φ(d)

      d

      logR+ cE +sump|d

      log p

      p

      +Olowast(

      7284Rminus13f1(d))

      (1212)

      122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 233

      for all d isin Z+ and all R ge 1 where

      f1(d) =prodp|d

      (1 + pminus23)

      (1 +

      p13 + p23

      p(pminus 1)

      )minus1

      (1213)

      andcE = γ +

      sumpge2

      log p

      p(pminus 1)= 13325822 (1214)

      by [RS62 (211)]If R ge 182 then

      logR+ 1312 le G(R) le logR+ 1354 (1215)

      where the upper bound is valid for R ge 120 This is true by (1212) for R ge 4 middot 107we check (1215) for 120 le R le 4 middot 107 by a numerical computation1 Similarly forR ge 200

      logR+ 1661

      2le G2(R) le logR+ 1698

      2(1216)

      by (1212) for R ge 16 middot108 and by a numerical computation for 200 le R le 16 middot108Write ρ = (logQ0)(logQ) le 1 We obtain immediately from (1215) and (1216)

      thatG(Q0)

      G(Q)le logQ0 + 1354

      logQ+ 1312

      G2(Q0)

      G2(Q)le logQ0 + 1698

      logQ+ 1661

      (1217)

      for QQ0 ge 200 What is hard is to approximate Gq(Q0)Gq(Q) for q large and Q0

      smallLet us start by giving an easy bound off from the truth by a factor of about eγ

      (Specialists will recognize this as a factor that appears often in first attempts at esti-mates based on either large or small sieves) First we need a simple explicit lemma

      Lemma 1221 Let m ge 1 q ge 1 Thenprodp|qorplem

      p

      pminus 1le eγ(log(m+ log q) + 065771) (1218)

      Proof Let P =prodplemorp|q p Then by [RS75 (51)]

      P le qprodplem

      p = qesumplem log p le qe(1+ε0)m

      where ε0 = 0001102 Now by [RS62 (342)]

      n

      φ(n)le eγ log log n+

      250637

      log log nle eγ log log x+

      250637

      log log x

      1Using D Plattrsquos implementation [Pla11] of double-precision interval arithmetic based on Lambovrsquos[Lam08] ideas

      234 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      for all x ge n ge 27 (since given a b gt 0 the function t 7rarr a + bt is increasing on tfor t ge

      radicba) Hence if qem ge 27

      P

      φ(P)le eγ log((1 + ε0)m+ log q) +

      250637

      log(m+ log q)

      le eγ(

      log(m+ log q) + ε0 +250637eγ

      log(m+ log q)

      )

      Thus (1218) holds when m + log q ge 853 since then ε0 + (250637eγ) log(m +log q) le 065771 We verify all choices of m q ge 1 with m + log q le 853 compu-tationally the worst case is that of m = 1 q = 6 which give the value 065771 in(1218)

      Here is the promised easy bound

      Lemma 1222 Let Q0 ge 1 Q ge 182Q0 Let q le Q0 s le Q0q q an integer Then

      Gq(Q0sq)

      Gq(Qsq)leeγ log

      (Q0

      sq + log q)

      + 1172

      log QQ0

      + 1312le eγ logQ0 + 1172

      log QQ0

      + 1312

      Proof Let P =prodpleQ0sqorp|q p Then

      Gq(Q0sq)GP(QQ0) le Gq(Qsq)

      and soGq(Q0sq)

      Gq(Qsq)le 1

      GP(QQ0) (1219)

      Now the lower bound in (1211) gives us that for d = P R = QQ0

      GP(QQ0) ge φ(P)

      PG(QQ0)

      By Lem 1221

      P

      φ(P)le eγ

      (log

      (Q0

      sq+ log q

      )+ 0658

      )

      Hence using (1215) we get that

      Gq(Q0sq)

      Gq(Qsq)le Pφ(P)

      G(QQ0)leeγ log

      (Q0

      sq + log q)

      + 1172

      log QQ0

      + 1312 (1220)

      since QQ0 ge 184 Since(Q0

      sq+ log q

      )prime= minusQ0

      sq2+

      1

      q=

      1

      q

      (1minus Q0

      sq

      )le 0

      the rightmost expression of (1220) is maximal for q = 1

      122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 235

      Lemma 1222 will play a crucial role in reducing to a finite computation the prob-lem of bounding Gq(Q0sq)Gq(Qsq) As we will now see we can use Lemma1222 to obtain a bound that is useful when sq is large compared to Q0 ndash precisely thecase in which asymptotic estimates such as (1212) are relatively weak

      Lemma 1223 Let Q0 ge 1 Q ge 200Q0 Let q le Q0 s le Q0q Let ρ =(logQ0) logQ le 23 Then for any σ ge 1312ρ

      Gq(Q0sq)

      Gq(Qsq)le logQ0 + σ

      logQ+ 1312(1221)

      holds provided thatQ0

      sqle c(σ) middotQ(1minusρ)eminusγ

      0 minus log q

      where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

      Proof By Lemma 1222 we see that (1221) will hold provided that

      eγ log

      (Q0

      sq+ log q

      )+ 1172 le

      log QQ0

      + 1312

      logQ+ 1312middot (logQ0 + σ) (1222)

      The expression on the right of (1222) equals

      logQ0 + σ minus (logQ0 + σ) logQ0

      logQ+ 1312

      = (1minus ρ)(logQ0 + σ) +1312ρ(logQ0 + σ)

      logQ+ 1312

      ge (1minus ρ)(logQ0 + σ) + 1312ρ2

      and so (1222) will hold provided that

      eγ log

      (Q0

      sq+ log q

      )+ 1172 le (1minus ρ)(logQ0) + (1minus ρ)σ + 1312ρ2

      Taking derivatives we see that

      (1minus ρ)σ + 1312ρ2 minus 1172 ge(

      1minus σ

      2624

      )σ + 1312

      ( σ

      2624

      )2

      minus 1172

      = σ minus σ2

      4 middot 1312minus 1172

      Hence it is enough that

      Q0

      sq+ log q le ee

      minusγ(

      (1minusρ) logQ0+σminus σ2

      4middot1312minus1172)

      = c(σ) middotQ(1minusρ)eminusγ0

      where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

      We now pass to the main result of the section

      236 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      Proposition 1224 Let Q ge 20000Q0 Q0 ge Q0min where Q0min = 105 Letρ = (logQ0) logQ Assume ρ le 06 Then for every 1 le q le Q0 and everys isin [1 Q0q]

      Gq(Q0sq)

      Gq(Qsq)le logQ0 + c+

      logQ+ cE (1223)

      where cE is as in (1214) and c+ = 136

      An ideal result would have c+ instead of cE but this is not actually possible errorterms do exist even if they are in reality smaller than the bound given in (1212) thismeans that a bound such as (1223) with c+ instead of cE would be false for q = 1s = 1

      There is nothing special about the assumptions

      Q ge 20000Q0 Q0 ge 105 (logQ0)(logQ) le 06

      They can all be relaxed at the cost of an increase in c+

      Proof Define errqR so that

      Gq(R) =φ(q)

      q

      logR+ cE +sump|q

      log p

      p

      + errqR (1224)

      Then (1223) will hold if

      logQ0

      sq+ cE +

      sump|q

      log p

      p+

      q

      φ(q)err

      qQ0sq

      le

      logQ

      sq+ cE +

      sump|q

      log p

      p+

      q

      φ(q)errq Qsq

      logQ0 + c+logQ+ cE

      (1225)

      This in turn happens iflog sq minussump|q

      log p

      p

      (1minus logQ0 + c+logQ+ cE

      )+ c+ minus cE

      ge q

      φ(q)

      (err

      qQ0sqminus logQ0 + c+

      logQ+ cEerrq Qsq

      )

      Defineω(ρ) =

      logQ0min + c+1ρ logQ0min + cE

      = ρ+c+ minus ρcE

      1ρ logQ0min + cE

      Then ρ le (logQ0 + c+)(logQ+ cE) le ω(ρ) (because c+ ge ρcE) We conclude that(1225) (and hence (1223)) holds provided that

      (1minus ω(ρ))

      log sq minussump|q

      log p

      p

      + c∆

      ge q

      φ(q)

      (err

      qQ0sq

      +ω(ρ) max(

      0minus errq Qsq

      ))

      (1226)

      122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 237

      where c∆ = c+ minus cE Note that 1minus ω(ρ) gt 0First let us give some easy bounds on the error terms these bounds will yield upper

      bounds for s By (128) and (1211)

      errqR leφ(q)

      q

      log q minussump|q

      log p

      p+ (14709minus cE)

      for R ge 1 by (1215) and (1211)

      errqR ge minusφ(q)

      q

      sump|q

      log p

      p+ (cE minus 1312)

      for R ge 182 Therefore the right side of (1226) is at most

      log q minus (1minus ω(ρ))sump|q

      log p

      p+ ((14709minus cE) + ω(ρ)(cE minus 1312))

      and so (1226) holds provided that

      (1minus ω(ρ)) log sq ge log q + (14709minus cE) + ω(ρ)(cE minus 1312)minus c∆ (1227)

      We will thus be able to assume from now on that (1227) does not hold or what is thesame that

      sq lt (cρ2q)1

      1minusω(ρ) (1228)

      holds where cρ2 = exp((14709minus cE) + ω(ρ)(cE minus 1312)minus c∆)What values of R = Q0sq must we consider for q given First by (1228) we

      can assume R gt Q0min(cρ2q)1(1minusω(ρ)) We can also assume

      R gt c(c+) middotmax(RqQ0min)(1minusρ)eminusγ minus log q (1229)

      for c(c+) is as in Lemma 1223 since all smaller R are covered by that LemmaClearly (1229) implies that

      R1minusτ gt c(c+) middot qτ minus log q

      Rτgt c(c+)qτ minus log q

      where τ = (1minusρ)eminusγ and also thatR gt c(c+)Q(1minusρ)eminusγ0min minus log q Iterating we obtain

      that we can assume that R gt $(q) where

      $(q) = max

      ($0(q) c(c+)Qτ0min minus log q

      Q0min

      (cρ2q)1

      1minusω(ρ)

      )(1230)

      and

      $0(q) =

      (c(c+)qτ minus log q

      (c(c+)qτminuslog q)τ

      1minusτ

      ) 11minusτ

      if c(c+)qτ gt log q + 1

      0 otherwise

      238 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      Looking at (1226) we see that it will be enough to show that for all R satisfyingR gt $(q) we have

      errqR +ω(ρ) max (0minus errqtR) le φ(q)

      qκ(q) (1231)

      for all t ge 20000 where

      κ(q) = (1minus ω(ρ))

      log q minussump|q

      log p

      p

      + c∆

      Ramarersquos bound (1212) implies that

      | errqR | le 7284Rminus13f1(q) (1232)

      with f1(q) as in (1213) and so

      errqR +ω(ρ) max (0minus errqtR) le (1 + βρ) middot 7284Rminus13f1(q)

      where βρ = ω(ρ)2000013 This is enough when

      R ge λ(q) =

      (q

      φ(q)

      7284(1 + βρ)f1(q)

      κ(q)

      )3

      (1233)

      It remains to do two things First we have to compute how large q has to be for$(q) to be guaranteed to be greater than λ(q) (For such q there is no checking to bedone) Then we check the inequality (1231) for all smaller q letting R range throughthe integers in [$(q) λ(q)] We bound errqtR using (1232) but we compute errqRdirectly

      How large must q be for $(q) gt λ(q) to hold We claim that $(q) gt λ(q)whenever q ge 22 middot 1010 Let us show this

      It is easy to see that (p(pminus1)) middotf1(p) and prarr (log p)p are decreasing functionsof p for p ge 3 moreover for both functions the value at p ge 7 is smaller than forp = 2 Hence we have that for q lt

      prodplep0 p p0 a prime

      κ(q) ge (1minus ω(ρ))

      (log q minus

      sumpltp0

      log p

      p

      )+ c∆ (1234)

      and

      λ(q) le

      prodpltp0

      p

      pminus 1middot

      7284(1 + βρ)prodpltp0

      f1(p)

      (1minus ω(ρ))(

      log q minussumpltp0

      log pp

      )+ c∆

      3

      (1235)

      If we also assume that 2 middot 3 middot 5 middot 7 - q we obtain

      κ(q) ge (1minus ω(ρ))

      log q minussumpltp0p 6=7

      log p

      p

      + c∆ (1236)

      122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 239

      and

      λ(q) le

      prodpltp0p 6=7

      p

      pminus 1middot

      7284(1 + βρ)prodpltp0p 6=7 f1(p)

      (1minus ω(ρ))(

      log q minussumpltp0p6=7

      log pp

      )+ c∆

      3

      (1237)

      for q ltprodplep0 (We are taking out 7 because it is the ldquoleast helpfulrdquo prime to omit

      among all primes from 2 to 7 again by the fact that (p(p minus 1)) middot f1(p) and p rarr(log p)p are decreasing functions for p ge 3)

      We know how to give upper bounds for the expression on the right of (1235)The task is in essence simple we can base our bounds on the classic explicit work in[RS62] except that we also have to optimize matters so that they are close to tight forp1 = 29 p1 = 31 and other low p1

      By [RS62 (330)] and a numerical computation for 29 le p1 le 43prodplep1

      p

      pminus 1lt 190516 log p1

      for p1 ge 29 Since ω(ρ) is increasing on ρ and we are assuming ρ le 06 Q0min =100000

      ω(ρ) le 0627312 βρ le 0023111

      For x gt a where a gt 1 is any constant we obviously havesumaltplex

      log(

      1 + pminus23)le

      sumaltplex

      (log p)pminus23

      log a

      by Abel summation (133) and the estimate [RS62 (332)] for θ(x) =sumplex log psum

      altplex

      (log p)pminus23 = (θ(x)minus θ(a))xminus23 minus

      int x

      a

      (θ(u)minus θ(a))

      (minus2

      3uminus

      53

      )du

      le (101624xminus θ(a))xminus23 +

      2

      3

      int x

      a

      (101624uminus θ(a))uminus53 du

      = (101624xminus θ(a))xminus23 + 2 middot 101624(x13 minus a13) + θ(a)(xminus23 minus aminus23)

      = 3 middot 101624 middot x13 minus (203248a13 + θ(a)aminus23)

      We conclude thatsum

      104ltplex log(1 + pminus23) le 033102x13 minus 706909 for x gt 104Since

      sumple104 log p le 1009062 this means thatsum

      plex

      log(1 + pminus23) le(

      033102 +1009062minus 706909

      1043

      )x13 le 047126x13

      for x gt 104 a direct computation for all x prime between 29 and 104 then confirmsthat sum

      plex

      log(1 + pminus23) le 074914x13

      240 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      for all x ge 29 Thusprodplex

      f1(p) le esumplex log(1+pminus23)prod

      ple29

      (1 + p13+p23

      p(pminus1)

      ) le e074914x13

      662365

      for x ge 29 Finally by [RS62 (324)]sumplep1

      log pp lt log p1

      We conclude that for q ltprodplep0 p0 p0 a prime and p1 the prime immediately

      preceding p0

      λ(q) le

      190516 log p1 middot745235 middot

      (e074914p

      131

      662365

      )037268(log q minus log p1) + 002741

      3

      le 190272(log p1)3e224742p131

      (log q minus log p1 + 007354)3

      (1238)

      It is clear from (1230) that $(q) is increasing as soon as

      q ge max(Q0min Q1minusω(ρ)0min cρ2)

      and c(c+)qτ gt log q+ 1 since then $0(q) is increasing and $(q) = $0(q) Here it isuseful to recall that cρ2 ge exp(14709 minus c+) and to note that c(c+)qτ minus (log q + 1)is increasing for q ge 1(τ middot c(c+))1τ we see also that 1(τ middot c(c+))1τ le 1((1 minus06)eminusγc(c+))1((1minus06)eminusγ) for ρ le 06 A quick computation for our value of c+makes us conclude that q gt 112Q0min = 112000 is a sufficient condition for $(q) tobe equal to $0(q) and for $0(q) to be increasing

      Since (1238) is decreasing on q for p1 fixed and $0(q) is decreasing on ρ andincreasing on q we set ρ = 06 and check that then

      $0

      (22 middot 1010

      )ge 846765

      whereas by (1238)

      λ(22 middot 1010) le 838227 lt 846765

      this is enough to ensure that λ(q) lt $0(q) for 22 middot 1010 le q ltprodple31 p

      Let us now give some rough bounds that will be enough to cover the case q geprodple31 p First as we already discussed $(q) = $0(q) and since c(c+)qτ gt log q +

      1

      $0(q) ge (c(c+)qτ minus log q)1

      1minusτ ge (0911q0224 minus log q)1289 ge q02797 (1239)

      by q geprodple31 p We are in the range

      prodplep1 p le q le

      prodplep0 p where p1 lt p0

      are two consecutive primes with p1 ge 31 By [RS62 (316)] and a computation for31 le q lt 200 we know that log q ge

      prodplep1 log p ge 08009p1 By (1238) and

      (1239) it follows that we just have to show that

      e0224t gt190272(log t)3e224742t13

      (08009tminus log t+ 007354)3

      122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 241

      for t ge 31 Now t ge 31 implies 08009tminus log t+ 007354 ge 06924t and so takinglogarithms we see that we just have to verify

      0224tminus 224742t13 gt 3 log log tminus 3 log t+ 63513 (1240)

      for t ge 31 and since the left side is increasing and the right side is decreasing fort ge 31 this is trivial to check

      We conclude that $(q) gt λ(q) whenever q ge 22 middot 1010It remains to see how we can relax this assumption if we assume that 2 middot 3 middot 5 middot 7 - q

      We repeat the same analysis as before using (1236) and (1237) instead of (1234) and(1235) For p1 ge 29

      prodplep1p 6=7

      p

      pminus 1lt 1633 log p1

      prodplep1p6=7

      f1(p) le e074914x13minuslog(1+7minus23)

      58478le e074914x13

      744586

      andsumplep1p 6=7(log p)p lt log p1minus (log 7)7 So for q lt

      prodplep0p 6=7 p and p1 ge 29

      the prime immediately preceding p0

      λ(q) le

      1633 log p1 middot745235 middot

      (e074914p

      131

      744586

      )037268

      (log q minus log p1 + log 7

      7

      )+ 002741

      3

      le 84351(log p1)3e224742p131

      (log q minus log p1 + 035152)3

      Thus we obtain just like before that

      $0(33 middot 109) ge 477465 λ(33 middot 109) le 475513 lt 477465

      We also check that $0(q0) ge 916322 is greater than λ(q0) le 429731 for q0 =prodple31p 6=7 p The analysis for q ge

      prodple37p 6=7 p is also just like before since log q ge

      08009p1 minus log 7 we have to show that

      e0224t

      7gt

      84351(log t)3e224742t13

      (08009tminus log t+ 007354)3

      for t ge 37 and that in turn follows from

      0224tminus 224742t13 gt 3 log log tminus 3 log t+ 674849

      which we check for t ge 37 just as we checked (1240)We conclude that $(q) gt λ(q) if q ge 33 middot 109 and 210 - qComputation Now for q lt 33middot109 (and also for 33middot109 le q lt 22middot1010 210|q)

      we need to check that the maximum mqR1 of errqR over all $(q) le R lt λ(q)satisfies (1231) Note that there is a term errqtR in (1231) we bound it using (1232)

      Since logR is increasing on R and Gq(R) depends only on bRc we can tell from(1224) that since we are taking the maximum of errqR it is enough to check integer

      242 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      values of R We check all integers R in [$(q) λ(q)) for all q lt 33 middot 109 (and all33 middot 109 le q lt 22 middot 1010 210|q) by an explicit computation2

      Finally we have the trivial bound

      Gq(Q0sq)

      Gq(Qsq)le 1 (1241)

      which we shall use for Q0 close to Q

      Corollary 1225 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

      radicx Let Q0 ge 105 δ0 ge 1 be such that

      (20000Q0)2 le x2δ0 set Q =radicx2δ0

      Let S(α) =sumn ane(αn) for α isin RZ Let M as in (121) Then if Q0 le Q06int

      M

      |S(α)|2 dα le logQ0 + c+logQ+ cE

      sumn

      |an|2

      where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

      Let Mδ0Q0 as in (105) Then if (2Q0) le (2Q)06intMδ0Q0

      |S(α)|2 dα le log 2Q0 + c+log 2Q+ cE

      sumn

      |an|2 (1242)

      Here of courseintRZ |S(α)|2 dα =

      sumn |an|2 (Plancherel) If Q0 gt Q06 we will

      use the trivial boundintMδ0r

      |S(α)|2 dα leintRZ|S(α)|2 dα =

      sumn

      |an|2 (1243)

      Proof Immediate from Prop 1211 Prop 1212 and Prop 1224

      Obviously one can also give a statement derived from Prop 1211 the resultingbound is int

      M

      |S(α)|2dα le logQ0 + c+logQ+ cE

      sumn

      |an|2

      where M is as in (121)We also record the large-sieve form of the result

      2This is by far the heaviest computation in the present work though it is still rather minor (about twoweeks of computing on a single core of a fairly new (2010) desktop computer carrying out other tasks as wellthis is next to nothing compared to the computations in [Plab] or even those in [HP13]) For the applicationshere we could have assumed ρ le 815 and that would have reduced computation time drastically thelighter assumption ρ le 06 was made with views to general applicability in the future As elsewhere in thissection numerical computations were carried out by the author in C all floating-point operations used DPlattrsquos interval arithmetic package

      122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 243

      Corollary 1226 Let N ge 1 Let aninfinn=1 an isin C be supported on the integersn le N Let Q0 ge 105 Q ge 20000Q0 Assume that an = 0 for every n for whichthere is a p le Q dividing n

      Let S(α) =sumn ane(αn) for α isin RZ Then if Q0 le Q06sum

      qleQ0

      suma mod q

      (aq)=1

      |S(aq)|2 dα le logQ0 + c+logQ+ cE

      middot (N +Q2)sumn

      |an|2

      where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

      Proof Proceed as Ramare does in the proof of [Ram09 Thm 52] with Kq = a isinZqZ (a q) = 1 and un = an) in particular apply [Ram09 Thm 21] The proofof [Ram09 Thm 52] shows thatsum

      qleQ0

      suma mod q

      (aq)=1

      |S(aq)|2 dα le maxqleQ0

      Gq(Q0)

      Gq(Q)middotsumqleQ0

      suma mod q

      (aq)=1

      |S(aq)|2 dα

      Now instead of using the easy inequalityGq(Q0)Gq(Q) le G1(Q0)G1(QQ0) useProp 1224

      It would seem desirable to prove a result such as Prop 1224 (or Cor 1225 orCor 1226) without computations and with conditions that are as weak as possibleSince as we said we cannot make c+ equal to cE and since c+ does have to increasewhen the conditions are weakened (as is shown by computations this is not an arti-fact of our method of proof) the right goal might be to show that the maximum ofGq(Q0sq)Gq(Qsq) is reached when s = q = 1

      However this is also untrue without conditions For instance for Q0 = 2 and Qlarge the value of Gq(Q0q)Gq(Qq) at q = 2 is larger than at q = 1 by (1212)

      G2

      (Q0

      2

      )G2

      (Q2

      ) sim 1

      12

      (log Q

      2 + cE + log 22

      )=

      2

      logQ+ cE minus log 22

      gt2

      logQ+ cEsim G(Q0)

      G(Q)

      Thus at the very least a lower bound on Q0 is needed as a condition This also dimsthe hopes somewhat for a combinatorial proof of Gq(Q0q)G(Q) le Gq(Qq)G(Q0)at any rate while such a proof would be welcome it could not be extremely straightfor-ward since there are terms in Gq(Q0q)G(Q) that do not appear in Gq(Qq)G(Q0)

      244 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

      Chapter 13

      The integral over the minor arcs

      The time has come to bound the part of our triple-product integral (103) that comesfrom the minor arcs m sub RZ We have an `infin estimate (from Prop 1123 based onTheorem 311) and an `2 estimate (from sect122) Now we must put them together

      There are two ways in which we must be careful A trivial bound of the form`33 =

      int|S(α)|3dα le `22 middot `infin would introduce a fatal factor of log x coming from `2

      We avoid this by using the fact that we have `2 estimates over Mδ0Q0for varying Q0

      We must also remember to substract the major-arc contribution from our estimatefor Mδ0Q0 this is why we were careful to give a lower bound in Lem 1031 asopposed to just the upper bound (1028)

      131 Putting together `2 bounds over arcs and `infin bounds

      Let us start with a simple lemma ndash essentially a way to obtain upper bounds by meansof summation by parts

      Lemma 1311 Let f g a a+ 1 b rarr R+0 where a b isin Z+ Assume that for

      all x isin [a b] sumalenlex

      f(n) le F (x) (131)

      where F [a b]rarr R is continuous piecewise differentiable and non-decreasing Then

      bsumn=a

      f(n) middot g(n) le (maxngea

      g(n)) middot F (a) +

      int b

      a

      (maxngeu

      g(n)) middot F prime(u)du

      Proof Let S(n) =sumnm=a f(m) Then by partial summation

      bsumn=a

      f(n) middot g(n) le S(b)g(b) +bminus1sumn=a

      S(n)(g(n)minus g(n+ 1)) (132)

      245

      246 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

      Let h(x) = maxxlenleb g(n) Then h is non-increasing Hence (131) and (132) implythat

      bsumn=a

      f(n)g(n) lebsum

      n=a

      f(n)h(n)

      le S(b)h(b) +

      bminus1sumn=a

      S(n)(h(n)minus h(n+ 1))

      le F (b)h(b) +

      bminus1sumn=a

      F (n)(h(n)minus h(n+ 1))

      In general for αn isin C A(x) =sumalenlex αn and F continuous and piecewise differ-

      entiable on [a x]sumalenlex

      αnF (x) = A(x)F (x)minusint x

      a

      A(u)F prime(u)du (Abel summation) (133)

      Applying this with αn = h(n)minush(n+1) andA(x) =sumalenlex αn = h(a)minush(bxc+

      1) we obtain

      bminus1sumn=a

      F (n)(h(n)minus h(n+ 1))

      = (h(a)minus h(b))F (bminus 1)minusint bminus1

      a

      (h(a)minus h(buc+ 1))F prime(u)du

      = h(a)F (a)minus h(b)F (bminus 1) +

      int bminus1

      a

      h(buc+ 1)F prime(u)du

      = h(a)F (a)minus h(b)F (bminus 1) +

      int bminus1

      a

      h(u)F prime(u)du

      = h(a)F (a)minus h(b)F (b) +

      int b

      a

      h(u)F prime(u)du

      since h(buc+ 1) = h(u) for u isin Z Hence

      bsumn=a

      f(n)g(n) le h(a)F (a) +

      int b

      a

      h(u)F prime(u)du

      We will now see our main application of Lemma 1311 We have to bound anintegral of the form

      intMδ0r

      |S1(α)|2|S2(α)|dα where Mδ0r is a union of arcs defined

      as in (105) Our inputs are (a) a bound on integrals of the formintMδ0r

      |S1(α)|2dα (b)a bound on |S2(α)| for α isin (RZ)Mδ0r The input of type (a) is what we derived insect121 and sect122 the input of type (b) is a minor-arcs bound and as such was the mainsubject of Part I

      131 PUTTING TOGETHER `2 BOUNDS OVER ARCS AND `infin BOUNDS 247

      Proposition 1312 Let S1(α) =sumn ane(αn) an isin C an in L1 Let S2 RZrarr

      C be continuous Define Mδ0r as in (105)Let r0 be a positive integer not greater than r1 Let H [r0 r1] rarr R+ be a

      continuous piecewise differentiable non-decreasing function such that

      1sum|an|2

      intMδ0r+1

      |S1(α)|2dα le H(r) (134)

      for some δ0 le x2r21 and all r isin [r0 r1] Assume moreover that H(r1) = 1 Let

      g [r0 r1]rarr R+ be a non-increasing function such that

      maxαisin(RZ)Mδ0r

      |S2(α)| le g(r) (135)

      for all r isin [r0 r1] and δ0 as aboveThen

      1sumn |an|2

      int(RZ)Mδ0r0

      |S1(α)|2|S2(α)|dα

      le g(r0) middot (H(r0)minus I0) +

      int r1

      r0

      g(r)H prime(r)dr

      (136)

      whereI0 =

      1sumn |an|2

      intMδ0r0

      |S1(α)|2dα (137)

      The condition δ0 le x2r21 is there just to ensure that the arcs in the definition of

      Mδ0r do not overlap for r le r1

      Proof For r0 le r lt r1 let

      f(r) =1sum

      n |an|2

      intMδ0r+1Mδ0r

      |S1(α)|2dα

      Letf(r1) =

      1sumn |an|2

      int(RZ)Mδ0r1

      |S1(α)|2dα

      Then by (135)

      1sumn |an|2

      int(RZ)Mδ0r0

      |S1(α)|2|S2(α)|dα ler1sumr=r0

      f(r)g(r)

      By (134)sumr0lerlex

      f(r) =1sum

      n |an|2

      intMδ0x+1Mδ0r0

      |S1(α)|2dα

      =

      (1sum

      n |an|2

      intMδ0x+1

      |S1(α)|2dα

      )minus I0 le H(x)minus I0

      (138)

      248 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

      for x isin [r0 r1) Moreoversumr0lerler1

      f(r) =1sum

      n |an|2

      int(RZ)Mδ0r0

      |S1(α)|2

      =

      (1sum

      n |an|2

      intRZ|S1(α)|2

      )minus I0 = 1minus I0 = H(r1)minus I0

      We let F (x) = H(x) minus I0 and apply Lemma 1311 with a = r0 b = r1 Weobtain that

      r1sumr=r0

      f(r)g(r) le (maxrger0

      g(r))F (r0) +

      int r1

      r0

      (maxrgeu

      g(r))F prime(u) du

      le g(r0)(H(r0)minus I0) +

      int r1

      r0

      g(u)H prime(u) du

      132 The minor-arc totalWe now apply Prop 1312 Inevitably the main statement involves some integrals thatwill have to be evaluated at the end of the section

      Theorem 1321 Let x ge 1025 middot κ where κ ge 1 Let

      Sη(α x) =sumn

      Λ(n)e(αn)η(nx) (139)

      Let ηlowast(t) = (η2 lowastM ϕ)(κt) where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in `1 Let η+ [0infin)rarr [0infin) be a bounded piecewise differentiablefunction with limtrarrinfin η+(t) = 0 Let Mδ0r be as in (105) with δ0 = 8 Let 105 ler0 lt r1 where r1 = (38)(xκ)415 Let g(r) = gxκϕ(r) where

      gyϕ(r) =(RyKϕ2r log 2r + 05)

      radicz(r) + 25radic

      2r+L2r

      r+ 336K16yminus16 (1310)

      just as in (1119) and K = log(xκ)2 Here RyKφt is as in (1119) and Lt is asin (1113)

      Denote

      Zr0 =

      int(RZ)M8r0

      |Sηlowast(α x)||Sη+(α x)|2dα

      Then

      Zr0 le

      (radic|ϕ|1xκ

      (M + T ) +radicSηlowast(0 x) middot E

      )2

      132 THE MINOR-ARC TOTAL 249

      where

      S =sumpgtradicx

      (log p)2η2+(nx)

      T = Cϕ3

      (1

      2log

      x

      κ

      )middot (S minus (

      radicJ minusradicE)2)

      J =

      intM8r0

      |Sη+(α x)|2 dα

      E =((Cη+0 + Cη+2) log x+ (2Cη+0 + Cη+1)

      )middot x12

      (1311)

      Cη+0 = 07131

      int infin0

      1radict(suprget

      η+(r))2dt

      Cη+1 = 07131

      int infin1

      log tradict

      (suprget

      η+(r))2dt

      Cη+2 = 051942|η+|2infin

      Cϕ3(K) =104488

      |ϕ|1

      int 1K

      0

      |ϕ(w)|dw

      (1312)

      and

      M = g(r0) middot(

      log(r0 + 1) + c+

      logradicx+ cminus

      middot S minus (radicJ minusradicE)2

      )+

      (2

      log x+ 2cminus

      int r1

      r0

      g(r)

      rdr +

      (7

      15+minus214938 + 8

      15 logκlog x+ 2cminus

      )g(r1)

      )middot S

      (1313)where c+ = 20532 and cminus = 06394

      Proof Let y = xκ Let Q = (34)y23 as in Thm 311 (applied with y insteadof x) Let α isin (RZ) M8r where r0 le r le y136 and y is used instead ofx to define M8r (see (105)) There exists an approximation 2α = aq + δy withq le Q |δ|y le 1qQ Thus α = aprimeqprime + δ2y where either aprimeqprime = a2q oraprimeqprime = (a + q)2q holds (In particular if qprime is odd then qprime = q if qprime is even thenqprime = 2q)

      There are three cases

      1 q le r Then either (a) qprime is odd and qprime le r or (b) qprime is even and qprime le 2rSince α is not in M8r then by definition (105) |δ|2y ge δ0r2qy and so|δ| ge δ0rq = 8rq In particular |δ| ge 8

      Thus by Prop 1123

      |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(|δ|8q

      )middot|ϕ|1y le gyϕ(r)middot|ϕ|1y (1314)

      where we use the fact that g(r) is a non-increasing function (Lemma 1124)

      250 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

      2 r lt q le y136 Then by Prop 1123 and Lemma 1124

      |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(

      max

      (|δ|8 1

      )q

      )middot |ϕ|1y

      le gyϕ(r) middot |ϕ|1y(1315)

      3 q gt y136 Again by Prop 1123

      |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le(h( yK

      )+ Cϕ3(K)

      )|ϕ|1y (1316)

      where h(x) is as in (1115) (Of course Cϕ3(K) as in (1312) is equal toCϕ0K|φ|1 where Cϕ0K is as in (1121)) We set K = (log y)2 Sincey = xκ ge 1025 it follows that yK = 2y log y gt 347 middot 1023 gt 216 middot 1020

      Let

      r1 =3

      8y415 g(r) =

      gyϕ(r) if r le r1

      gyϕ(r1) if r gt r1

      By Lemma 1124 for r ge 670 g(r) is a non-increasing function and g(r) ge gyφ(r)Moreover by Lemma 1125 gyφ(r1) ge h(2y log y) where h is as in (1115) and sog(r) ge h(2y log y) for all r ge r0 ge 670 Thus we have shown that

      |Sηlowast(y α)| le(g(r) + Cϕ3

      (log y

      2

      ))middot |ϕ|1y (1317)

      for all α isin (RZ) M8rWe first need to undertake the fairly dull task of getting non-prime or small n out

      of the sum defining Sη+(α x) Write

      S1η+(α x) =sumpgtradicx

      (log p)e(αp)η+(px)

      S2η+(α x) =sum

      n non-primengtradicx

      Λ(n)e(αn)η+(nx) +sumnleradicx

      Λ(n)e(αn)η+(nx)

      By the triangle inequality (with weights |Sη+(α x)|)radicint(RZ)M8r0

      |Sηlowast(α x)||Sη+(α x)|2dα

      le2sumj=1

      radicint(RZ)M8r0

      |Sηlowast(α x)||Sjη+(α x)|2dα

      132 THE MINOR-ARC TOTAL 251

      Clearlyint(RZ)M8r0

      |Sηlowast(α x)||S2η+(α x)|2dα

      le maxαisinRZ

      |Sηlowast(α x)| middotintRZ|S2η+(α x)|2dα

      leinfinsumn=1

      Λ(n)ηlowast(nx) middot

      sumn non-prime

      Λ(n)2η+(nx)2 +sumnleradicx

      Λ(n)2η+(nx)2

      Let η+(z) = suptgez η+(t) Since η+(t) tends to 0 as t rarr infin so does η+ By [RS62Thm 13] partial summation and integration by partssum

      n non-prime

      Λ(n)2η+(nx)2 lesum

      n non-prime

      Λ(n)2η+(nx)2

      le minusint infin

      1

      sumnlet

      n non-prime

      Λ(n)2

      (η+2(tx)

      )primedt

      le minusint infin

      1

      (log t) middot 14262radict(η+

      2(tx))primedt

      le 07131

      int infin1

      log e2tradictmiddot η+

      2

      (t

      x

      )dt

      =

      (07131

      int infin1x

      2 + log txradict

      η+2(t)dt

      )radicx

      while by [RS62 Thm 12]sumnleradicx

      Λ(n)2η+(nx)2 le 1

      2|η+|2infin(log x)

      sumnleradicx

      Λ(n)

      le 051942|η+|2infin middotradicx log x

      This shows thatint(RZ)M8r0

      |Sηlowast(α x)||S2η+(α x)|2dα leinfinsumn=1

      Λ(n)ηlowast(nx) middot E = Sηlowast(0 x) middot E

      where E is as in (1311)It remains to boundint

      (RZ)M8r0

      |Sηlowast(α x)||S1η+(α x)|2dα (1318)

      We wish to apply Prop 1312 Corollary 1225 gives us an input of type (134) wehave just derived a bound (1317) that provides an input of type (135) More precisely

      252 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

      by (1242) (134) holds with

      H(r) =

      log(r+1)+c+

      logradicx+cminus

      if r lt r1

      1 if r ge r1

      where c+ = 20532 gt log 2 + 136 and cminus = 06394 lt log(1radic

      2 middot 8) + log 2 +13325822 (We can apply Corollary 1225 because 2(r1 + 1) = (34)x415 + 2 le(2radicx16)06 for x ge 1025 (or even for x ge 100000)) Since r1 = (38)y415 and

      x ge 1025 middot κ

      limrrarrr+1

      H(r)minus limrrarrrminus1

      H(r) = 1minus log((38)(xκ)415 + 1) + c+

      logradicx+ cminus

      le 1minus(

      415

      12+

      log 38 + c+ minus 4

      15 logκ minus 815cminus

      logradicx+ cminus

      )le 7

      15+minus214938 + 8

      15 logκlog x+ 2cminus

      We also have (135) with (g(r) + Cϕ3

      (log y

      2

      ))middot |ϕ|1y (1319)

      instead of g(r) (by (1317)) Here (1319) is a non-increasing function of r becauseg(r) is as we already checked Hence Prop 1312 gives us that (1318) is at most

      g(r0)middot(H(r0)minus I0) + (1minus I0) middot Cϕ3(

      log y

      2

      )+

      1

      logradicx+ cminus

      int r1

      r0

      g(r)

      r + 1dr +

      (7

      15+minus214938 + 8

      15 logκlog x+ 2cminus

      )g(r1)

      (1320)times |ϕ|1y middot

      sumpgtradicx(log p)2η2

      +(px) where

      I0 =1sum

      pgtradicx(log p)2η2

      +(nx)

      intM8r0

      |S1η+(α x)|2 dα (1321)

      By the triangle inequalityradicintM8r0

      |S1η+(α x)|2 dα =

      radicintM8r0

      |Sη+(α x)minus S2η+(α x)|2 dα

      geradicint

      M8r0

      |Sη+(α x)|2 dαminusradicint

      M8r0

      |S2η+(α x)|2 dα

      geradicint

      M8r0

      |Sη+(α x)|2 dαminusradicint

      RZ|S2η+(α x)|2 dα

      132 THE MINOR-ARC TOTAL 253

      As we already showedintRZ|S2η+(α x)|2 dα =

      sumn non-primeor n le

      radicx

      Λ(n)2η+(nx)2 le E

      ThusI0 middot S ge (

      radicJ minusradicE)2

      and so we are done

      We now should estimate the integralint r1r0

      g(r)r dr in (1313) It is easy to see thatint infin

      r0

      1

      r32dr =

      2

      r120

      int infinr0

      log r

      r2dr =

      log er0

      r0

      int infinr0

      1

      r2dr =

      1

      r0int r1

      r0

      1

      rdr = log

      r1

      r0

      int infinr0

      log r

      r32dr =

      2 log e2r0radicr0

      int infinr0

      log 2r

      r32dr =

      2 log 2e2r0radicr0

      int infinr0

      (log 2r)2

      r32dr =

      2P2(log 2r0)radicr0

      int infinr0

      (log 2r)3

      r32dr =

      2P3(log 2r0)

      r120

      (1322)where

      P2(t) = t2 + 4t+ 8 P3(t) = t3 + 6t2 + 24t+ 48 (1323)

      We also have int infinr0

      dr

      r2 log r= E1(log r0) (1324)

      where E1 is the exponential integral

      E1(z) =

      int infinz

      eminust

      tdt

      We must also estimate the integralsint r1

      r0

      radicz(r)

      r32dr

      int r1

      r0

      z(r)

      r2dr

      int r1

      r0

      z(r) log r

      r2dr

      int r1

      r0

      z(r)

      r32dr (1325)

      Clearly z(r) minus eγ log log r = 250637 log log r is decreasing on r Hence forr ge 105

      z(r) le eγ log log r + cγ

      where cγ = 1025742 Let F (t) = eγ log t+ cγ Then F primeprime(t) = minuseγt2 lt 0 Hence

      d2radicF (t)

      dt2=

      F primeprime(t)

      2radicF (t)

      minus (F prime(t))2

      4(F (t))32lt 0

      254 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

      for all t gt 0 In other wordsradicF (t) is convex-down and so we can bound

      radicF (t)

      from above byradicF (t0) +

      radicFprime(t0) middot (tminus t0) for any t ge t0 gt 0 Hence for r ge r0 ge

      105 radicz(r) le

      radicF (log r) le

      radicF (log r0) +

      dradicF (t)

      dt|t=log r0 middot log

      r

      r0

      =radicF (log r0) +

      eγradicF (log r0)

      middotlog r

      r0

      2 log r0

      Thus by (1322)int infinr0

      radicz(r)

      r32dr le

      radicF (log r0)

      (2minus eγ

      F (log r0)

      )1radicr0

      +eγradic

      F (log r0) log r0

      log e2r0radicr0

      =2radicF (log r0)radicr0

      (1 +

      F (log r0) log r0

      )

      (1326)

      The other integrals in (1325) are easier Just as in (1326) we extend the range ofintegration to [r0infin] Using (1322) and (1324) we obtainint infin

      r0

      z(r)

      r2dr le

      int infinr0

      F (log r)

      r2dr = eγ

      (log log r0

      r0+ E1(log r0)

      )+cγr0int infin

      r0

      z(r) log r

      r2dr le eγ

      ((1 + log r0) log log r0 + 1

      r0+ E1(log r0)

      )+cγ log er0

      r0

      By [OLBC10 (682)]

      1

      r(log r + 1)le E1(log r) le 1

      r log r

      (The second inequality is obvious) Henceint infinr0

      z(r)

      r2dr le eγ(log log r0 + 1 log r0) + cγ

      r0

      int infinr0

      z(r) log r

      r2dr le

      eγ(

      log log r0 + 1log r0

      )+ cγ

      r0middot log er0

      Finally int infinr0

      z(r)

      r32le eγ

      (2 log log r0radic

      r0+ 2E1

      (log r0

      2

      ))+

      2cγradicr0

      le 2radicr0

      (F (log r0) +

      2eγ

      log r0

      )

      (1327)

      It is time to estimate int r1

      r0

      Rz2r log 2rradicz(r)

      r32dr (1328)

      132 THE MINOR-ARC TOTAL 255

      where z = y or z = y((log y)2) (and y = xκ as before) and where Rzt is asdefined in (1113) By Cauchy-Schwarz (1328) is at most

      radicint r1

      r0

      (Rz2r log 2r)2

      r32dr middot

      radicint r1

      r0

      z(r)

      r32dr (1329)

      We have already bounded the second integral Let us look at the first one We can writeRzt = 027125Rzt + 041415 where

      Rzt = log

      (1 +

      log 4t

      2 log 9z13

      2004t

      ) (1330)

      Clearly

      Rzet4 = log

      (1 +

      t2

      log 36z13

      2004 minus t

      )

      Now for f(t) = log(c+ at(bminus t)) and t isin [0 b)

      f prime(t) =ab(

      c+ atbminust

      )(bminus t)2

      f primeprime(t) =minusab((aminus 2c)(bminus 2t)minus 2ct)(

      c+ atbminust

      )2

      (bminus t)4

      In our case a = 12 c = 1 and b = log 36z13 minus log(2004) gt 0 Hence for t lt b

      minusab((aminus 2c)(bminus 2t)minus 2ct) =b

      2

      (2t+

      3

      2(bminus 2t)

      )=b

      2

      (3

      2bminus t

      )gt 0

      and so f primeprime(t) gt 0 In other words t rarr Rzet4 is convex-up for t lt b ie foret4 lt 9z132004 It is easy to check that since we are assuming y ge 1025

      2r1 =3

      16y415 lt

      9

      2004

      (2y

      log y

      )13

      le 9z13

      2004

      We conclude that r rarr Rz2r is convex-up on log 8r for r le r1 and hence so isr rarr Rzr and so in turn is r rarr R2

      zr Thus for r isin [r0 r1]

      R2z2r le R2

      z2r0 middotlog r1r

      log r1r0+R2

      z2r1 middotlog rr0

      log r1r0 (1331)

      256 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

      Therefore by (1322)

      int r1

      r0

      (Rz2r log 2r)2

      r32dr

      leint r1

      r0

      (R2z2r0

      log r1r

      log r1r0+R2

      z2r1

      log rr0

      log r1r0

      )(log 2r)2 dr

      r32

      =2R2

      z2r0

      log r1r0

      ((P2(log 2r0)radicr0

      minus P2(log 2r1)radicr1

      )log 2r1 minus

      P3(log 2r0)radicr0

      +P3(log 2r1)radicr1

      )+

      2R2z2r1

      log r1r0

      (P3(log 2r0)radicr0

      minus P3(log 2r1)radicr1

      minus(P2(log 2r0)radicr0

      minus P2(log 2r1)radicr1

      )log 2r0

      )

      = 2

      (R2z2r0 minus

      log 2r0

      log r1r0

      (R2z2r1 minusR

      2z2r0)

      )middot(P2(log 2r0)radicr0

      minus P2(log 2r1)radicr1

      )+ 2

      R2z2r1 minusR

      2z2r0

      log r1r0

      (P3(log 2r0)radicr0

      minus P3(log 2r1)radicr1

      )= 2R2

      z2r0 middot(P2(log 2r0)radicr0

      minus P2(log 2r1)radicr1

      )+ 2

      R2z2r1 minusR

      2z2r0

      log r1r0

      (Pminus2 (log 2r0)radicr0

      minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

      )

      (1332)where P2(t) and P3(t) are as in (1323) and Pminus2 (t) = P3(t)minustP2(t) = 2t2 +16t+48

      Putting all terms together we conclude that

      int r1

      r0

      g(r)

      rdr le f0(r0 y) + f1(r0) + f2(r0 y) (1333)

      where

      f0(r0 y) =

      ((1minus cϕ)

      radicI0r0r1y + cϕ

      radicI0r0r1 2y

      log y

      )radic2radicr0I1r0

      f1(r0) =

      radicF (log r0)radic

      2r0

      (1 +

      F (log r0) log r0

      )+

      5radic2r0

      +1

      r0

      ((13

      4log er0 + 1107

      )Jr0 + 1366 log er0 + 3755

      )f2(r0 y) = 336

      ((log y)2)16

      y16log

      r1

      r0

      (1334)

      132 THE MINOR-ARC TOTAL 257

      where F (t) = eγ log t+ cγ cγ = 1025742 y = xκ (as usual)

      I0r0r1z = R2z2r0 middot

      (P2(log 2r0)radicr0

      minus P2(log 2r1)radicr1

      )+R2z2r1 minusR

      2z2r0

      log r1r0

      (Pminus2 (log 2r0)radicr0

      minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

      )Jr = F (log r) +

      log r I1r = F (log r) +

      2eγ

      log r cϕ =

      Cϕ2 log y2|ϕ|1

      log log y2(1335)

      and Cϕ2K is as in (1120)Let us recapitulate briefly The term f2(r0 y) in (1334) comes from the term

      336xminus116 in (1112) The term f1(r0 y) includes all other terms in (1112) exceptfor Rx2r log 2r

      radicz(r)(

      radic2r) The contribution of that last term is (1328) divided

      byradic

      2 That in turn is at most (1329) divided byradic

      2 The first integral in (1329)was bounded in (1332) the second integral was bounded in (1327)

      258 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

      Chapter 14

      Conclusion

      We now need to gather all results using the smoothing functions

      ηlowast = (η2 lowastM ϕ)(κt)

      where ϕ(t) = t2eminust22 η2 = η1 lowastM η1 and η1 = 2 middot I[minus1212] and

      η+ = h200(t)teminust22

      where

      hH(t) =

      int infin0

      h(tyminus1)FH(y)dy

      y

      h(t) =

      t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

      FH(t) =sin(H log y)

      π log y

      We studied ηlowast and η+ in Part II We saw ηlowast in Thm 1321 (which actually works forgeneral ϕ [0infin)rarr [0infin) as its statement says) We will set κ soon

      We fix a value for r namely r = 150000 Our results will have to be valid for anyx ge x+ where x+ is fixed We set x+ = 49 middot 1026 since we want a result valid forN ge 1027 and as was discussed in (111) we will work with x+ slightly smaller thanN2

      141 The `2 norm over the major arcs explicit versionWe apply Lemma 1031 with η = η+ and η as in (113) Let us first work out theerror terms defined in (1027) Recall that δ0 = 8 By Thm 714

      ETη+δ0r2 = max|δ|leδ0r2

      | errηχT (δ x)|

      = 4772 middot 10minus11 +251400radicx+

      le 11405 middot 10minus8(141)

      259

      260 CHAPTER 14 CONCLUSION

      Eη+rδ0 = maxχ mod q

      qlermiddotgcd(q2)

      |δ|legcd(q2)δ0r2q

      radicqlowast| errη+χlowast(δ x)|

      le 13482 middot 10minus14radic

      300000 +1617 middot 10minus10

      radic2

      +1radicx+

      (499900 + 52

      radic300000

      )le 23992 middot 10minus8

      (142)where in the latter case we are using the fact that a stronger bound for q = 1 (namely(141)) allows us to assume q ge 2

      We also need to bound a few norms by the estimates in sectA3 and sectA5 (appliedwith H = 200)

      |η+|1 le 1062319 |η+|2 le 0800129 +2748569

      20072le 0800132

      |η+|infin le 1 + 206440727 middot1 + 4

      π logH

      Hle 1079955

      (143)

      By (1012) and (141)

      |Sη+(0 x)| =∣∣η+(0) middot x+Olowast

      (errη+χT (0 x)

      )middot x∣∣

      le (|η+|1 + ETη+δ0r2)x le 1063x

      This is far from optimal but it will do since all we wish to do with this is to bound thetiny error term Kr2 in (1027)

      Kr2 = (1 +radic

      300000)(log x)2 middot 1079955

      middot (2 middot 106232 + (1 +radic

      300000)(log x)21079955x)

      le 125906(log x)2 le 971 middot 10minus21x

      for x ge x+ By (141) we also have

      519δ0r

      (ET

      η+δ0r2middot

      (|η+|1 +

      ETη+

      δ0r2

      2

      ))le 0075272

      andδ0r(log 2e2r)

      (E2η+rδ0 +Kr2x

      )le 100393 middot 10minus8

      By (A23) and (A26)

      08001287 le |η|2 le 08001288 (144)

      and|η+ minus η|2 le

      274856893

      H72le 242942 middot 10minus6 (145)

      We bound |η(3) |1 using the fact that (as we can tell by taking derivatives) η(2)

      (t)

      increases from 0 at t = 0 to a maximum within [0 12] and then decreases to η(2) (1) =

      142 THE TOTAL MAJOR-ARC CONTRIBUTION 261

      minus7 only to increase to a maximum within [32 2] (equal to the maximum attainedwithin [0 12]) and then decrease to 0 at t = 2

      |η(3) |1 = 2 max

      tisin[012]η

      (2) (t)minus 2η

      (2) (1) + 2 max

      tisin[322]η

      (2) (t)

      = 4 maxtisin[012]

      η(2) (t) + 14 le 4 middot 46255653 + 14 le 325023

      (146)

      where we compute the maximum by the bisection method with 30 iterations (usinginterval arithmetic as always)

      We evaluate explicitly sumqlerq odd

      micro2(q)

      φ(q)= 6798779

      using yet again interval arithmeticLooking at (1029) and (1028) we conclude that

      Lrδ0 le 2 middot 6798779 middot 08001322 le 870531

      Lrδ0 ge 2 middot 6798779 middot 080012872 minus ((log r + 17) middot (3888 middot 10minus6 + 591 middot 10minus12))

      minus(1342 middot 10minus5

      )middot(

      064787 +log r

      4r+

      0425

      r

      )ge 870517

      Lemma 1031 thus gives us thatintM8r0

      ∣∣Sη+(α x)∣∣2 dα = (870524 +Olowast(000007))x+Olowast(0075273)x

      = (87052 +Olowast(00754))x le 87806x

      (147)

      142 The total major-arc contributionFirst of all we must bound from below

      C0 =prodp|N

      (1minus 1

      (pminus 1)2

      )middotprodp-N

      (1 +

      1

      (pminus 1)3

      ) (148)

      The only prime that we know does not divide N is 2 Thus we use the bound

      C0 ge 2prodpgt2

      (1minus 1

      (pminus 1)2

      )ge 13203236 (149)

      The other main constant is Cηηlowast which we defined in (1037) and already startedto estimate in (116)

      Cηηlowast = |η|22int N

      x

      0

      ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

      x

      0

      ((2minusNx) + ρ)2ηlowast(ρ)dρ

      )(1410)

      262 CHAPTER 14 CONCLUSION

      provided that N ge 2x Recall that ηlowast = (η2 lowastM ϕ)(κt) where ϕ(t) = t2eminust22

      Thereforeint Nx

      0

      ηlowast(ρ)dρ =

      int Nx

      0

      (η2 lowast ϕ)(κρ)dρ =

      int 1

      14

      η2(w)

      int Nx

      0

      ϕ(κρw

      )dρdw

      w

      =|η2|1|ϕ|1

      κminus 1

      κ

      int 1

      14

      η2(w)

      int infinκNxw

      ϕ(ρ)dρdw

      By integration by parts and [AS64 (7113)]int infiny

      ϕ(ρ)dρ = yeminusy22 +

      radic2

      int infinyradic

      2

      eminust2

      dt lt

      (y +

      1

      y

      )eminusy

      22

      Hence int infinκNxw

      ϕ(ρ)dρ leint infin

      2κϕ(ρ)dρ lt

      (2κ +

      1

      )eminus2κ2

      and so since |η2|1 = 1int Nx

      0

      ηlowast(ρ)dρ ge |ϕ|1κminusint 1

      14

      η2(w)dw middot(

      2 +1

      2κ2

      )eminus2κ2

      ge |ϕ|1κminus(

      2 +1

      2κ2

      )eminus2κ2

      (1411)

      Let us now focus on the second integral in (1410) Write Nx = 2 + c1κ Thenthe integral equalsint 2+c1κ

      0

      (minusc1κ + ρ)2ηlowast(ρ)dρ le 1

      κ3

      int infin0

      (uminus c1)2 (η2 lowastM ϕ)(u) du

      =1

      κ3

      int 1

      14

      η2(w)

      int infin0

      (vw minus c1)2ϕ(v)dvdw

      =1

      κ3

      int 1

      14

      η2(w)

      (3

      radicπ

      2w2 minus 2 middot 2c1w + c21

      radicπ

      2

      )dw

      =1

      κ3

      (49

      48

      radicπ

      2minus 9

      4c1 +

      radicπ

      2c21

      )

      It is thus best to choose c1 = (94)radic

      2π = 089762 We must now estimate |ηprime|22 We could do this directly by rigorous numerical

      integration but we might as well do it the hard way (which is actually rather easy) Bythe definition (113) of η

      |ηprime(x+ 1)|2 =(x14 minus 18x12 + 111x10 minus 284x8 + 351x6 minus 210x4 + 49x2

      )eminusx

      2

      (1412)for x isin [minus1 1] and ηprime(x+ 1) = 0 for x 6isin [minus1 1] Now for any even integer k gt 0int 1

      minus1

      xkeminusx2

      dx = 2

      int 1

      0

      xkeminusx2

      dx = γ

      (k + 1

      2 1

      )

      142 THE TOTAL MAJOR-ARC CONTRIBUTION 263

      where γ(a r) =int r

      0eminusttaminus1dt is the incomplete gamma function (We substitute

      t = x2 in the integral) By [AS64 (6516) (6522)] γ(a+ 1 1) = aγ(a 1)minus 1e forall a gt 0 and γ(12 1) =

      radicπ erf(1) where

      erf(z) =2radicπ

      int 1

      0

      eminust2

      dt

      Thus starting from (1412) we see that

      |ηprime|22 = γ

      (15

      2 1

      )minus 18 middot γ

      (13

      2 1

      )+ 111 middot γ

      (11

      2 1

      )minus 284 middot γ

      (9

      2 1

      )+ 351 middot γ

      (7

      2 1

      )minus 210 middot γ

      (5

      2 1

      )+ 49 middot γ

      (3

      2 1

      )=

      9151

      128

      radicπ erf(1)minus 18101

      64e= 27375292

      (1413)We thus obtain

      271|ηprime|22middotint N

      x

      0

      ((2minusNx) + ρ)2ηlowast(ρ)dρ

      le 74188 middot 1

      κ3

      (49

      48

      radicπ

      2minus (94)2

      2radic

      )le 20002

      κ3

      We conclude that

      Cηηlowast ge1

      κ|ϕ|1|η|22 minus |η|22

      (2 +

      1

      2κ2

      )eminus2κ2

      minus 20002

      κ3

      Settingκ = 49

      and using (144) we obtain

      Cηηlowast ge1

      κ(|ϕ|1|η|22 minus 0000834) (1414)

      Here it is useful to note that |ϕ|1 =radic

      π2 and so by (144) |ϕ|1|η|22 = 080237

      We have finally chosen x in terms of N

      x =N

      2 + c1κ

      =N

      2 + 94radic2π

      149

      = 0495461 middotN (1415)

      Thus we see that since we are assuming N ge 1027 we in fact have x ge 495461 middot1026 and so in particular

      x ge 49 middot 1026x

      κge 1025 (1416)

      264 CHAPTER 14 CONCLUSION

      Let us continue with our determination of the major-arcs total We should com-pute the quantities in (1038) We already have bounds for Eη+rδ0 Aη+ (see (147))Lηrδ0 and Kr2 By Corollary 713 we have

      Eηlowastr8 le maxχ mod q

      qlermiddotgcd(q2)

      |δ|legcd(q2)δ0r2q

      radicqlowast| errηlowastχlowast(δ x)|

      le 1

      κ

      (2485 middot 10minus19 +

      1radic1025

      (381500 + 76

      radic300000

      ))le 133805 middot 10minus8

      κ

      (1417)

      where the factor of κ comes from the scaling in ηlowast(t) = (η2 lowastM ϕ)(κt) (which ineffect divides x by κ) It remains only to bound the more harmless terms of type Zη2and LSη

      Clearly Zη2+2 le (1x)sumn Λ(n)(log n)η2

      +(nx) Now by Prop 715

      infinsumn=1

      Λ(n)(log n)η2(nx)

      =

      (0640206 +Olowast

      (2 middot 10minus6 +

      36691radicx

      ))x log xminus 0021095x

      le (0640206 +Olowast(3 middot 10minus6))x log xminus 0021095x

      (1418)

      ThusZη2+2 le 0640209 log x (1419)

      We will proceed a little more crudely for Zη2lowast2

      Zη2lowast2 =1

      x

      sumn

      Λ2(n)η2lowast(nx) le 1

      x

      sumn

      Λ(n)ηlowast(nx) middot (ηlowast(nx) log n)

      le (|ηlowast|1 + | errηlowastχT (0 x)|) middot (|ηlowast(t) middot log+(κt)|infin + |ηlowast|infin log(xκ))(1420)

      where log+(t) = max(0 log t) It is easy to see that

      |ηlowast|infin = |η2 lowastM ϕ|infin le∣∣∣∣η2(t)

      t

      ∣∣∣∣1

      |ϕ|infin le 4(log 2)2 middot 2

      ele 1414 (1421)

      and since log+ is non-decreasing and η2 is supported on a subset of [0 1]

      |ηlowast(t) middot log+(κt)|infin = |(η2 lowastM ϕ) middot log+ |infin le |η2 lowastM (ϕ middot log+)|infin

      le∣∣∣∣η2(t)

      t

      ∣∣∣∣1

      middot |ϕ middot log+ |infin le 1921813 middot 0381157 le 0732513

      where we bound |ϕ middot log+ |infin by the bisection method with 25 iterations We alreadyknow that

      |ηlowast|1 =|η2|1|ϕ|1

      κ=|ϕ|1κ

      =

      radicπ2

      κ (1422)

      142 THE TOTAL MAJOR-ARC CONTRIBUTION 265

      By Cor 713

      | errηlowastχT (0 x)| le 2485 middot 10minus19 +1radic1025

      (381500 + 76) le 120665 middot 10minus7

      We conclude that

      Zη2lowast2 le (radicπ249 + 120665 middot 10minus7)(0732513 + 1414 log(x49)) le 00362 log x

      (1423)We have bounds for |ηlowast|infin and |η+|infin We can also bound

      |ηlowast middot t|infin =|(η2 lowastM ϕ) middot t|infin

      κle |η2|1 middot |ϕ middot t|infin

      κle 332eminus32

      κ

      We quote the estimate

      |η+ middot t|infin = 1064735 + 325312 middot (1 + (4π) log 200)200 le 119073 (1424)

      from (A42)We can now bound LSη(x r) for η = ηlowast η+

      LSη(x r) = log r middotmaxpler

      sumαge1

      η

      (pα

      x

      )

      le (log r) middotmaxpler

      log x

      log p|η|infin +

      sumαge1

      pαgex

      |η middot t|infinpαx

      le (log r) middotmax

      pler

      (log x

      log p|η|infin +

      |η middot t|infin1minus 1p

      )le (log r)(log x)

      log 2|η|infin + 2(log r)|η middot t|infin

      and so

      LSηlowast le(

      1414

      log 2log x+ 2 middot (3e)32

      49

      )log r le 2432 log x+ 057

      LSη+ le(

      107996

      log 2log x+ 2 middot 119073

      )log r le 1857 log x+ 2839

      (1425)

      where we are using the bound on |η+|infin in (143)We can now start to put together all terms in (1036) Let ε0 = |η+ minus η|2|η|2

      Then by (145)ε0|η|2 = |η+ minus η|2 le 242942 middot 10minus6

      Thus

      282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

      |η(3) |21δ50

      r

      266 CHAPTER 14 CONCLUSION

      is at most

      282643 middot 242942 middot 10minus6 middot (2 middot 080013 + 242942 middot 10minus6)

      +43101 middot 0800132 + 00012 middot 325032

      85

      150000le 29387 middot 10minus5

      by (144) (146) and (1422)Since ηlowast = (η2 lowastM ϕ)(κx) and η2 is supported on [14 1]

      |ηlowast|22 =|η2 lowastM ϕ|22

      κ=

      1

      κ

      int infin0

      (int infin0

      η2(t)ϕ(wt

      ) dtt

      )2

      dw

      le 1

      κ

      int infin0

      (1minus 1

      4

      )int infin0

      η22(t)ϕ2

      (wt

      ) dtt2dw

      =3

      int infin0

      η22(t)

      t

      (int infin0

      ϕ2(wt

      ) dwt

      )dt

      =3

      4κ|η2(t)

      radict|22 middot |ϕ|22 =

      3

      4κmiddot 32

      3(log 2)3 middot 3

      8

      radicπ le 177082

      κ

      where we go from the first to the second line by Cauchy-SchwarzRecalling the bounds on Eηlowastrδ0 and Eη+rδ0 we obtained in (142) and (1417)

      we conclude that the second line of (1036) is at most x2 times

      133805 middot 10minus8

      κmiddot 87806 + 23922 middot 10minus8 middot 16812

      middot (radic

      87806 + 16812 middot 080014)

      radic177082

      κle 17316 middot 10minus6

      κ

      where we are using the boundAη+ le 87806 we obtained in (147) (We are also usingthe bounds on norms in (143) and the value κ = 49)

      By the bounds (1419) (1423) and (1425) we see that the third line of (1036) isat most

      2 middot (0640209 log x) middot (2432 log x+ 057) middot x

      + 4radic

      0640209 log x middot 00362 log x(1857 log x+ 2839)x le 43(log x)2x

      where we use the assumption x ge x+ = 49 middot 1026 (though a much weaker assumptionwould suffice)

      Using the assumption x ge x+ again together with (1422) and the bounds we havejust proven we conclude that for r = 150000 the integral over the major arcsint

      M8r

      Sη+(α x)2Sηlowast(α x)e(minusNα)dα

      143 THE MINOR-ARC TOTAL EXPLICIT VERSION 267

      is

      C0 middot Cη0ηlowastx2 +Olowast

      (29387 middot 10minus5 middot

      radicπ2

      κx2 +

      17316 middot 10minus6

      κx2 + 43(log x)2x

      )

      = C0 middot Cη0ηlowastx2 +Olowast(

      385628 middot 10minus5 middot x2

      κ

      )= C0 middot Cη0ηlowastx2 +Olowast(786996 middot 10minus7x2)

      (1426)where C0 and Cη0ηlowast are as in (1037) Notice that C0Cη0ηlowastx

      2 is the expected asymp-totic for the integral over all of RZ

      Moreover by (149) (1414) and (144) as well as |ϕ|1 =radicπ2

      C0 middot Cη0ηlowast ge 13203236

      (|ϕ|1|η|22

      κminus 0000834

      κ

      )ge 10594003

      κminus 0001102

      κge 1058298

      49

      Hence intM8r

      Sη+(α x)2Sηlowast(α x)e(minusNα)dα ge 1058259

      κx2 (1427)

      where as usual κ = 49 This is our total major-arc bound

      143 The minor-arc total explicit versionWe need to estimate the quantities E S T J M in Theorem 1321 Let us start bybounding the constants in (1312) The constants Cη+j j = 0 1 2 will appear onlyin the minor term E and so crude bounds on them will do

      By (143) and (1424)

      suprget

      η+(r) le min

      (107996

      119073

      t

      )for all t ge 0 Thus

      Cη+0 = 07131

      int infin0

      1radict

      (suprget

      η+(r)

      )2

      dt

      le 07131

      (int 1

      0

      1079962

      radict

      dt+

      int infin1

      1190732

      t52dt

      )le 233744

      Similarly

      Cη+1 = 07131

      int infin1

      log tradict

      (suprget

      η+(r)

      )2

      dt

      le 07131

      int infin1

      1190732 log t

      t52dt le 044937

      268 CHAPTER 14 CONCLUSION

      Immediately from (143)

      Cη+2 = 051942|η+|2infin le 060581

      We get

      E le ((233744 + 060581) log x+ (2 middot 233744 + 044937)) middot x12

      le (294325 log x+ 512426) middot x12 le 84029 middot 10minus12 middot x(1428)

      where E is defined as in (1311) and where we are using the assumption x ge x+ =49 middot 1026 Using (1417) and (1422) we see that

      Sηlowast(0 x) = (|ηlowast|1 +Olowast(ETηlowast0))x =(radic

      π2 +Olowast(133805 middot 10minus8)) xκ

      Hence

      Sηlowast(0 x) middot E le 105315 middot 10minus11 middot x2

      κ (1429)

      We can bound

      S lesumn

      Λ(n)(log n)η2+(nx) le 0640209x log xminus 0021095x (1430)

      by (1418) Let us now estimate T Recall that ϕ(t) = t2eminust22 Sinceint u

      0

      ϕ(t)dt =

      int u

      0

      t2eminust22dt le

      int u

      0

      t2dt =u3

      3

      we can bound

      Cϕ3

      (1

      2log

      x

      κ

      )=

      104488radicπ2

      int 2log xκ

      0

      t2eminust22dt le 02779

      ((log xκ)2)3

      By (147) we already know that J = (87052 +Olowast(00754))x Hence

      (radicJ minusradicE)2 = (

      radic(87052 +Olowast(00754))xminus

      radic84029 middot 10minus12 middot x)2

      ge 86297x(1431)

      and so

      T = Cϕ3

      (1

      2log

      x

      κ

      )middot (S minus (

      radicJ minusradicE)2)

      le 8 middot 02779

      (log xκ)3middot (0640209x log xminus 0021095xminus 86297x)

      le 0177928x log x

      (log xκ)3minus 240405

      8x

      (log xκ)3

      le 142336x

      (log xκ)2minus 1369293

      x

      (log xκ)3

      143 THE MINOR-ARC TOTAL EXPLICIT VERSION 269

      for κ = 49 Since xκ ge 1025 this implies that

      T le 35776 middot 10minus4 middot x (1432)

      It remains to estimate M Let us first look at g(r0) here g = gxκϕ where gyϕ isdefined as in (1119) and φ(t) = t2eminust

      22 as usual Write y = xκ We must estimatethe constant Cϕ2K defined in (1121)

      Cϕ2K = minusint 1

      1K

      ϕ(w) logw dw le minusint 1

      0

      ϕ(w) logw dw

      le minusint 1

      0

      w2eminusw22 logw dw le 0093426

      where again we use VNODE-LP for rigorous numerical integration Since |ϕ|1 =radicπ2 and K = (log y)2 this implies that

      Cϕ2K|ϕ|1logK

      le 007455

      log log y2

      (1433)

      and so

      RyKϕt =007455

      log log y2

      RyKt +

      (1minus 007455

      log log y2

      )Ryt (1434)

      Let t = 2r0 = 300000 we recall that K = (log y)2 Recall from (1416) thaty = xκ ge 1025 thus yK ge 347435 middot 1023 and log((log y)2) ge 335976 Goingback to the definition of Rxt in (1113) we see that

      Ry2r0 le 027125 log

      (1 +

      log(8 middot 150000)

      2 log 9middot(1025)13

      2004middot2middot150000

      )+ 041415 le 058341

      (1435)

      RyK2r0 le 027125 log

      (1 +

      log(8 middot 150000)

      2 log 9middot(347435middot1023)13

      2004middot2middot150000

      )+ 041415 le 060295

      (1436)and so

      RyKϕ2r0 le007455

      335976060295 +

      (1minus 007455

      335976

      )058341 le 058385

      Using

      z(r) = eγ log log r +250637

      log log rle 542506

      we see from (1113) that

      L2r0 = 542506 middot(

      13

      4log 300000 + 782

      )+ 1366 log 300000 + 3755 le 474608

      270 CHAPTER 14 CONCLUSION

      Going back to (1119) we sum up and obtain that

      g(r0) =(058385 middot log 300000 + 05)

      radic542506 + 25radic

      2 middot 150000

      +474608

      150000+ 336

      (log y

      2y

      )16

      le 0041568

      Using again the bound x ge 49 middot 1026 we obtain

      log(150000 + 1) + c+

      logradicx+ cminus

      middot S minus (radicJ minusradicE)2

      le 13971612 log x+ 06394

      middot (0640209x log xminus 0021095x)minus 86297x

      le 178895xminus 117332x12 log x+ 06394

      minus 86297x

      le (178895minus 86297)x le 92598x

      where c+ = 20532 and cminus = 06394 Therefore

      g(r0) middot(

      log(150000 + 1) + c+

      logradicx+ cminus

      middot S minus (radicJ minusradicE)2

      )le 0041568 middot 92598x

      le 038492x(1437)

      This is one of the main terms

      Let r1 = (38)y415 where as usual y = xκ and κ = 49 Then

      Ry2r1 = 027125 log

      1 +log(8 middot 3

      8y415

      )2 log 9y13

      2004middot 34y415

      + 041415

      = 027125 log

      (1 +

      415 log y + log 3

      2(

      13 minus

      415

      )log y + 2 log 9

      2004middot 34

      )+ 041415

      le 027125 log

      (1 +

      415

      2(

      13 minus

      415

      ))+ 041415 le 071215

      (1438)

      143 THE MINOR-ARC TOTAL EXPLICIT VERSION 271

      Similarly for K = (log y)2 (as usual)

      RyK2r1 = 027125 log

      1 +log(8 middot 3

      8y415

      )2 log 9(yK)13

      2004middot 34y415

      + 041415

      = 027125 log

      1 +415 log y + log 3

      215 log y minus 2

      3 log log y + 2 log 9middot213

      2004middot 34

      + 041415

      = 027125 log

      (3 +

      43 log log y minus c

      215 log y minus 2

      3 log log y + 2 log 12middot213

      2004

      )+ 041415

      (1439)where c = 4 log(12 middot 2132004)minus log 3 Let

      f(t) =43 log tminus c

      215 tminus

      23 log t+ 2 log 12middot213

      2004

      The bisection method with 32 iterations shows that

      f(t) le 0019562618 (1440)

      for 180 le t le 30000 since f(t) lt 0 for 0 lt t lt 180 (by (43) log t minus c lt 0) andsince by c gt 203 we have f(t) lt (52)(log t)t as soon as t gt (log t)2 (and so inparticular for t gt 30000) we see that (1440) is valid for all t gt 0 Therefore

      RyK2r1 le 071392 (1441)

      and so by (1434) we conclude that

      RyKϕ2r1 le007455

      335976middot 071392 +

      (1minus 007455

      335976

      )middot 071215 le 071219

      Since r1 = (38)y415 and z(r) is increasing for r ge 27 we know that

      z(r1) le z(y415) = eγ log log y415 +250637

      log log y415

      = eγ log log y +250637

      log log y minus log 154

      minus eγ log15

      4le eγ log log y minus 143644

      (1442)for y ge 1025 Hence (1113) gives us that

      L2r1 le (eγ log log y minus 143644)

      (13

      4log

      3

      4y

      415 + 782

      )+ 1366 log

      3

      4y

      415 + 3755

      le 13

      15eγ log y log log y + 239776 log y + 122628 log log y + 237304

      le (213522 log y + 18118) log log y

      272 CHAPTER 14 CONCLUSION

      Moreover again by (1442)radicz(r1) le

      radiceγ log log y minus 143644

      2radiceγ log log y

      and so by y ge 1025

      (071219 log3

      4y

      415 + 05)

      radicz(r1)

      le (018992 log y + 029512)

      (radiceγ log log y minus 143644

      2radiceγ log log y

      )le 019505

      radiceγ log log y minus 019505 middot 143644 log y

      2radiceγ log log y

      le 026031 log yradic

      log log y minus 300147

      Therefore by (1119)

      gyϕ(r1) le 026031 log yradic

      log log y + 25minus 300147radic34y

      415

      +(213522 log y + 18118) log log y

      38y

      415

      +336((log y)2)16

      y16

      le 030059 log yradic

      log log y

      y215

      +569392 log y log log y

      y415

      minus 057904

      y215

      +483147 log log y

      y415

      +2994(log y)16

      y16

      le 030059 log yradic

      log log y

      y215

      +569392 log y log log y

      y415

      +130151(log y)16

      y16

      le 030915 log yradic

      log log y

      y215

      where we use y ge 1025 and verify that the functions t 7rarr (log t)16t16minus215 t 7rarrradiclog log tt415minus215 and t 7rarr (log log t)t415minus215 are decreasing for t ge y (just by

      taking derivatives)Since κ = 49 one of the terms in (1313) simplifies easily

      7

      15+minus214938 + 8

      15 logκlog x+ 2cminus

      le 7

      15

      By (1430) and y = xκ = x49 we conclude that

      7

      15g(r1)S le 7

      15middot 030915 log y

      radiclog log y

      y215

      middot (0640209 log xminus 0021095)x

      le 014427 log yradic

      log log y

      y215

      (0640209 log y + 24705)x le 030517x

      (1443)

      143 THE MINOR-ARC TOTAL EXPLICIT VERSION 273

      where we are using the fact that y 7rarr (log y)2radic

      log log yy215 is decreasing for y ge1025 (because y 7rarr (log y)52y215 is decreasing for y ge e754 and 1025 gt e754)

      It remains only to bound

      2S

      log x+ 2cminus

      int r1

      r0

      g(r)

      rdr

      in the expression (1313) forM We will use the bound on the integral given in (1333)The easiest term to bound there is f1(r0) defined in (1334) since it depends only onr0 for r0 = 150000

      f1(r0) = 00169073

      It is also not hard to bound f2(r0 x) also defined in (1334)

      f2(r0 y) = 336((log y)2)16

      x16log

      38y

      415

      r0

      le 336(log y)16

      (2y)16

      (4

      15log y + 005699minus log r0

      )

      where we recall again that x = κy = 49y Thus since r0 = 150000 and y ge 1025

      f2(r0 y) le 0001399

      Let us now look at the terms I1r cϕ in (1335) We already saw in (1433) that

      cϕ =Cϕ2|ϕ|1

      logKle 007455

      log log y2

      le 002219

      Since F (t) = eγ log t+ cγ with cγ = 1025742

      I1r0 = F (log r0) +2eγ

      log r0= 573826 (1444)

      It thus remains only to estimate I0r0r1z for z = y and z = yK where K =(log y)2

      We will first give estimates for y large Omitting negative terms from (1335) weeasily get the following general bound crude but useful enough

      I0r0r1z le R2z2r0 middot

      P2(log 2r0)radicr0

      +R2z2r1 minus 0414152

      log r1r0

      Pminus2 (log 2r0)radicr0

      where P2(t) = t2 + 4t+ 8 and Pminus2 (t) = 2t2 + 16t+ 48 By (1438) and (1441)

      Ry2r1 le 071215 RyK2r1 le 071392

      for y ge 1025 Assume now that y ge 10150 Then since r0 = 150000

      Ryr0 le 027125 log

      (1 +

      log 4r0

      2 log 9middot(10150)13

      2004r0

      )+ 041415 le 043086

      274 CHAPTER 14 CONCLUSION

      and similarly RyKr0 le 043113 Since

      0430862 middot P2(log 2r0)radicr0

      le 010426 0431132 middot P2(log 2r0)radicr0

      le 010439

      we obtain that

      (1minus cϕ)radicI0r0r1y + cϕ

      radicI0r0r1 2y

      log y

      le 097781 middotradic

      010426 +049214

      415 log y minus log 400000

      + 002219

      radic010439 +

      049584415 log y minus log 400000

      le 033239

      (1445)

      for y ge 10150For y between 1025 and 10150 we evaluate the left side of (1445) directly using

      the definition (1335) of I0r0r1z instead as well as the bound

      cϕ le007455

      log log y2

      from (1433) (It is clear from the second and third lines of (1332) that I0r0r1z isdecreasing on z for r0 r1 fixed and so the upper bound for cϕ does give the worst case)The bisection method (applied to the interval [25 150] with 30 iterations including 30initial iterations) gives us that

      (1minus cϕ)radicI0r0r1y + cϕ

      radicI0r0r1 2y

      log yle 04153461 (1446)

      for 1025 le y le 10140 By (1445) (1446) is also true for y gt 10150 Hence

      f0(r0 y) le 04153461 middot

      radic2radicr0

      573827 le 0071498

      By (1333) we conclude thatint r1

      r0

      g(r)

      rdr le 0071498 + 0016908 + 0001399 le 0089805

      By (1430)

      2S

      log x+ 2cminusle 2(0640209x log xminus 0021095x)

      log x+ 2cminusle 2 middot 0640209x = 1280418x

      where we recall that cminus = 06294 gt 0 Hence

      2S

      log x+ 2cminus

      int r1

      r0

      g(r)

      rdr le 0114988x (1447)

      144 CONCLUSION PROOF OF MAIN THEOREM 275

      Putting (1437) (1443) and (1447) together we conclude that the quantity Mdefined in (1313) is bounded by

      M le 038492x+ 030517x+ 0114988x le 080508x (1448)

      Gathering the terms from (1429) (1432) and (1448) we see that Theorem 1321states that the minor-arc total

      Zr0 =

      int(RZ)M8r0

      |Sηlowast(α x)||Sη+(α x)|2dα

      is bounded by

      Zr0 le

      (radic|ϕ|1xκ

      (M + T ) +radicSηlowast(0 x) middot E

      )2

      le(radic|ϕ|1(080508 + 35776 middot 10minus4)

      xradicκ

      +radic

      10532 middot 10minus11xradicκ

      )2

      le 100948x2

      κ

      (1449)

      for r0 = 150000 x ge 49 middot 1026 where we use yet again the fact that |ϕ|1 =radicπ2

      This is our total minor-arc bound

      144 Conclusion proof of main theoremAs we have known from the startsum

      n1+n2+n3=N

      Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

      =

      intRZ

      Sη+(α x)2Sηlowast(α x)e(minusNα)dα

      (1450)

      We have just shown that assuming N ge 1027 N oddintRZ

      Sη+(α x)2Sηlowast(α x)e(minusNα)dα

      =

      intM8r0

      Sη+(α x)2Sηlowast(α x)e(minusNα)dα

      +Olowast

      (int(RZ)M8r0

      |Sη+(α x)|2|Sηlowast(α x)|dα

      )

      ge 1058259x2

      κ+Olowast

      (100948

      x2

      κ

      )ge 004877

      x2

      κ

      for r0 = 150000 where x = N(2 + 9(196radic

      2π)) as in (1415) (We are using(1427) and (1449)) Recall that κ = 49 and ηlowast(t) = (η2 lowastM ϕ)(κt) where ϕ(t) =

      t2eminust22

      276 CHAPTER 14 CONCLUSION

      It only remains to show that the contribution of terms with n1 n2 or n3 non-primeto the sum in (1450) is negligible (Let us take out n1 n2 n3 equal to 2 as well sincesome prefer to state the ternary Goldbach conjecture as follows every odd numberge 9is the sum of three odd primes) Clearlysum

      n1+n2+n3=Nn1 n2 or n3 even or non-prime

      Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

      le 3|η+|2infin|ηlowast|infinsum

      n1+n2+n3=Nn1 even or non-prime

      Λ(n1)Λ(n2)Λ(n3)

      le 3|η+|2infin|ηlowast|infinmiddot(logN)sum

      n1 le N non-primeor n1 = 2

      Λ(n1)sumn2leN

      Λ(n2)

      (1451)

      By (143) and (1421) |η+|infin le 1079955 and |ηlowast|infin le 1414 By [RS62 Thms 12and 13] sum

      n1 le N non-primeor n1 = 2

      Λ(n1) lt 14262radicN + log 2 lt 14263

      radicN

      sumn1 le N non-prime

      or n1 = 2

      Λ(n1)sumn2leN

      Λ(n2) = 14263radicN middot 103883N le 148169N32

      Hence the sum on the first line of (1451) is at most

      73306N32 logN

      Thus for N ge 1027 oddsumn1+n2+n3=N

      n1 n2 n3 odd primes

      Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

      ge 004877x2

      κminus 73306N32 logN

      ge 000024433N2 minus 14412 middot 10minus11 middotN2 ge 00002443N2

      by κ = 49 and (1415) Since 00002443N2 gt 0 this shows that every odd numberN ge 1027 can be written as the sum of three odd primes

      Since the ternary Goldbach conjecture has already been checked for allN le 8875middot1030 [HP13] we conclude that every odd number N gt 7 can be written as the sumof three odd primes and every odd number N gt 5 can be written as the sum of threeprimes The main result is hereby proven the ternary Goldbach conjecture is true

      Part IV

      Appendices

      277

      Appendix A

      Norms of smoothing functions

      Our aim here is to give bounds on the norms of some smoothing functions ndash and inparticular on several norms of a smoothing function η+ [0infin) rarr R based on theGaussian ηhearts(t) = eminust

      22As before we write

      h t 7rarr

      t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

      (A1)

      We recall that we will work with an approximation η+ to the function η [0infin)rarr Rdefined by

      η(t) = h(t)ηhearts(t) =

      t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

      (A2)

      The approximation η+ is defined by

      η+(t) = hH(t)teminust22 (A3)

      where

      FH(t) =sin(H log y)

      π log y

      hH(t) = (h lowastM FH)(y) =

      int infin0

      h(tyminus1)FH(y)dy

      y

      (A4)

      and H is a positive constant to be set later By (28) MhH = Mh middotMFH Now FH isjust a Dirichlet kernel under a change of variables using this we get that for τ real

      MFH(iτ) =

      1 if |τ | lt H 12 if |τ | = H 0 if |τ | gt H

      (A5)

      279

      280 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

      Thus

      MhH(iτ) =

      Mh(iτ) if |τ | lt H 12Mh(iτ) if |τ | = H 0 if |τ | gt H

      (A6)

      As it turns out h η and Mh (and hence MhH ) are relatively easy to work withwhereas we can already see that hH and η+ have more complicated definitions Partof our work will consist in expressing norms of hH and η+ in terms of norms of h ηand Mh

      A1 The decay of a Mellin transformNow consider any φ [0infin) rarr C that (a) has compact support (or fast decay) (b)satisfies φ(k)(t)tkminus1 = O(1) for trarr 0+ and 0 le k le 3 and (c) is C2 everywhere andquadruply differentiable outside a finite set of points

      By definition

      Mφ(s) =

      int infin0

      φ(x)xsdx

      x

      Thus by integration by parts for lt(s) gt minus1 and s 6= 0

      Mφ(s) =

      int infin0

      φ(x)xsdx

      x= limtrarr0+

      int infint

      φ(x)xsdx

      x= minus lim

      trarr0+

      int infint

      φprime(x)xs

      sdx

      = limtrarr0+

      int infint

      φprimeprime(x)xs+1

      s(s+ 1)dx = lim

      trarr0+minusint infint

      φ(3)(x)xs+2

      s(s+ 1)(s+ 2)dx

      = limtrarr0+

      int infint

      φ(4)(x)xs+3

      s(s+ 1)(s+ 2)(s+ 3)dx

      (A7)where φ(4)(x) is understood in the sense of distributions at the finitely many pointswhere it is not well-defined as a function

      Let s = it φ = h Let Ck = limtrarr0+

      intinfint|h(k)(x)|xkminus1dx for 0 le k le 4 Then

      (A7) gives us that

      Mh(it) le min

      (C0

      C1

      |t|

      C2

      |t||t+ i|

      C3

      |t||t+ i||t+ 2i|

      C4

      |t||t+ i||t+ 2i||t+ 3i|

      )

      (A8)We must estimate the constants Cj 0 le j le 4

      Clearly h(t)tminus1 = O(1) as t rarr 0+ hk(t) = O(1) as t rarr 0+ for all k ge 1h(2) = hprime(2) = hprimeprime(2) = 0 and h(x) hprime(x) and hprimeprime(x) are all continuous Thefunction hprimeprimeprime has a discontinuity at t = 2 As we said we understand h(4) in the senseof distributions at t = 2 for example limεrarr0

      int 2+ε

      2minusε h(4)(t)dt = limεrarr0(h(3)(2 + ε)minus

      h(3)(2minus ε))Symbolic integration easily gives that

      C0 =

      int 2

      0

      t(2minus t)3etminus12dt = 92eminus12 minus 12e32 = 202055184 (A9)

      A1 THE DECAY OF A MELLIN TRANSFORM 281

      We will have to compute Ck 1 le k le 4 with some care due to the absolute valueinvolved in the definition

      The function (x2(2minus x)3exminus12)prime = ((x2(2minus x)3)prime + x2(2minus x)3)exminus12 has thesame zeros as H1(x) = (x2(2minus x)3)prime + x2(2minus x)3 namely minus4 0 1 and 2 The signof H1(x) (and hence of hprime(x)) is + within (0 1) and minus within (1 2) Hence

      C1 =

      int infin0

      |hprime(x)|dx = |h(1)minus h(0)|+ |h(2)minus h(1)| = 2h(1) = 2radice (A10)

      The situation with (x2(2 minus x)3exminus12)primeprime is similar it has zeros at the roots ofH2(x) = 0 where H2(x) = H1(x) + H prime1(x) (and in general Hk+1(x) = Hk(x) +H primek(x)) This time we will prefer to find the roots numerically It is enough to find(candidates for) the roots using any available tool1 and then check rigorously that thesign does change around the purported roots In this way we check thatH2(x) = 0 hastwo roots α21 α22 in the interval (0 2) another root at 2 and two more roots outside[0 2] moreover

      α21 = 048756597185712

      α22 = 148777169309489 (A11)

      where we verify the root using interval arithmetic The sign of H2(x) (and hence ofhprimeprime(x)) is first + then minus then + Write α20 = 0 α23 = 2 By integration by parts

      C2 =

      int infin0

      |hprimeprime(x)|x dx =

      int α21

      0

      hprimeprime(x)x dxminusint α22

      α21

      hprimeprime(x)x dx+

      int 2

      α22

      hprimeprime(x)x dx

      =

      3sumj=1

      (minus1)j+1

      (hprime(x)x|α2j

      α2jminus1minusint α2j

      α2jminus1

      hprime(x) dx

      )

      = 2

      2sumj=1

      (minus1)j+1 (hprime(α2j)α2j minus h(α2j)) = 1079195821037

      (A12)

      To compute C3 we proceed in the same way finding two roots of H3(x) = 0(numerically) within the interval (0 2) viz

      α31 = 104294565694978

      α32 = 180999654602916

      The sign of H3(x) on the interval [0 2] is first minus then + then minus Write α30 = 0α33 = 2 Proceeding as before ndash with the only difference that the integration by parts

      1Routine find root in SAGE was used here

      282 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

      is iterated once now ndash we obtain that

      C3 =

      int infin0

      |hprimeprimeprime(x)|x2dx =

      3sumj=1

      (minus1)jint α3j

      α3jminus1

      hprimeprimeprime(x)x2dx

      =

      3sumj=1

      (minus1)j

      (hprimeprime(x)x2|α3j

      α3jminus1minusint α3j

      α3jminus1

      hprimeprime(x) middot 2x

      )dx

      =

      3sumj=1

      (minus1)j(hprimeprime(x)x2 minus hprime(x) middot 2x+ 2h(x)

      )|α3jα3jminus1

      = 2

      2sumj=1

      (minus1)j(hprimeprime(α3j)α23j minus 2hprime(α3j)α3j + 2h(α3j))

      (A13)

      and so interval arithmetic gives us

      C3 = 751295251672 (A14)

      The treatment of the integral in C4 is very similar at least as first There are tworoots of H4(x) = 0 in the interval (0 2) namely

      α41 = 045839599852663

      α42 = 154626346975533

      The sign ofH4(x) on the interval [0 2] is firstminus + thenminus Using integration by partsas before we obtainint 2minus

      0+

      ∣∣∣h(4)(x)∣∣∣x3dx

      = minusint α41

      0+

      h(4)(x)x3dx+

      int α42

      α41

      h(4)(x)x3dxminusint 2minus

      α41

      h(4)(x)x3dx

      = 2

      2sumj=1

      (minus1)j(h(3)(α4j)α

      34j minus 3h(2)(α4j)α

      24j + 6hprime(α4j)α4j minus 6h(α4j)

      )minus limtrarr2minus

      h(3)(t)t3 = 115269754862

      since limtrarr0+ h(k)(t)tk = 0 for 0 le k le 3 limtrarr2minus h(k)(t) = 0 for 0 le k le 2 and

      limtrarr2minus h(3)(t) = minus24e32 Nowint infin

      2minus|h(4)(x)x3|dx = lim

      εrarr0+|h(3)(2 + ε)minus h(3)(2minus ε)| middot 23 = 23 middot 24e32

      Hence

      C4 =

      int 2minus

      0+

      ∣∣∣h(4)(x)∣∣∣x3dx+ 24e32 middot 23 = 201318185012 (A15)

      A2 THE DIFFERENCE η+ minus η IN `2 NORM 283

      We finish by remarking that can write down Mh explicitly

      Mh = minuseminus12(minus1)minuss(8γ(s+2minus2)+12γ(s+3minus2)+6γ(s+4minus2)+γ(s+5minus2))(A16)

      where γ(s x) is the (lower) incomplete Gamma function

      γ(s x) =

      int x

      0

      eminusttsminus1dt

      We will however find it easier to deal with Mh by means of the bound (A8) in partbecause (A16) amounts to an invitation to numerical instability

      For instance it is easy to use (A8) to give a bound for the `1-norm of Mh(it)Since C4C3 gt C3C2 gt C2C1 gt C1C0

      |Mh(it)|1 = 2

      int infin0

      Mh(it)dt

      le2

      (C0C1

      C0+ C1

      int C2C1

      C1C0

      dt

      t+ C2

      int C3C2

      C2C1

      dt

      t2+ C3

      int C4C3

      C3C2

      dt

      t3+ C4

      int infinC4C3

      dt

      t4

      )

      =2

      (C1 + C1 log

      C2C0

      C21

      + C2

      (C1

      C2minus C2

      C3

      )+C3

      2

      (C2

      2

      C23

      minus C23

      C24

      )+C4

      3middot C

      33

      C34

      )

      and so|Mh(it)|1 le 161939176 (A17)

      This bound is far from tight but it will certainly be usefulSimilarly |(t+ i)Mh(it)|1 is at most two times

      C0

      int C1C0

      0

      |t+ i| dt+ C1

      int C2C1

      C1C0

      ∣∣∣∣1 +i

      t

      ∣∣∣∣ dt+ C2

      int C3C2

      C2C1

      dt

      t+ C3

      int C4C3

      C3C2

      dt

      t2+ C4

      int infinC4C3

      dt

      t3

      =C0

      2

      (radicC4

      1

      C40

      +C2

      1

      C20

      + sinhminus1 C1

      C0

      )+ C1

      (radict2 + 1 + log

      (radict2 + 1minus 1

      t

      ))|C2C1C1C0

      + C2 logC3C1

      C22

      + C3

      (C2

      C3minus C3

      C4

      )+C4

      2

      C23

      C24

      and so|(t+ i)Mh(it)|1 le 278622803 (A18)

      A2 The difference η+ minus η in `2 norm

      We wish to estimate the distance in `2 norm between η and its approximation η+ Thiswill be an easy affair since on the imaginary axis the Mellin transform of η+ is just atruncation of the Mellin transform of η

      284 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

      By (A2) and (A3)

      |η+ minus η|22 =

      int infin0

      ∣∣∣hH(t)teminust22 minus h(t)teminust

      22∣∣∣2 dt

      le(

      maxtge0

      eminust2

      t3)middotint infin

      0

      |hH(t)minus h(t)|2 dtt

      (A19)

      The maximum maxtge0 t3eminust

      2

      is (32)32eminus32 Since the Mellin transform is anisometry (ie (26) holds)int infin

      0

      |hH(t)minus h(t)|2 dtt

      =1

      int infinminusinfin|MhH(it)minusMh(it)|2dt =

      1

      π

      int infinH

      |Mh(it)|2dt

      (A20)By (A8) int infin

      H

      |Mh(it)|2dt leint infinH

      C24

      t8dt le C2

      4

      7H7 (A21)

      Hence int infin0

      |hH(t)minus h(t)|2 dttle C2

      4

      7πH7 (A22)

      Using the bound (A15) for C4 we conclude that

      |η+ minus η|2 leC4radic7π

      (3

      2e

      )34

      middot 1

      H72le 274856893

      H72 (A23)

      It will also be useful to bound∣∣∣∣int infin0

      (η+(t)minus η(t))2 log t dt

      ∣∣∣∣ This is at most (

      maxtge0

      eminust2

      t3| log t|)middotint infin

      0

      |hH(t)minus h(t)|2 dtt

      Now

      maxtge0

      eminust2

      t3| log t| = max

      (maxtisin[01]

      eminust2

      t3(minus log t) maxtisin[15]

      eminust2

      t3 log t

      )= 014882234545

      where we find the maximum by the bisection method with 40 iterations (see 26)Hence by (A22)int infin

      0

      (η+(t)minus η(t))2| log t|dt le 0148822346C2

      4

      le 27427502

      H7le(

      16561251

      H72

      )2

      (A24)

      A3 NORMS INVOLVING η+ 285

      A3 Norms involving η+

      Let us now bound some `1- and `2-norms involving η+ Relatively crude bounds willsuffice in most cases

      First by (A23)

      |η+|2 le |η|2 + |η+ minus η|2 le 0800129 +2748569

      H72

      |η+|2 ge |η|2 minus |η+ minus η|2 ge 0800128minus 2748569

      H72

      (A25)

      where we obtain

      |η|2 =radic

      0640205997 = 08001287 (A26)

      by symbolic integrationLet us now bound |η+ middot log |22 By isometry and (210)

      |η+ middot log |22 =1

      2πi

      int 12 +iinfin

      12minusiinfin

      |M(η+ middot log)(s)|2ds =1

      2πi

      int 12 +iinfin

      12minusiinfin

      |(Mη+)prime(s)|2ds

      Now (Mη+)prime(12 + it) equals 12π times the additive convolution of MhH(it) and(Mηdiams)prime(12 + it) where ηdiams(t) = teminust

      22 Hence by Youngrsquos inequality

      |(Mη+)prime(12 + it)|2 le1

      2π|MhH(it)|1|(Mηdiams)prime(12 + it)|2

      Again by isometry and (210)

      |(Mηdiams)prime(12 + it)|2 =radic

      2π|ηdiams middot log |2

      Hence by (A17)

      |η+ middot log |2 le1

      2π|MhH(it)|1|ηdiams middot log |2 le 25773421 middot |ηdiams middot log |2

      Since by symbolic integration

      |ηdiams middot log |2 leradicradic

      π

      32(8(log 2)2 + 2γ2 + π2 + 8(γ minus 2) log 2minus 8γ)

      le 03220301

      (A27)

      we get that|η+ middot log |2 le 08299818 (A28)

      Let us bound |η+(t)tσ|1 for σ isin (minus2infin) By Cauchy-Schwarz and Plancherel

      |η+(t)tσ|1 =∣∣∣hH(t)t1+σeminust

      22∣∣∣1le∣∣∣tσ+32eminust

      22∣∣∣2|hH(t)

      radict|2

      =∣∣∣tσ+32eminust

      22∣∣∣2

      radicint infin0

      |hH(t)|2 dtt

      =∣∣∣tσ+32eminust

      22∣∣∣2middot

      radic1

      int H

      minusH|Mh(ir)|2dr

      le∣∣∣tσ+32eminust

      22∣∣∣2middot

      radic1

      int infinminusinfin|Mh(ir)|2dr =

      ∣∣∣tσ+32eminust22∣∣∣2middot |h(t)

      radict|2

      (A29)

      286 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

      Since ∣∣∣tσ+32eminust22∣∣∣2

      =

      radicint infin0

      eminust2t2σ+3dt =

      radicΓ(σ + 2)

      2

      |h(t)radict|2 =

      radic31989

      8eminus 585e3

      8le 15023459

      we conclude that|η+(t)tσ|1 le 1062319 middot

      radicΓ(σ + 2) (A30)

      for σ gt minus2

      A4 Norms involving ηprime+By one of the standard transformation rules (see (210)) the Mellin transform of ηprime+equals minus(sminus 1) middotMη+(sminus 1) Since the Mellin transform is an isometry in the senseof (26)

      |ηprime+|22 =1

      2πi

      int 12 +iinfin

      12minusiinfin

      ∣∣M(ηprime+)(s)∣∣2 ds =

      1

      2πi

      int minus 12 +iinfin

      minus 12minusiinfin

      |s middotMη+(s)|2 ds

      Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) = teminust22 Thus by (29) the func-

      tion Mη+(minus12 + it) equals 12π times the (additive) convolution of MhH(it) andMηdiams(minus12 + it) Therefore for s = minus12 + it

      |s| |Mη+(s)| = |s|2π

      int H

      minusHMh(ir)Mηdiams(sminus ir)dr

      le 3

      int H

      minusH|ir minus 1||Mh(ir)| middot |sminus ir||Mηhearts(sminus ir)|dr

      =3

      2π(f lowast g)(t)

      (A31)

      where f(t) = |it minus 1||Mh(it)| and g(t) = | minus 12 + it||Mηdiams(minus12 + it)| (Since|(minus12 + i(tminus r)) + (1 + ir)| = |12 + it| = |s| either | minus 12 + i(tminus r)| ge |s|3 or|1+ir| ge 2|s|3 hence |sminusir||irminus1| = |minus12+i(tminusr)||1+ir| ge |s|3) By Youngrsquosinequality (in a special case that follows from Cauchy-Schwarz) |f lowast g|2 le |f |1|g|2By (A18)

      |f |1 = |(r + i)Mh(ir)|1 le 278622803

      Yet again by Plancherel

      |g|22 =

      int minus 12 +iinfin

      minus 12minusiinfin

      |s|2|Mηdiams(s)|2ds

      =

      int 12 +iinfin

      12minusiinfin

      |(M(ηprimediams))(s)|2ds = 2π|ηprimediams|22 =3π

      32

      4

      A4 NORMS INVOLVING ηprime+ 287

      Hence

      |ηprime+|2 le1radic2πmiddot 3

      2π|f lowast g|2

      le 1radic2π

      3

      2πmiddot 278622803

      radic3π

      32

      4le 10845789

      (A32)

      Let us now bound |ηprime+(t)tσ|1 for σ isin (minus1infin) First of all

      |ηprime+(t)tσ|1 =

      ∣∣∣∣(hH(t)teminust22)primetσ∣∣∣∣1

      le∣∣∣(hprimeH(t)teminust

      22 + hH(t)(1minus t2)eminust22)middot tσ∣∣∣1

      le∣∣∣hprimeH(t)tσ+1eminust

      22∣∣∣1

      + |η+(t)tσminus1|1 + |η+(t)tσ+1|1

      We can bound the last two terms by (A30) Much as in (A29) we note that∣∣∣hprimeH(t)tσ+1eminust22∣∣∣1le∣∣∣tσ+12eminust

      22∣∣∣2|hprimeH(t)

      radict|2

      and then see that

      |hprimeH(t)radict|2 =

      radicint infin0

      |hprimeH(t)|2t dt =

      radic1

      int infinminusinfin|M(hprimeH)(1 + ir)|2dr

      =

      radic1

      int infinminusinfin|(minusir)MhH(ir)|2dr =

      radic1

      int H

      minusH|(minusir)Mh(ir)|2dr

      =

      radic1

      int H

      minusH|M(hprime)(1 + ir)|2dr le

      radic1

      int infinminusinfin|M(hprime)(1 + ir)|2dr = |hprime(t)

      radict|2

      where we use the first rule in (210) twice Since

      ∣∣∣tσ+12eminust22∣∣∣2

      =

      radicΓ(σ + 1)

      2 |hprime(t)

      radict|2 =

      radic103983

      16eminus 1899e3

      16= 26312226

      we conclude that

      |ηprime+(t)tσ|1 le 1062319 middot (radic

      Γ(σ + 1) +radic

      Γ(σ + 3)) +

      radicΓ(σ + 1)

      2middot 26312226

      le 2922875radic

      Γ(σ + 1) + 1062319radic

      Γ(σ + 3)(A33)

      for σ gt minus1

      288 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

      A5 The `infin-norm of η+

      Let us now get a bound for |η+|infin Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) =

      teminust22 Clearly

      |η+|infin = |hH(t)ηdiams(t)|infin le |η|infin + |(h(t)minus hH(t))ηdiams(t)|infin

      le |η|infin +

      ∣∣∣∣h(t)minus hH(t)

      t

      ∣∣∣∣infin|ηdiams(t)t|infin

      (A34)

      Taking derivatives we easily see that

      |η|infin = η(1) = 1 |ηdiams(t)t|infin = 2e

      It remains to bound |(h(t)minus hH(t))t|infin By (76)

      hH(t) =

      int infint2

      h(tyminus1)sin(H log y)

      π log y

      dy

      y=

      int infinminusH log 2

      t

      h

      (t

      ewH

      )sinw

      πwdw (A35)

      The sine integral

      Si(x) =

      int x

      0

      sin t

      tdt

      is defined for all x it tends to π2 as xrarr +infin and to minusπ2 as xrarr minusinfin (see [AS64(5225)]) We apply integration by parts to the second integral in (A35) and obtain

      hH(t)minus h(t) = minus 1

      π

      int infinminusH log 2

      t

      (d

      dwh

      (t

      ewH

      ))Si(w)dw minus h(t)

      = minus 1

      π

      int infin0

      (d

      dwh

      (t

      ewH

      ))(Si(w)minus π

      2

      )dw

      minus 1

      π

      int 0

      minusH log 2t

      (d

      dwh

      (t

      ewH

      ))(Si(w) +

      π

      2

      )dw

      Now ∣∣∣∣ ddwh(

      t

      ewH

      )∣∣∣∣ =teminuswH

      H

      ∣∣∣∣hprime( t

      ewH

      )∣∣∣∣ le t|hprime|infinHewH

      Integration by parts easily yields the bounds |Si(x) minus π2| lt 2x for x gt 0 and|Si(x) + π2| lt 2|x| for x lt 0 we also know that 0 le Si(x) le x lt π2 forx isin [0 1] and minusπ2 lt x le Si(x) le 0 for x isin [minus1 0] Hence

      |hH(t)minus h(t)| le 2t|hprime|infinπH

      (int 1

      0

      π

      2eminuswHdw +

      int infin1

      2eminuswH

      wdw

      )= t|hprime|infin middot

      ((1minus eminus1H) +

      4

      π

      E1(1H)

      H

      )

      where E1 is the exponential integral

      E1(z) =

      int infinz

      eminust

      tdt

      A5 THE `infin-NORM OF η+ 289

      By [AS64 (5120)]

      0 lt E1(1H) ltlog(H + 1)

      e1H

      and since log(H+1) = logH+log(1+1H) lt logH+1H lt (logH)(1+1H) lt(logH)e1H for H ge e we see that this gives us that E1(1H) lt logH (again forH ge e as is the case) Hence

      |hH(t)minus h(t)|t

      lt |hprime|infin middot(

      1minus eminus 1H +

      4

      π

      logH

      H

      )lt |hprime|infin middot

      1 + 4π logH

      H (A36)

      and so by (A34)

      |η+|infin le 1 +2

      e

      ∣∣∣∣h(t)minus hH(t)

      t

      ∣∣∣∣infinlt 1 +

      2

      e|hprime|infin middot

      1 + 4π logH

      H

      By (A11) and interval arithmetic we determine that

      |hprime|infin = |hprime(α22)| le 2805820379671 (A37)

      where α22 is a root of hprimeprime(x) = 0 as in (A11) We have proven

      |η+|infin lt 1+2

      emiddot280582038 middot

      1 + 4π logH

      Hlt 1+206440727 middot

      1 + 4π logH

      H (A38)

      We will need three other bounds of this kind namely for η+(t) log t η+(t)t andη+(t)t We start as in (A34)

      |η+ log t|infin le |η log t|infin + |(h(t)minus hH(t))ηdiams(t) log t|infinle |η log t|infin + |(hminus hH(t))t|infin|ηdiams(t)t log t|infin

      |η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)|infin|η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)t2|infin

      (A39)

      By the bisection method with 30 iterations implemented with interval arithmetic

      |η(t) log t|infin le 0279491 |ηdiams(t)t log t|infin le 03811561

      Hence by (A36) and (A37)

      |η+ log t|infin le 0279491 + 1069456 middot1 + 4

      π logH

      H (A40)

      By the bisection method with 32 iterations

      |η(t)t|infin le 108754396

      (We can also obtain this by solving (η(t)t)prime = 0 symbolically) It is easy to show

      that |ηdiams|infin = 1radice Hence again by (A36) and (A37)

      |η+(t)t|infin le 108754396 + 170181609 middot1 + 4

      π logH

      H (A41)

      290 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

      By the bisection method with 32 iterations

      |η(t)t|infin le 106473476

      Taking derivatives we see that |ηdiams(t)t2|infin = 332eminus32 Hence yet again by (A36)and (A37)

      |η+(t)t|infin le 106473476 + 325312 middot1 + 4

      π logH

      H (A42)

      Appendix B

      Norms of Fourier transforms

      B1 The Fourier transform of ηprimeprime2Our aim here is to give upper bounds on |ηprimeprime2 |infin where η2 is as in (34) We will doconsiderably better than the trivial bound |ηprimeprime|infin le |ηprimeprime|1

      Lemma B11 For every t isin R

      |4e(minust4)minus 4e(minust2) + e(minust)| le 787052 (B1)

      We will describe an extremely simple but rigorous procedure to find the maxi-mum Since |g(t)|2 is C2 (in fact smooth) there are several more efficient and equallyrigourous algorithms ndash for starters the bisection method with error bounded in termsof |(|g|2)primeprime|infin

      Proof Letg(t) = 4e(minust4)minus 4e(minust2) + e(minust) (B2)

      For a le t le b

      g(t) = g(a) +tminus abminus a

      (g(b)minus g(a)) +1

      8(bminus a)2 middotOlowast( max

      visin[ab]|gprimeprime(v)|) (B3)

      (This formula in all likelihood well-known is easy to derive First we can assumewithout loss of generality that a = 0 b = 1 and g(a) = g(b) = 0 Dividing by gby g(t) we see that we can also assume that g(t) is real (and in fact 1) We can alsoassume that g is real-valued in that it will be enough to prove (B3) for the real-valuedfunction ltg as this will give us the bound g(t) = ltg(t) le (18) maxv |(ltg)primeprime(v)| lemaxv |gprimeprime(v)| that we wish for Lastly we can assume (by symmetry) that 0 le t le 12and that g has a local maximum or minimum at t Writing M = maxuisin[01] |gprimeprime(u)|we then have

      g(t) =

      int t

      0

      gprime(v)dv =

      int t

      0

      int v

      t

      gprimeprime(u)dudv = Olowast(int t

      0

      ∣∣∣∣int v

      t

      Mdu

      ∣∣∣∣ dv)= Olowast

      (int t

      0

      (v minus t)Mdv

      )= Olowast

      (1

      2t2M

      )= Olowast

      (1

      8M

      )

      291

      292 APPENDIX B NORMS OF FOURIER TRANSFORMS

      as desired)We obtain immediately from (B3) that

      maxtisin[ab]

      |g(t)| le max(|g(a)| |g(b)|) +1

      8(bminus a)2 middot max

      visin[ab]|gprimeprime(v)| (B4)

      For any v isin R

      |gprimeprime(v)| le(π

      2

      )2

      middot 4 + π2 middot 4 + (2π)2 = 9π2 (B5)

      Clearly g(t) depends only on t mod 4π Hence by (B4) and (B5) to estimate

      maxtisinR|g(t)|

      with an error of at most ε it is enough to subdivide [0 4π] into intervals of lengthleradic

      8ε9π2 each We set ε = 10minus6 and compute

      Lemma B12 Let η2 R+ rarr R be as in (34) Then

      |ηprimeprime2 |infin le 31521 (B6)

      This should be compared with |ηprimeprime2 |1 = 48

      Proof We can write

      ηprimeprime2 (x) = 4(4δ14(x)minus 4δ12(x) + δ1(x)) + f(x) (B7)

      where δx0is the point measure at x0 of mass 1 (Dirac delta function) and

      f(x) =

      0 if x lt 14 or x ge 1minus4xminus2 if 14 le x lt 124xminus2 if 12 le x lt 1

      Thus ηprimeprime2 (t) = 4g(t) + f(t) where g is as in (B2) It is easy to see that |f prime|1 =2 maxx f(x)minus 2 minx f(x) = 160 Therefore∣∣∣f(t)

      ∣∣∣ =∣∣∣f prime(t)(2πit)∣∣∣ le |f prime|1

      2π|t|=

      80

      π|t| (B8)

      Since 31521 minus 4 middot 787052 = 003892 we conclude that (B6) follows from LemmaB11 and (B8) for |t| ge 655 gt 80(π middot 003892)

      It remains to check the range t isin (minus655 655) since 4g(minust)+f(minust) is the complexconjugate of 4g(t) + f(t) it suffices to consider t non-negative We use (B4) (with4g+ f instead of g) and obtain that to estimate maxtisinR |4g+ f(t)| with an error of at

      most ε it is enough to subdivide [0 655) into intervals of lengthleradic

      2ε|(4g + f)primeprime|infineach and check |4g + f(t)| at the endpoints Now for every t isin R∣∣∣∣(f)primeprime (t)∣∣∣∣ =

      ∣∣∣(minus2πi)2x2f(t)∣∣∣ = (2π)2 middotOlowast

      (|x2f |1

      )= 12π2

      B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 293

      By this and (B5) |(4g + f)primeprime|infin le 48π2 Thus intervals of length δ1 give an errorterm of size at most 24π2δ2

      1 We choose δ1 = 0001 and obtain an error term less than0000237 for this stage

      To evaluate f(t) (and hence 4g(t) + f(t)) at a point we integrate using Simpsonrsquosrule on subdivisions of the intervals [14 12] [12 1] into 200 middotmax(1 b

      radic|t|c) sub-

      intervals each1 The largest value of f(t) we find is 3152065 with an error termof at most 45 middot 10minus5

      B2 Bounds involving a logarithmic factor

      Our aim now is to give upper bounds on |ηprimeprime(y)|infin where η(y)(t) = log(yt)η2(t) andy ge 4

      Lemma B21 Let η2 R+ rarr R be as in (34) Let η(y)(t) = log(yt)η2(t) wherey ge 4 Then

      |ηprime(y)|1 lt (log y)|ηprime2|1 (B9)

      Proof Recall that supp(η2) = (14 1) For t isin (14 12)

      ηprime(y)(t) = (4 log(yt) log 4t)prime =4 log 4t

      t+

      4 log yt

      tge 8 log 4t

      tgt 0

      whereas for t isin (12 1)

      ηprime(y)(t) = (minus4 log(yt) log t)prime = minus4 log yt

      tminus 4 log t

      t= minus4 log yt2

      tlt 0

      where we are using the fact that y ge 4 Hence η(y)(t) is increasing on (14 12) anddecreasing on (12 1) it is also continuous at t = 12 Hence |ηprime(y)|1 = 2|η(y)(12)|We are done by

      2|η(y)(12)| = 2 logy

      2middot η2(12) = log

      y

      2middot 8 log 2 lt log y middot 8 log 2 = (log y)|ηprime2|1

      Lemma B22 Let y ge 4 Let g(t) = 4e(minust4) minus 4e(minust2) + e(minust) and k(t) =2e(minust4)minus e(minust2) Then for every t isin R

      |g(t) middot log y minus k(t) middot 4 log 2| le 787052 log y (B10)

      Proof By Lemma B11 |g(t)| le 787052 Since y ge 4 k(t) middot (4 log 2) log y le 6For any complex numbers z1 z2 with |z1| |z2| le ` we can have |z1 minus z2| gt ` only if| arg(z1z2)| gt π3 It is easy to check that for all t isin [minus2 2]∣∣∣∣arg

      (g(t) middot log y

      4 log 2 middot k(t)

      )∣∣∣∣ =

      ∣∣∣∣arg

      (g(t)

      k(t)

      )∣∣∣∣ lt 07 ltπ

      3

      (It is possible to bound maxima rigorously as in (B4)) Hence (B10) holds1As usual the code uses interval arithmetic (sect26)

      294 APPENDIX B NORMS OF FOURIER TRANSFORMS

      Lemma B23 Let η2 R+ rarr R be as in (34) Let η(y)(t) = (log yt)η2(t) wherey ge 4 Then

      |ηprimeprime(y)|infin lt 31521 middot log y (B11)

      Proof Clearly

      ηprimeprime(y)(x) = ηprimeprime2 (x)(log y) +

      ((log x)ηprimeprime2 (x) +

      2

      xηprime2(x)minus 1

      x2η2(x)

      )= ηprimeprime2 (x)(log y) + 4(log x)(4δ14(x)minus 4δ12(x) + δ1(x)) + h(x)

      where

      h(x) =

      0 if x lt 14 or x gt 14x2 (2minus 2 log 2x) if 14 le x lt 124x2 (minus2 + 2 log x) if 12 le x lt 1

      (Here we are using the expression (B7) for ηprimeprime2 (x)) Hence

      ηprimeprime(y)(t) = (4g(t) + f(t))(log y) + (minus16 log 2 middot k(t) + h(t)) (B12)

      where k(t) = 2e(minust4)minus e(minust2) Just as in the proof of Lemma B12

      |f(t)| le |fprime|1

      2π|t|le 80

      π|t| |h(t)| le 160(1 + log 2)

      π|t| (B13)

      Again as before this implies that (B11) holds for

      |t| ge 1

      π middot 003892

      (80 +

      160(1 + log 2)

      (log 4)

      )= 225251

      Note also that it is enough to check (B11) for t ge 0 by symmetry Our remaining taskis to prove (B11) for 0 le t le 225221

      Let I = [03 225221] [325 365] For t isin I we will have

      arg

      (4g(t) + f(t)

      minus16 log 2 middot k(t) + h(t)

      )sub(minusπ

      3

      ) (B14)

      (This is actually true for 0 le t le 03 as well but we will use a different strategy inthat range in order to better control error terms) Consequently by Lemma B12 andlog y ge log 4

      |ηprimeprime(y)(t)| lt max(|4g(t) + f(t)| middot (log y) |16 log 2 middot k(t)minus h(t)|)

      lt max(31521(log y) |48 log 2 + 25|) = 31521 log y

      where we bound h(t) by (B13) and by a numerical computation of the maximum of|h(t)| for 0 le t le 4 as in the proof of Lemma B12

      It remains to check (B14) Here as in the proof of Lemma B22 the allowableerror is relatively large (the expression on the left of (B14) is actually contained in

      B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 295

      (minus1 1) for t isin I) We decide to evaluate the argument in (B14) at all t isin 0005Z cap I computing f(t) and h(t) by numerical integration (Simpsonrsquos rule) with a subdivisionof [minus14 1] into 5000 intervals Proceeding as in the proof of Lemma B11 we seethat the sampling induces an error of at most

      1

      200052 max

      visinI((4|gprimeprime(v)|+ |(f)primeprime(t)|) le 00001

      848π2 lt 000593 (B15)

      in the evaluation of 4g(t) + f(t) and an error of at most

      1

      200052 max

      visinI((16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|)

      le 00001

      8(16 log 2 middot 6π2 + 24π2 middot (2minus log 2)) lt 00121

      (B16)

      in the evaluation of 16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|Running the numerical evaluation just described for t isin I the estimates for the left

      side of (B14) at the sample points are at most 099134 in absolute value the absolutevalues of the estimates for 4g(t) + f(t) are all at least 27783 and the absolute valuesof the estimates for | minus 16 log 2 middot log k(t) + h(t)| are all at least 21166 Numericalintegration by Simpsonrsquos rule gives errors bounded by 017575 percent Hence theabsolute value of the left side of (B14) is at most

      099134 + arcsin

      (000593

      27783+ 00017575

      )+ arcsin

      (00121

      21166+ 00017575

      )le 100271 lt

      π

      3

      for t isin I Lastly for t isin [0 03] cup [325 365] a numerical computation (samples at 0001Z

      interpolation as in Lemma B12 integrals computed by Simpsonrsquos rule with a subdi-vision into 1000 intervals) gives

      maxtisin[003]cup[325365]

      (|(4g(t) + f(t))|+ | minus 16 log 2 middot k(t) + h(t)|

      log 4

      )lt 2908

      and so maxtisin[003]cup[325365] |ηprimeprime(y)|infin lt 291 log y lt 31521 log y

      An easy integral gives us that the function log middotη2 satisfies

      | log middotη2|1 = 2minus log 4 (B17)

      The following function will appear only in a lower-order term thus an `1 estimate willdo

      Lemma B24 Let η2 R+ rarr R be as in (34) Then

      |(log middotη2)primeprime|1 = 96 log 2 (B18)

      296 APPENDIX B NORMS OF FOURIER TRANSFORMS

      Proof The function log middotη(t) is 0 for t isin [14 1] is increasing and negative for t isin(14 12) and is decreasing and positive for t isin (12 1) Hence

      |(log middotη2)primeprime|infin = 2

      ((log middotη2)prime

      (1

      2

      )minus (log middotη2)prime

      (1

      4

      ))= 2(16 log 2minus (minus32 log 2)) = 96 log 2

      Appendix C

      Sums involving Λ and φ

      C1 Sums over primesHere we treat some sums of the type

      sumn Λ(n)ϕ(n) where ϕ has compact support

      Since the sums are over all integers (not just an arithmetic progression) and there is nophase e(αn) involved the treatment is relatively straightforward

      The following is standard

      Lemma C11 (Explicit formula) Let ϕ [1infin) rarr C be continuous and piecewiseC1 with ϕprimeprime isin `1 let it also be of compact support contained in [1infin) Thensum

      n

      Λ(n)ϕ(n) =

      int infin1

      (1minus 1

      x(x2 minus 1)

      )ϕ(x)dxminus

      sumρ

      (Mϕ)(ρ) (C1)

      where ρ runs over the non-trivial zeros of ζ(s)

      The non-trivial zeros of ζ(s) are of course those in the critical strip 0 lt lt(s) lt 1Remark Lemma C11 appears as exercise 5 in [IK04 sect55] the condition there

      that ϕ be smooth can be relaxed since already the weaker assumption that ϕprimeprime be in L1

      implies that the Mellin transform (Mϕ)(σ + it) decays quadratically on t as t rarr infinthereby guaranteeing that the sum

      sumρ(Mϕ)(ρ) converges absolutely

      Lemma C12 Let x ge 10 Let η2 be as in (117) Assume that all non-trivial zeros ofζ(s) with |=(s)| le T0 lie on the critical line

      Thensumn

      Λ(n)η2

      (nx

      )= x+Olowast

      (0135x12 +

      97

      x2

      )+

      log eT0

      T0

      (94

      2π+

      603

      T0

      )x

      (C2)In particular with T0 = 3061 middot 1010 in the assumption we have for x ge 2000sum

      n

      Λ(n)η2

      (nx

      )= (1 +Olowast(ε))x+Olowast(0135x12)

      where ε = 273 middot 10minus10

      297

      298 APPENDIX C SUMS INVOLVING Λ AND φ

      The assumption that all non-trivial zeros up to T0 = 3061 middot 1010 lie on the criticalline was proven rigorously in [Plaa] higher values of T0 have been reached elsewhere([Wed03] [GD04])

      Proof By Lemma C11sumn

      Λ(n)η2

      (nx

      )=

      int infin1

      η2

      (t

      x

      )dtminus

      int infin1

      η2(tx)

      t(t2 minus 1)dtminus

      sumρ

      (Mϕ)(ρ)

      where ϕ(u) = η2(ux) and ρ runs over all non-trivial zeros of ζ(s) Since η2 is non-negative

      intinfin1η2(tx)dt = x|η2|1 = x whileint infin

      1

      η2(tx)

      t(t2 minus 1)dt = Olowast

      (int 1

      14

      η2(t)

      tx2(t2 minus 1100)dt

      )= Olowast

      (961114

      x2

      )

      By (211)

      sumρ

      (Mϕ)(ρ) =sumρ

      Mη2(ρ) middot xρ =sumρ

      (1minus 2minusρ

      ρ

      )2

      = S1(x)minus 2S1(x2) + S1(x4)

      whereSm(x) =

      sumρ

      ρm+1 (C3)

      Setting aside the contribution of all ρ with |=(ρ)| le T0 and all ρ with |=(ρ)| gt T0 andlt(s) le 12 and using the symmetry provided by the functional equation we obtain

      |Sm(x)| le x12 middotsumρ

      1

      |ρ|m+1+ x middot

      sumρ

      |=(ρ)|gtT0

      |lt(ρ)|gt12

      1

      |ρ|m+1

      le x12 middotsumρ

      1

      |ρ|m+1+x

      2middotsumρ

      |=(ρ)|gtT0

      1

      |ρ|m+1

      We bound the first sum by [Ros41 Lemma 17] and the second sum by [RS03 Lemma2] We obtain

      |Sm(x)| le(

      1

      2mπTm0+

      268

      Tm+10

      )x log

      eT0

      2π+ κmx

      12 (C4)

      where κ1 = 00463 κ2 = 000167 and κ3 = 00000744Hence∣∣∣∣∣sum

      ρ

      (Mη)(ρ) middot xρ∣∣∣∣∣ le

      (1

      2πT0+

      268

      T 20

      )9x

      4log

      eT0

      2π+

      (3

      2+radic

      2

      )κ1x

      12

      C2 SUMS INVOLVING φ 299

      For T0 = 3061 middot 1010 and x ge 2000 we obtainsumn

      Λ(n)η2

      (nx

      )= (1 +Olowast(ε))x+Olowast(0135x12)

      where ε = 273 middot 10minus10

      Corollary C13 Let η2 be as in (117) Assume that all non-trivial zeros of ζ(s) with|=(s)| le T0 T0 = 3061 middot 1010 lie on the critical line Then for all x ge 1sum

      n

      Λ(n)η2

      (nx

      )le min

      ((1 + ε)x+ 02x12 104488x

      ) (C5)

      where ε = 273 middot 10minus10

      Proof Immediate from Lemma C12 for x ge 2000 For x lt 2000 we use computa-tion as follows Since |ηprime2|infin = 16 and

      sumx4lenlex Λ(n) le x for all x ge 0 computingsum

      nlex Λ(n)η2(nx) only for x isin (11000)Z cap [0 2000] results in an inaccuracy of atmost (16 middot 0000509995)x le 000801x This resolves the matter at all points outside(205 207) (for the first estimate) or outside (95 105) and (135 145) (for the secondestimate) In those intervals the prime powers n involved do not change (since whetherx4 lt n le x depends only on n and [x]) and thus we can find the maximum of thesum in (C5) just by taking derivatives

      C2 Sums involving φWe need estimates for several sums involving φ(q) in the denominator

      The easiest are convergent sums such assumq micro

      2(q)(φ(q)q) We can express thisasprodp(1 + 1(p(pminus 1))) This is a convergent product and the main task is to bound

      a tail for r an integer

      logprodpgtr

      (1 +

      1

      p(pminus 1)

      )lesumpgtr

      1

      p(pminus 1)lesumngtr

      1

      n(nminus 1)=

      1

      r (C6)

      A quick computation1 now suffices to give

      2591461 lesumq

      gcd(q 2)micro2(q)

      φ(q)qlt 2591463 (C7)

      and so

      1295730 lesumq odd

      micro2(q)

      φ(q)qlt 1295732 (C8)

      since the expression bounded in (C8) is exactly half of that bounded in (C7)

      1Using D Plattrsquos integer arithmetic package

      300 APPENDIX C SUMS INVOLVING Λ AND φ

      Again using (C6) we get that

      2826419 lesumq

      micro2(q)

      φ(q)2lt 2826421 (C9)

      In what follows we will use values for convergent sums obtained in much the sameway ndash an easy tail bound followed by a computation

      By [Ram95 Lemma 34]sumqler

      micro2(q)

      φ(q)= log r + cE +Olowast(7284rminus13)

      sumqlerq odd

      micro2(q)

      φ(q)=

      1

      2

      (log r + cE +

      log 2

      2

      )+Olowast(4899rminus13)

      (C10)

      wherecE = γ +

      sump

      log p

      p(pminus 1)= 1332582275 +Olowast(10minus93)

      by [RS62 (211)] As we already said in (1215) this supplemented by a computationfor r le 4 middot 107 gives

      log r + 1312 lesumqler

      micro2(q)

      φ(q)le log r + 1354

      for r ge 182 In the same way we get that

      1

      2log r + 083 le

      sumqlerq odd

      micro2(q)

      φ(q)le 1

      2log r + 085 (C11)

      for r ge 195 (The numerical verification here goes up to 138 middot 108 for r gt 318 middot 108use C11)

      Clearly sumqle2rq even

      micro2(q)

      φ(q)=sumqlerq odd

      micro2(q)

      φ(q) (C12)

      We wish to obtain bounds for the sumssumqger

      micro2(q)

      φ(q)2

      sumqgerq odd

      micro2(q)

      φ(q)2

      sumqgerq even

      micro2(q)

      φ(q)2

      where N isin Z+ and r ge 1 To do this it will be helpful to express some of thequantities within these sums as convolutions2 For q squarefree and j ge 1

      micro2(q)qjminus1

      φ(q)j=sumab=q

      fj(b)

      a (C13)

      2The author would like to thank O Ramare for teaching him this technique

      C2 SUMS INVOLVING φ 301

      where fj is the multiplicative function defined by

      fj(p) =pj minus (pminus 1)j

      (pminus 1)jp fj(p

      k) = 0 for k ge 2

      We will also find the following estimate useful

      Lemma C21 Let j ge 2 be an integer andA a positive real Letm ge 1 be an integerThen sum

      ageA(am)=1

      micro2(a)

      ajle ζ(j)ζ(2j)

      Ajminus1middotprodp|m

      (1 +

      1

      pj

      )minus1

      (C14)

      It is useful to note that ζ(2)ζ(4) = 15π2 = 1519817 and ζ(3)ζ(6) =1181564

      Proof The right side of (C14) decreases as A increases while the left side dependsonly on dAe Hence it is enough to prove (C14) when A is an integer

      For A = 1 (C14) is an equality Let

      C =ζ(j)

      ζ(2j)middotprodp|m

      (1 +

      1

      pj

      )minus1

      Let A ge 2 Since sumageA

      (am)=1

      micro2(a)

      aj= C minus

      sumaltA

      (am)=1

      micro2(a)

      aj

      and

      C =suma

      (am)=1

      micro2(a)

      ajlt

      sumaltA

      (am)=1

      micro2(a)

      aj+

      1

      Aj+

      int infinA

      1

      tjdt

      =sumaltA

      (am)=1

      micro2(a)

      aj+

      1

      Aj+

      1

      (j minus 1)Ajminus1

      we obtainsumageA

      (am)=1

      micro2(a)

      aj=

      1

      Ajminus1middot C +

      Ajminus1 minus 1

      Ajminus1middot C minus

      sumaltA

      (am)=1

      micro2(a)

      aj

      ltC

      Ajminus1+Ajminus1 minus 1

      Ajminus1middot(

      1

      Aj+

      1

      (j minus 1)Ajminus1

      )minus 1

      Ajminus1

      sumaltA

      (am)=1

      micro2(a)

      aj

      le C

      Ajminus1+

      1

      Ajminus1

      ((1minus 1

      Ajminus1

      )(1

      A+

      1

      j minus 1

      )minus 1

      )

      302 APPENDIX C SUMS INVOLVING Λ AND φ

      Since (1minus 1A)(1A+ 1) lt 1 and 1A+ 1(j minus 1) le 1 for j ge 3 we obtain that(1minus 1

      Ajminus1

      )(1

      A+

      1

      j minus 1

      )lt 1

      for all integers j ge 2 and so the statement follows

      We now obtain easily the estimates we want by (C13) and Lemma C21 (withj = 2 and m = 1)sumqger

      micro2(q)

      φ(q)2=sumqger

      sumab=q

      f2(b)

      a

      micro2(q)

      qlesumbge1

      f2(b)

      b

      sumagerb

      micro2(a)

      a2

      le ζ(2)ζ(4)

      r

      sumbge1

      f2(b) =15π2

      r

      prodp

      (1 +

      2pminus 1

      (pminus 1)2p

      )le 67345

      r

      (C15)

      Similarly by (C13) and Lemma C21 (with j = 2 and m = 2)sumqgerq odd

      micro2(q)

      φ(q)2=sumbge1

      b odd

      f2(b)

      b

      sumagerba odd

      micro2(a)

      a2le ζ(2)ζ(4)

      1 + 122

      1

      r

      sumb odd

      f2(b)

      =12

      π2

      1

      r

      prodpgt2

      (1 +

      2pminus 1

      (pminus 1)2p

      )le 215502

      r

      (C16)

      sumqgerq even

      micro2(q)

      φ(q)2=sumqger2q odd

      micro2(q)

      φ(q)2le 431004

      r (C17)

      Lastlysumqlerq odd

      micro2(q)q

      φ(q)=sumqlerq odd

      micro2(q)sumd|q

      1

      φ(d)=sumdlerd odd

      1

      φ(d)

      sumqlerd|qq odd

      micro2(q) lesumdlerd odd

      1

      2φ(d)

      ( rd

      + 1)

      le r

      2

      sumd odd

      1

      φ(d)d+

      1

      2

      sumdlerd odd

      1

      φ(d)le 064787r +

      log r

      4+ 0425

      (C18)where we are using (C8) and (C11)

      Since we are on the subject of φ(q) let us also prove a simple lemma that we useat various points in the text to bound qφ(q)

      Lemma C22 For any q ge 1 and any r ge max(3 q)

      q

      φ(q)lt z(r)

      C2 SUMS INVOLVING φ 303

      wherez(r) = eγ log log r +

      250637

      log log r (C19)

      Proof Since z(r) is increasing for r ge 27 the statement follows immediately forq ge 27 by [RS62 Thm 15]

      q

      φ(q)lt z(q) le z(r)

      For q lt 27 it is clear that qφ(q) le 2 middot 3(1 middot 2) = 3 By the arithmeticgeometricmean inequality z(t) ge 2

      radiceγ250637 gt 3 for all t gt e and so the lemma holds for

      q lt 27

      304 APPENDIX C SUMS INVOLVING Λ AND φ

      Appendix D

      Checking small n by checkingzeros of ζ(s)

      In order to show that every odd number n le N is the sum of three primes it is enoughto show for some M le N that

      1 every even integer 4 le m leM can be written as the sum of two primes

      2 the difference between any two consecutive primes le N is at most M minus 4

      (If we want to show that every odd number n le N is the sum of three odd primeswe just replace M minus 4 by M minus 6 in (2)) The best known result of type (1) is thatof Oliveira e Silva Herzog and Pardi ([OeSHP14] M = 4 middot 1018) As for (2) it wasproven in [HP13] for M = 4 middot 1018 and N = 8875694 middot 1030 by a direct computation(valid even if we replace M minus 4 by M minus 6 in the statement of (2))

      Alternatively one can establish results of type (2) by means of numerical verifica-tions of the Riemann hypothesis up to a certain height This is a classical approachfollowed in [RS75] and [Sch76] and later in [RS03] we will use the version of (1)kindly provided by Ramare in [Ramd] We carry out this approach in full here notbecause it is preferrable to [HP13] ndash it is still based on computations and it is slightlymore indirect than [HP13] ndash but simply to show that one can establish what we needby a different route

      A numerical verification of the Riemann hypothesis up to a certain height consistssimply in checking that all (non-trivial) zeroes z of the Riemann zeta function up to aheight H (meaning =(z) le H) lie on the critical line lt(z) = 12

      The height up to which the Riemann hypothesis has actually been fully verified isnot a matter on which there is unanimity The strongest claim in the literature is in[GD04] which states that the first 1013 zeroes of the Riemann zeta function lie on thecritical line lt(z) = 12 This corresponds to checking the Riemann hypothesis up toheight H = 244599 middot 1012 It is unclear whether this computation was or could beeasily made rigorous as pointed out in [SD10 p 2398] it has not been replicated yet

      Before [GD04] the strongest results were those of the ZetaGrid distributed com-puting project led by S Wedeniwski [Wed03] the method followed in it was more

      305

      306 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

      traditional and should allow rigorous verification involving interval arithmetic Unfor-tunately the results were never formally published The statement that the ZetaGridproject verified the first 9 middot 1011 zeroes (corresponding to H = 2419 middot 1011) is oftenquoted (eg [Bom10 p 29]) this is the point to which the project had got by thetime of Gourdon and Demichelrsquos announcement Wedeniwski asserts in private com-munication that the project verified the first 1012 zeroes and that the computation wasdouble-checked (by the same method)

      The strongest claim prior to ZetaGrid was that of van de Lune (H = 3293 middot 109first 1010 zeroes unpublished) Recently Platt [Plaa] checked the first 11 middot 1011 ze-roes (H = 3061 middot 1010) rigorously following a method essentially based on thatin [Boo06a] Note that [Plaa] uses interval arithmetic which is highly desirable forfloating-point computations

      Proposition D03 Every odd integer 5 le n le n0 is the sum of three primes where

      n0 =

      590698 middot 1029 if [GD04] is used (H = 244 middot 1012)615697 middot 1028 if ZetaGrid results are used (H = 2419 middot 1011)123163 middot 1027 if [Plaa] is used ( H = 3061 middot 1010)

      Proof For n le 4 middot 1018 + 3 this is immediate from [OeSHP14] Let 4 middot 1018 + 3 ltn le n0 We need to show that there is a prime p in [n minus 4 minus (n minus 4)∆ n minus 4]where ∆ is large enough for (nminus 4)∆ le 4 middot 1018 minus 4 to hold We will then have that4 le n minus p le 4 + (n minus 4)∆ le 4 middot 1018 Since n minus p is even [OeSHP14] will thenimply that nminus p is the sum of two primes pprime pprimeprime and so

      n = p+ pprime + pprimeprime

      Since nminus 4 gt 1011 the interval [nminus 4minus (nminus 4)∆ nminus 4] with ∆ = 28314000must contain a prime [RS03] This gives the solution for (nminus4) le 11325 middot1026 sincethen (nminus 4) le 4 middot 1018 minus 4 Note 11325 middot 1026 gt e59

      From here onwards we use the tables in [Ramd] to find acceptable values of ∆Since nminus 4 ge e59 we can choose

      ∆ =

      52211882224 if [GD04] is used (case (a))13861486834 if ZetaGrid is used (case (b))307779681 if [Plaa] is used (case (c))

      This gives us (n minus 4)∆ le 4 middot 1018 minus 4 for n minus 4 lt er0 where r0 = 67 in case (a)r0 = 66 in case (b) and r0 = 62 in case (c)

      If nminus 4 ge er0 we can choose (again by [Ramd])

      ∆ =

      146869130682 in case (a)15392435100 in case (b)307908668 in case (c)

      This is enough for nminus4 lt e68 in case (a) and without further conditions for (b) or (c)

      307

      Finally if nminus 4 ge e68 and we are in case (a) [Ramd] assures us that the choice

      ∆ = 147674531294

      is valid we verify as well that (n0 minus 4)∆ le 4 middot 1018 minus 4

      In other words the rigorous results in [Plaa] are enough to show the result for allodd n le 1027 Of course [HP13] is also more than enough and gives stronger resultsthan Prop D03

      308 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

      Bibliography

      [AS64] M Abramowitz and I A Stegun Handbook of mathematical func-tions with formulas graphs and mathematical tables volume 55 ofNational Bureau of Standards Applied Mathematics Series For sale bythe Superintendent of Documents US Government Printing OfficeWashington DC 1964

      [BBO10] J Bertrand P Bertrand and J-P Ovarlez Mellin transform In A DPoularikas editor Transforms and applications handbook CRC PressBoca Raton FL 2010

      [Bom74] E Bombieri Le grand crible dans la theorie analytique des nombresSociete Mathematique de France Paris 1974 Avec une sommaire enanglais Asterisque No 18

      [Bom10] E Bombieri The classical theory of zeta and L-functions Milan JMath 78(1)11ndash59 2010

      [Bom76] E Bombieri On twin almost primes Acta Arith 28(2)177ndash193197576

      [Boo06a] A R Booker Artinrsquos conjecture Turingrsquos method and the Riemannhypothesis Experiment Math 15(4)385ndash407 2006

      [Boo06b] A R Booker Turing and the Riemann hypothesis Notices AmerMath Soc 53(10)1208ndash1211 2006

      [Bor56] K G Borodzkin On the problem of I M Vinogradovrsquos constant (inRussian) In Proc Third All-Union Math Conf volume 1 page 3Izdat Akad Nauk SSSR Moscow 1956

      [Bou99] J Bourgain On triples in arithmetic progression Geom Funct Anal9(5)968ndash984 1999

      [BR02] G Bastien and M Rogalski Convexite complete monotonie etinegalites sur les fonctions zeta et gamma sur les fonctions desoperateurs de Baskakov et sur des fonctions arithmetiques CanadJ Math 54(5)916ndash944 2002

      309

      310 BIBLIOGRAPHY

      [But11] Y Buttkewitz Exponential sums over primes and the prime twin prob-lem Acta Math Hungar 131(1-2)46ndash58 2011

      [Che73] J R Chen On the representation of a larger even integer as the sum ofa prime and the product of at most two primes Sci Sinica 16157ndash1761973

      [Che85] J R Chen On the estimation of some trigonometrical sums and theirapplication Sci Sinica Ser A 28(5)449ndash458 1985

      [Chu37] NG Chudakov On the Goldbach problem C R (Dokl) Acad SciURSS n Ser 17335ndash338 1937

      [Chu38] NG Chudakov On the density of the set of even numbers which arenot representable as the sum of two odd primes Izv Akad Nauk SSSRSer Mat 2 pages 25ndash40 1938

      [Chu47] N G Chudakov Introduction to the theory of Dirichlet L-functionsOGIZ Moscow-Leningrad 1947 In Russian

      [CW89] J R Chen and T Z Wang On the Goldbach problem Acta MathSinica 32(5)702ndash718 1989

      [CW96] J R Chen and T Z Wang The Goldbach problem for odd numbersActa Math Sinica (Chin Ser) 39(2)169ndash174 1996

      [Dab96] H Daboussi Effective estimates of exponential sums over primesIn Analytic number theory Vol 1 (Allerton Park IL 1995) volume138 of Progr Math pages 231ndash244 Birkhauser Boston Boston MA1996

      [Dav67] H Davenport Multiplicative number theory Markham PublishingCo Chicago Ill 1967 Lectures given at the University of MichiganWinter Term

      [dB81] N G de Bruijn Asymptotic methods in analysis Dover PublicationsInc New York third edition 1981

      [Des08] R Descartes Œuvres de Descartes publiees par Charles Adam etPaul Tannery sous les auspices du Ministere de lrsquoInstruction publiquePhysico-mathematica Compendium musicae Regulae ad directionemingenii Recherche de la verite Supplement a la correspondance XParis Leopold Cerf IV u 691 S 4 1908

      [Des77] J-M Deshouillers Sur la constante de Snirelprimeman In SeminaireDelange-Pisot-Poitou 17e annee (197576) Theorie des nombresFac 2 Exp No G16 page 6 Secretariat Math Paris 1977

      [DEtRZ97] J-M Deshouillers G Effinger H te Riele and D Zinoviev A com-plete Vinogradov 3-primes theorem under the Riemann hypothesisElectron Res Announc Amer Math Soc 399ndash104 1997

      BIBLIOGRAPHY 311

      [Dic66] L E Dickson History of the theory of numbers Vol I Divisibilityand primality Chelsea Publishing Co New York 1966

      [DLDDD+10] C Daramy-Loirat F De Dinechin D Defour M Gallet N Gast andCh Lauter Crlibm March 2010 version 10beta4

      [DR01] H Daboussi and J Rivat Explicit upper bounds for exponential sumsover primes Math Comp 70(233)431ndash447 (electronic) 2001

      [Dre93] F Dress Fonction sommatoire de la fonction de Mobius I Majorationsexperimentales Experiment Math 2(2)89ndash98 1993

      [DS70] H G Diamond and J Steinig An elementary proof of the prime num-ber theorem with a remainder term Invent Math 11199ndash258 1970

      [Eff99] G Effinger Some numerical implications of the Hardy and Littlewoodanalysis of the 3-primes problem Ramanujan J 3(3)239ndash280 1999

      [EM95] M El Marraki Fonction sommatoire de la fonction de Mobius III Ma-jorations asymptotiques effectives fortes J Theor Nombres Bordeaux7(2)407ndash433 1995

      [EM96] M El Marraki Majorations de la fonction sommatoire de la fonctionmicro(n)n Univ Bordeaux 1 preprint (96-8) 1996

      [Est37] T Estermann On Goldbachrsquos Problem Proof that Almost all EvenPositive Integers are Sums of Two Primes Proc London Math SocS2-44(4)307ndash314 1937

      [FI98] J Friedlander and H Iwaniec Asymptotic sieve for primes Ann ofMath (2) 148(3)1041ndash1065 1998

      [FI10] J Friedlander and H Iwaniec Opera de cribro volume 57 of AmericanMathematical Society Colloquium Publications American Mathemat-ical Society Providence RI 2010

      [For02] K Ford Vinogradovrsquos integral and bounds for the Riemann zeta func-tion Proc London Math Soc (3) 85(3)565ndash633 2002

      [GD04] X Gourdon and P Demichel The first 1013 zeros of the Rie-mann zeta function and zeros computation at very large heighthttpnumberscomputationfreefrConstantsMiscellaneouszetazeros1e13-1e24pdf 2004

      [GR94] I S Gradshteyn and I M Ryzhik Table of integrals series and prod-ucts Academic Press Inc Boston MA fifth edition 1994 Transla-tion edited and with a preface by Alan Jeffrey

      [GR96] A Granville and O Ramare Explicit bounds on exponential sumsand the scarcity of squarefree binomial coefficients Mathematika43(1)73ndash107 1996

      312 BIBLIOGRAPHY

      [Har66] G H Hardy Collected papers of G H Hardy (Including Joint pa-pers with J E Littlewood and others) Vol I Edited by a committeeappointed by the London Mathematical Society Clarendon Press Ox-ford 1966

      [HB79] D R Heath-Brown The fourth power moment of the Riemann zetafunction Proc London Math Soc (3) 38(3)385ndash422 1979

      [HB85] D R Heath-Brown The ternary Goldbach problem Rev MatIberoamericana 1(1)45ndash59 1985

      [HB11] H Hong and Ch W Brown QEPCAD B ndash Quantifier elimination bypartial cylindrical algebraic decomposition May 2011 version 162

      [Hela] H A Helfgott Major arcs for Goldbachrsquos problem Preprint Availableat arXiv12035712

      [Helb] H A Helfgott Minor arcs for Goldbachrsquos problem Preprint Availableas arXiv12055252

      [Helc] H A Helfgott The Ternary Goldbach Conjecture is true PreprintAvailable as arXiv13127748

      [Hel13a] H Helfgott La conjetura debil de Goldbach Gac R Soc Mat Esp16(4) 2013

      [Hel13b] H A Helfgott The ternary Goldbach conjecture 2013 Avail-able at httpvaluevarwordpresscom20130702the-ternary-goldbach-conjecture

      [Hel14a] H A Helfgott La conjecture de Goldbach ternaire Gaz Math(140)5ndash18 2014 Translated by Margaret Bilu revised by the author

      [Hel14b] H A Helfgott The ternary Goldbach problem To appear in Proceed-ings of the International Congress of Mathematicians (Seoul Korea2014) 2014

      [HL22] G H Hardy and J E Littlewood Some problems of lsquoPartitio numero-rumrsquo III On the expression of a number as a sum of primes ActaMath 44(1)1ndash70 1922

      [HP13] H A Helfgott and David J Platt Numerical verification of the ternaryGoldbach conjecture up to 8875 middot 1030 Exp Math 22(4)406ndash4092013

      [HR00] G H Hardy and S Ramanujan Asymptotic formulaelig in combinatoryanalysis [Proc London Math Soc (2) 17 (1918) 75ndash115] In Collectedpapers of Srinivasa Ramanujan pages 276ndash309 AMS Chelsea PublProvidence RI 2000

      BIBLIOGRAPHY 313

      [Hux72] M N Huxley Irregularity in sifted sequences J Number Theory4437ndash454 1972

      [IK04] H Iwaniec and E Kowalski Analytic number theory volume 53 ofAmerican Mathematical Society Colloquium Publications AmericanMathematical Society Providence RI 2004

      [Kad] H Kadiri An explicit zero-free region for the Dirichlet L-functionsPreprint Available as arXiv0510570

      [Kad05] H Kadiri Une region explicite sans zeros pour la fonction ζ de Rie-mann Acta Arith 117(4)303ndash339 2005

      [Kar93] A A Karatsuba Basic analytic number theory Springer-VerlagBerlin 1993 Translated from the second (1983) Russian edition andwith a preface by Melvyn B Nathanson

      [Knu99] O Knuppel PROFILBIAS February 1999 version 2

      [Kor58] N M Korobov Estimates of trigonometric sums and their applicationsUspehi Mat Nauk 13(4 (82))185ndash192 1958

      [Lam08] B Lambov Interval arithmetic using SSE-2 In Reliable Implemen-tation of Real Number Algorithms Theory and Practice Interna-tional Seminar Dagstuhl Castle Germany January 8-13 2006 volume5045 of Lecture Notes in Computer Science pages 102ndash113 SpringerBerlin 2008

      [Leh66] R Sherman Lehman On the difference π(x) minus li(x) Acta Arith11397ndash410 1966

      [LW02] M-Ch Liu and T Wang On the Vinogradov bound in the three primesGoldbach conjecture Acta Arith 105(2)133ndash175 2002

      [Mar41] K K Mardzhanishvili On the proof of the Goldbach-Vinogradov the-orem (in Russian) C R (Doklady) Acad Sci URSS (NS) 30(8)681ndash684 1941

      [McC84a] K S McCurley Explicit estimates for the error term in the prime num-ber theorem for arithmetic progressions Math Comp 42(165)265ndash285 1984

      [McC84b] K S McCurley Explicit zero-free regions for Dirichlet L-functionsJ Number Theory 19(1)7ndash32 1984

      [Mon68] H L Montgomery A note on the large sieve J London Math Soc4393ndash98 1968

      [Mon71] H L Montgomery Topics in multiplicative number theory LectureNotes in Mathematics Vol 227 Springer-Verlag Berlin 1971

      314 BIBLIOGRAPHY

      [MV73] H L Montgomery and R C Vaughan The large sieve Mathematika20119ndash134 1973

      [MV74] H L Montgomery and R C Vaughan Hilbertrsquos inequality J LondonMath Soc (2) 873ndash82 1974

      [MV07] H L Montgomery and R C Vaughan Multiplicative number the-ory I Classical theory volume 97 of Cambridge Studies in AdvancedMathematics Cambridge University Press Cambridge 2007

      [Ned06] N S Nedialkov VNODE-LP a validated solver for initial value prob-lems in ordinary differential equations July 2006 version 03

      [OeSHP14] T Oliveira e Silva S Herzog and S Pardi Empirical verification ofthe even Goldbach conjecture and computation of prime gaps up to4 middot 1018 Math Comp 832033ndash2060 2014

      [OLBC10] F W J Olver D W Lozier R F Boisvert and Ch W Clark edi-tors NIST handbook of mathematical functions US Department ofCommerce National Institute of Standards and Technology Washing-ton DC 2010 With 1 CD-ROM (Windows Macintosh and UNIX)

      [Olv58] F W J Olver Uniform asymptotic expansions of solutions of lin-ear second-order differential equations for large values of a parameterPhilos Trans Roy Soc London Ser A 250479ndash517 1958

      [Olv59] F W J Olver Uniform asymptotic expansions for Weber paraboliccylinder functions of large orders J Res Nat Bur Standards Sect B63B131ndash169 1959

      [Olv61] F W J Olver Two inequalities for parabolic cylinder functions ProcCambridge Philos Soc 57811ndash822 1961

      [Olv65] F W J Olver On the asymptotic solution of second-order differentialequations having an irregular singularity of rank one with an applica-tion to Whittaker functions J Soc Indust Appl Math Ser B NumerAnal 2225ndash243 1965

      [Olv74] F W J Olver Asymptotics and special functions Academic Press[A subsidiary of Harcourt Brace Jovanovich Publishers] New York-London 1974 Computer Science and Applied Mathematics

      [Plaa] D Platt Computing π(x) analytically To appear in Math CompAvailable as arXiv12035712

      [Plab] D Platt Numerical computations concerning GRH Preprint Availableat arXiv13053087

      [Pla11] D Platt Computing degree 1 L-functions rigorously PhD thesis Bris-tol University 2011

      BIBLIOGRAPHY 315

      [Rama] O Ramare Etat des lieux Preprint Available as httpmathuniv-lille1fr˜ramareMathsExplicitJNTBpdf

      [Ramb] O Ramare Explicit estimates on several summatory functions involv-ing the Moebius function To appear in Math Comp

      [Ramc] O Ramare A sharp bilinear form decomposition for primes and Moe-bius function Preprint To appear in Acta Math Sinica

      [Ramd] O Ramare Short effective intervals containing primes Preprint

      [Ram95] O Ramare On Snirelprimemanrsquos constant Ann Scuola Norm Sup PisaCl Sci (4) 22(4)645ndash706 1995

      [Ram09] O Ramare Arithmetical aspects of the large sieve inequality volume 1of Harish-Chandra Research Institute Lecture Notes Hindustan BookAgency New Delhi 2009 With the collaboration of D S Ramana

      [Ram10] O Ramare On Bombierirsquos asymptotic sieve J Number Theory130(5)1155ndash1189 2010

      [Ram13] O Ramare From explicit estimates for primes to explicit estimates forthe Mobius function Acta Arith 157(4)365ndash379 2013

      [Ram14] O Ramare Explicit estimates on the summatory functions of theMobius function with coprimality restrictions Acta Arith 165(1)1ndash10 2014

      [Ros41] B Rosser Explicit bounds for some functions of prime numbers AmerJ Math 63211ndash232 1941

      [RR96] O Ramare and R Rumely Primes in arithmetic progressions MathComp 65(213)397ndash425 1996

      [RS62] J B Rosser and L Schoenfeld Approximate formulas for some func-tions of prime numbers Illinois J Math 664ndash94 1962

      [RS75] J B Rosser and L Schoenfeld Sharper bounds for the Chebyshevfunctions θ(x) and ψ(x) Math Comp 29243ndash269 1975 Collectionof articles dedicated to Derrick Henry Lehmer on the occasion of hisseventieth birthday

      [RS03] O Ramare and Y Saouter Short effective intervals containing primesJ Number Theory 98(1)10ndash33 2003

      [RV83] H Riesel and R C Vaughan On sums of primes Ark Mat 21(1)46ndash74 1983

      [Sao98] Y Saouter Checking the odd Goldbach conjecture up to 1020 MathComp 67(222)863ndash866 1998

      316 BIBLIOGRAPHY

      [Sch33] L Schnirelmann Uber additive Eigenschaften von Zahlen Math Ann107(1)649ndash690 1933

      [Sch76] L Schoenfeld Sharper bounds for the Chebyshev functions θ(x) andψ(x) II Math Comp 30(134)337ndash360 1976

      [SD10] Y Saouter and P Demichel A sharp region where π(x) minus li(x) ispositive Math Comp 79(272)2395ndash2405 2010

      [Sel91] A Selberg Lectures on sieves In Collected papers vol II pages66ndash247 Springer Berlin 1991

      [Sha14] X Shao A density version of the Vinogradov three primes theoremDuke Math J 163(3)489ndash512 2014

      [Shu92] F H Shu The Cosmos In Encyclopaedia Britannica Macropaediavolume 16 pages 762ndash795 Encyclopaedia Britannica Inc 15 edition1992

      [Tao14] T Tao Every odd number greater than 1 is the sum of at most fiveprimes Math Comp 83(286)997ndash1038 2014

      [Tem10] N M Temme Parabolic cylinder functions In NIST Handbook ofmathematical functions pages 303ndash319 US Dept Commerce Wash-ington DC 2010

      [Tru] T S Trudgian An improved upper bound for the error in thezero-counting formulae for Dirichlet L-functions and Dedekind zeta-functions Preprint

      [Tuc11] W Tucker Validated numerics A short introduction to rigorous com-putations Princeton University Press Princeton NJ 2011

      [Tur53] A M Turing Some calculations of the Riemann zeta-function ProcLondon Math Soc (3) 399ndash117 1953

      [TV03] N M Temme and R Vidunas Parabolic cylinder functions exam-ples of error bounds for asymptotic expansions Anal Appl (Singap)1(3)265ndash288 2003

      [van37] J G van der Corput Sur lrsquohypothese de Goldbach pour presque tousles nombres pairs Acta Arith 2266ndash290 1937

      [Vau77a] R C Vaughan On the estimation of Schnirelmanrsquos constant J ReineAngew Math 29093ndash108 1977

      [Vau77b] R-C Vaughan Sommes trigonometriques sur les nombres premiersC R Acad Sci Paris Ser A-B 285(16)A981ndashA983 1977

      [Vau80] R C Vaughan Recent work in additive prime number theory In Pro-ceedings of the International Congress of Mathematicians (Helsinki1978) pages 389ndash394 Acad Sci Fennica Helsinki 1980

      BIBLIOGRAPHY 317

      [Vau97] R C Vaughan The Hardy-Littlewood method volume 125 of Cam-bridge Tracts in Mathematics Cambridge University Press Cam-bridge second edition 1997

      [Vin37] I M Vinogradov A new method in analytic number theory (Russian)Tr Mat Inst Steklova 105ndash122 1937

      [Vin47] IM Vinogradov The method of trigonometrical sums in the theory ofnumbers (Russian) Tr Mat Inst Steklova 233ndash109 1947

      [Vin54] I M Vinogradov The method of trigonometrical sums in the theoryof numbers Interscience Publishers London and New York 1954Translated revised and annotated by K F Roth and Anne Davenport

      [Vin58] I M Vinogradov A new estimate of the function ζ(1 + it) Izv AkadNauk SSSR Ser Mat 22161ndash164 1958

      [Vin04] I M Vinogradov The method of trigonometrical sums in the theory ofnumbers Dover Publications Inc Mineola NY 2004 Translated fromthe Russian revised and annotated by K F Roth and Anne DavenportReprint of the 1954 translation

      [Wed03] S Wedeniwski ZetaGrid - Computational verification of the Riemannhypothesis Conference in Number Theory in honour of Professor HC Williams Banff Alberta Canada May 2003

      [Wei84] A Weil Number theory An approach through history From Hammu-rapi to Legendre Birkhauser Boston Inc Boston MA 1984

      [Whi03] E T Whittaker On the functions associated with the parabolic cylinderin harmonic analysis Proc London Math Soc 35417ndash427 1903

      [Wig20] S Wigert Sur la theorie de la fonction ζ(s) de Riemann Ark Mat141ndash17 1920

      [Won01] R Wong Asymptotic approximations of integrals volume 34 of Clas-sics in Applied Mathematics Society for Industrial and Applied Math-ematics (SIAM) Philadelphia PA 2001 Corrected reprint of the 1989original

      [Zin97] D Zinoviev On Vinogradovrsquos constant in Goldbachrsquos ternary problemJ Number Theory 65(2)334ndash358 1997

      • Preface
      • Acknowledgements
      • 1 Introduction
        • 11 History and new developments
        • 12 The circle method Fourier analysis on Z
        • 13 The major arcs M
          • 131 What do we really know about L-functions and their zeros
          • 132 Estimates of f0362f() for in the major arcs
            • 14 The minor arcs m
              • 141 Qualitative goals and main ideas
              • 142 Combinatorial identities
              • 143 Type I sums
              • 144 Type II or bilinear sums
                • 15 Integrals over the major and minor arcs
                • 16 Some remarks on computations
                  • 2 Notation and preliminaries
                    • 21 General notation
                    • 22 Dirichlet characters and L functions
                    • 23 Fourier transforms and exponential sums
                    • 24 Mellin transforms
                    • 25 Bounds on sums of and
                    • 26 Interval arithmetic and the bisection method
                      • I Minor arcs
                        • 3 Introduction
                          • 31 Results
                          • 32 Comparison to earlier work
                          • 33 Basic setup
                            • 331 Vaughans identity
                            • 332 An alternative route
                                • 4 Type I sums
                                  • 41 Trigonometric sums
                                  • 42 Type I estimates
                                    • 421 Type I variations
                                        • 5 Type II sums
                                          • 51 The sum S1 cancellation
                                            • 511 Reduction to a sum with
                                            • 512 Explicit bounds for a sum with
                                            • 513 Estimating the triple sum
                                              • 52 The sum S2 the large sieve primes and tails
                                                • 6 Minor-arc totals
                                                  • 61 The smoothing function
                                                  • 62 Contributions of different types
                                                    • 621 Type I terms SI1
                                                    • 622 Type I terms SI2
                                                    • 623 Type II terms
                                                      • 63 Adjusting parameters Calculations
                                                        • 631 First choice of parameters qy
                                                        • 632 Second choice of parameters
                                                          • 64 Conclusion
                                                              • II Major arcs
                                                                • 7 Major arcs overview and results
                                                                  • 71 Results
                                                                  • 72 Main ideas
                                                                    • 8 The Mellin transform of the twisted Gaussian
                                                                      • 81 How to choose a smoothing function
                                                                      • 82 The twisted Gaussian overview and setup
                                                                        • 821 Relation to the existing literature
                                                                        • 822 General approach
                                                                          • 83 The saddle point
                                                                            • 831 The coordinates of the saddle point
                                                                            • 832 The direction of steepest descent
                                                                              • 84 The integral over the contour
                                                                                • 841 A simple contour
                                                                                • 842 Another simple contour
                                                                                  • 85 Conclusions
                                                                                    • 9 Explicit formulas
                                                                                      • 91 A general explicit formula
                                                                                      • 92 Sums and decay for the Gaussian
                                                                                      • 93 The case of (t)
                                                                                      • 94 The case of +(t)
                                                                                      • 95 A sum for +(t)2
                                                                                      • 96 A verification of zeros and its consequences
                                                                                          • III The integral over the circle
                                                                                            • 10 The integral over the major arcs
                                                                                              • 101 Decomposition of S by characters
                                                                                              • 102 The integral over the major arcs the main term
                                                                                              • 103 The 2 norm over the major arcs
                                                                                              • 104 The integral over the major arcs conclusion
                                                                                                • 11 Optimizing and adapting smoothing functions
                                                                                                  • 111 The symmetric smoothing function
                                                                                                    • 1111 The product (t) (-t)
                                                                                                      • 112 The smoothing function adapting minor-arc bounds
                                                                                                        • 12 The 2 norm and the large sieve
                                                                                                          • 121 Variations on the large sieve for primes
                                                                                                          • 122 Bounding the quotient in the large sieve for primes
                                                                                                            • 13 The integral over the minor arcs
                                                                                                              • 131 Putting together 2 bounds over arcs and bounds
                                                                                                              • 132 The minor-arc total
                                                                                                                • 14 Conclusion
                                                                                                                  • 141 The 2 norm over the major arcs explicit version
                                                                                                                  • 142 The total major-arc contribution
                                                                                                                  • 143 The minor-arc total explicit version
                                                                                                                  • 144 Conclusion proof of main theorem
                                                                                                                      • IV Appendices
                                                                                                                        • A Norms of smoothing functions
                                                                                                                          • A1 The decay of a Mellin transform
                                                                                                                          • A2 The difference +- in 2 norm
                                                                                                                          • A3 Norms involving +
                                                                                                                          • A4 Norms involving +
                                                                                                                          • A5 The -norm of +
                                                                                                                            • B Norms of Fourier transforms
                                                                                                                              • B1 The Fourier transform of 2
                                                                                                                              • B2 Bounds involving a logarithmic factor
                                                                                                                                • C Sums involving and
                                                                                                                                  • C1 Sums over primes
                                                                                                                                  • C2 Sums involving
                                                                                                                                    • D Checking small n by checking zeros of (s)

        iv CONTENTS

        332 An alternative route 47

        4 Type I sums 5141 Trigonometric sums 5142 Type I estimates 56

        421 Type I variations 63

        5 Type II sums 7751 The sum S1 cancellation 80

        511 Reduction to a sum with micro 80512 Explicit bounds for a sum with micro 84513 Estimating the triple sum 89

        52 The sum S2 the large sieve primes and tails 93

        6 Minor-arc totals 10161 The smoothing function 10162 Contributions of different types 102

        621 Type I terms SI1 102622 Type I terms SI2 103623 Type II terms 107

        63 Adjusting parameters Calculations 117631 First choice of parameters q le y 119632 Second choice of parameters 125

        64 Conclusion 133

        II Major arcs 135

        7 Major arcs overview and results 13771 Results 13872 Main ideas 140

        8 The Mellin transform of the twisted Gaussian 14381 How to choose a smoothing function 14582 The twisted Gaussian overview and setup 146

        821 Relation to the existing literature 146822 General approach 147

        83 The saddle point 149831 The coordinates of the saddle point 149832 The direction of steepest descent 150

        84 The integral over the contour 152841 A simple contour 152842 Another simple contour 157

        85 Conclusions 159

        CONTENTS v

        9 Explicit formulas 16391 A general explicit formula 16492 Sums and decay for the Gaussian 17593 The case of ηlowast(t) 17894 The case of η+(t) 18495 A sum for η+(t)2 18896 A verification of zeros and its consequences 193

        III The integral over the circle 199

        10 The integral over the major arcs 201101 Decomposition of Sη by characters 202102 The integral over the major arcs the main term 204103 The `2 norm over the major arcs 207104 The integral over the major arcs conclusion 212

        11 Optimizing and adapting smoothing functions 217111 The symmetric smoothing function η 218

        1111 The product η(t)η(ρminus t) 218112 The smoothing function ηlowast adapting minor-arc bounds 219

        12 The `2 norm and the large sieve 227121 Variations on the large sieve for primes 227122 Bounding the quotient in the large sieve for primes 232

        13 The integral over the minor arcs 245131 Putting together `2 bounds over arcs and `infin bounds 245132 The minor-arc total 248

        14 Conclusion 259141 The `2 norm over the major arcs explicit version 259142 The total major-arc contribution 261143 The minor-arc total explicit version 267144 Conclusion proof of main theorem 275

        IV Appendices 277

        A Norms of smoothing functions 279A1 The decay of a Mellin transform 280A2 The difference η+ minus η in `2 norm 283A3 Norms involving η+ 285A4 Norms involving ηprime+ 286A5 The `infin-norm of η+ 288

        vi CONTENTS

        B Norms of Fourier transforms 291B1 The Fourier transform of ηprimeprime2 291B2 Bounds involving a logarithmic factor 293

        C Sums involving Λ and φ 297C1 Sums over primes 297C2 Sums involving φ 299

        D Checking small n by checking zeros of ζ(s) 305

        Preface

        ἐγγὺς δrsquo ἦν τέλεος ὃ δὲ τὀ τρίτον ἧκε χ[αμᾶζε

        σὺν τῶι δrsquo ἐξέφυγεν θάνατον καὶ κῆ[ρα μέλαιναν

        Hesiod () Ehoiai fr 7621ndash2 Merkelbach and West

        The ternary Goldbach conjecture (or three-prime conjecture) states that every oddnumber n greater than 5 can be written as the sum of three primes The purpose of thisbook is to give the first full proof of this conjecture

        The proof builds on the great advances made in the early 20th century by Hardy andLittlewood (1922) and Vinogradov (1937) Progress since then has been more gradualIn some ways it was necessary to clear the board and start work using only the mainexisting ideas towards the problem together with techniques developed elsewhere

        Part of the aim has been to keep the exposition as accessible as possible withan emphasis on qualitative improvements and new technical ideas that should be ofuse elsewhere The main strategy was to give an analytic approach that is efficientrelatively clean and as it must be for this problem explicit the focus does not lie inoptimizing explicit constants or in performing calculations necessary as these tasksare

        Organization In the introduction after a summary of the history of the problemwe will go over a detailed outline of the proof The rest of the book is divided in threeparts structured so that they can be read independently the first two parts do not referto each other and the third part uses only the main results (clearly marked) of the firsttwo parts

        As is the case in most proofs involving the circle method the problem is reduced toshowing that a certain integral over the ldquocirclerdquo RZ is non-zero The circle is dividedinto major arcs and minor arcs In Part I ndash in some ways the technical heart of the proofndash we will see how to give upper bounds on the integrand when α is in the minor arcsPart II will provide rather precise estimates for the integrand when the variable α is inthe major arcs Lastly Part III shows how to use these inputs as well as possible toestimate the integral

        Each part and each chapter starts with a general discussion of the strategy andthe main ideas involved Some of the more technical bounds and computations arerelegated to the appendices

        vii

        viii PREFACE

        Dependencies between the chapters

        1 2

        3 7 10

        4 8 11

        5 9 12

        6 13

        14

        Introduction Notation andpreliminaries

        Minor arcsintroduction

        Type I sums

        Type II sums

        Minor-arctotals

        Major arcsoverview

        Mellin transform oftwisted Gaussian

        Explicit formulas

        The integral overthe major arcs

        Smoothing func-tions and their use

        The `2 norm andthe large sieve

        The integral overthe minor arcs

        Conclusion

        Acknowledgements

        The author is very thankful to D Platt who working in close coordination with himprovided GRH verifications in the necessary ranges and also helped him with the usageof interval arithmetic He is also deeply grateful to O Ramare who in reply to hisrequests prepared and sent for publication several auxiliary results and who otherwiseprovided much-needed feedback

        The author is also much indebted to A Booker B Green R Heath-Brown HKadiri D Platt T Tao and M Watkins for many discussions on Goldbachrsquos prob-lem and related issues Several historical questions became clearer due to the helpof J Brandes K Gong R Heath-Brown Z Silagadze R Vaughan and T WooleyAdditional references were graciously provided by R Bryant S Huntsman and IRezvyakova Thanks are also due to B Bukh A Granville and P Sarnak for theirvaluable advice

        The introduction is largely based on the authorrsquos article for the Proceedings of the2014 ICM [Hel14b] That article in turn is based in part on the informal note [Hel13b]which was published in Spanish translation ([Hel13a] translated by M A Morales andthe author and revised with the help of J Cilleruelo and M Helfgott) and in a Frenchversion ([Hel14a] translated by M Bilu and revised by the author) The proof firstappeared as a series of preprints [Helb] [Hela] [Helc]

        Travel and other expenses were funded in part by the Adams Prize and the PhilipLeverhulme Prize The authorrsquos work on the problem started at the Universite deMontreal (CRM) in 2006 he is grateful to both the Universite de Montreal and theEcole Normale Superieure for providing pleasant working environments During thelast stages of the work travel was partly covered by ANR Project Caesar No ANR-12-BS01-0011

        The present work would most likely not have been possible without free and pub-licly available software SAGE PARI Maxima gnuplot VNODE-LP PROFIL BIASand of course LATEX Emacs the gcc compiler and GNULinux in general Some ex-ploratory work was done in SAGE and Mathematica Rigorous calculations used eitherD Plattrsquos interval-arithmetic package (based in part on Crlibm) or the PROFILBIASinterval arithmetic package underlying VNODE-LP

        The calculations contained in this paper used a nearly trivial amount of resourcesthey were all carried out on the authorrsquos desktop computers at home and work How-ever D Plattrsquos computations [Plab] used a significant amount of resources kindly do-nated to D Platt and the author by several institutions This crucial help was providedby MesoPSL (affiliated with the Observatoire de Paris and Paris Sciences et Lettres)

        ix

        x ACKNOWLEDGEMENTS

        Universite de Paris VIVII (UPMC - DSI - Pole Calcul) University of Warwick (thanksto Bill Hart) University of Bristol France Grilles (French National Grid InfrastructureDIRAC national instance) Universite de Lyon 1 and Universite de Bordeaux 1 BothD Platt and the author would like to thank the donating organizations their technicalstaff and all those who helped to make these resources available to them

        Chapter 1

        Introduction

        The question we will discuss or one similar to it seems to have been first posed byDescartes in a manuscript published only centuries after his death [Des08 p 298]Descartes states ldquoSed amp omnis numerus par fit ex uno vel duobus vel tribus primisrdquo(ldquoBut also every even number is made out of one two or three prime numbersrdquo1) Thisstatement comes in the middle of a discussion of sums of polygonal numbers such asthe squares

        Statements on sums of primes and sums of values of polynomials (polygonal num-bers powers nk etc) have since shown themselves to be much more than mere cu-riosities ndash and not just because they are often very difficult to prove Whereas the studyof sums of powers can rely on their algebraic structure the study of sums of primesleads to the realization that from several perspectives the set of primes behaves muchlike the set of integers or like a random set of integers (It also leads to the realizationthat this is very hard to prove)

        If instead of the primes we had a random set of odd integers S whose density ndashan intuitive concept that can be made precise ndash equaled that of the primes then wewould expect to be able to write every odd number as a sum of three elements of Sand every even number as the sum of two elements of S We would have to check byhand whether this is true for small odd and even numbers but it is relatively easy toshow that after a long enough check it would be very unlikely that there would be anyexceptions left among the infinitely many cases left to check

        The question then is in what sense we need the primes to be like a random set ofintegers in other words we need to know what we can prove about the regularities ofthe distribution of the primes This is one of the main questions of analytic numbertheory progress on it has been very slow and difficult

        Fourier analysis expresses information on the distribution of a sequence in termsof frequencies In the case of the primes what may be called the main frequencies ndashthose in the major arcs ndash correspond to the same kind of large-scale distribution thatis encoded by L-functions the family of functions to which the Riemann zeta function

        1Thanks are due to J Brandes and R Vaughan for a discussion on a possible ambiguity in the Latinwording Descartesrsquo statement is mentioned (with a translation much like the one given here) in DicksonrsquosHistory [Dic66 Ch XVIII]

        1

        2 CHAPTER 1 INTRODUCTION

        belongs On some of the crucial questions on L-functions the limits of our knowledgehave barely budged in the last century There is something relatively new now namelyrigorous numerical data of non-negligible scope still such data is by definition finiteand as a consequence its range of applicability is very narrow Thus the real questionin the major-arc regime is how to use well the limited information we do have on thelarge-scale distribution of the primes As we will see this requires delicate work onexplicit asymptotic analysis and smoothing functions

        Outside the main frequencies ndash that is in what are called the minor arcs ndash estimatesbased on L-functions no longer apply and what is remarkable is that one can sayanything meaningful on the distribution of the primes Vinogradov was the first to giveunconditional non-trivial bounds showing that there are no great irregularities in theminor arcs this is what makes them ldquominorrdquo Here the task is to give sharper boundsthan Vinogradov It is in this regime that we can genuinely say that we learn a littlemore about the distribution of the primes based on what is essentially an elementaryand highly optimized analytic-combinatorial analysis of exponential sums ie Fouriercoefficients given by series (supported on the primes in our case)

        The circle method reduces an additive problem ndash that is a problems on sums suchas sums of primes powers etc ndash to the estimation of an integral on the space offrequencies (the ldquocirclerdquo RZ) In the case of the primes as we have just discussed wehave precise estimates on the integrand on part of the circle (the major arcs) and upperbounds on the rest of the circle (the minor arcs) Putting them together efficiently togive an estimate on the integral is a delicate matter we leave it for the last part as itis really what is particular to our problem as opposed to being of immediate generalrelevance to the study of the primes As we shall see estimating the integral well doesinvolve using ndash and improving ndash general estimates on the variance of irregularities inthe distribution of the primes as given by the large sieve

        In fact one of the main general lessons of the proof is that there is a very closerelationship between the circle method and the large sieve we will use the large sievenot just as a tool ndash which we shall incidentally sharpen in certain contexts ndash but as asource for ideas on how to apply the circle method more effectively

        This has been an attempt at a first look from above Let us now undertake a moreleisurely and detailed overview of the problem and its solution

        11 History and new developments

        The history of the conjecture starts properly with Euler and his close friend ChristianGoldbach both of whom lived and worked in Russia at the time of their correspon-dence ndash about a century after Descartesrsquo isolated statement Goldbach a man of manyinterests is usually classed as a serious amateur he seems to have awakened Eulerrsquospassion for number theory which would lead to the beginning of the modern era ofthe subject [Wei84 Ch 3 sectIV] In a letter dated June 7 1742 Goldbach made aconjectural statement on prime numbers and Euler rapidly reduced it to the followingconjecture which he said Goldbach had already posed to him every positive integercan be written as the sum of at most three prime numbers

        11 HISTORY AND NEW DEVELOPMENTS 3

        We would now say ldquoevery integer greater than 1rdquo since we no long consider 1 tobe a prime number Moreover the conjecture is nowadays split into two

        bull the weak or ternary Goldbach conjecture states that every odd integer greaterthan 5 can be written as the sum of three primes

        bull the strong or binary Goldbach conjecture states that every even integer greaterthan 2 can be written as the sum of two primes

        As their names indicate the strong conjecture implies the weak one (easily subtract 3from your odd number n then express nminus 3 as the sum of two primes)

        The strong conjecture remains out of reach A short while ago ndash the first completeversion appeared on May 13 2013 ndash the author proved the weak Goldbach conjecture

        Theorem 111 Every odd integer greater than 5 can be written as the sum of threeprimes

        In 1937 I M Vinogradov proved [Vin37] that the conjecture is true for all oddnumbers n larger than some constant C (Hardy and Littlewood had proved the samestatement under the assumption of the Generalized Riemann Hypothesis which weshall have the chance to discuss later)

        It is clear that a computation can verify the conjecture only for n le c c a constantcomputations have to be finite What can make a result coming from analytic numbertheory be valid only for n ge C

        An analytic proof generally speaking gives us more than just existence In thiskind of problem it gives us more than the possibility of doing something (here writingan integer n as the sum of three primes) It gives us a rigorous estimate for the numberof ways in which this something is possible that is it shows us that this number ofways equals

        main term + error term (11)

        where the main term is a precise quantity f(n) and the error term is something whoseabsolute value is at most another precise quantity g(n) If f(n) gt g(n) then (11) isnon-zero ie we will have shown the existence of a way to write our number as thesum of three primes

        (Since what we truly care about is existence we are free to weigh different waysof writing n as the sum of three primes however we wish ndash that is we can decide thatsome primes ldquocountrdquo twice or thrice as much as others and that some do not count atall)

        Typically after much work we succeed in obtaining (11) with f(n) and g(n) suchthat f(n) gt g(n) asymptotically that is for n large enough To give a highly simplifiedexample if say f(n) = n2 and g(n) = 100n32 then f(n) gt g(n) for n gt C whereC = 104 and so the number of ways (11) is positive for n gt C

        We want a moderate value of C that is a C small enough that all cases n le C canbe checked computationally To ensure this we must make the error term bound g(n)as small as possible This is our main task A secondary (and sometimes neglected)possibility is to rig the weights so as to make the main term f(n) larger in comparisonto g(n) this can generally be done only up to a certain point but is nonetheless veryhelpful

        4 CHAPTER 1 INTRODUCTION

        As we said the first unconditional proof that odd numbers n ge C can be writtenas the sum of three primes is due to Vinogradov Analytic bounds fall into severalcategories or stages quite often successive versions of the same theorem will gothrough successive stages

        1 An ineffective result shows that a statement is true for some constant C but givesno way to determine what the constant C might be Vinogradovrsquos first proof ofhis theorem (in [Vin37]) is like this it shows that there exists a constant C suchthat every odd number n gt C is the sum of three primes yet give us no hope offinding out what the constant C might be2 Many proofs of Vinogradovrsquos resultin textbooks are also of this type

        2 An effective but not explicit result shows that a statement is true for someunspecified constant C in a way that makes it clear that a constant C couldin principle be determined following and reworking the proof with great careVinogradovrsquos later proof ([Vin47] translated in [Vin54]) is of this nature AsChudakov [Chu47 sectIV2] pointed out the improvement on [Vin37] given byMardzhanishvili [Mar41] already had the effect of making the result effective3

        3 An explicit result gives a value of C According to [Chu47 p 201] the firstexplicit version of Vinogradovrsquos result was given by Borozdkin in his unpub-lished doctoral dissertation written under the direction of Vinogradov (1939)C = exp(exp(exp(4196))) Such a result is by definition also effectiveBorodzkin later [Bor56] gave the value C = ee

        16038

        though he does not seem tohave published the proof The best ndash that is smallest ndash value of C known beforethe present work was that of Liu and Wang [LW02] C = 2 middot 101346

        4 What we may call an efficient proof gives a reasonable value for C ndash in our casea value small enough that checking all cases up to C is feasible

        How far were we from an efficient proof That is what sort of computation couldever be feasible The situation was paradoxical the conjecture was known above anexplicit C but C = 2 middot101346 is so large that it could not be said that the problem couldbe attacked by any foreseeable computational means within our physical universe (Atruly brute-force verification up to C takes at least C steps a cleverer verification takeswell over

        radicC steps The number of picoseconds since the beginning of the universe is

        less than 1030 whereas the number of protons in the observable universe is currentlyestimated at sim 1080 [Shu92] this limits the number of steps that can be taken inany currently imaginable computer even if it were to do parallel processing on anastronomical scale) Thus the only way forward was a series of drastic improvementsin the mathematical rather than computational side

        I gave a proof with C = 1029 in May 2013 Since D Platt and I had verifiedthe conjecture for all odd numbers up to n le 88 middot 1030 by computer [HP13] thisestablished the conjecture for all odd numbers n

        2Here as is often the case in ineffective results in analytic number theory the underlying issue is that ofSiegel zeros which are believed not to exist but have not been shown not to the strongest bounds on (ieagainst) such zeros are ineffective and so are all of the many results using such estimates

        3The proof in [Mar41] combined the bounds in [Vin37] with a more careful accounting of the effect ofthe single possible Siegel zero within range

        11 HISTORY AND NEW DEVELOPMENTS 5

        (In December 2013 I reduced C to 1027 The verification of the ternary Gold-bach conjecture up to n le 1027 can be done on a home computer over a weekendas of the time of writing (2014) It must be said that this uses the verification of thebinary Goldbach conjecture for n le 4 middot 1018 [OeSHP14] which itself required com-putational resources far outside the home-computing range Checking the conjectureup to n le 1027 was not even the main computational task that needed to be accom-plished to establish the Main Theorem ndash that task was the finite verification of zeros ofL-functions in [Plab] a general-purpose computation that should be useful elsewhere)

        What was the strategy of the proof The basic framework is the one pioneered byHardy and Littlewood for a variety of problems ndash namely the circle method which aswe shall see is an application of Fourier analysis over Z (There are other later routesto Vinogradovrsquos result see [HB85] [FI98] and especially the recent work [Sha14]which avoids using anything about zeros of L-functions inside the critical strip) Vino-gradovrsquos proof like much of the later work on the subject was based on a detailedanalysis of exponential sums ie Fourier transforms over Z So is the proof that wewill sketch

        At the same time the distance between 2 middot 101346 and 1027 is such that we cannothope to get to 1027 (or any other reasonable constant) by fine-tuning previous workRather we must work from scratch using the basic outline in Vinogradovrsquos originalproof and other initially unrelated developments in analysis and number theory (no-tably the large sieve) Merely improving constants will not do rather we must doqualitatively better than previous work (by non-constant factors) if we are to have anychance to succeed It is on these qualitative improvements that we will focus

        It is only fair to review some of the progress made between Vinogradovrsquos time andours Here we will focus on results later we will discuss some of the progress madein the techniques of proof See [Dic66 Ch XVIII] for the early history of the problem(before Hardy and Littlewood) see R Vaughanrsquos ICM lecture notes on the ternaryGoldbach problem [Vau80] for some further details on the history up to 1978

        In 1933 Schnirelmann proved [Sch33] that every integer n gt 1 can be written asthe sum of at most K primes for some unspecified constant K (This pioneering workis now considered to be part of the early history of additive combinatorics) In 1969Klimov gave an explicit value for K (namely K = 6 middot 109) he later improved theconstant to K = 115 (with G Z Piltay and T A Sheptickaja) and K = 55 Laterthere were results by Vaughan [Vau77a] (K = 27) Deshouillers [Des77] (K = 26)and Riesel-Vaughan [RV83] (K = 19)

        Ramare showed in 1995 that every even number n gt 1 can be written as the sum ofat most 6 primes [Ram95] In 2012 Tao proved [Tao14] that every odd number n gt 1is the sum of at most 5 primes

        There have been other avenues of attack towards the strong conjecture Using ideasclose to those of Vinogradovrsquos Chudakov [Chu37] [Chu38] Estermann [Est37] andvan der Corput [van37] proved (independently from each other) that almost every evennumber (meaning all elements of a subset of density 1 in the even numbers) can bewritten as the sum of two primes In 1973 J-R Chen showed [Che73] that every even

        6 CHAPTER 1 INTRODUCTION

        number n larger than a constant C can be written as the sum of a prime number andthe product of at most two primes (n = p1 + p2 or n = p1 + p2p3) IncidentallyJ-R Chen himself together with T-Z Wang was responsible for the best bounds onC (for ternary Goldbach) before Lui and Wang C = exp(exp(11503)) lt 4 middot 1043000

        [CW89] and C = exp(exp(9715)) lt 6 middot 107193 [CW96]Matters are different if one assumes the Generalized Riemann Hypothesis (GRH)

        A careful analysis [Eff99] of Hardy and Littlewoodrsquos work [HL22] gives that everyodd number n ge 124 middot 1050 is the sum of three primes if GRH is true4 Accordingto [Eff99] the same statement with n ge 1032 was proven in the unpublished doctoraldissertation of B Lucke a student of E Landaursquos in 1926 Zinoviev [Zin97] improvedthis to n ge 1020 A computer check ([DEtRZ97] see also [Sao98]) showed that theconjecture is true for n lt 1020 thus completing the proof of the ternary Goldbachconjecture under the assumption of GRH What was open until now was of course theproblem of giving an unconditional proof

        12 The circle method Fourier analysis on Z

        It is common for a first course on Fourier analysis to focus on functions over the re-als satisfying f(x) = f(x + 1) or what is the same functions f RZ rarr CSuch a function (unless it is fairly pathological) has a Fourier series converging to itthis is just the same as saying that f has a Fourier transform f Z rarr C definedby f(n) =

        intRZ f(α)e(minusαn)dα and satisfying f(α) =

        sumnisinZ f(n)e(αn) (Fourier

        inversion theorem) where e(t) = e2πitIn number theory we are especially interested in functions f Zrarr C Then things

        are exactly the other way around provided that f decays reasonably fast as n rarr plusmninfin(or becomes 0 for n large enough) f has a Fourier transform f RZ rarr C definedby f(α) =

        sumn f(n)e(minusαn) and satisfying f(n) =

        intRZ f(α)e(αn)dα (Highbrow

        talk we already knew that Z is the Fourier dual of RZ and so of course RZ isthe Fourier dual of Z) ldquoExponential sumsrdquo (or ldquotrigonometrical sumsrdquo as in the titleof [Vin54]) are sums of the form

        sumn f(α)e(minusαn) of course the ldquocirclerdquo in ldquocircle

        methodrdquo is just a name for RZ (To see an actual circle in the complex plane look atthe image of RZ under the map α 7rarr e(α))

        The study of the Fourier transform f is relevant to additive problems in numbertheory ie questions on the number of ways of writing n as a sum of k integers ofa particular form Why One answer could be that f gives us information about theldquorandomnessrdquo of f if f were the characteristic function of a random set then f(α)would be very small outside a sharp peak at α = 0

        We can also give a more concrete and immediate answer Recall that in generalthe Fourier transform of a convolution equals the product of the transforms over Z

        4In fact Hardy Littlewood and Effinger use an assumption somewhat weaker than GRH they assumethat Dirichlet L-functions have no zeroes satisfying lt(s) ge θ where θ lt 34 is arbitrary (We will reviewDirichlet L-functions in a minute)

        12 THE CIRCLE METHOD FOURIER ANALYSIS ON Z 7

        this means that for the additive convolution

        (f lowast g)(n) =sum

        m1m2isinZm1+m2=n

        f(m1)g(m2)

        the Fourier transform satisfies the simple rule

        f lowast g(α) = f(α) middot g(α)

        We can see right away from this that (f lowast g)(n) can be non-zero only if n can bewritten as n = m1 + m2 for some m1 m2 such that f(m1) and g(m2) are non-zeroSimilarly (f lowastglowasth)(n) can be non-zero only if n can be written as n = m1 +m2 +m3

        for some m1 m2 m3 such that f(m1) f2(m2) and f3(m3) are all non-zero Thissuggests that to study the ternary Goldbach problem we define f1 f2 f3 Zrarr C sothat they take non-zero values only at the primes

        Hardy and Littlewood defined f1(n) = f2(n) = f3(n) = 0 for n non-prime (andalso for n le 0) and f1(n) = f2(n) = f3(n) = (log n)eminusnx for n prime (where x isa parameter to be fixed later) Here the factor eminusnx is there to provide ldquofast decayrdquoso that everything converges as we will see later Hardy and Littlewoodrsquos choice ofeminusnx (rather than some other function of fast decay) comes across in hindsight asbeing very clever though not quite best-possible (Their ldquochoicerdquo was to some extentnot a choice but an artifact of their version of the circle method which was framedin terms of power series not in terms of exponential sums with arbitrary smoothingfunctions) The term log n is there for technical reasons ndash in essence it makes senseto put it there because a random integer around n has a chance of about 1(log n) ofbeing prime

        We can see that (f1 lowast f2 lowast f3)(n) 6= 0 if and only if n can be written as the sumof three primes Our task is then to show that (f1 lowast f2 lowast f3)(n) (ie (f lowast f lowast f)(n))is non-zero for every n larger than a constant C sim 1027 Since the transform of aconvolution equals a product of transforms

        (f1lowastf2lowastf3)(n) =

        intRZ

        f1 lowast f2 lowast f3(α)e(αn)dα =

        intRZ

        (f1f2f3)(α)e(αn)dα (12)

        Our task is thus to show that the integralintRZ(f1f2f3)(α)e(αn)dα is non-zero

        As it happens f(α) is particularly large when α is close to a rational with smalldenominator Moreover for such α it turns out we can actually give rather preciseestimates for f(α) Define M (called the set of major arcs) to be a union of narrowarcs around the rationals with small denominator

        M =⋃qler

        ⋃a mod q

        (aq)=1

        (a

        qminus 1

        qQa

        q+

        1

        qQ

        )

        where Q is a constant times xr and r will be set later (This is a slight simplificationthe major-arc set we will actually use in the course of the proof will be a little different

        8 CHAPTER 1 INTRODUCTION

        due to a distinction between odd and even q) We can writeintRZ

        (f1f2f3)(α)e(αn)dα =

        intM

        (f1f2f3)(α)e(αn)dα+

        intm

        (f1f2f3)(α)e(αn)dα

        (13)where m is the complement (RZ) M (called minor arcs)

        Now we simply do not know how to give precise estimates for f(α) when α is inm However as Vinogradov realized one can give reasonable upper bounds on |f(α)|for α isin m This suggests the following strategy show thatint

        m

        |f1(α)||f2(α)||f3(α)|dα ltintM

        f1(α)f2(α)f3(α)e(αn)dα (14)

        By (12) and (13) this will imply immediately that (f1 lowast f2 lowast f3)(n) gt 0 and so wewill be done

        The name of circle method is given to the study of additive problems by means ofFourier analysis over Z and in particular to the use of a subdivision of the circle RZinto major and minor arcs to estimate the integral of a Fourier transform There wasa ldquocirclerdquo already in Hardy and Ramanujanrsquos work [HR00] but the subdivision intomajor and minor arcs is due to Hardy and Littlewood who also applied their methodto a wide variety of additive problems (Hence ldquothe Hardy-Littlewood methodrdquo as analternative name for the circle method) For instance before working on the ternaryGoldbach conjecture they studied the question of whether every n gt C can be writtenas the sum of kth powers (Waringrsquos problem) In fact they used a subdivision intomajor and minor arcs to study Waringrsquos problem and not for the ternary Goldbachproblem they had no minor-arc bounds for ternary Goldbach and their use of GRHhad the effect of making every α isin RZ yield to a major-arc treatment

        Vinogradov worked with finite exponential sums ie fi compactly supportedFrom todayrsquos perspective it is clear that there are applications (such as ours) in whichit can be more important for fi to be smooth than compactly supported still Vino-gradovrsquos simplifications were an incentive to further developments In the case of theternary Goldbachrsquos problem his key contribution consisted in the fact that he couldgive bounds on f(α) for α in the minor arcs without using GRH

        An important note in the case of the binary Goldbach conjecture the method failsat (14) and not before if our understanding of the actual value of fi(α) is at all correctit is simply not true in general thatint

        m

        |f1(α)||f2(α)|dα ltintM

        f1(α)f2(α)e(αn)dα

        Let us see why this is not surprising Set f1 = f2 = f3 = f for simplicity so thatwe have the integral of the square (f(α))2 for the binary problem and the integral ofthe cube (f(α))3 for the ternary problem Squaring like cubing amplifies the peaksof f(α) which are at the rationals of small denominator and their immediate neighbor-hoods (the major arcs) however cubing amplifies the peaks much more than squaringThis is why even though the arcs making up M are very narrow

        intM

        (f(α))3e(αn)dα

        13 THE MAJOR ARCS M 9

        is larger thanintm|f(α)|3dα that explains the name major arcs ndash they are not large but

        they give the major part of the contribution In contrast squaring amplifies the peaksless and this is why the absolute value of

        intMf(α)2e(αn)dα is in general smaller thanint

        m|f(α)|2dα As nobody knows how to prove a precise estimate (and in particular

        lower bounds) on f(α) for α isin m the binary Goldbach conjecture is still very muchout of reach

        To prove the ternary Goldbach conjecture it is enough to estimate both sides of(14) for carefully chosen f1 f2 f3 and compare them This is our task from now on

        13 The major arcs M

        131 What do we really know about L-functions and their zerosBefore we start let us give a very brief review of basic analytic number theory (in thesense of say [Dav67]) A Dirichlet character χ Z rarr C of modulus q is a characterof (ZqZ)lowast lifted to Z (In other words χ(n) = χ(n+ q) for all n χ(ab) = χ(a)χ(b)for all a b and χ(n) = 0 for (n q) 6= 1) A Dirichlet L-series is defined by

        L(s χ) =

        infinsumn=1

        χ(n)nminuss

        for lt(s) gt 1 and by analytic continuation for lt(s) le 1 (The Riemann zeta functionζ(s) is the L-function for the trivial character ie the character χ such that χ(n) = 1for all n) Taking logarithms and then derivatives we see that

        minus Lprime(s χ)

        L(s χ)=

        infinsumn=1

        χ(n)Λ(n)nminuss (15)

        for lt(s) gt 1 where Λ is the von Mangoldt function (Λ(n) = log p if n is some primepower pα α ge 1 and Λ(n) = 0 otherwise)

        Dirichlet introduced his characters and L-series so as to study primes in arithmeticprogressions In general and after some work (15) allows us to restate many sumsover the primes (such as our Fourier transforms f(α)) as sums over the zeros ofL(s χ)A non-trivial zero of L(s χ) is a zero of L(s χ) such that 0 lt lt(s) lt 1 (The otherzeros are called trivial because we know where they are namely at negative integersand in some cases also on the line lt(s) = 0 In order to eliminate all zeros onlt(s) = 0 outside s = 0 it suffices to assume that χ is primitive a primitive charactermodulo q is one that is not induced by (ie not the restriction of) any character modulod|q d lt q)

        The Generalized Riemann Hypothesis for Dirichlet L-functions is the statementthat for every Dirichlet character χ every non-trivial zero of L(s χ) satisfies lt(s) =12 Of course the Generalized Riemann Hypothesis (GRH) ndash and the Riemann Hy-pothesis which is the special case of χ trivial ndash remains unproven Thus if we want toprove unconditional statements we need to make do with partial results towards GRHTwo kinds of such results have been proven

        10 CHAPTER 1 INTRODUCTION

        bull Zero-free regions Ever since the late nineteenth century (Hadamard de laVallee-Poussin) we have known that there are hourglass-shaped regions (moreprecisely of the shape c

        log t le σ le 1minus clog t where c is a constant and where we

        write s = σ + it) outside which non-trivial zeros cannot lie Explicit values forc are known [McC84b] [Kad05] [Kad] There is also the Vinogradov-Korobovregion [Kor58] [Vin58] which is broader asymptotically but narrower in mostof the practical range (see [For02] however)

        bull Finite verifications of GRH It is possible to (ask a computer to) prove smallfinite fragments of GRH in the sense of verifying that all non-trivial zeros ofa given finite set of L-functions with imaginary part less than some constant Hlie on the critical line lt(s) = 12 Such verifications go back to Riemannwho checked the first few zeros of ζ(s) Large-scale rigorous computer-basedverifications are now a possibility

        Most work in the literature follows the first alternative though [Tao14] did use afinite verification of RH (ie GRH for the trivial character) Unfortunately zero-freeregions seem too narrow to be useful for the ternary Goldbach problem Thus we areleft with the second alternative

        In coordination with the present work Platt [Plab] verified that all zeros s of L-functions for characters χ with modulus q le 300000 satisfying =(s) le Hq lie on theline lt(s) = 12 where

        bull Hq = 108q for q odd and

        bull Hq = max(108q 200 + 75 middot 107q) for q even

        This was a medium-large computation taking a few hundreds of thousands of core-hours on a parallel computer It used interval arithmetic for the sake of rigor we willlater discuss what this means

        The choice to use a finite verification of GRH rather than zero-free regions hadconsequences on the manner in which the major and minor arcs had to be chosen Aswe shall see such a verification can be used to give very precise bounds on the majorarcs but also forces us to define them so that they are narrow and their number isconstant To be precise the major arcs were defined around rationals aq with q le rr = 300000 moreover as will become clear the fact that Hq is finite will force theirwidth to be bounded by c0rqx where c0 is a constant (say c0 = 8)

        132 Estimates of f(α) for α in the major arcs

        Recall that we want to estimate sums of the type f(α) =sumf(n)e(minusαn) where

        f(n) is something like (log n)η(nx) for n equal to a prime and 0 otherwise hereη Rrarr C is some function of fast decay such as Hardy and Littlewoodrsquos choice

        η(t) =

        eminust for t ge 0

        0 for t lt 0

        13 THE MAJOR ARCS M 11

        Let us modify this just a little ndash we will actually estimate

        Sη(α x) =sum

        Λ(n)e(αn)η(nx) (16)

        where Λ is the von Mangoldt function (as in (15)) The use of α rather thanminusα is justa bow to tradition as is the use of the letter S (for ldquosumrdquo) however the use of Λ(n)rather than just plain log p does actually simplify matters

        The function η here is sometimes called a smoothing function or simply a smooth-ing It will indeed be helpful for it to be smooth on (0infin) but in principle it neednot even be continuous (Vinogradovrsquos work implicitly uses in effect the ldquobrutal trun-cationrdquo 1[01](t) defined to be 1 when t isin [0 1] and 0 otherwise that would be fine forthe minor arcs but as it will become clear it is a bad idea as far as the major arcs areconcerned)

        Assume α is on a major arc meaning that we can write α = aq+δx for some aq(q small) and some δ (with |δ| small) We can write Sη(α x) as a linear combination

        Sη(α x) =sumχ

        cχSηχ

        x x

        )+ tiny error term (17)

        where

        Sηχ

        x x

        )=sum

        Λ(n)χ(n)e(δnx)η(nx) (18)

        In (17) χ runs over primitive Dirichlet characters of moduli d|q and cχ is small(|cχ| le

        radicdφ(q))

        Why are we expressing the sums Sη(α x) in terms of the sums Sηχ(δx x) whichlook more complicated The argument has become δx whereas before it was αHere δ is relatively small ndash smaller than the constant c0r in our setup In other wordse(δnx) will go around the circle a bounded number of times as n goes from 1 up to aconstant times x (by which time η(nx) has become small because η is of fast decay)This makes the sums much easier to estimate

        To estimate the sums Sηχ we will use L-functions together with one of the mostcommon tools of analytic number theory the Mellin transform This transform is es-sentially a Laplace transform with a change of variables and a Laplace transform inturn is a Fourier transform taken on a vertical line in the complex plane For f of fastenough decay the Mellin transform F = Mf of f is given by

        F (s) =

        int infin0

        f(t)tsdt

        t

        we can express f in terms of F by the Mellin inversion formula

        f(t) =1

        2πi

        int σ+iinfin

        σminusiinfinF (s)tminussds

        for any σ within an interval We can thus express e(δt)η(t) in terms of its Mellintransform Fδ and then use (15) to express Sηχ in terms of Fδ and Lprime(s χ)L(s χ)

        12 CHAPTER 1 INTRODUCTION

        shifting the integral in the Mellin inversion formula to the left we obtain what is knownin analytic number theory as an explicit formula

        Sηχ(δx x) = [η(minusδ)x]minussumρ

        Fδ(ρ)xρ + tiny error term

        Here the term between brackets appears only for χ trivial In the sum ρ goes over allnon-trivial zeros ofL(s χ) and Fδ is the Mellin transform of e(δt)η(t) (The tiny errorterm comes from a sum over the trivial zeros of L(s χ)) We will obtain the estimatewe desire if we manage to show that the sum over ρ is small

        The point is this if we verify GRH for L(s χ) up to imaginary part H ie ifwe check that all zeroes ρ of L(s χ) with |=(ρ)| le H satisfy lt(ρ) = 12 we have|xρ| =

        radicx In other words xρ is very small (compared to x) However for any

        ρ whose imaginary part has absolute value greater than H we know next to nothingabout its real part other than 0 le lt(ρ) le 1 (Zero-free regions are notoriously weakfor =(ρ) large we will not use them) Hence our only chance is to make sure thatFδ(ρ) is very small when |=(ρ)| ge H

        This has to be true for both δ very small (including the case δ = 0) and for δ not sosmall (|δ| up to c0rq which can be large because r is a large constant) How can wechoose η so that Fδ(ρ) is very small in both cases for τ = =(ρ) large

        The method of stationary phase is useful as an exploratory tool here In brief itsuggests (and can sometimes prove) that the main contribution to the integral

        Fδ(t) =

        int infin0

        e(δt)η(t)tsdt

        t(19)

        can be found where the phase of the integrand has derivative 0 This happens whent = minusτ2πδ (for sgn(τ) 6= sgn(δ)) the contribution is then a moderate factor timesη(minusτ2πδ) In other words if sgn(τ) 6= sgn(δ) and δ is not too small (|δ| ge 8 say)Fδ(σ + iτ) behaves like η(minusτ2πδ) if δ is small (|δ| lt 8) then Fδ behaves like F0which is the Mellin transform Mη of η Here is our goal then the decay of η(t) as|t| rarr infin should be as fast as possible and the decay of the transform Mη(σ + iτ)should also be as fast as possible

        This is a classical dilemma often called the uncertainty principle because it is themathematical fact underlying the physical principle of the same name you cannot havea function η that decreases extremely rapidly and whose Fourier transform (or in thiscase its Mellin transform) also decays extremely rapidly

        What does ldquoextremely rapidlyrdquo mean here It means (as Hardy himself proved)ldquofaster than any exponential eminusCtrdquo Thus Hardy and Littlewoodrsquos choice η(t) = eminust

        seems essentially optimal at first sightHowever it is not optimal We can choose η so that Mη decreases exponentially

        (with a constant C somewhat worse than for η(t) = eminust) but η decreases faster thanexponentially This is a particularly appealing possibility because it is t|δ| and not somuch t that risks being fairly small (To be explicit say we check GRH for charactersof modulus q up to Hq sim 50 middot c0rq ge 50|δ| Then we only know that |τ2πδ| amp8 So for η(t) = eminust η(minusτ2πδ) may be as large as eminus8 which is not negligibleIndeed since this term will be multiplied later by other terms eminus8 is simply not small

        13 THE MAJOR ARCS M 13

        enough On the other hand we can assume that Hq ge 200 (say) and so Mη(s) simeminus(π2)|τ | is completely negligible and will remain negligible even if we replace π2by a somewhat smaller constant)

        We shall take η(t) = eminust22 (that is the Gaussian) This is not the only possible

        choice but it is in some sense natural It is easy to show that the Mellin transform Fδfor η(t) = eminust

        22 is a multiple of what is called a parabolic cylinder function U(a z)with imaginary values for z There are plenty of estimates on parabolic cylinder func-tions in the literature ndash but mostly for a and z real in part because that is one of thecases occuring most often in applications There are some asymptotic expansions andestimates for U(a z) a z general due to Olver [Olv58] [Olv59] [Olv61] [Olv65]but unfortunately they come without fully explicit error terms for a and z within ourrange of interest (The same holds for [TV03])

        In the end I derived bounds for Fδ using the saddle-point method (The methodof stationary phase which we used to choose η seems to lead to error terms that aretoo large) The saddle-point method consists in brief in changing the contour of anintegral to be bounded (in this case (19)) so as to minimize the maximum of theintegrand (To use a metaphor in [dB81] find the lowest mountain pass)

        Here we strive to get clean bounds rather than the best possible constants Considerthe case k = 0 of Corollary 802 with k = 0 it states the following For s = σ + iτwith σ isin [0 1] and |τ | ge max(100 4π2|δ|) we obtain that the Mellin transform Fδ ofη(t)e(δt) with η(t) = eminust

        22 satisfies

        |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

        3001eminus01065( 2|τ|

        |`| )2

        if 4|τ |`2 lt 323286eminus01598|τ | if 4|τ |`2 ge 32

        (110)

        Similar bounds hold for σ in other ranges thus giving us estimates on the Mellintransform Fδ for η(t) = tkeminust

        22 and σ in the critical range [0 1] (We could do a littlebetter if we knew the value of σ but in our applications we do not once we leavethe range in which GRH has been checked We will give a bound (Theorem 801) thatdoes take σ into account and also reflects and takes advantage of the fact that thereis a transitional region around |τ | sim (32)(πδ)2 in practice however we will useCor 802)

        A momentrsquos thought shows that we can also use (110) to deal with the Mellintransform of η(t)e(δt) for any function of the form η(t) = eminust

        22g(t) (or more gener-ally η(t) = tkeminust

        22g(t)) where g(t) is any band-limited function By a band-limitedfunction we could mean a function whose Fourier transform is compactly supportedwhile that is a plausible choice it turns out to be better to work with functions that areband-limited with respect to the Mellin transform ndash in the sense of being of the form

        g(t) =

        int R

        minusRh(r)tminusirdr

        where h Rrarr C is supported on a compact interval [minusRR] withR not too large (sayR = 200) What happens is that the Mellin transform of the product eminust

        22g(t)e(δt)

        is a convolution of the Mellin transform Fδ(s) of eminust22e(δt) (estimated in (110)) and

        14 CHAPTER 1 INTRODUCTION

        that of g(t) (supported in [minusRR]) the effect of the convolution is just to delay decayof Fδ(s) by at most a shift by y 7rarr y minusR

        We wish to estimate Sηχ(δx) for several functions η This motivates us to derivean explicit formula (sect) general enough to work with all the weights η(t) we will workwith while being also completely explicit and free of any integrals that may be tediousto evaluate

        Once that is done and once we consider the input provided by Plattrsquos finite verifi-cation of GRH up to Hq we obtain simple bounds for different weights

        For η(t) = eminust22 x ge 108 χ a primitive character of modulus q le r = 300000

        and any δ isin R with |δ| le 4rq we obtain

        Sηχ

        x x

        )= Iq=1 middot η(minusδ)x+ E middot x (111)

        where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

        |E| le 4306 middot 10minus22 +1radicx

        (650400radicq

        + 112

        ) (112)

        Here η stands for the Fourier transform from R to R normalized as follows η(t) =intinfinminusinfin e(minusxt)η(x)dx Thus η(minusδ) is just

        radic2πeminus2π2δ2 (self-duality of the Gaussian)

        This is one of the main results of Part II see sect71 Similar bounds are also proventhere for η(t) = t2eminust

        22 as well as for a weight of type η(t) = teminust22g(t) where

        g(t) is a band-limited function and also for a weight η defined by a multiplicativeconvolution The conditions on q (namely q le r = 300000) and δ are what weexpected from the outset

        Thus concludes our treatment of the major arcs This is arguably the easiest part ofthe proof it was actually what I left for the end as I was fairly confident it would workout Minor-arc estimates are more delicate let us now examine them

        14 The minor arcs m

        141 Qualitative goals and main ideas

        What kind of bounds do we need What is there in the literatureWe wish to obtain upper bounds on |Sη(α x)| for some weight η and any α isin RZ

        not very close to a rational with small denominator Every α is close to some rationalaq what we are looking for is a bound on |Sη(α x)| that decreases rapidly when qincreases

        Moreover we want our bound to decrease rapidly when δ increases where α =aq + δx In fact the main terms in our bound will be decreasing functions ofmax(1 |δ|8) middot q (Let us write δ0 = max(2 |δ|4) from now on) This will allowour bound to be good enough outside narrow major arcs which will get narrower andnarrower as q increases ndash that is precisely the kind of major arcs we were presupposingin our major-arc bounds

        14 THE MINOR ARCS M 15

        It would be possible to work with narrow major arcs that become narrower as qincreases simply by allowing q to be very large (close to x) and assigning each angleto the fraction closest to it This is in fact the common procedure However thismakes matters more difficult in that we would have to minimize at the same time thefactors in front of terms xq x

        radicq etc and those in front of terms q

        radicqx and so

        on (These terms are being compared to the trivial bound x) Instead we choose tostrive for a direct dependence on δ throughout this will allow us to cap q at a muchlower level thus making terms such as q and

        radicqx negligible (This choice has been

        taken elsewhere in applications of the circle method but strangely seems absent fromprevious work on the ternary Goldbach conjecture)

        How good must our bounds be Since the major-arc bounds are valid only forq le r = 300000 and |δ| le 4rq we cannot afford even a single factor of log x (orany other function tending to infin as x rarr infin) in front of terms such as x

        radicq|δ0| a

        factor like that would make the term larger than the trivial bound x if q|δ0| is equal toa constant (r say) and x is very large Apparently there was no such ldquolog-free boundrdquowith explicit constants in the literature even though such bounds were considered tobe in principle feasible and even though previous work ([Che85] [Dab96] [DR01][Tao14]) had gradually decreased the number of factors of log x (In limited ranges forq there were log-free bounds without explicit constants see [Dab96] [Ram10] Theestimate in [Vin54 Thm 2a 2b] was almost log-free but not quite There were alsobounds [Kar93] [But11] that used L-functions and thus were not really useful in atruly minor-arc regime)

        It also seemed clear that a main bound proportional to (log q)2xradicq (as in [Tao14])

        was too large At the same time it was not really necessary to reach a bound of thebest possible form that could be found through Vinogradovrsquos basic approach namely

        |Sη(α x)| le Cxradicq

        φ(q) (113)

        Such a bound had been proven by Ramare [Ram10] for q in a limited range and Cnon-explicit later in [Ramc] ndash which postdates the first version of [Helb] ndash Ramarebroadened the range to q le x148 and gave an explicit value forC namelyC = 13000Such a bound is a notable achievement but unfortunately it is not useful for ourpurposes Rather we will aim at a bound whose main term is bounded by a constantaround 1 times x(log δ0q)

        radicδ0φ(q) this is slightly worse asymptotically than (113)

        but it is much better in the delicate range of δ0q sim 300000 and in fact for a muchwider range as well

        We see that we have several tasks One of them is the removal of logarithms wecannot afford a single factor of log x and in practice we can afford at most one factorof log q Removing logarithms will be possible in part because of the use of previouslyexisting efficient techniques (the large sieve for sequences with prime support) but alsobecause we will be able to find cancellation at several places in sums coming from acombinatorial identity (namely Vaughanrsquos identity) The task of finding cancellationis particularly delicate because we cannot afford large constants or for that matter

        16 CHAPTER 1 INTRODUCTION

        statements valid only for large x (Bounding a sum such assumn micro(n) efficiently where

        micro is the Mobius function

        micro(n) =

        (minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

        is harder than estimating a sum such assumn Λ(n) equally efficiently even though we

        are used to thinking of the two problems as equivalent)We have said that our bounds will improve as |δ| increases This dependence on

        δ will be secured in different ways at different places Sometimes δ will appear asan argument as in η(minusδ) for η piecewise continuous with ηprime isin L1 we know that|η(t)| rarr 0 as |t| rarr infin Sometimes we will obtain a dependence on δ by using severaldifferent rational approximations to the same α isin R Lastly we will obtain a gooddependence on δ in bilinear sums by supplying a scattered input to a large sieve

        If there is a main moral to the argument it lies in the close relation between thecircle method and the large sieve The circle method rests on the estimation of anintegral involving a Fourier transform f RZ rarr C as we will later see this leadsnaturally to estimating the `2-norm of f on subsets (namely unions of arcs) of the circleRZ The large sieve can be seen as an approximate discrete version of Plancherelrsquosidentity which states that |f |2 = |f |2

        Both in this section and in sect15 we shall use the large sieve in part so as to usethe fact that some of the functions we work with have prime support ie are non-zeroonly on prime numbers There are ways to use prime support to improve the outputof the large sieve In sect15 these techniques will be refined and then translated to thecontext of the circle method where f has (essentially) prime support and |f |2 must beintegrated over unions of arcs (This allows us to remove a logarithm) The main pointis that the large sieve is not being used as a black box rather we can adapt ideas from(say) the large-sieve context and apply them to the circle method

        Lastly there are the benefits of a continuous η Hardy and Littlewood alreadyused a continuous η this was abandoned by Vinogradov presumably for the sake ofsimplicity The idea that smooth weights η can be superior to sharp truncations isnow commonplace As we shall see using a continuous η is helpful in the minor-arcsregime but not as crucial there as for the major arcs We will not use a smooth η wewill prove our estimates for any continuous η that is piecewise C1 and then towardsthe end we will choose to use the same weight η = η2 as in [Tao14] in part because ithas compact support and in part for the sake of comparison The moral here is not quitethe common dictum ldquoalways smoothrdquo but rather that different kinds of smoothing canbe appropriate for different tasks in the end we will show how to coordinate differentsmoothing functions η

        There are other ideas involved for instance some of Vinogradovrsquos lemmas areimproved Let us now go into some of the details

        142 Combinatorial identitiesGenerally since Vinogradov a treatment of the minor arcs starts with a combinatorialidentity expressing Λ(n) (or the characteristic function of the primes) as a sum of two

        14 THE MINOR ARCS M 17

        or more convolutions (In this section by a convolution flowastg we will mean the Dirichletconvolution (f lowast g)(n) =

        sumd|n f(d)g(nd) ie the multiplicative convolution on the

        semigroup of positive integers)In some sense the archetypical identity is

        Λ = micro lowast log

        but it will not usually do the contribution of micro(d) log(nd) with d close to n is toodifficult to estimate precisely There are alternatives for example there is the identity

        Λ(n) log n = micro lowast log2minusΛ lowast Λ (114)

        which underlies an estimate of Selbergrsquos that in turn is the basis for the Erdos-Selbergproof of the prime number theorem see eg [MV07 sect82] More generally onecan decompose Λ(n)(log n)k as micro lowast logk+1 minus a linear combination of convolu-tions this kind of decomposition ndash really just a direct consequence of the develop-ment of (ζ prime(s)ζ(s))(k) ndash will be familiar to some from the exposition of Bombierirsquoswork [Bom76] in [FI10 sect3] (for instance) Another useful identity was that used byDaboussi [Dab96] witness its application in [DR01] which gives explicit estimates onexponential sums over primes

        The proof of Vinogradovrsquos three-prime result was simplified substantially [Vau77b]by the introduction of Vaughanrsquos identity

        Λ(n) = microleU lowast logminusΛleV lowast microleU lowast 1 + 1 lowast microgtU lowast ΛgtV + ΛleV (115)

        where we are using the notation

        fleW =

        f(n) if n leW 0 if n gt W

        fgtW =

        0 if n leW f(n) if n gt W

        Of the resulting sums (sumn(microleU lowast log)(n)e(αn)η(nx) etc) the first three are said

        to be of type I type I (again) and type II the last sumsumnleV Λ(n) is negligible

        One of the advantages of Vaughanrsquos identity is its flexibility we can set U and Vto whatever values we wish Its main disadvantage is that it is not ldquolog-freerdquo in that itseems to impose the loss of two factors of log x if we sum each side of (115) from 1to x we obtain

        sumnlex Λ(n) sim x on the left side whereas if we bound the sum on the

        right side without the use of cancellation we obtain a bound of x(log x)2 Of coursewe will obtain some cancellation from the phase e(αn) still even if this gives us afactor of say 1

        radicq we will get a bound of x(log x)2

        radicq which is worse than the

        trivial bound x for q bounded and x large Since we want a bound that is useful for allq larger than the constant r and all x larger than a constant this will not do

        As was pointed out in [Tao14] it is possible to get a factor of (log q)2 instead of afactor of (log x)2 in the type II sums by setting U and V appropriately Unfortunatelya factor of (log q)2 is still too large in practice and there is also the issue of factors oflog x in type I sums

        Vinogradov had already managed to get an essentially log-free result (by a ratherdifficult procedure) in [Vin54 Ch IX] The result in [Dab96] is log-free Unfortu-nately the explicit result in [DR01] ndash the study of which encouraged me at the begin-ning of the project ndash is not For a while I worked with the case k = 2 of the expansion

        18 CHAPTER 1 INTRODUCTION

        of (ζ prime(s)ζ(s))(k) which gives

        Λ middot log2 = micro lowast log3minus3 middot (Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (116)

        This identity is essentially log-free while a trivial bound on the sum of the right sidefor n from 1 to N does seem to have two extra factors of log they are present only inthe term micro lowast log3 which is not the hardest one to estimate Ramare obtained a log-freebound in [Ram10] using an identity introduced by Diamond and Steinig in the courseof their own work on elementary proofs of the prime number theorem [DS70] thatidentity gives a decomposition for Λ middot logk that can also be derived from the expansionof (ζ prime(s)ζ(s))(k) by a clever grouping of terms

        In the end I decided to use Vaughanrsquos identity motivated in part by [Tao14] andin part by the lack of free parameters in (116) as can be seen in (115) Vaughanrsquosidentity has two parameters U V that we can set to whatever values we think best Theform of the identity allowed me to reuse much of my work up to that point but it alsoposed a challenge since Vaughanrsquos identity is by no means log-free one has obtaincancellation in Vaughanrsquos identity at every possible step beyond the cancellation givenby the phase e(αn) (The presence of a phase in fact makes the task of getting can-cellation from the identity more complicated) The removal of logarithms will be oneof our main tasks in what follows It is clear that the presence of the Mobius functionmicro should give in principle some cancellation we will show how to use it to obtain asmuch cancellation as we need ndash with good constants and not just asymptotically

        143 Type I sumsThere are two type I sums namelysum

        mleU

        micro(m)sumn

        (log n)e(αmn)η(mnx

        )(117)

        and sumvleV

        Λ(v)sumuleU

        micro(u)sumn

        e(αvun)η(vunx

        ) (118)

        In either case α = aq + δx where q is larger than a constant r and |δx| le 1qQ0

        for some Q0 gt max(qradicx) For the purposes of this exposition we will set it as our

        task to estimate the slightly simpler sumsummleD

        micro(m)sumn

        e(αmn)η(mnx

        ) (119)

        where D can be U or UV or something else less than xWhy can we consider this simpler sum without omitting anything essential It is

        clear that (117) is of the same kind as (119) The inner double sum in (118) is just(119) with αv instead of α this enables us to estimate (118) by means of (119) for qsmall ie the more delicate case If q is not small then the approximation αv sim avqmay not be accurate enough In that case we collapse the two outer sums in (118) intoa sum

        sumn(ΛleV lowast microleU )(n) and treat all of (118) much as we will treat (119) since

        14 THE MINOR ARCS M 19

        q is not small we can afford to bound (ΛleV lowast microleU )(n) trivially (by log n) in the lesssensitive terms

        Let us first outline Vinogradovrsquos procedure for bounding type I sums Just by sum-ming a geometric series we get∣∣∣∣∣∣

        sumnleN

        e(αn)

        ∣∣∣∣∣∣ le min

        (N

        c

        α

        ) (120)

        where c is a constant and α is the distance from α to the nearest integer Vinogradovsplits the outer sum in (119) into sums of length q When m runs on an interval oflength q the angle amq runs through all fractions of the form bq due to the errorδx αm could be close to 0 for two values of n but otherwise αm takes valuesbounded below by 1q (twice) 2q (twice) 3q (twice) etc Thus∣∣∣∣∣∣

        sumyltmley+q

        micro(m)sumnleN

        e(αmn)

        ∣∣∣∣∣∣ lesum

        yltmley+q

        ∣∣∣∣∣∣sumnleN

        e(αmn)

        ∣∣∣∣∣∣ le 2N

        m+ 2cq log eq

        (121)for any y ge 0

        There are several ways to improve this One is simply to estimate the inner summore precisely this was already done in [DR01] One can also define a smoothingfunction η as in (119) it is easy to get∣∣∣∣∣∣

        sumnleN

        e(αn)η(nx

        )∣∣∣∣∣∣ le min

        (x|η|1 +

        |ηprime|12|ηprime|1

        2| sin(πα)||ηprimeprime|infin

        4x(sinπα)2

        )

        Except for the third term this is as in [Tao14] We could also choose carefully whichbound to use for each m surprisingly this gives an improvement ndash in fact an impor-tant one for m large However even with these improvements we still have a termproportional to Nm as in (121) and this contributes about (x log x)q to the sum(119) thus giving us an estimate that is not log-free

        What we have to do naturally is to take out the terms with q|m for m small (If mis large then those may not be the terms for which mα is close to 0 we will later seewhat to do) For y + q le Q2 |αminus aq| le 1qQ we get thatsum

        yltmley+q

        q-m

        min

        (A

        B

        | sinπαn|

        C

        | sinπαn|2

        )(122)

        is at most

        min

        (20

        3π2Cq2 2A+

        4q

        π

        radicAC

        2Bq

        πmax

        (2 log

        Ce3q

        )) (123)

        This is satisfactory We are left with all the terms m le M = min(DQ2) with q|mndash and also with all the terms Q2 lt m le D For m le M divisible by q we can

        20 CHAPTER 1 INTRODUCTION

        estimate (as opposed to just bound from above) the inner sum in (119) by the Poissonsummation formula and then sum over m but without taking absolute values writingm = aq we get a main term

        xmicro(q)

        qmiddot η(minusδ) middot

        sumaleMq

        (aq)=1

        micro(a)

        a (124)

        where (a q) stands for the greatest common divisor of a and qIt is clear that we have to get cancellation over micro here There is an elegant elemen-

        tary argument [GR96] showing that the absolute value of the sum in (124) is at most1 We need to gain one more log however Ramare [Ramb] helpfully furnished thefollowing bound ∣∣∣∣∣∣∣∣

        sumalex

        (aq)=1

        micro(a)

        a

        ∣∣∣∣∣∣∣∣ le4

        5

        q

        φ(q)

        1

        log xq(125)

        for q le x (Cf [EM95] [EM96]) This is neither trivial nor elementary5 We are so tospeak allowed to use non-elementary means (that is methods based on L-functions)because the only L-function we need to use here is the Riemann zeta function

        What shall we do for m gt Q2 We can always give a bound

        sumyltmley+q

        min

        (A

        C

        | sinπαn|2

        )le 3A+

        4q

        π

        radicAC (126)

        for y arbitrary since AC will be of constant size (4qπ)radicAC is pleasant enough but

        the contribution of 3A sim 3|η|1xy is nasty (it adds a multiple of (x log x)q to thetotal) and seems unavoidable the values of m for which αm is close to 0 no longercorrespond to the congruence class m equiv 0 mod q and thus cannot be taken out

        The solution is to switch approximations (The idea of using different approxima-tions to the same α is neither new nor recent in the general context of the circle methodsee [Vau97 sect28 Ex 2] What may be new is its use to clear a hurdle in type I sums)What does this mean If α were exactly or almost exactly aq then there would beno other very good approximations in a reasonable range However note that we candefine Q = bx|δq|c for α = aq + δx and still have |αminus aq| le 1qQ If δ is verysmall Q will be larger than 2D and there will be no terms with Q2 lt m le D toworry about

        5The current state of knowledge may seem surprising after all we expect nearly square-root cancella-tion ndash for instance |

        sumnlex micro(n)n| le

        radic2x holds for all real 0 lt x le 1012 see also the stronger

        bound [Dre93]) The classical zero-free region of the Riemann zeta function ought to give a factor ofexp(minus

        radic(log x)c) which looks much better than 1 log x What happens is that (a) such a factor is

        not actually much better than 1 log x for x sim 1030 say (b) estimating sums involving the Mobius func-tion by means of an explicit formula is harder than estimating sums involving Λ(n) the residues of 1ζ(s)at the non-trivial zeros of s come into play As a result getting non-trivial explicit results on sums of micro(n)is harder than one would naively expect from the quality of classical effective (but non-explicit) results See[Rama] for a survey of explicit bounds

        14 THE MINOR ARCS M 21

        What happens if δ is not very small We know that for any Qprime there is an approx-imation aprimeqprime to α with |αminus aprimeqprime| le 1qprimeQprime and qprime le Qprime However for Qprime gt Q weknow that aprimeqprime cannot equal aq by the definition of Q the approximation aq is notgood enough ie |α minus aq| le 1qQprime does not hold Since aq 6= aprimeqprime we see that|aq minus aprimeqprime| ge 1qqprime and this implies that qprime ge (ε(1 + ε))Q

        Thus for m gt Q2 the solution is to apply (126) with aprimeqprime instead of aq Thecontribution of A fades into insignificance for the first sum over a range y lt m ley + qprime y ge Q2 it contributes at most x(Q2) and all the other contributions of Asum up to at most a constant times (x log x)qprime

        Proceeding in this way we obtain a total bound for (119) whose main terms areproportional to

        1

        φ(q)

        x

        log xq

        min

        (1

        1

        δ2

        )

        2

        π

        radic|ηprimeprime|infin middotD and q log max

        (D

        q q

        ) (127)

        with good explicit constants The first term ndash usually the largest one ndash is precisely whatwe needed it is proportional to (1φ(q))x log x for q small and decreases rapidly as|δ| increases

        144 Type II or bilinear sums

        We must now bound

        S =summ

        (1 lowast microgtU )(m)sumngtV

        Λ(n)e(αmn)η(mnx)

        At this point it is convenient to assume that η is the Mellin convolution of two functionsThe multiplicative or Mellin convolution on R+ is defined by

        (η0 lowastM η1)(t) =

        int infin0

        η0(r)η1

        (t

        r

        )dr

        r

        Tao [Tao14] takes η = η2 = η1 lowastM η1 where η1 is a brutal truncation viz thefunction taking the value 2 on [12 1] and 0 elsewhere We take the same η2 in partfor comparison purposes and in part because this will allow us to use off-the-shelfestimates on the large sieve (Brutal truncations are rarely optimal in principle but asthey are very common results for them have been carefully optimized in the literature)Clearly

        S =

        int XU

        V

        summ

        sumdgtUd|m

        micro(d)

        η1

        (m

        xW

        )middotsumngeV

        Λ(n)e(αmn)η1

        ( nW

        ) dWW

        (128)

        22 CHAPTER 1 INTRODUCTION

        By Cauchy-Schwarz the integrand is at mostradicS1(UW )S2(VW ) where

        S1(UW ) =sum

        x2W ltmle x

        W

        ∣∣∣∣∣∣∣∣sumdgtUd|m

        micro(d)

        ∣∣∣∣∣∣∣∣2

        S2(VW ) =sum

        x2W lemle

        xW

        ∣∣∣∣∣∣∣sum

        max(VW2 )lenleW

        Λ(n)e(αmn)

        ∣∣∣∣∣∣∣2

        (129)

        We must bound S1(UW ) by a constant times xW We are able to do this ndash witha good constant (A careless bound would have given a multiple of (xU) log3(xU)which is much too large) First we reduce S1(W ) to an expression involving an inte-gral of sum

        r1lex

        sumr2lex

        (r1r2)=1

        micro(r1)micro(r2)

        σ(r1)σ(r2) (130)

        We can bound (130) by the use of bounds onsumnlet micro(n)n combined with the es-

        timation of infinite products by means of approximations to ζ(s) for s rarr 1+ Aftersome additional manipulations we obtain a bound for S1(UW ) whose main term isat most (3π2)(xW ) for each W and closer to 022482xW on average over W

        (This is as good a point as any to say that throughout we can use a trick in [Tao14]that allows us to work with odd values of integer variables throughout instead of lettingm or n range over all integers Here for instance if m and n are restricted to be oddwe obtain a bound of (2π2)(xW ) for individual W and 015107xW on averageoverW This is so even though we are losing some cancellation in micro by the restriction)

        Let us now bound S2(VW ) This is traditionally done by Linnikrsquos dispersionmethod However it should be clear that the thing to do nowadays is to use a largesieve and more specifically a large sieve for primes that kind of large sieve is nothingother than a tool for estimating expressions such as S2(VW ) (Incidentally eventhough we are trying to save every factor of log we can we choose not to use smallsieves at all either here or elsewhere) In order to take advantage of prime support weuse Montgomeryrsquos inequality ([Mon68] [Hux72] see the expositions in [Mon71 pp27ndash29] and [IK04 sect74]) combined with Montgomery and Vaughanrsquos large sieve withweights [MV73 (16)] following the general procedure in [MV73 (16)] We obtain abound of the form

        logW

        log W2q

        (x

        4φ(q)+qW

        φ(q)

        )W

        2(131)

        on S2(VW ) where of course we can also choose not to gain a factor of logW2q ifq is close to or greater than W

        It remains to see how to gain a factor of |δ| in the major arcs and more specificallyin S2(VW ) To explain this let us step back and take a look at what the large sieve is

        14 THE MINOR ARCS M 23

        Given a civilized function f Zrarr C Plancherelrsquos identity tells us thatintRZ

        ∣∣∣f (α)∣∣∣2 dα =

        sumn

        |f(n)|2

        The large sieve can be seen as an approximate or statistical version of this for aldquosamplerdquo of points α1 α2 αk satisfying |αi minus αj | ge β for i 6= j it tells us thatsum

        1lejlek

        ∣∣∣f (αi)∣∣∣2 le (X + βminus1)

        sumn

        |f(n)|2 (132)

        assuming that f is supported on an interval of length X Now consider α1 = α α2 = 2α α3 = 3α If α = aq then the angles

        α1 αq are well-separated ie they satisfy |αi minus αj | ge 1q and so we can apply(132) with β = 1q However αq+1 = α1 Thus if we have an outer sum oflength L gt q ndash in (129) we have an outer sum of length L = x2W ndash we needto split it into dLqe blocks of length q and so the total bound given by (132) isdLqe(X + q)

        sumn |f(n)|2 Indeed this is what gives us (131) which is fine but we

        want to do better for |δ| larger than a constantSuppose then that α = aq + δx where |δ| gt 8 say Then the angles α1

        and αq+1 are not identical |α1 minus αq+1| le q|δ|x We also see that αq+1 is at adistance at least q|δ|x from α2 α3 αq provided that q|δ|x lt 1q We can goon with αq+2 αq+3 and stop only once there is overlap ie only once we reachαm such that m|δ|x ge 1q We then give all the angles α1 αm ndash which areseparated by at least q|δ|x from each other ndash to the large sieve at the same time Wedo this dLme le dL(x|δ|q)e times and obtain a total bound of dL(x|δ|q)e(X +x|δ|q)

        sumn |f(n)|2 which for L = x2W X = W2 gives us about(

        x

        4Q

        W

        2+x

        4

        )logW

        provided thatL ge x|δ|q and as usual |αminusaq| le 1qQ This is very small comparedto the trivial bound xW8

        What happens if L lt x|δq| Then there is never any overlap we consider allangles αi and give them all together to the large sieve The total bound is (W 24 +xW2|δ|q) logW If L = x2W is smaller than say x3|δq| then we see clearlythat there are non-intersecting swarms of angles αi around the rationals aq We canthus save a factor of log (or rather (φ(q)q) log(W|δq|)) by applying Montgomeryrsquosinequality which operates by strewing displacements of given angles (or here swarmsaround angles) around the circle to the extent possible while keeping everything well-separated In this way we obtain a bound of the form

        logW

        log W|δ|q

        (x

        |δ|φ(q)+

        q

        φ(q)

        W

        2

        )W

        2

        Compare this to (131) we have gained a factor of |δ|4 and so we use this estimatewhen |δ| gt 4 (We will actually use the criterion |δ| gt 8 but since we will be working

        24 CHAPTER 1 INTRODUCTION

        with approximations of the form 2α = aq + δx the value of δ in our actual workis twice of what it is in this introduction This is a consequence of working with sumsover the odd integers as in [Tao14])

        We have succeeded in eliminating all factors of log we came across The onlyfactor of log that remains is log xUV coming from the integral

        int xUV

        dWW Thuswe want UV to be close to x but we cannot let it be too close since we also have aterm proportional to D = UV in (127) and we need to keep it substantially smallerthan x We set U and V so that UV is x

        radicqmax(4 |δ|) or thereabouts

        In the end after some work we obtain our main minor-arcs bound (Theorem 311)It states the following Let x ge x0 x0 = 216 middot 1020 Tecall that Sη(α x) =sumn Λ(n)e(αn)η(nx) and η2 = η1lowastM η1 = 4 middot1[121]lowast1[121] Let 2α = aq+δx

        q le Q gcd(a q) = 1 |δx| le 1qQ where Q = (34)x23 If q le x136 then

        |Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

        middot x+25xradicδ0q

        +2x

        δ0qmiddot Lxδ0qq + 336x56

        (133)where

        δ0 = max(2 |δ|4) Rxt = 027125 log

        (1 +

        log 4t

        2 log 9x13

        2004t

        )+ 041415

        Lxtq =q

        φ(q)

        (13

        4log t+ 782

        )+ 1366 log t+ 3755

        (134)The factor Rxt is small in practice for typical ldquodifficultrdquo values of x and δ0x it is

        less than 1 The crucial things to notice in (133) are that there is no factor of log x andthat in the main term there is only one factor of log δ0q The fact that δ0 helps us asit grows is precisely what enables us to take major arcs that get narrower and narroweras q grows

        15 Integrals over the major and minor arcsSo far we have sketched (sect13) how to estimate Sη(α x) for α in the major arcs andη based on the Gaussian eminust

        22 and also (sect14) how to bound |Sη(α x)| for α in theminor arcs and η = η2 where η2 = 4 middot 1[121] lowastM 1[121] We now must show how touse such information to estimate integrals such as the ones in (14)

        We will use two smoothing functions η+ ηlowast in the notation of (13) we set f1 =f2 = Λ(n)η+(nx) f3 = Λ(n)ηlowast(nx) and so we must give a lower bound forint

        M

        (Sη+(α x))2Sηlowast(α x)e(minusαn)dα (135)

        and an upper bound for intm

        ∣∣Sη+(α x)∣∣2 Sηlowast(α x)e(minusαn)dα (136)

        15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 25

        so that we can verify (14)The traditional approach to (136) is to boundintm

        (Sη+(α x))2Sηlowast(α x)e(minusαn)dα leintm

        ∣∣Sη+(α x)∣∣2 dα middotmax

        αisinmηlowast(α)

        lesumn

        Λ(n)2η2+

        (nx

        )middotmaxαisinm

        Sηlowast(α x)(137)

        Since the sum over n is of the order of x log x this is not log-free and so cannot begood enough we will later see how to do better Still this gets the main shape rightour bound on (136) will be proportional to |η+|22|ηlowast|1 Moreover we see that ηlowast hasto be such that we know how to bound |Sηlowast(α x)| for α isin m while our choice of η+

        is more or less free at least as far as the minor arcs are concernedWhat about the major arcs In order to do anything on them we will have to be

        able to estimate both η+(α) and ηlowast(α) for α isin M If that is the case then as weshall see we will be able to obtain that the main term of (135) is an infinite product(independent of the smoothing functions) times x2 timesint infin

        minusinfin(η+(minusα))2ηlowast(minusα)e(minusαnx)dα

        =

        int infin0

        int infin0

        η+(t1)η+(t2)ηlowast

        (nxminus (t1 + t2)

        )dt1dt2

        (138)

        In other words we want to maximize (or nearly maximize) the expression on the rightof (138) divided by |η+|22|ηlowast|1

        One way to do this is to let ηlowast be concentrated on a small interval [0 ε) Then theright side of (138) is approximately

        |ηlowast|1 middotint infin

        0

        η+(t)η+

        (nxminus t)dt (139)

        To maximize (139) we should make sure that η+(t) sim η+(nxminus t) We set x sim n2and see that we should define η+ so that it is supported on [0 2] and symmetric aroundt = 1 or nearly so this will maximize the ratio of (139) to |η+|22|ηlowast|1

        We should do this while making sure that we will know how to estimate Sη+(α x)for α isin M We know how to estimate Sη(α x) very precisely for functions of theform η(t) = g(t)eminust

        22 η(t) = g(t)teminust22 etc where g(t) is band-limited We will

        work with a function η+ of that form chosen so as to be very close (in `2 norm) to afunction η that is in fact supported on [0 2] and symmetric around t = 1

        We choose

        η(t) =

        t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 if t 6isin [0 2]

        This function is obviously symmetric (η(t) = η(2 minus t)) and vanishes to high orderat t = 0 besides being supported on [0 2]

        We set η+(t) = hR(t)teminust22 where hR(t) is an approximation to the function

        h(t) =

        t2(2minus t)3etminus

        12 if t isin [0 2]

        0 if t 6isin [0 2]

        26 CHAPTER 1 INTRODUCTION

        We just let hR(t) be the inverse Mellin transform of the truncation ofMh to an interval[minusiR iR] (Explicitly

        hR(t) =

        int infin0

        h(tyminus1)FR(y)dy

        y

        where FR(t) = sin(R log y)(π log y) that is FR is the Dirichlet kernel with a changeof variables)

        Since the Mellin transform of teminust22 is regular at s = 0 the Mellin transform

        Mη+ will be holomorphic in a neighborhood of s 0 le lt(s) le 1 even thoughthe truncation of Mh to [minusiR iR] is brutal Set R = 200 say By the fast decay ofMh(it) and the fact that the Mellin transform M is an isometry |(hR(t)minush(t))t|2 isvery small and hence so is |η+ minus η|2 as we desired

        But what about the requirement that we be able to estimate Sηlowast(α x) for bothα isin m and α isinM

        Generally speaking if we know how to estimate Sη1(α x) for some α isin RZ andwe also know how to estimate Sη2(α x) for all other α isin RZ where η1 and η2 aretwo smoothing functions then we know how to estimate Sη3(α x) for all α isin RZwhere η3 = η1 lowastM η2 or more generally ηlowast(t) = (η1 lowastM η2)(κt) κ gt 0 a constantThis is an easy exercise on exchanging the order of integration and summation

        Sηlowast(α x) =sumn

        Λ(n)e(αn)(η1 lowastM η2)(κn

        x

        )=

        int infin0

        sumn

        Λ(n)e(αn)η1(κr)η2

        ( nrx

        ) drr

        =

        int infin0

        η1(κr)Sη2(rx)dr

        r

        (140)and similarly with η1 and η2 switched Of course this trick is valid for all exponentialsums any function f(n) would do in place of Λ(n) The only caveat is that η1 (andη2) should be small very near 0 since for r small we may not be able to estimateSη2(rx) (or Sη1(rx)) with any precision This is not a problem one of our functionswill be t2eminust

        22 which vanishes to second order at 0 and the other one will be η2 =4 middot 1[121] lowastM 1[121] which has support bounded away from 0 We will set κ large(say κ = 49) so that the support of ηlowast is indeed concentrated on a small interval [0 ε)as we wanted

        Now that we have chosen our smoothing weights η+ and ηlowast we have to estimate themajor-arc integral (135) and the minor-arc integral (136) What follows can actuallybe done for general η+ and ηlowast we could have left our particular choice of η+ and ηlowastfor the end

        Estimating the major-arc integral (135) may sound like an easy task since we haverather precise estimates for Sη(α x) (η = η+ ηlowast) when α is on the major arcs wecould just replace Sη(α x) in (135) by the approximation given by (17) and (111) Itis however more efficient to express (135) as the sum of the contribution of the trivialcharacter (a sum of integrals of (η(minusδ)x)3 where η(minusδ)x comes from (111)) plus a

        15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 27

        term of the form

        (maximum ofradicq middot E(q) for q le r) middot

        intM

        ∣∣Sη+(α x)∣∣2 dα

        where E(q) = E is as in (112) plus two other terms of essentially the same form Asusual the major arcs M are the arcs around rationals aq with q le r We will soondiscuss how to bound the integral of

        ∣∣Sη+(α x)∣∣2 over arcs around rationals aq with

        q le s s arbitrary Here however it is best to estimate the integral over M using theestimate on Sη+(α x) from (17) and (111) we obtain a great deal of cancellationwith the effect that for χ non-trivial the error term in (112) appears only when it getssquared and thus becomes negligible

        The contribution of the trivial character has an easy approximation thanks to thefast decay of η We obtain that the major-arc integral (135) equals a main termC0Cηηlowastx

        2 where

        C0 =prodp|n

        (1minus 1

        (pminus 1)2

        )middotprodp-n

        (1 +

        1

        (pminus 1)3

        )

        Cηηlowast =

        int infin0

        int infin0

        η(t1)η(t2)ηlowast

        (nxminus (t1 + t2)

        )dt1dt2

        plus several small error terms We have already chosen η ηlowast and x so as to (nearly)maximize Cηηlowast

        It is time to bound the minor-arc integral (136) As we said in sect15 we must dobetter than the usual bound (137) Since our minor-arc bound (32) on |Sη(α x)|α sim aq decreases as q increases it makes sense to use partial summation togetherwith bounds onint

        ms

        |Sη+(α x)|2 =

        intMs

        |Sη+(α x)|2dαminusintM

        |Sη+(α x)|2dα

        where ms denotes the arcs around aq r lt q le s and Ms denotes the arcs around allaq q le s We already know how to estimate the integral on M How do we boundthe integral on Ms

        In order to do better than the trivial boundintMsleintRZ we will need to use the

        fact that the series (16) defining Sη+(α x) is essentially supported on prime numbersBounding the integral on Ms is closely related to the problem of bounding

        sumqles

        suma mod q

        (aq)=1

        ∣∣∣∣∣∣sumnlex

        ane(aq)

        ∣∣∣∣∣∣2

        (141)

        efficiently for s considerably smaller thanradicx and an supported on the primes

        radicx lt

        p le x This is a classical problem in the study of the large sieve The usual bound on(141) (by for instance Montgomeryrsquos inequality) has a gain of a factor of

        2eγ(log s)(log xs2)

        28 CHAPTER 1 INTRODUCTION

        relative to the bound of (x + s2)sumn |an|2 that one would get from the large sieve

        without using prime support Heath-Brown proceeded similarly to boundintMs

        |Sη+(α x)|2dα 2eγ log s

        log xs2

        intRZ|Sη+(α x)|2dα (142)

        This already gives us the gain of C(log s) log x that we absolutely need butthe constant C is suboptimal the factor in the right side of (142) should really be(log s) log x ie C should be 1 We cannot reasonably hope to obtain a factor betterthan 2(log s) log x in the minor arcs due to what is known as the parity problem insieve theory As it turns out Ramare [Ram09] had given general bounds on the largesieve that were clearly conducive to better bounds on (141) though they involved aratio that was not easy to bound in general

        I used several careful estimations (including [Ram95 Lem 34]) to reduce theproblem of bounding this ratio to a finite number of cases which I then checked bya rigorous computation This approach gave a bound on (141) with a factor of sizeclose to 2(log s) log x (This solves the large-sieve problem for s le x03 it wouldstill be worthwhile to give a computation-free proof for all s le x12minusε ε gt 0) It wasthen easy to give an analogous bound for the integral over Ms namelyint

        Ms

        |Sη+(α x)|2dα 2 log s

        log x

        intRZ|Sη+(α x)|2dα

        where can easily be made precise by replacing log s by log s + 136 and log x bylog x + c where c is a small constant Without this improvement the main theoremwould still have been proved but the required computation time would have been mul-tiplied by a factor of considerably more than e3γ = 56499

        What remained then was just to compare the estimates on (135) and (136) andcheck that (136) is smaller for n ge 1027 This final step was just bookkeeping Aswe already discussed a check for n lt 1027 is easy Thus ends the proof of the maintheorem

        16 Some remarks on computationsThere were two main computational tasks verifying the ternary conjecture for all n leC and checking the Generalized Riemann Hypothesis for modulus q le r up to acertain height

        The first task was not very demanding Platt and I verified in [HP13] that everyodd integer 5 lt n le 88 middot 1030 can be written as the sum of three primes (In theend only a check for 5 lt n le 1027 was needed) We proceeded as follows In amajor computational effort Oliveira e Silva Herzog and Pardi [OeSHP14]) had alreadychecked that the binary Goldbach conjecture is true up to 4 middot 1018 ndash that is every evennumber up to 4 middot 1018 is the sum of two primes Given that all we had to do wasto construct a ldquoprime ladderrdquo that is a list of primes from 3 up to 88 middot 1030 suchthat the difference between any two consecutive primes in the list is at least 4 and atmost 4 middot 1018 (This is a known strategy see [Sao98]) Then for any odd integer

        16 SOME REMARKS ON COMPUTATIONS 29

        5 lt n le 88 middot 1030 there is a prime p in the list such that 4 le n minus p le 4 middot 1018 + 2(Choose the largest p lt n in the ladder or if n minus that prime is 2 choose the primeimmediately under that) By [OeSHP14] (and the fact that 4 middot 1018 + 2 equals p + qwhere p = 2000000000000001301 and q = 1999999999999998701 are both prime)we can write nminus p = p1 + p2 for some primes p1 p2 and so n = p+ p1 + p2

        Building a prime ladder involves only integer arithmetic that is computer manip-ulation of integers rather than of real numbers Integers are something that computerscan handle rapidly and reliably We look for primes for our ladder only among a spe-cial set of integers whose primality can be tested deterministically quite quickly (Prothnumbers k middot 2m + 1 k lt 2m) Thus we can build a prime ladder by a rigorousdeterministic algorithm that can be (and was) parallelized trivially

        The second computation is more demanding It consists in verifying that for everyL-function L(s χ) with χ of conductor q le r = 300000 (for q even) or q le r2(for q odd) all zeroes of L(s χ) such that |=(s)| le Hq = 108q (for q odd) and|=(s)| le Hq = max(108q 200 + 75 middot 107q (for q even) lie on the critical lineAs a matter of fact Platt went up to conductor q le 200000 (or twice that for q even)[Plab] he had already gone up to conductor 100000 in his PhD thesis [Pla11] Theverification took in total about 400000 core-hours (ie the total number of processorcores used times the number of hours they ran equals 400000 nowadays a top-of-the-line processor typically has eight cores) In the end since I used only q le 150000 (ortwice that for q even) the number of hours actually needed was closer to 160000 sinceI could have made do with q le 120000 (at the cost of increasing C to 1029 or 1030) itis likely in retrospect that only about 80000 core-hours were needed

        Checking zeros of L-functions computationally goes back to Riemann (who didit by hand for the special case of the Riemann zeta function) It is also one of thethings that were tried on digital computers in their early days (by Turing [Tur53] forinstance see the exposition in [Boo06b]) One of the main issues to be careful aboutarises whenever one manipulates real numbers via a computer generally speaking acomputer cannot store an irrational number moreover while a computer can handlerationals it is really most comfortable handling just those rationals whose denomina-tors are powers of two Thus one cannot really say ldquocomputer give me the sine ofthat numberrdquo and expect a precise result What one should do if one really wants toprove something (as is the case here) is to say ldquocomputer I am giving you an intervalI = [a2k b2k] give me an interval I prime = [c2` d2`] preferably very short suchthat sin(I) sub I primerdquo This is called interval arithmetic it is arguably the easiest way to dofloating-point computations rigorously

        Processors do not do this natively and if interval arithmetic is implemented purelyon software computations can be slowed down by a factor of about 100 Fortunatelythere are ways of running interval-arithmetic computations partly on hardware partlyon software

        Incidentally there are some basic functions (such as sin) that should always be doneon software not just if one wants to use interval arithmetic but even if one just wantsreasonably precise results the implementation of transcendental functions in some ofthe most popular processors does not always round correctly and errors can accumulatequickly Fortunately this problem is already well-known and there is software thattakes care of this (Platt and I used the crlibm library [DLDDD+10])

        30 CHAPTER 1 INTRODUCTION

        Lastly there were several relatively minor computations strewn here and there inthe proof There is some numerical integration done rigorously once or twice thiswas done using a standard package based on interval arithmetic [Ned06] but most ofthe time I wrote my own routines in C (using Plattrsquos interval arithmetic package) forthe sake of speed Another kind of computation (employed much more in [Hela] thanin the somewhat more polished version of the proof given here) was a rigorous versionof a ldquoproof by graphrdquo (ldquothe maximum of a function f is clearly less than 4 because Ican see it on the screenrdquo) There is a standard way to do this (see eg [Tuc11 sect52])essentially the bisection method combines naturally with interval arithmetic as weshall describe in sect26 Yet another computation (and not a very small one) was thatinvolved in verifying a large-sieve inequality in an intermediate range (as we discussedin sect15)

        It may be interesting to note that one of the inequalities used to estimate (130) wasproven with the help of automatic quantifier elimination [HB11] Proving this inequal-ity was a very minor task both computationally and mathematically in all likelihoodit is feasible to give a human-generated proof Still it is nice to know from first-hand experience that computers can nowadays (pretend to) do something other thanjust perform numerical computations ndash and that this is already applicable in currentmathematical practice

        Chapter 2

        Notation and preliminaries

        21 General notationGiven positive integers m n we say m|ninfin if every prime dividing m also divides nWe say a positive integer n is square-full if for every prime p dividing n the squarep2 also divides n (In particular 1 is square-full) We say n is square-free if p2 - nfor every prime p For p prime n a non-zero integer we define vp(n) to be the largestnon-negative integer α such that pα|n

        When we writesumn we mean

        suminfinn=1 unless the contrary is stated As always

        Λ(n) denotes the von Mangoldt function

        Λ(n) =

        log p if n = pα for some prime p and some integer α ge 10 otherwise

        and micro denotes the Mobius function

        micro(n) =

        (minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p

        We let τ(n) be the number of divisors of an integer n ω(n) the number of primedivisors of n and σ(n) the sum of the divisors of n

        We write (a b) for the greatest common divisor of a and b If there is any riskof confusion with the pair (a b) we write gcd(a b) Denote by (a binfin) the divisorprodp|b p

        vp(a) of a (Thus a(a binfin) is coprime to b and is in fact the maximal divisorof a with this property)

        As is customary we write e(x) for e2πix We denote the Lr norm of a function fby |f |r We write Olowast(R) to mean a quantity at most R in absolute value Given a setS we write 1S for its characteristic function

        1S(x) =

        1 if x isin S0 otherwise

        Write log+ x for max(log x 0)

        31

        32 CHAPTER 2 NOTATION AND PRELIMINARIES

        22 Dirichlet characters and L functions

        Let us go over some basic terms A Dirichlet character χ Z rarr C of modulus q is acharacter χ of (ZqZ)lowast lifted to Z with the convention that χ(n) = 0 when (n q) 6= 1(In other words χ is completely multiplicative and periodic modulo q and vanisheson integers not coprime to q) Again by convention there is a Dirichlet character ofmodulus q = 1 namely the trivial character χT Z rarr C defined by χT (n) = 1 forevery n isin Z

        If χ is a character modulo q and χprime is a character modulo qprime|q such that χ(n) =χprime(n) for all n coprime to q we say that χprime induces χ A character is primitive if it isnot induced by any character of smaller modulus Given a character χ we write χlowast forthe (uniquely defined) primitive character inducing χ If a character χmod q is inducedby the trivial character χT we say that χ is principal and write χ0 for χ (provided themodulus q is clear from the context) In other words χ0(n) = 1 when (n q) = 1 andχ0(n) = 0 when (n q) = 0

        A Dirichlet L-function L(s χ) (χ a Dirichlet character) is defined as the analyticcontinuation of

        sumn χ(n)nminuss to the entire complex plane there is a pole at s = 1 if χ

        is principalA non-trivial zero of L(s χ) is any s isin C such that L(s χ) = 0 and 0 lt lt(s) lt 1

        (In particular a zero at s = 0 is called ldquotrivialrdquo even though its contribution can bea little tricky to work out The same would go for the other zeros with lt(s) = 0occuring for χ non-primitive though we will avoid this issue by working mainly withχ primitive) The zeros that occur at (some) negative integers are called trivial zeros

        The critical line is the line lt(s) = 12 in the complex plane Thus the generalizedRiemann hypothesis for Dirichlet L-functions reads for every Dirichlet character χall non-trivial zeros of L(s χ) lie on the critical line Verifiable finite versions ofthe generalized Riemann hypothesis generally read for every Dirichlet character χ ofmodulus q le Q all non-trivial zeros of L(s χ) with |=(s)| le f(q) lie on the criticalline (where f Zrarr R+ is some given function)

        23 Fourier transforms and exponential sums

        The Fourier transform on R is normalized here as follows

        f(t) =

        int infinminusinfin

        e(minusxt)f(x)dx

        The trivial bound is |f |infin le |f |1 If f is compactly supported (or of fast enoughdecay as t 7rarr plusmninfin) and piecewise continuous f(t) = f prime(t)(2πit) by integration byparts Iterating we obtain that if f is of fast decay and differentiable k times outsidefinitely many points then

        f(t) = Olowast

        (|f (k)|infin(2πt)k

        )= Olowast

        (|f (k)|1(2πt)k

        ) (21)

        23 FOURIER TRANSFORMS AND EXPONENTIAL SUMS 33

        Thus for instance if f is compactly supported continuous and piecewise C1 then fdecays at least quadratically

        It could happen that |f (k)|1 = infin in which case (21) is trivial (but not false) Inpractice we require f (k) isin L1 In a typical situation f is differentiable k times exceptat x1 x2 xk where it is differentiable only (k minus 2) times the contribution of xi(say) to |f (k)|1 is then | limxrarrx+

        if (kminus1)(x)minus limxrarrxminusi

        f (kminus1)(x)|The following bound is standard (see eg [Tao14 Lemma 31]) for α isin RZ and

        f Rrarr C compactly supported and piecewise continuous∣∣∣∣∣sumnisinZ

        f(n)e(αn)

        ∣∣∣∣∣ le min

        (|f |1 +

        1

        2|f prime|1

        12 |fprime|1

        | sin(πα)|

        ) (22)

        (The first bound follows fromsumnisinZ |f(n)| le |f |1 + (12)|f prime|1 which in turn is

        a quick consequence of the fundamental theorem of calculus the second bound isproven by summation by parts) The alternative bound (14)|f primeprime|1| sin(πα)|2 givenin [Tao14 Lemma 31] (for f continuous and piecewise C1) can usually be improvedby the following estimate

        Lemma 231 Let f Rrarr C be compactly supported continuous and piecewise C1Then ∣∣∣∣∣sum

        nisinZf(n)e(αn)

        ∣∣∣∣∣ le 14 |f primeprime|infin

        (sinπα)2(23)

        for every α isin R

        As usual the assumption of compact support could easily be relaxed to an assump-tion of fast decay

        Proof By the Poisson summation formulainfinsum

        n=minusinfinf(n)e(αn) =

        infinsumn=minusinfin

        f(nminus α)

        Since f(t) = f prime(t)(2πit)

        infinsumn=minusinfin

        f(nminus α) =

        infinsumn=minusinfin

        f prime(nminus α)

        2πi(nminus α)=

        infinsumn=minusinfin

        f primeprime(nminus α)

        (2πi(nminus α))2

        By Eulerrsquos formula π cot sπ = 1s+suminfinn=1(1(n+ s)minus 1(nminus s))

        infinsumn=minusinfin

        1

        (n+ s)2= minus(π cot sπ)prime =

        π2

        (sin sπ)2 (24)

        Hence∣∣∣∣∣infinsum

        n=minusinfinf(nminus α)

        ∣∣∣∣∣ le |f primeprime|infininfinsum

        n=minusinfin

        1

        (2π(nminus α))2= |f primeprime|infin middot

        1

        (2π)2middot π2

        (sinαπ)2

        34 CHAPTER 2 NOTATION AND PRELIMINARIES

        The trivial bound |f primeprime|infin le |f primeprime|1 applied to (23) recovers the bound in [Tao14Lemma 31] In order to do better we will give a tighter bound for |f primeprime|infin in AppendixB when f is equal to one of our main smoothing functions (f = η2)

        Integrals of multiples of f primeprime (in particular |f primeprime|1 and f primeprime) can still be made senseof when f primeprime is undefined at a finite number of points provided f is understood as adistribution (and f prime has finite total variation) This is the case in particular for f = η2

        When we need to estimatesumn f(n) precisely we will use the Poisson summation

        formula sumn

        f(n) =sumn

        f(n)

        We will not have to worry about convergence here since we will apply the Poissonsummation formula only to compactly supported functions f whose Fourier transformsdecay at least quadratically

        24 Mellin transformsThe Mellin transform of a function φ (0infin)rarr C is

        Mφ(s) =

        int infin0

        φ(x)xsminus1dx (25)

        If φ(x)xσminus1 is in `1 with respect to dt (ieintinfin

        0|φ(x)|xσminus1dx ltinfin) then the Mellin

        transform is defined on the line σ+ iR Moreover if φ(x)xσminus1 is in `1 for σ = σ1 andfor σ = σ2 where σ2 gt σ1 then it is easy to see that it is also in `1 for all σ isin (σ1 σ2)and that moreover the Mellin transform is holomorphic on s σ1 lt lt(s) lt σ2 Wethen say that s σ1 lt lt(s) lt σ2 is a strip of holomorphy for the Mellin transform

        The Mellin transform becomes a Fourier transform (of η(eminus2πv)eminus2πvσ) by meansof the change of variables x = eminus2πv We thus obtain for example that the Mellintransform is an isometry in the sense thatint infin

        0

        |f(x)|2x2σ dx

        x=

        1

        int infinminusinfin|Mf(σ + it)|2dt (26)

        Recall that in the case of the Fourier transform for |f |2 = |f |2 to hold it is enoughthat f be in `1 cap `2 This gives us that for (26) to hold it is enough that f(x)xσminus1 bein `1 and f(x)xσminus12 be in `2 (again with respect to dt in both cases)

        We write f lowastM g for the multiplicative or Mellin convolution of f and g

        (f lowastM g)(x) =

        int infin0

        f(w)g( xw

        ) dww (27)

        In generalM(f lowastM g) = Mf middotMg (28)

        25 BOUNDS ON SUMS OF micro AND Λ 35

        and

        M(f middot g)(s) =1

        2πi

        int σ+iinfin

        σminusiinfinMf(z)Mg(sminus z)dz [GR94 sect1732] (29)

        provided that z and sminus z are within the strips on which Mf and Mg (respectively) arewell-defined

        We also have several useful transformation rules just as for the Fourier transformFor example

        M(f prime(t))(s) = minus(sminus 1) middotMf(sminus 1)

        M(tf prime(t))(s) = minuss middotMf(s)

        M((log t)f(t))(s) = (Mf)prime(s)

        (210)

        (as in eg [BBO10 Table 111])Let

        η2 = (2 middot 1[121]) lowastM (2 middot 1[121])

        Since (see eg [BBO10 Table 113] or [GR94 sect1643])

        (MI[ab])(s) =bs minus as

        s

        we see that

        Mη2(s) =

        (1minus 2minuss

        s

        )2

        Mη4(s) =

        (1minus 2minuss

        s

        )4

        (211)

        Let fz = eminuszt where lt(z) gt 0 Then

        (Mf)(s) =

        int infin0

        eminuszttsminus1dt =1

        zs

        int infin0

        eminustdt

        =1

        zs

        int zinfin

        0

        eminusuusminus1du =1

        zs

        int infin0

        eminusttsminus1dt =Γ(s)

        zs

        where the next-to-last step holds by contour integration and the last step holds by thedefinition of the Gamma function Γ(s)

        25 Bounds on sums of micro and Λ

        We will need some simple explicit bounds on sums involving the von Mangoldt func-tion Λ and the Moebius function micro In non-explicit work such sums are usuallybounded using the prime number theorem or rather using the properties of the zetafunction ζ(s) underlying the prime number theorem Here however we need robustfully explicit bounds valid over just about any range

        For the most part we will just be quoting the literature supplemented with somecomputations when needed The proofs in the literature are sometimes based on prop-erties of ζ(s) and sometimes on more elementary facts

        36 CHAPTER 2 NOTATION AND PRELIMINARIES

        First let us see some bounds involving Λ The following bound can be easilyderived from [RS62 (323)] supplemented by a quick calculation of the contributionof powers of primes p lt 32 sum

        nlex

        Λ(n)

        nle log x (212)

        We can derive a bound in the other direction from [RS62 (321)] (for x gt 1000adding the contribution of all prime powers le 1000) and a numerical verification forx le 1000 sum

        nlex

        Λ(n)

        nge log xminus log

        3radic2 (213)

        We also use the following older bounds

        1 By the second table in [RR96 p 423] supplemented by a computation for2 middot 106 le V le 4 middot 106 sum

        nley

        Λ(n) le 10004y (214)

        for y ge 2 middot 106

        2 sumnley

        Λ(n) lt 103883y (215)

        for every y gt 0 [RS62 Thm 12]

        For all y gt 663 sumnley

        Λ(n)n lt 103884y2

        2 (216)

        where we use (215) and partial summation for y gt 200000 and a computation for663 lt y le 200000 Using instead the second table in [RR96 p 423] together withcomputations for small y lt 107 and partial summation we get that

        sumnley

        Λ(n)n lt 10008y2

        2(217)

        for y gt 16 middot 106Similarly sum

        nley

        Λ(n)radicn

        lt 2 middot 10004radicy (218)

        for all y ge 1It is also true that sum

        y2ltpley

        (log p)2 le 1

        2y(log y) (219)

        25 BOUNDS ON SUMS OF micro AND Λ 37

        for y ge 117 this holds for y ge 2 middot 758699 by [RS75 Cor 2] (applied to x = yx = y2 and x = 2y3) and for 117 le y lt 2 middot 758699 by direct computation

        Now let us see some estimates on sums involving micro The situation here is lesssatisfactory than for sums involving Λ The main reason is that the complex-analyticapproach to estimating

        sumnleN micro(n) would involve 1ζ(s) rather than ζ prime(s)ζ(s) and

        thus strong explicit bounds on the residues of 1ζ(s) would be needed Thus explicitestimates on sums involving micro are harder to obtain than estimates on sums involving ΛThis is so even though analytic number theorists are generally used (from the habit ofnon-explicit work) to see the estimation of one kind of sum or the other as essentiallythe same task

        Fortunately in the case of sums of the typesumnlex micro(n)n for x arbitrary (a type of

        sum that will be rather important for us) all we need is a saving of (log n) or (log n)2

        on the trivial bound This is provided by the following

        1 (Granville-Ramare [GR96] Lemma 102)∣∣∣∣∣∣sum

        nlexgcd(nq)=1

        micro(n)

        n

        ∣∣∣∣∣∣ le 1 (220)

        for all x q ge 1

        2 (Ramare [Ram13] cf El Marraki [EM95] [EM96])∣∣∣∣∣∣sumnlex

        micro(n)

        n

        ∣∣∣∣∣∣ le 003

        log x(221)

        for x ge 11815

        3 (Ramare [Ramb]) sumnlexgcd(nq)=1

        micro(n)

        n= Olowast

        (1

        log xqmiddot 4

        5

        q

        φ(q)

        )(222)

        for all x and all q le xsumnlexgcd(nq)=1

        micro(n)

        nlog

        x

        n= Olowast

        (100303

        q

        φ(q)

        )(223)

        for all x and all q

        Improvements on these bounds would lead to improvements on type I estimates butnot in what are the worst terms overall at this point

        A computation carried out by the author has proven the following inequality for allreal x le 1012 ∣∣∣∣∣∣

        sumnlex

        micro(n)

        n

        ∣∣∣∣∣∣ leradic

        2

        x(224)

        38 CHAPTER 2 NOTATION AND PRELIMINARIES

        The computation was conducted rigorously by means of interval arithmetic For thesake of verification we record that

        542625 middot 10minus8 lesum

        nle1012

        micro(n)

        nle 542898 middot 10minus8

        Computations also show that the stronger bound∣∣∣∣∣∣sumnlex

        micro(n)

        n

        ∣∣∣∣∣∣ le 1

        2radicx

        holds for all 3 le x le 7727068587 but not for x = 7727068588minus εEarlier numerical work carried out by Olivier Ramare [Ram14] had shown that

        (224) holds for all x le 1010

        26 Interval arithmetic and the bisection methodInterval arithmetic has at its basic data type intervals of the form I = [a2` b2`]where a b ` isin Z and a le b Say we have a real number x and we want to know sin(x)In general we cannot represent x in a computer in part because it may have no finitedescription The best we can do is to construct an interval of the form I = [a2` b2`]in which x is contained

        What we ask of a routine in an interval-arithmetic package is to construct an intervalI prime = [aprime2`

        prime bprime2`

        prime] in which sin(I) is contained (In practice this is done partly in

        software by means of polynomial approximations to sin with precise error terms andpartly in hardware by means of an efficient usage of rounding conventions) This givesus in effect a value for sin(x) (namely (aprime+ bprime)2`

        prime+1) and a bound on the error term(namely (bprime minus aprime)2`prime+1)

        There are several implementations of interval arithmetic available We will almostalways use D Plattrsquos implementation [Pla11] of double-precision interval arithmeticbased on Lambovrsquos [Lam08] ideas (At one point we will use the PROFILBIAS inter-val arithmetic package [Knu99] since it underlies the VNODE-LP [Ned06] packagewhich we use to bound an integral)

        The bisection method is a particularly simple method for finding maxima and min-ima of functions as well as roots It combines rather nicely with interval arithmeticwhich makes the method rigorous We follow an implementation based on [Tuc11sect52] Let us go over the basic ideas

        Let us use the bisection method to find the minima (say) of a function f on acompact interval I0 (If the interval is non-compact we generally apply the bisectionmethod to a compact sub-interval and use other tools eg power-series expansionsin the complement) The method proceeds by splitting an interval into two repeatedlydiscarding the halfs where the minimum cannot be found More precisely if we im-plement it by interval arithmetic it proceeds as follows First in an optional initialstep we subdivide (if necessary) the interval I0 into smaller intervals Ik to which thealgorithm will actually be applied For each k interval arithmetic gives us a lower

        26 INTERVAL ARITHMETIC AND THE BISECTION METHOD 39

        bound rminusk and an upper bound r+k on f(x) x isin Ik here rminusk and r+

        k are both ofthe form a2` a ` isin Z Let m0 be the minimum of r+

        k over all k We can discardall the intervals Ik for which rminusk gt m0 Then we apply the main procedure startingwith i = 1 split each surviving interval into two equal halves recompute the lower andupper bound on each half definemi as before to be the minimum of all upper boundsand discard again the intervals on which the lower bound is larger than mi increase iby 1 We repeat the main procedure as often as needed In the end we obtain that theminimum is no smaller than the minimum of the lower bounds (call them (r(i))minusk ) onall surviving intervals I(i)

        k Of course we also obtain that the minimum (or minima ifthere is more than one) must lie in one of the surviving intervals

        It is easy to see how the same method can be applied (with a trivial modification)to find maxima or (with very slight changes) to find the roots of a real-valued functionon a compact interval

        40 CHAPTER 2 NOTATION AND PRELIMINARIES

        Part I

        Minor arcs

        41

        Chapter 3

        Introduction

        The circle method expresses the number of solutions to a given problem in terms ofexponential sums Let η R+ rarr C be a smooth function Λ the von Mangoldt function(defined as in (15)) and e(t) = e2πit The estimation of exponential sums of the type

        Sη(α x) =sumn

        Λ(n)e(αn)η(nx) (31)

        where α isin RZ already lies at the basis of Hardy and Littlewoodrsquos approach to theternary Goldbach problem by means of the circle method [HL22] The division of thecircle RZ into ldquomajor arcsrdquo and ldquominor arcsrdquo goes back to Hardy and Littlewoodrsquosdevelopment of the circle method for other problems As they themselves noted as-suming GRH means that for the ternary Goldbach problem all of the circle can bein effect subdivided into major arcs ndash that is under GRH (31) can be estimated withmajor-arc techniques for α arbitrary They needed to make such an assumption pre-cisely because they did not yet know how to estimate Sη(α x) on the minor arcs

        Minor-arc techniques for Goldbachrsquos problem were first developed by Vinogradov[Vin37] These techniques make it possible to work without GRH The main obstacleto a full proof of the ternary Goldbach conjecture since then has been that in spite ofgradual improvements minor-arc bounds have simply not been strong enough

        As in all work to date our aim will be to give useful upper bounds on (31) forα in the minor bounds rather than the precise estimates that are typical of the major-arc case We will have to give upper bounds that are qualitatively stronger than thoseknown before (In Part III we will also show how to use them more efficiently)

        Our main challenge will be to give a good upper bound whenever q is larger than aconstant r Here ldquosufficiently goodrdquo means ldquosmaller than the trivial bound divided bya large constant and getting even smaller quickly as q growsrdquo Our bound must also begood for α = aq + δx where q lt r but δ is large (Such an α may be said to lie onthe tail (δ large) of a major arc (q small))

        Of course all expressions must be explicit and all constants in the leading terms ofthe bound must be small Still the main requirement is a qualitative one For instancewe know in advance that a single factor of log x would be the end of us That is we

        43

        44 CHAPTER 3 INTRODUCTION

        know that if there is a single term of the form say (x log x)q and the trivial boundis about x we are lost (x log x)q is greater than x for x large and q constant

        The quality of the results here is due to several new ideas of general applicabilityIn particular sect51 introduces a way to obtain cancellation from Vaughanrsquos identityVaughanrsquos identity is a two-log gambit in that it introduces two convolutions (each ofthem at a cost of log) and offers a great deal of flexibility in compensation One of theideas presented here is that at least one of two logs can be successfully recovered afterhaving been given away in the first stage of the proof This reduces the cost of the useof this basic identity in this and presumably many other problems

        There are several other improvements that make a qualitative difference see thediscussions at the beginning of sect4 and sect5 Considering smoothed sums ndash now a com-mon idea ndash also helps (Smooth sums here go back to Hardy-Littlewood [HL22] ndash bothin the general context of the circle method and in the context of Goldbachrsquos ternaryproblem In recent work on the problem they reappear in [Tao14])

        31 ResultsThe main bound we are about to see is essentially proportional to ((log q)

        radicφ(q)) middot x

        The term δ0 serves to improve the bound when we are on the tail of an arc

        Theorem 311 Let x ge x0 x0 = 216 middot 1020 Let Sη(α x) be as in (31) with ηdefined in (34) Let 2α = aq + δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23 If q le x136 then

        |Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)

        middot x+25xradicδ0q

        +2x

        δ0qmiddot Lxδ0qq + 336x56

        (32)where

        δ0 = max(2 |δ|4) Rxt = 027125 log

        (1 +

        log 4t

        2 log 9x13

        2004t

        )+ 041415

        Lxtq =q

        φ(q)

        (13

        4log t+ 782

        )+ 1366 log t+ 3755

        (33)If q gt x136 then

        |Sη(α x)| le 0276x56(log x)32 + 1234x23 log x

        The factor Rxt is small in practice for instance for x = 1025 and δ0q = 5 middot 105

        (typical ldquodifficultrdquo values) Rxδ0q equals 059648 The classical choice1 for η in (31) is η(t) = 1 for t le 1 η(t) = 0 for t gt 1 which

        of course is not smooth or even continuous We use

        η(t) = η2(t) = 4 max(log 2minus | log 2t| 0) (34)

        1Or more precisely the choice made by Vinogradov and followed by most of the literature since himHardy and Littlewood [HL22] worked with η(t) = eminust

        32 COMPARISON TO EARLIER WORK 45

        as in Tao [Tao14] in part for purposes of comparison (This is the multiplicative con-volution of the characteristic function of an interval with itself) Nearly all work shouldbe applicable to any other sufficiently smooth function η of fast decay It is importantthat η decay at least quadratically

        We are not forced to use the same smoothing function as in Part II and we do notAs was explained in the introduction the simple technique (140) allows us to workwith one smoothing function on the major arcs and with another one on the minor arcs

        32 Comparison to earlier workTable 31 compares the bounds for the ratio |Sη(aq x)|x given by this paper and by[Tao14][Thm 13] for x = 1027 and different values of q We are comparing worstcases φ(q) as small as possible (q divisible by 2 middot 3 middot 5 middot middot middot ) in the result here and qdivisible by 4 (implying 4α sim a(q4)) in Taorsquos result The main term in the result inthis paper improves slowly with increasing x the results in [Tao14] worsen slowly withincreasing x The qualitative gain with respect to the main term in [Tao14 (110)] is inthe order of log(q)

        radicφ(q)q Notice also that the bounds in [Tao14] are not log-free in

        [Tao14 (110)] there is a term proportional to x(log x)2q This becomes larger thanthe trivial bound x for x very large

        The results in [DR01] are unfortunately worse than the trivial bound in the rangecovered by Table 31 Ramarersquos results ([Ram10 Thm 3] [Ramc Thm 6]) are notapplicable within the range since neither of the conditions log q le (150)(log x)13q le x148 is satisfied Ramarersquos bound in [Ramc Thm 6] is∣∣∣∣∣∣

        sumxltnle2x

        Λ(n)e(anq)

        ∣∣∣∣∣∣ le 13000

        radicq

        φ(q)x (35)

        for 20 le q le x148 We should underline that while both the constant 13000 and thecondition q le x148 keep (35) from being immediately useful in the present context(35) is asymptotically better than the results here as q rarr infin (Indeed qualitativelyspeaking the form of (35) is the best one can expect from results derived by the familyof methods stemming from Vinogradovrsquos work) There is also unpublished work byRamare (ca 1993) with different constants for q (log x log log x)4

        33 Basic setupIn the minor-arc regime the first step in estimating an exponential sum on the primesgenerally consists in the application of an identity expressing the von Mangoldt func-tion Λ(n) in terms of a sum of convolutions of other functions

        331 Vaughanrsquos identityWe recall Vaughanrsquos identity [Vau77b]

        Λ = microleU lowast log +microleU lowast ΛleV lowast 1 + microgtU lowast ΛgtV lowast 1 + ΛleV (36)

        46 CHAPTER 3 INTRODUCTION

        q0|Sη(aqx)|

        x HH |Sη(aqx)|x Tao

        105 004661 03447515 middot 105 003883 02883625 middot 105 003098 0231945 middot 105 002297 01741675 middot 105 001934 014775106 001756 013159107 000690 005251

        Table 31 Worst-case upper bounds on xminus1|Sη(a2q x)| for q ge q0 |δ| le 8 x =1027 The trivial bound is 1

        where 1 is the constant function 1 and where we write

        flez(n) =

        f(n) if n le z0 if n gt z

        fgtz(n) =

        0 if n le zf(n) if n gt z

        Here f lowast g denotes the Dirichlet convolution (f lowast g)(n) =sumd|n f(d)g(nd) We can

        set the values of U and V however we wishVaughanrsquos identity is essentially a consequence of the Mobius inversion formula

        (1 lowast micro)(n) =

        1 if n = 10 otherwise

        (37)

        Indeed by (37)

        ΛgtV (n) =sumdm|n

        micro(d)ΛgtV (m)

        =sumdm|n

        microleU (d)ΛgtV (m) +sumdm|n

        microgtU (d)ΛgtV (m)

        Applying to this the trivial equality ΛgtV = Λ minus ΛleV as well as the simple fact that1 lowast Λ = log we obtain that

        ΛgtV (n) =sumd|n

        microleU (d) log(nd)minussumdm|n

        microleU (d)ΛleV (m) +sumdm|n

        microgtU (d)ΛgtV (m)

        By ΛV = ΛgtV + ΛgeV we conclude that Vaughanrsquos identity (36) holdsApplying Vaughanrsquos identity we easily get that for any function η R rarr R any

        completely multiplicative function f Z+ rarr C and any x gt 0 U V ge 0sumn

        Λ(n)f(n)e(αn)η(nx) = SI1 minus SI2 + SII + S0infin (38)

        33 BASIC SETUP 47

        where

        SI1 =summleU

        micro(m)f(m)sumn

        (log n)e(αmn)f(n)η(mnx)

        SI2 =sumdleV

        Λ(d)f(d)summleU

        micro(m)f(m)sumn

        e(αdmn)f(n)η(dmnx)

        SII =summgtU

        f(m)

        sumdgtUd|m

        micro(d)

        sumngtV

        Λ(n)e(αmn)f(n)η(mnx)

        S0infin =sumnleV

        Λ(n)e(αn)f(n)η(nx)

        (39)

        We will use the function

        f(n) =

        1 if gcd(n v) = 10 otherwise

        (310)

        where v is a small positive square-free integer (Our final choice will be v = 2) Then

        Sη(x α) = SI1 minus SI2 + SII + S0infin + S0w (311)

        where Sη(x α) is as in (31) and

        S0v =sumn|v

        Λ(n)e(αn)η(nx)

        The sums SI1 SI2 are called ldquoof type Irdquo the sum SII is called ldquoof type IIrdquo (orbilinear) (The not-all-too colorful nomenclature goes back to Vinogradov) The sumS0infin is in general negligible for our later choice of V and η it will be in fact 0 Thesum S0v will be negligible as well

        As we already discussed in the introduction Vaughanrsquos identity is highly flexible(in that we can choose U and V at will) but somewhat inefficient in practice (in that atrivial estimate for the right side of (311) is actually larger than a trivial estimate forthe left side of (311)) Some of our work will consist in regaining part of what is givenup when we apply Vaughanrsquos identity

        332 An alternative route

        There is an alternative route ndash namely to use a less sacrificial though also more in-flexible identity While this was not in the end the route that was followed let usnevertheless discuss it in some detail in part so that we can understand to what extentit was in retrospect viable and in part so as to see how much of the work we willundertake is really more or less independent of the particular identity we choose

        48 CHAPTER 3 INTRODUCTION

        Since ζ prime(s)ζ(s) =sumn Λ(n)nminuss and(

        ζ prime(s)

        ζ(s)

        )(2)

        =

        (ζ primeprime(s)

        ζ(s)minus (ζ prime(s))

        2

        ζ(s)2

        )prime

        =ζ(3)(s)

        ζ(s)minus 3ζ primeprime(s)ζ prime(s)

        ζ(s)2+ 2

        (ζ prime(s)

        ζ(s)

        )3

        =ζ(3)(s)

        ζ(s)minus 3

        (ζ prime(s)

        ζ(s)

        )primemiddot ζprime(s)

        ζ(s)minus(ζ prime(s)

        ζ(s)

        )3

        (312)

        we can see comparing coefficients that

        Λ middot log2 = micro lowast log3minus3(Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (313)

        as was stated by Bombieri in [Bom76]Here the term microlowast log3 is of the same kind as the term microleU lowast log we have to estimate

        if we use Vaughanrsquos identity though the fact that there is no truncation at U means thatone of the error terms will get larger ndash it will be proportional to x in fact if we sumfrom 1 to x The trivial upper bound on the sum of Λ middot log2 from 1 to x is x(log x)2thus an error term of size x is barely acceptable

        In general when we have a double or triple sum we are not very good at gettingbetter than trivial bounds in ranges in which all but one of the variables are very smallThis is the source of the large error term that appears in the sum involving micro lowast log3

        because we are no longer truncating as for microleU lowast log It will also be the source of otherlarge error terms including one that would be too large ndash namely the one coming fromthe term (Λ middot log) lowast Λ when the variable of Λ middot log is large and that of Λ is small (Thetrivial bound on that range is x log x)

        We avoid this problem by substituting the identity Λ middot log = micro lowast log2minusΛ lowastΛ inside(313)

        Λ middot log2 = micro lowast log3minus3(micro lowast log2) lowast Λ + 2Λ lowast Λ lowast Λ (314)

        (We could also have got this directly from the next-to-last line in (312)) When thevariable of Λ in (micro lowast log2) lowast Λ is small the variable of micro lowast log2 is large and we canestimate the resulting term using the same techniques as for micro lowast log3

        It is easy to see that we can in fact mix (313) and (314)

        Λ middot log2 = micro lowast log3minus3((Λ middot log) lowast ΛgtV + (micro lowast log2) lowast ΛleV

        )+ (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ)

        (315)

        for V arbitrary Note here that there is some cancellation in the last term writing

        F3V (n) = (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ) (n) (316)

        we can check easily that for n = p1p2p3 square-free with V 3 lt n we have

        F3V (n) =

        minus6 log p1 log p2 log p3 if all pi gt V 0 if p1 lt p2 le V lt p36 log p1 log p2 log p3 if p1 le V lt p2 lt p312 log p1 log p2 log p3 if all pi le V

        33 BASIC SETUP 49

        In contrast for n square-free minusΛ lowast Λ lowast Λ(n) is minus6 if n is of the form p1p2p3 and 0otherwise

        We may find it useful to take aside two large terms that may need to be boundedtrivially namely micro lowast log3

        leu and (Λ middot log)leu lowastΛgtV where u will be a small parameter(We can let for instance u = 3) We conclude that

        Λ middot log2 = FI1u(n)minus 3FI2Vu(n)minus 3FIIVu(n) + F3V (n) + F0Vu(n) (317)

        whereFI1u = micro lowast log3

        gtu

        FI2Vu = (micro lowast log2) lowast ΛleV

        FIIVu(n) = (Λ middot log)gtu lowast ΛgtV

        F0Vu(n) = micro lowast log3leuminus3(Λ middot log)leu lowast ΛgtV

        and F3V is as in (316)In the bulk of the present work ndash in particular in all steps that are part of the proof

        of Theorem 311 or the Main Theorem ndash we will use Vaughanrsquos identity rather than(317) This choice was made while the proof was still underway it was due mainlyto back-of-the-envelope estimates that showed that the error terms could be too largeif (314) was used Of course this might have been the case with Vaughanrsquos identityas well but the fact that the parameters U V there have a large effect on the outcomemeant that one could hope to improve on insufficient estimates in part by adjusting Uand V without losing all previous work (This is what was meant by the ldquoflexibilityrdquoof Vaughanrsquos identity)

        The question remains can one prove ternary Goldbach using (317) rather thanVaughanrsquos identity This seems likely If so which proof would be more complicatedThis is not clear

        There are large parts of the work that are the essentially the same in both cases

        bull estimates for sums involving microleU lowast logk (ldquotype Irdquo)

        bull estimates for sums involving Λgtu lowast ΛgtV and the like (ldquotype IIrdquo)

        Trilinear sums ie sums involving ΛlowastΛlowastΛ can be estimated much like bilinear sumsie sums involving Λ lowast Λ

        There are also challenges that appear only for Vaughanrsquos identity and others thatappear only for (317) An example of a challenge that is successfully faced in the mainproof but does not appear if (317) is used consists in bounding sums of type

        sumUltmlexW

        sumdgtUd|m

        micro(d)

        2

        (In sect51 we will be able to bound sums of this type by a constant times xW ) Like-wise large tail terms that have to be estimated trivially seem unavoidable in (317)(The choice of a parameter u gt 1 as above is meant to alleviate the problem)

        50 CHAPTER 3 INTRODUCTION

        In the end losing a factor of about log xUV seems inevitable when one usesVaughanrsquos identity but not when one uses (317) Another reason why a full treatmentbased on (317) would also be worthwhile is that it is a somewhat less familiar andarguably under-used identity and deserves more exploration With these commentswe close the discussion of (317) we will henceforth use Vaughanrsquos identity

        Chapter 4

        Type I sums

        Here we must bound sums of the basic typesummleD

        micro(m)sumn

        e(αmn)η(mnx

        )and variations thereof There are three main improvements in comparison to standardtreatments

        1 The terms with m divisible by q get taken out and treated separately by analyticmeans This all but eliminates what would otherwise be the main term

        2 The other terms get handled by improved estimates on trigonometric sums Forlarge m the improvements have a substantial total effect ndash more than a constantfactor is gained

        3 The ldquoerrorrdquo term δx = α minus aq is used to our advantage This happens boththrough the Poisson summation formula and through the use of two alternativeapproximations to the same number α

        The fact that a continuous weight η is used (ldquosmoothingrdquo) is a difference with respectto the classical literature ([Vin37] and what followed) but not with respect to morerecent work (including [Tao14]) using smooth or continuous weights is an idea thathas become commonplace in analytic number theory even though it is not consistentlyapplied The improvements due to smoothing in type I are both relatively minor andessentially independent of the improvements due to (1) and (3) The use of a contin-uous weight combines nicely with (2) but the ideas given here would give qualitativeimprovements in the treatment of trigonometric sums even in the absence of smoothing

        41 Trigonometric sumsThe following lemmas on trigonometric sums improve on the best Vinogradov-typelemmas in the literature (By this we mean results of the type of Lemma 8a and

        51

        52 CHAPTER 4 TYPE I SUMS

        Lemma 8b in [Vin04 Ch I] See in particular the work of Daboussi and Rivat [DR01Lemma 1]) The main idea is to switch between different types of approximation withinthe sum rather than just choosing between bounding all terms either trivially (by A)or non-trivially (by C| sin(παn)|2) There will also1 be improvements in our appli-cations stemming from the fact that Lemmas 411 and Lemma 412 take quadratic(| sin(παn)|2) rather than linear (| sin(παn)|) inputs (These improved inputs comefrom the use of smoothing elsewhere)

        Lemma 411 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Then for anyAC ge 0

        sumyltnley+q

        min

        (A

        C

        | sin(παn)|2

        )le min

        (2A+

        6q2

        π2C 3A+

        4q

        π

        radicAC

        ) (41)

        Proof We start by letting m0 = byc + b(q + 1)2c j = n minusm0 so that j ranges inthe interval (minusq2 q2] We write

        αn =aj + c

        q+ δ1(j) + δ2 mod 1

        where |δ1(j)| and |δ2| are both le 12q we can assume δ2 ge 0 The variable r =aj + c mod q occupies each residue class mod p exactly once

        One option is to bound the terms corresponding to r = 0minus1 by A each and allthe other terms by C| sin(παn)|2 (This can be seen as the simple case it will takeus about a page just because we should estimate all sums and all terms here with greatcare ndash as in [DR01] only more so)

        The terms corresponding to r = minusk and r = k minus 1 (2 le k le q2) contribute atmost

        1

        sin2 πq (k minus 1

        2 minus qδ2)+

        1

        sin2 πq (k minus 3

        2 + qδ2)le 1

        sin2 πq

        (k minus 1

        2

        ) +1

        sin2 πq

        (k minus 3

        2

        ) since x 7rarr 1

        (sin x)2 is convex-up on (0infin) Hence the terms with r 6= 0 1 contribute atmost

        1(sin π

        2q

        )2 + 2sum

        2lerle q2

        1(sin π

        q (r minus 12))2 le

        1(sin π

        2q

        )2 + 2

        int q2

        1

        1(sin π

        q x)2

        where we use again the convexity of x 7rarr 1(sinx)2 (We can assume q gt 2 asotherwise we have no terms other than r = 0 1) Nowint q2

        1

        1(sin π

        q x)2 dx =

        q

        π

        int π2

        πq

        1

        (sinu)2du =

        q

        πcot

        π

        q

        1This is a change with respect to the first version of the preprint [Helb] The version of Lemma 411there has however the advantage of being immediately comparable to results in the literature

        41 TRIGONOMETRIC SUMS 53

        Hence sumyltnley+q

        min

        (A

        C

        (sinπαn)2

        )le 2A+

        C(sin π

        2q

        )2 + C middot 2q

        πcot

        π

        q

        Now by [AS64 (4368)] and [AS64 (4370)] for t isin (minusπ π)

        t

        sin t= 1 +

        sumkge0

        a2k+1t2k+2 = 1 +

        t2

        6+

        t cot t = 1minussumkge0

        b2k+1t2k+2 = 1minus t2

        3minus t4

        45minus

        (42)

        where a2k+1 ge 0 b2k+1 ge 0 Thus for t isin [0 t0] t0 lt π(t

        sin t

        )2

        = 1 +t2

        3+ c0(t)t4 le 1 +

        t2

        3+ c0(t0)t4 (43)

        where

        c0(t) =1

        t4

        ((t

        sin t

        )2

        minus(

        1 +t2

        3

        ))

        which is an increasing function because a2k+1 ge 0 For t0 = π4 c0(t0) le 0074807Hence

        t2

        sin2 t+ t cot 2t le

        (1 +

        t2

        3+ c0

        (π4

        )t4)

        +

        (1

        2minus 2t2

        3minus 8t4

        45

        )=

        3

        2minus t2

        3+

        (c0

        (π4

        )minus 8

        45

        )t4 le 3

        2minus t2

        3le 3

        2

        for t isin [0 π4]Therefore the left side of (41) is at most

        2A+ C middot(

        2q

        π

        )2

        middot 3

        2= 2A+

        6

        π2Cq2

        The following is an alternative approach it yields the other estimate in (41) Webound the terms corresponding to r = 0 r = minus1 r = 1 by A each We let r = plusmnrprimefor rprime ranging from 2 to q2 We obtain that the sum is at most

        3A+sum

        2lerprimeleq2

        min

        A C(sin π

        q

        (rprime minus 1

        2 minus qδ2))2

        +

        sum2lerprimeleq2

        min

        A C(sin π

        q

        (rprime minus 1

        2 + qδ2))2

        (44)

        54 CHAPTER 4 TYPE I SUMS

        We bound a term min(AC sin((πq)(rprime minus 12 plusmn qδ2))2) by A if and only ifC sin((πq)(rprimeminus 1plusmn qδ2))2 ge A (In other words we are choosing which of the twobounds A C| sin(παn)|2 on a case-by-case basis ie for each n instead of makinga single choice for all n in one go This is hardly anything deep but it does result ina marked improvement with respect to the literature and would give an improvementeven if we were given a bound B| sin(παn)| instead of a bound C| sin(παn)|2 asinput) The number of such terms is

        le max(0 b(qπ) arcsin(radicCA)∓ qδ2c)

        and thus at most (2qπ) arcsin(radicCA) in total (Recall that qδ2 le 12) Each

        other term gets bounded by the integral of C sin2(παq) from rprime minus 1 plusmn qδ2 (ge(qπ) arcsin(

        radicCA)) to rprime plusmn qδ2 by convexity Thus (44) is at most

        3A+2q

        πA arcsin

        radicC

        A+ 2

        int q2

        qπ arcsin

        radicCA

        C

        sin2 πtq

        dt

        le 3A+2q

        πA arcsin

        radicC

        A+

        2q

        πC

        radicA

        Cminus 1

        We can easily show (taking derivatives) that arcsinx + x(1 minus x2) le 2x for 0 lex le 1 Setting x = CA we see that this implies that

        3A+2q

        πA arcsin

        radicC

        A+

        2q

        πC

        radicA

        Cminus 1 le 3A+

        4q

        π

        radicAC

        (If CA gt 1 then 3A + (4qπ)radicAC is greater than Aq which is an obvious upper

        bound for the left side of (41))

        Now we will see that if we take out terms with n divisible by q and n is not toolarge then we can give a bound that does not involve a constant term A at all (We arereferring to the bound (203π2)Cq2 below of course 2A + (4qπ)

        radicAC does have

        a constant term 2A ndash it is just smaller than the constant term 3A in the correspondingbound in (41))

        Lemma 412 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any AC ge 0sum

        y1ltnley2q-n

        min

        (A

        C

        | sin(παn)|2

        )le min

        (20

        3π2Cq2 2A+

        4q

        π

        radicAC

        ) (45)

        Proof Clearly αn equals anq + (nQ)βq since y2 le Q2 this means that |αnminusanq| le 12q for n le y2 moreover again for n le y2 the sign of αnminus anq remainsconstant Hence the left side of (45) is at most

        q2sumr=1

        min

        (A

        C

        (sin πq (r minus 12))2

        )+

        q2sumr=1

        min

        (A

        C

        (sin πq r)

        2

        )

        41 TRIGONOMETRIC SUMS 55

        Proceeding as in the proof of Lemma 411 we obtain a bound of at most

        C

        (1

        (sin π2q )2

        +1

        (sin πq )2

        +q

        πcot

        π

        q+q

        πcot

        2q

        )

        for q ge 2 (If q = 1 then the left-side of (45) is trivially zero) Now by (42)

        t2

        (sin t)2+t

        2cot 2t le

        (1 +

        t2

        3+ c0

        (π4

        )t4)

        +1

        4

        (1minus 4t2

        3minus 16t4

        45

        )le 5

        4+

        (c0

        (π4

        )minus 4

        45

        )t4 le 5

        4

        for t isin [0 π4] and

        t2

        (sin t)2+ t cot

        3t

        2le(

        1 +t2

        3+ c0

        (π2

        )t4)

        +2

        3

        (1minus 3t2

        4minus 81t4

        24 middot 45

        )le 5

        3+

        (minus1

        6+

        (c0

        (π2

        )minus 27

        360

        )(π2

        )2)t2 le 5

        3

        for t isin [0 π2] Hence(1

        (sin π2q )2

        +1

        (sin πq )2

        +q

        πcot

        π

        q+q

        πcot

        2q

        )le(

        2q

        π

        )2

        middot 54

        +( qπ

        )2

        middot 53le 20

        3π2q2

        Alternatively we can follow the second approach in the proof of Lemma 411 andobtain an upper bound of 2A+ (4qπ)

        radicAC

        The following bound will be useful when the constant A in an application ofLemma 412 would be too large (This tends to happen for n small)

        Lemma 413 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any BC ge 0

        sumy1ltnley2

        q-n

        min

        (B

        | sin(παn)|

        C

        | sin(παn)|2

        )le 2B

        q

        πmax

        (2 log

        Ce3q

        ) (46)

        The upper bound le (2Bqπ) log(2e2qπ) is also valid

        Proof As in the proof of Lemma 412 we can bound the left side of (46) by

        2

        q2sumr=1

        min

        (B

        sin πq

        (r minus 1

        2

        ) C

        sin2 πq

        (r minus 1

        2

        ))

        56 CHAPTER 4 TYPE I SUMS

        Assume B sin(πq) le C le B By the convexity of 1 sin(t) and 1 sin(t)2 fort isin (0 π2]

        q2sumr=1

        min

        (B

        sin πq

        (r minus 1

        2

        ) C

        sin2 πq

        (r minus 1

        2

        ))

        le B

        sin π2q

        +

        int qπ arcsin C

        B

        1

        B

        sin πq tdt+

        int q2

        qπ arcsin C

        B

        1

        sin2 πq tdt

        le B

        sin π2q

        +q

        π

        (B

        (log tan

        (1

        2arcsin

        C

        B

        )minus log tan

        π

        2q

        )+ C cot arcsin

        C

        B

        )le B

        sin π2q

        +q

        π

        (B

        (log cot

        π

        2qminus log

        C

        B minusradicB2 minus C2

        )+radicB2 minus C2

        )

        Now for all t isin (0 π2)

        2

        sin t+

        1

        tlog cot t lt

        1

        tlog

        (e2

        t

        )

        we can verify this by comparing series Thus

        B

        sin π2q

        +q

        πB log cot

        π

        2qle B q

        πlog

        2e2q

        π

        for q ge 2 (If q = 1 the sum on the left of (46) is empty and so the bound we aretrying to prove is trivial) We also have

        t log(tminusradict2 minus 1) +

        radict2 minus 1 lt minust log 2t+ t (47)

        for t ge 1 (as this is equivalent to log(2t2(1minusradic

        1minus tminus2)) lt 1minusradic

        1minus tminus2 which wecheck easily after changing variables to δ = 1minus

        radic1minus tminus2) Hence

        B

        sin π2q

        +q

        π

        (B

        (log cot

        π

        2qminus log

        C

        B minusradicB2 minus C2

        )+radicB2 minus C2

        )le B q

        πlog

        2e2q

        π+q

        π

        (B minusB log

        2B

        C

        )le B q

        πlog

        Ce3q

        for q ge 2Given any C we can apply the above with C = B instead as for any t gt 0

        min(Bt Ct2) le Bt le min(BtBt2) (We refrain from applying (47) so as toavoid worsening a constant) If C lt B sinπq (or even if C lt (πq)B) we relax theinput to C = B sinπq and go through the above

        42 Type I estimatesLet us give our first main type I estimate2 One of the main innovations is the mannerin which the ldquomain termrdquo (m divisible by q) is separated we are able to keep error

        2The current version of Lemma 421 is an improvement over that included in the first version of thepreprint [Helb]

        42 TYPE I ESTIMATES 57

        terms small thanks to the particular way in which we switch between two differentapproximations

        (These are not necessarily successive approximations in the sense of continuedfractions we do not want to assume that the approximation aq we are given arisesfrom a continued fraction and at any rate we need more control on the denominator qprime

        of the new approximation aprimeqprime than continued fractions would furnish)The following lemma is a theme so to speak to which several variations will be

        given Later in practice we will always use one of the variations rather than theoriginal lemma itself This is so just because even though (48) is the basic type ofsum we treat in type I the sums that we will have to estimate in practice will alwayspresent some minor additional complication Proving the lemma we are about to givein full will give us a chance to see all the main ideas at work leaving complications forlater

        Lemma 421 Let α = aq+ δx (a q) = 1 |δx| le 1qQ0 q le Q0 Q0 ge 16 Letη be continuous piecewise C2 and compactly supported with |η|1 = 1 and ηprimeprime isin L1Let c0 ge |ηprimeprime|infin

        Let 1 le D le x Then if |δ| le 12c2 where c2 = (3π5radicc0)(1 +

        radic133) the

        absolute value of summleD

        micro(m)sumn

        e(αmn)η(mnx

        )(48)

        is at most

        x

        qmin

        (1

        c0(2πδ)2

        ) ∣∣∣∣∣∣∣∣∣∣summleMq

        (mq)=1

        micro(m)

        m

        ∣∣∣∣∣∣∣∣∣∣+Olowast

        (c0

        (1

        4minus 1

        π2

        )(D2

        2xq+D

        2x

        ))(49)

        plus

        2radicc0c1π

        D + 3c1x

        qlog+ D

        c2xq+

        radicc0c1π

        q log+ D

        q2

        +|ηprime|1π

        q middotmax

        (2 log

        c0e3q2

        4π|ηprime|1x

        )+

        (2radic

        3c0c1π

        +3c1c2

        +55c0c212π2

        )q

        (410)

        where c1 = 1 + |ηprime|1(2xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

        In general if |δ| ge 12c2 the absolute value of (48) is at most (49) plus

        2radicc0c1π

        (D + (1 + ε) min

        (lfloorx

        |δ|q

        rfloor+ 1 2D

        )($ε +

        1

        2log+ 2D

        x|δ|q

        ))

        + 3c1

        (2 +

        (1 + ε)

        εlog+ 2D

        x|δ|q

        )x

        Q0+

        35c0c26π2

        q

        (411)

        for ε isin (0 1] arbitrary where $ε =radic

        3 + 2ε+ ((1 +radic

        133)4minus 1)(2(1 + ε))

        58 CHAPTER 4 TYPE I SUMS

        In (49) min(1 c0(2πδ)2) always equals 1 when |δ| le 12c2 (since (35)(1 +radic

        133) gt 1)

        Proof Let Q = bx|δq|c Then α = aq + Olowast(1qQ) and q le Q (If δ = 0 welet Q = infin and ignore the rest of the paragraph since then we will never need Qprime orthe alternative approximation aprimeqprime) Let Qprime = d(1 + ε)Qe ge Q + 1 Then α is notaq + Olowast(1qQprime) and so there must be a different approximation aprimeqprime (aprime qprime) = 1qprime le Qprime such that α = aprimeqprime + Olowast(1qprimeQprime) (since such an approximation alwaysexists) Obviously |aq minus aprimeqprime| ge 1qqprime yet at the same time |aq minus aprimeqprime| le1qQ+ 1qprimeQprime le 1qQ+ 1((1 + ε)qprimeQ) Hence qprimeQ+ q((1 + ε)Q) ge 1 and soqprime ge Qminusq(1+ε) ge (ε(1+ε))Q (Note also that (ε(1+ε))Q ge (2|δq|x)middotbxδqc gt1 and so qprime ge 2)

        Lemma 412 will enable us to treat separately the contribution from terms withm divisible by q and m not divisible by q provided that m le Q2 Let M =min(Q2 D) We start by considering all terms with m le M divisible by q Thene(αmn) equals e((δmx)n) By Poisson summation

        sumn

        e(αmn)η(mnx) =sumn

        f(n)

        where f(u) = e((δmx)u)η((mx)u) Now

        f(n) =

        inte(minusun)f(u)du =

        x

        m

        inte((δ minus xn

        m

        )u)η(u)du =

        x

        mη( xmnminus δ

        )

        By assumption m le M le Q2 le x2|δq| and so |xm| ge 2|δq| ge 2δ Thus by(21) (with k = 2)

        sumn

        f(n) =x

        m

        η(minusδ) +sumn 6=0

        η(nxmminus δ)

        =x

        m

        η(minusδ) +Olowast

        sumn6=0

        1(2π(nxm minus δ

        ))2 middot ∣∣∣ηprimeprime∣∣∣

        infin

        =

        x

        mη(minusδ) +

        m

        x

        c0(2π)2

        Olowast

        max|r|le 1

        2

        sumn 6=0

        1

        (nminus r)2

        (412)

        Since x 7rarr 1x2 is convex on R+

        max|r|le 1

        2

        sumn 6=0

        1

        (nminus r)2=sumn 6=0

        1(nminus 1

        2

        )2 = π2 minus 4

        42 TYPE I ESTIMATES 59

        Therefore the sum of all terms with m leM and q|m issummleMq|m

        x

        mη(minusδ) +

        summleMq|m

        m

        x

        c0(2π)2

        (π2 minus 4)

        =xmicro(q)

        qmiddot η(minusδ) middot

        summleMq

        (mq)=1

        micro(m)

        m

        +Olowast(micro(q)2c0

        (1

        4minus 1

        π2

        )(D2

        2xq+D

        2x

        ))

        We will bound |η(minusδ)| by (21)As we have just seen estimating the contribution of the terms with m divisible by

        q and not too large (m le M ) involves isolating a main term estimating it carefully(with cancellation) and then bounding the remaining error terms

        We will now bound the contribution of all other m ndash that is m not divisible by qand m larger than M Cancellation will now be used only within the inner sum thatis we will bound each inner sum

        Tm(α) =sumn

        e(αmn)η(mnx

        )

        and then we will carefully consider how to bound sums of |Tm(α)| over m efficientlyBy (22) and Lemma 231

        |Tm(α)| le min

        (x

        m+

        1

        2|ηprime|1

        12 |ηprime|1

        | sin(πmα)|m

        x

        c04

        1

        (sinπmα)2

        ) (413)

        For any y2 gt y1 gt 0 with y2 minus y1 le q and y2 le Q2 (413) gives us thatsumy1ltmley2

        q-m

        |Tm(α)| lesum

        y1ltmley2q-m

        min

        (A

        C

        (sinπmα)2

        )(414)

        for A = (xy1)(1 + |ηprime|1(2(xy1))) and C = (c04)(y2x) We must now estimatethe sum sum

        mleMq-m

        |Tm(α)|+sum

        Q2 ltmleD

        |Tm(α)| (415)

        To bound the terms with m le M we can use Lemma 412 The question is thenwhich one is smaller the first or the second bound given by Lemma 412 A briefcalculation gives that the second bound is smaller (and hence preferable) exactly whenradicCA gt (3π10q)(1 +

        radic133) Since

        radicCA sim (

        radicc02)mx this means that

        it is sensible to prefer the second bound in Lemma 412 when m gt c2xq wherec2 = (3π5

        radicc0)(1 +

        radic133)

        It thus makes sense to ask does Q2 le c2xq (so that m le M implies m lec2xq) This question divides our work into two basic cases

        60 CHAPTER 4 TYPE I SUMS

        Case (a) δ large |δ| ge 12c2 where c2 = (3π5radicc0)(1 +

        radic133) Then

        Q2 le c2xq this will induce us to bound the first sum in (415) by the first bound inLemma 412

        Recall that M = min(Q2 D) and so M le c2xq By (414) and Lemma 412

        sum1lemleMq-m

        |Tm(α)| leinfinsumj=0

        sumjqltmlemin((j+1)qM)

        q-m

        min

        (x

        jq + 1+|ηprime|1

        2

        c04

        (j+1)qx

        (sinπmα)2

        )

        le 20

        3π2

        c0q3

        4x

        sum0lejleMq

        (j + 1) le 20

        3π2

        c0q3

        4xmiddot(

        1

        2

        M2

        q2+

        3

        2

        c2x

        q2+ 1

        )

        le 5c0c26π2

        M +5c0q

        3π2

        (3

        2c2 +

        q2

        x

        )le 5c0c2

        6π2M +

        35c0c26π2

        q

        (416)where to bound the smaller terms we are using the inequality Q2 le c2xq andwhere we are also using the observation that since |δx| le 1qQ0 the assumption|δ| ge 12c2 implies that q le 2c2xQ0 moreover since q le Q0 this gives us thatq2 le 2c2x In the main term we are bounding qM2x from above by M middot qQ2x leM2δ le c2M

        If D le (Q + 1)2 then M ge bDc and so (416) is all we need the second sumin (415) is empty Assume from now on that D gt (Q+ 1)2 The first sum in (415)is then bounded by (416) (with M = Q2) To bound the second sum in (415) wewill use the approximation aprimeqprime instead of aq The motivation is the following ifwe used the approximation aq even for m gt Q2 the contribution of the terms withq|m would be too large When we use aprimeqprime the contribution of the terms with qprime|m(or m equiv plusmn1 mod qprime) is very small only a fraction 1qprime (tiny since qprime is large) of allterms are like that and their individual contribution is always small precisely becausem gt Q2

        By (414) (without the restriction q - m on either side) and Lemma 411

        sumQ2ltmleD

        |Tm(α)| leinfinsumj=0

        sumjqprime+Q

        2 ltmlemin((j+1)qprime+Q2D)

        |Tm(α)|

        le

        lfloorDminus(Q+1)2

        qprime

        rfloorsumj=0

        (3c1

        x

        jqprime + Q+12

        +4qprime

        π

        radicc1c0

        4

        x

        jqprime + (Q+ 1)2

        (j + 1)qprime +Q2

        x

        )

        le

        lfloorDminus(Q+1)2

        qprime

        rfloorsumj=0

        (3c1

        x

        jqprime + Q+12

        +4qprime

        π

        radicc1c0

        4

        (1 +

        qprime

        jqprime + (Q+ 1)2

        ))

        where we recall that c1 = 1 + |ηprime|1(2xD) Since qprime ge (ε(1 + ε))QlfloorDminus(Q+1)2

        qprime

        rfloorsumj=0

        x

        jqprime + Q+12

        le x

        Q2+x

        qprime

        int D

        Q+12

        1

        tdt le 2x

        Q+

        (1 + ε)x

        εQlog+ D

        Q+12

        (417)

        42 TYPE I ESTIMATES 61

        Recall now that qprime le (1 + ε)Q+ 1 le (1 + ε)(Q+ 1) Therefore

        qprimebDminus(Q+1)2

        qprime csumj=0

        radic1 +

        qprime

        jqprime + (Q+ 1)2le qprime

        radic1 +

        (1 + ε)Q+ 1

        (Q+ 1)2+

        int D

        Q+12

        radic1 +

        qprime

        tdt

        le qprimeradic

        3 + 2ε+

        (D minus Q+ 1

        2

        )+qprime

        2log+ D

        Q+12

        (418)We conclude that

        sumQ2ltmleD |Tm(α)| is at most

        2radicc0c1π

        (D +

        ((1 + ε)

        radic3 + 2εminus 1

        2

        )(Q+ 1) +

        (1 + ε)Q+ 1

        2log+ D

        Q+12

        )

        + 3c1

        (2 +

        (1 + ε)

        εlog+ D

        Q+12

        )x

        Q

        (419)We sum this to (416) (with M = Q2) and obtain that (415) is at most

        2radicc0c1π

        (D + (1 + ε)(Q+ 1)

        ($ε +

        1

        2log+ D

        Q+12

        ))

        + 3c1

        (2 +

        (1 + ε)

        εlog

        DQ+1

        2

        )x

        Q+

        35c0c26π2

        q

        (420)

        where we are bounding

        5c0c26π2

        =5c06π2

        5radicc0

        (1 +

        radic13

        3

        )=

        radicc0

        (1 +

        radic13

        3

        )le

        2radicc0c1π

        middot 14

        (1 +

        radic13

        3

        )(421)

        and defining

        $ε =radic

        3 + 2ε+

        (1

        4

        (1 +

        radic13

        3

        )minus 1

        )1

        2(1 + ε) (422)

        (Note that $ε ltradic

        3 for ε lt 01741) A quick check against (416) shows that (420)is valid also when D le Q2 even when Q + 1 is replaced by min(Q + 1 2D) Webound Q from above by x|δ|q and log+D((Q + 1)2) by log+ 2D(x|δ|q + 1)and obtain the result

        Case (b) |δ| small |δ| le 12c2 or D le Q02 Then min(c2xqD) le Q2 Westart by bounding the first q2 terms in (415) by (413) and Lemma 413sum

        mleq2

        |Tm(α)| lesum

        mleq2

        min

        ( 12 |ηprime|1

        | sin(πmα)|

        c0q8x

        | sin(πmα)|2

        )

        le |ηprime|1π

        qmax

        (2 log

        c0e3q2

        4π|ηprime|1x

        )

        (423)

        62 CHAPTER 4 TYPE I SUMS

        If q2 lt 2c2x we estimate the terms with q2 lt m le c2xq by Lemma 412which is applicable because min(c2xqD) lt Q2

        sumq2ltmleDprime

        q-m

        |Tm(α)| leinfinsumj=1

        sum(jminus 1

        2 )qltmle(j+ 12 )q

        mlemin( c2xq D)q-m

        min

        (x(

        j minus 12

        )q

        +|ηprime1|2c04

        (j+12)qx

        (sinπmα)2

        )

        le 20

        3π2

        c0q3

        4x

        sum1lejleDprimeq + 1

        2

        (j +

        1

        2

        )le 20

        3π2

        c0q3

        4x

        (c2x

        2q2

        Dprime

        q+

        3

        2

        (c2x

        q2

        )+

        5

        8

        )

        le 5c06π2

        (c2D

        prime + 3c2q +5

        4

        q3

        x

        )le 5c0c2

        6π2

        (Dprime +

        11

        2q

        )

        (424)where we write Dprime = min(c2xqD) If c2xq ge D we stop here Assume thatc2xq lt D Let R = max(c2xq q2) The terms we have already estimated areprecisely those with m le R We bound the terms R lt m le D by the second boundin Lemma 411sum

        RltmleD

        |Tm(α)| leinfinsumj=0

        summgtjq+R

        mlemin((j+1)q+RD)

        min

        (c1x

        jq +Rc04

        (j+1)q+Rx

        (sinπmα)2

        )

        leb 1q (DminusR)csumj=0

        3c1x

        jq +R+

        4q

        π

        radicc1c0

        4

        (1 +

        q

        jq +R

        ) (425)

        (Note there is no need to use two successive approximations aq aprimeqprime as in case (a)We are also including all terms with m divisible by q as we may since |Tm(α)| isnon-negative) Now much as before

        b 1q (DminusR)csumj=0

        x

        jq +Rle x

        R+x

        q

        int D

        R

        1

        tdt le min

        (q

        c2

        2x

        q

        )+x

        qlog+ D

        c2xq (426)

        andb 1q (DminusR)csumj=0

        radic1 +

        q

        jq +Rleradic

        1 +q

        R+

        1

        q

        int D

        R

        radic1 +

        q

        tdt

        leradic

        3 +D minusRq

        +1

        2log+ D

        q2

        (427)

        We sum with (423) and (424) and we obtain that (415) is at most

        2radicc0c1π

        (radic3q +D +

        q

        2log+ D

        q2

        )+

        (3c1 log+ D

        c2xq

        )x

        q

        + 3c1 min

        (q

        c2

        2x

        q

        )+

        55c0c212π2

        q +|ηprime|1π

        q middotmax

        (2 log

        c0e3q2

        4π|ηprime|1x

        )

        (428)

        42 TYPE I ESTIMATES 63

        where we are using the fact that 5c0c26π2 lt 2

        radicc0c1π to make sure that the term

        (5c0c26π2)Dprime from (424) is more than compensated by the termminus2

        radicc0c1Rπ com-

        ing from minusRq in (427) (by the definition of Dprime and R we have R ge D) We canalso use 5c0c26π

        2 lt 2radicc0c1π to bound the term (5c0c26π

        2)Dprime from (424) by theterm 2

        radicc0c1Dπ in (428) in case c2xq ge D (Again by definition Dprime le D) Thus

        (428) is valid both when c2xq lt D and when c2xq ge D

        421 Type I variationsWe will need a version of Lemma 421 with m and n restricted to the odd numbers(We will barely be using the restriction of m whereas the restriction on n is both (a)slightly harder to deal with (b) something that can be turned to our advantage)

        Lemma 422 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge 16 Let η be continuous piecewise C2 and compactly supported with|η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin

        Let 1 le D le x Then if |δ| le 12c2 where c2 = 6π5radicc0 the absolute value ofsum

        mleDm odd

        micro(m)sumn odd

        e(αmn)η(mnx

        )(429)

        is at most

        x

        2qmin

        (1

        c0(πδ)2

        ) ∣∣∣∣∣∣∣∣∣∣summleMq

        (m2q)=1

        micro(m)

        m

        ∣∣∣∣∣∣∣∣∣∣+Olowast

        (c0q

        x

        (1

        8minus 1

        2π2

        )(D

        q+ 1

        )2)

        (430)

        plus

        2radicc0c1π

        D +3c12

        x

        qlog+ D

        c2xq+

        radicc0c1π

        q log+ D

        q2

        +2|ηprime|1π

        q middotmax

        (1 log

        c0e3q2

        4π|ηprime|1x

        )+

        (2radic

        3c0c1π

        +3c12c2

        +55c0c2

        6π2

        )q

        (431)

        where c1 = 1 + |ηprime|1(xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02

        In general if |δ| ge 12c2 the absolute value of (48) is at most (430) plus

        2radicc0c1π

        (D + (1 + ε) min

        (lfloorx

        |δ|q

        rfloor+ 1 2D

        )(radic3 + 2ε+

        1

        2log+ 2D

        x|δ|q

        ))

        +3

        2c1

        (2 +

        (1 + ε)

        εlog+ 2D

        x|δ|q

        )x

        Q0+

        35c0c23π2

        q

        (432)for ε isin (0 1] arbitrary

        64 CHAPTER 4 TYPE I SUMS

        If q is even the sum (430) can be replaced by 0

        Proof The proof is almost exactly that of Lemma 421 we go over the differencesThe parameters Q Qprime aprime qprime and M are defined just as before (with 2α wherever wehad α)

        Let us first consider m le M odd and divisible by q (Of course this case arisesonly if q is odd) For n = 2r + 1

        e(αmn) = e(αm(2r + 1)) = e(2αrm)e(αm)

        = e

        xrm

        )e

        ((a

        2q+

        δ

        2x+κ

        2

        )m

        )= e

        (δ(2r + 1)

        2xm

        )e

        (a+ κq

        2

        m

        q

        )= κprimee

        (δ(2r + 1)

        2xm

        )

        where κ isin 0 1 and κprime = e((a + κq)2) isin minus1 1 are independent of m and nHence by Poisson summationsum

        n odd

        e(αmn)η(mnx) = κprimesumn odd

        e((δm2x)n)η(mnx)

        =κprime

        2

        (sumn

        f(n)minussumn

        f(n+ 12)

        )

        (433)

        where f(u) = e((δm2x)u)η((mx)u) Now

        f(t) =x

        (x

        mtminus δ

        2

        )

        Just as before |xm| ge 2|δq| ge 2δ Thus

        1

        2

        ∣∣∣∣∣sumn

        f(n)minussumn

        f(n+ 12)

        ∣∣∣∣∣ le x

        m

        1

        2

        ∣∣∣∣η(minusδ2)∣∣∣∣+

        1

        2

        sumn 6=0

        ∣∣∣∣η( xm n

        2minus δ

        2

        )∣∣∣∣

        =x

        m

        1

        2

        ∣∣∣∣η(minusδ2)∣∣∣∣+

        1

        2middotOlowast

        sumn 6=0

        1(π(nxm minus δ

        ))2 middot ∣∣∣ηprimeprime∣∣∣

        infin

        =

        x

        2m

        ∣∣∣∣η(minusδ2)∣∣∣∣+

        m

        x

        c02π2

        (π2 minus 4)x

        (434)The contribution of the second term in the last line of (434) issum

        mleMm oddq|m

        m

        x

        c02π2

        (π2 minus 4) =q

        x

        c02π2

        (π2 minus 4) middotsum

        mleMq

        m odd

        m

        =qc0x

        (1

        8minus 1

        2π2

        )(M

        q+ 1

        )2

        42 TYPE I ESTIMATES 65

        Hence the absolute value of the sum of all terms with m le M and q|m is given by(430)

        We define Tm(α) by

        Tm(α) =sumn odd

        e(αmn)η(mnx

        ) (435)

        Changing variables by n = 2r + 1 we see that

        |Tm(α)| =

        ∣∣∣∣∣sumr

        e(2α middotmr)η(m(2r + 1)x)

        ∣∣∣∣∣ Hence instead of (413) we get that

        |Tm(α)| le min

        (x

        2m+

        1

        2|ηprime|1

        12 |ηprime|1

        | sin(2πmα)|m

        x

        c02

        1

        (sin 2πmα)2

        ) (436)

        We obtain (414) but with Tm instead of Tm A = (x2y1)(1 + |ηprime|1(xy1)) andC = (c02)(y2x) and so c1 = 1 + |ηprime|1(xD)

        The rest of the proof of Lemma 421 carries almost over word-by-word (For thesake of simplicity we do not really try to take advantage of the odd support of mhere) Since C has doubled it would seem to make sense to reset the value of c2 to bec2 = (3π5

        radic2c0)(1 +

        radic133) this would cause complications related to the fact that

        5c0c23π2 would become larger than 2

        radicc0π and so we set c2 to the slightly smaller

        value c2 = 6π5radicc0 instead This implies

        5c0c23π2

        =2radicc0π

        (437)

        The bound from (416) gets multiplied by 2 (but the value of c2 has changed) thesecond line in (419) gets halved (421) gets replaced by (437) the second term inthe maximum in the second line of (423) gets doubled the bound from (424) getsdoubled and the bound from (426) gets halved

        We will also need a version of Lemma 421 (or rather Lemma 422 we will decideto work with the restriction that n and m be odd) with a factor of (log n) within theinner sum This is the sum SI1 in (39)

        Lemma 423 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(16 2

        radicx) Let η be continuous piecewise C2 and compactly

        supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Assume that for any ρ ge ρ0ρ0 a constant the function η(ρ)(t) = log(ρt)η(t) satisfies

        |η(ρ)|1 le log(ρ)|η|1 |ηprime(ρ)|1 le log(ρ)|ηprime|1 |ηprimeprime(ρ)|infin le c0 log(ρ) (438)

        Letradic

        3 le D le min(xρ0 xe) Then if |δ| le 12c2 where c2 = 6π5radicc0 the

        absolute value of summleDm odd

        micro(m)sumn

        n odd

        (log n)e(αmn)η(mnx

        )(439)

        66 CHAPTER 4 TYPE I SUMS

        is at most

        x

        qmin

        (1c0δ

        2

        (2π)2

        ) ∣∣∣∣∣∣∣∣∣∣summleMq

        (mq)=1

        micro(m)

        mlog

        x

        mq

        ∣∣∣∣∣∣∣∣∣∣+x

        q|log middotη(minusδ)|

        ∣∣∣∣∣∣∣∣∣∣summleMq

        (mq)=1

        micro(m)

        m

        ∣∣∣∣∣∣∣∣∣∣+Olowast

        (c0

        (1

        2minus 2

        π2

        )(D2

        4qxlog

        e12x

        D+

        1

        e

        )) (440)

        plus

        2radicc0c1π

        D logex

        D+

        3c12

        x

        qlog+ D

        c2xqlog

        q

        c2

        +

        (2|ηprime|1π

        max

        (1 log

        c0e3q2

        4π|ηprime|1x

        )log x+

        2radicc0c1π

        (radic3 +

        1

        2log+ D

        q2

        )log

        q

        c2

        )q

        +3c12

        radic2x

        c2log

        2x

        c2+

        20c0c322

        3π2

        radic2x log

        2radicex

        c2(441)

        for c1 = 1 + |ηprime|1(xD) The same bound holds if |δ| ge 12c2 but D le Q02In general if |δ| ge 12c2 the absolute value of (439) is at most

        2radicc0c1π

        D logex

        D+

        2radicc0c1π

        (1 + ε)

        (x

        |δ|q+ 1

        )(radic3 + 2ε middot log+ 2

        radice|δ|q +

        1

        2log+ 2D

        x|δ|q

        log+ 2|δ|q

        )

        +

        (3c14

        (2radic5

        +1 + ε

        2εlog x

        )+

        40

        3

        radic2c0c

        322

        )radicx log x

        (442)for ε isin (0 1]

        Proof DefineQQprimeM aprime and qprime as in the proof of Lemma 421 The same method ofproof works as for Lemma 421 we go over the differences When applying Poissonsummation or (22) use η(xm)(t) = (log xtm)η(t) instead of η(t) Then use thebounds in (438) with ρ = xm in particular

        |ηprimeprime(xm)|infin le c0 logx

        m

        For f(u) = e((δm2x)u)(log u)η((mx)u)

        f(t) =x

        mη(xm)

        (x

        mtminus δ

        2

        )

        42 TYPE I ESTIMATES 67

        and so

        1

        2

        sumn

        ∣∣∣f(n2)∣∣∣ le x

        m

        1

        2

        ∣∣∣∣η(xm)

        (minusδ

        2

        )∣∣∣∣+1

        2

        sumn 6=0

        ∣∣∣∣η( xm n

        2minus δ

        2

        )∣∣∣∣

        =1

        2

        x

        m

        (log middotη

        (minusδ

        2

        )+ log

        ( xm

        (minusδ

        2

        ))+m

        x

        (log

        x

        m

        ) c02π2

        (π2 minus 4)

        The part of the main term involving log(xm) becomes

        xη(minusδ)2

        summleMm oddq|m

        micro(m)

        mlog( xm

        )=xmicro(q)

        qη(minusδ) middot

        summleMq

        (m2q)=1

        micro(m)

        mlog

        (x

        mq

        )

        for q odd (We can see that this like the rest of the main term vanishes for m even)In the term in front of π2 minus 4 we find the sum

        summleMm oddq|m

        m

        xlog( xm

        )le M

        xlog

        x

        M+q

        2

        int Mq

        0

        t logxq

        tdt

        =M

        xlog

        x

        M+M2

        4qxlog

        e12x

        M

        where we use the fact that t 7rarr t log(xt) is increasing for t le xe By the same fact(and by M le D) (M2q) log(e12xM) le (D2q) log(e12xD) It is also easy tosee that (Mx) log(xM) le 1e (since M le D le x)

        The basic estimate for the rest of the proof (replacing (413)) is

        Tm(α) =sumn odd

        e(αmn)(log n)η(mnx

        )=sumn odd

        e(αmn)η(xm)

        (mnx

        )

        = Olowast

        min

        x

        2m|η(xm)|1 +

        |ηprime(xm)|12

        12 |ηprime(xm)|1

        | sin(2πmα)|m

        x

        12 |ηprimeprime(xm)|infin

        (sin 2πmα)2

        = Olowast

        (log

        x

        mmiddotmin

        (x

        2m+|ηprime|1

        2

        12 |ηprime|1

        | sin(2πmα)|m

        x

        c02

        1

        (sin 2πmα)2

        ))

        We wish to bound summleMq-mm odd

        |Tm(α)|+sum

        Q2 ltmleD

        |Tm(α)| (443)

        Just as in the proofs of Lemmas 421 and 422 we give two bounds one valid for|δ| large (|δ| ge 12c2) and the other for δ small (|δ| le 12c2) Again as in the proofof Lemma 422 we ignore the condition that m is odd in (415)

        68 CHAPTER 4 TYPE I SUMS

        Consider the case of |δ| large first Instead of (416) we havesum1lemleMq-m

        |Tm(α)| le 40

        3π2

        c0q3

        2x

        sum0lejleMq

        (j + 1) logx

        jq + 1 (444)

        Since sum0lejleMq

        (j + 1) logx

        jq + 1

        le log x+M

        qlog

        x

        M+

        sum1lejleMq

        logx

        jq+

        sum1lejleMq minus1

        j logx

        jq

        le log x+M

        qlog

        x

        M+

        int Mq

        0

        logx

        tqdt+

        int Mq

        1

        t logx

        tqdt

        le log x+

        (2M

        q+M2

        2q2

        )log

        e12x

        M

        this means thatsum1lemleMq-m

        |Tm(α)| le 40

        3π2

        c0q3

        4x

        (log x+

        (2M

        q+M2

        2q2

        )log

        e12x

        M

        )

        le 5c0c23π2

        M log

        radicex

        M+

        40

        3

        radic2c0c

        322

        radicx log x

        (445)

        where we are using the bounds M le Q2 le c2xq and q2 le 2c2x (just as in (416))Instead of (417) we havelfloor

        Dminus(Q+1)2

        qprime

        rfloorsumj=0

        (log

        x

        jqprime + Q+12

        )x

        jqprime + Q+12

        le x

        Q2log

        2x

        Q+x

        qprime

        int D

        Q+12

        logx

        t

        dt

        t

        le 2x

        Qlog

        2x

        Q+x

        qprimelog

        2x

        Qlog+ 2D

        Q

        recall that the coefficient in front of this sum will be halved by the condition that n isodd Instead of (418) we obtain

        qprimebDminus(Q+1)2

        qprime csumj=0

        radic1 +

        qprime

        jqprime + (Q+ 1)2

        (log

        x

        jqprime + Q+12

        )

        le qprimeradic

        3 + 2ε middot log2x

        Q+ 1+

        int D

        Q+12

        (1 +

        qprime

        2t

        )(log

        x

        t

        )dt

        le qprimeradic

        3 + 2ε middot log2x

        Q+ 1+D log

        ex

        D

        minus Q+ 1

        2log

        2ex

        Q+ 1+qprime

        2log

        2x

        Q+ 1log

        2D

        Q+ 1

        42 TYPE I ESTIMATES 69

        (The boundint ba

        log(xt)dtt le log(xa) log(ba) will be more practical than the exactexpression for the integral) Hence

        sumQ2ltmleD |Tm(α)| is at most

        2radicc0c1π

        D logex

        D

        +2radicc0c1π

        ((1 + ε)

        radic3 + 2ε+

        (1 + ε)

        2log

        2D

        Q+ 1

        )(Q+ 1) log

        2x

        Q+ 1

        minus2radicc0c1π

        middot Q+ 1

        2log

        2ex

        Q+ 1+

        3c12

        (2radic5

        +1 + ε

        εlog+ D

        Q2

        )radicx log

        radicx

        Summing this to (445) (with M = Q2) and using (421) and (422) as before weobtain that (443) is at most

        2radicc0c1π

        D logex

        D

        +2radicc0c1π

        (1 + ε)(Q+ 1)

        (radic3 + 2ε log+ 2

        radicex

        Q+ 1+

        1

        2log+ 2D

        Q+ 1log+ 2x

        Q+ 1

        )+

        3c12

        (2radic5

        +1 + ε

        εlog+ D

        Q2

        )radicx log

        radicx+

        40

        3

        radic2c0c

        322

        radicx log x

        Now we go over the case of |δ| small (or D le Q02) Instead of (423) we havesummleq2

        |Tm(α)| le 2|ηprime|1π

        qmax

        (1 log

        c0e3q2

        4π|ηprime|1x

        )log x (446)

        Suppose q2 lt 2c2x (Otherwise the sum we are about to estimate is empty) Insteadof (424) we havesumq2ltmleDprime

        q-m

        |Tm(α)| le 40

        3π2

        c0q3

        6x

        sum1lejleDprimeq + 1

        2

        (j +

        1

        2

        )log

        x(j minus 1

        2

        )q

        le 10c0q3

        3π2x

        (log

        2x

        q+

        1

        q

        int Dprime

        0

        logx

        tdt+

        1

        q

        int Dprime

        0

        t logx

        tdt+

        Dprime

        qlog

        x

        Dprime

        )

        =10c0q

        3

        3π2x

        (log

        2x

        q+

        (2Dprime

        q+

        (Dprime)2

        2q2

        )log

        radicex

        Dprime

        )le 5c0c2

        3π2

        (4radic

        2c2x log2x

        q+ 4radic

        2c2x log

        radicex

        Dprime+Dprime log

        radicex

        Dprime

        )le 5c0c2

        3π2

        (Dprime log

        radicex

        Dprime+ 4radic

        2c2x log2radicex

        c2

        )(447)

        where Dprime = min(c2xqD) (We are using the bounds q3x le (2c2)32 Dprimeq2x lec2q lt c

        322

        radic2x and Dprimeqx le c2) Instead of (425) we have

        sumRltmleD

        |Tm(α)| lebDminusRq csumj=0

        (3c12 x

        jq +R+

        4q

        π

        radicc1c0

        4

        (1 +

        q

        jq +R

        ))log

        x

        jq +R

        70 CHAPTER 4 TYPE I SUMS

        where R = max(c2xq q2) We can simply reuse (426) multiplying it by log xRthe only difference is that now we take care to bound min(qc2 2xq) by the geometricmean

        radic(qc2)(2xq) =

        radic2xc2 We replace (427) by

        b 1q (DminusR)csumj=0

        radic1 +

        q

        jq +Rlog

        x

        jq +Rleradic

        1 +q

        Rlog

        x

        R+

        1

        q

        int D

        R

        radic1 +

        q

        tlog

        x

        tdt

        leradic

        3 logq

        c2+

        (D

        qlog

        ex

        Dminus R

        qlog

        ex

        R

        )+

        1

        2log

        q

        c2log+ D

        R

        (448)We sum with (446) and (447) and obtain (441) as an upper bound for (443) (Just asin the proof of Lemma 421 the term (5c0c2(3π

        2))Dprime log(radicexDprime) is smaller than

        the term (2radicc1c0π)R log exR in (448) and thus gets absorbed by it when D gt R

        If D le R then again as in Lemma 421 the sumsumRltmleD |Tm(α)| is empty and

        we bound (5c0c2(3π2))Dprime log(

        radicexDprime) by the term (2

        radicc1c0π)D log exD which

        would not appear otherwise)

        Now comes the time to focus on our second type I sum namelysumvleVv odd

        Λ(v)sumuleUu odd

        micro(u)sumn

        n odd

        e(αvun)η(vunx)

        which corresponds to the term SI2 in (39) The innermost two sums on their ownare a sum of type I we have already seen Accordingly for q small we will be able tobound them using Lemma 422 If q is large then that approach does not quite worksince then the approximation avq to vα is not always good enough (As we shall latersee we need q le Qv for the approximation to be sufficiently close for our purposes)

        Fortunately when q is large we can also afford to lose a factor of log since thegains from q will be large Here is the estimate we will use for q large

        Lemma 424 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(2e 2

        radicx) Let η be continuous piecewise C2 and compactly

        supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Let c2 = 6π5radicc0 Assume

        that x ge e2c22Let U V ge 1 satisfy UV +(1918)Q0 le x56 Then if |δ| le 12c2 the absolute

        value of ∣∣∣∣∣∣∣∣sumvleVv odd

        Λ(v)sumuleUu odd

        micro(u)sumn

        n odd

        e(αvun)η(vunx)

        ∣∣∣∣∣∣∣∣ (449)

        is at most

        x

        2qmin

        (1

        c0(πδ)2

        )log V q

        +Olowast(

        1

        4minus 1

        π2

        )middot c0(D2 log V

        2qx+

        3c42

        UV 2

        x+

        (U + 1)2V

        2xlog q

        ) (450)

        42 TYPE I ESTIMATES 71

        plus

        2radicc0c1π

        (D log

        Dradice

        + q

        (radic3 log

        c2x

        q+

        logD

        2log+ D

        q2

        ))+

        3c12

        x

        qlogD log+ D

        c2xq+

        2|ηprime|1π

        qmax

        (1 log

        c0e3q2

        4π|ηprime|1x

        )log

        q

        2

        +3c1

        2radic

        2c2

        radicx log

        c2x

        2+

        25c04π2

        (2c2)32radicx log x

        (451)

        whereD = UV and c1 = 1+ |ηprime|1(2xD) and c4 = 103884 The same bound holdsif |δ| ge 12c2 but D le Q02

        In general if |δ| ge 12c2 the absolute value of (449) is at most (450) plus

        2radicc0c1π

        D logD

        e

        +2radicc0c1π

        (1 + ε)

        (x

        |δ|q+ 1

        )((radic

        3 + 2εminus 1) log

        x|δ|q + 1radic

        2+

        1

        2logD log+ e2D

        x|δ|q

        )

        +

        (3c12

        (1

        2+

        3(1 + ε)

        16εlog x

        )+

        20c03π2

        (2c2)32

        )radicx log x

        (452)for ε isin (0 1]

        Proof We proceed essentially as in Lemma 421 and Lemma 422 Let Q qprime and Qprime

        be as in the proof of Lemma 422 that is with 2α where Lemma 421 uses αLet M = min(UVQ2) We first consider the terms with uv le M u and v odd

        uv divisible by q If q is even there are no such terms Assume q is odd Then by(433) and (434) the absolute value of the contribution of these terms is at most

        sumaleMa oddq|a

        sumv|a

        aUlevleV

        Λ(v)micro(av)

        (xη(minusδ2)

        2a+O

        (a

        x

        |ηprimeprime|infin2π2

        middot (π2 minus 4)

        )) (453)

        Now

        sumaleMa oddq|a

        sumv|a

        aUlevleV

        Λ(v)micro(av)

        a

        =sumvleVv odd

        (vq)=1

        Λ(v)

        v

        sumulemin(UMV )

        u oddq|u

        micro(u)

        u+sumpαleVp oddp|q

        Λ(pα)

        sumulemin(UMV )

        u oddq

        (qpα)|u

        micro(u)

        u

        72 CHAPTER 4 TYPE I SUMS

        which equals

        micro(q)

        q

        sumvleVv odd

        (vq)=1

        Λ(v)

        v

        sumulemin(UqMV q)

        (u2q)=1

        micro(u)

        u

        +micro(

        q(qpα)

        )q

        sumpαleVp oddp|q

        Λ(pα)

        pα(q pα)

        sumulemin( U

        q(qpα)MV

        q(qpα) )u odd

        (u q(qpα) )=1

        micro(u)

        u

        =1

        qmiddotOlowast

        sumvleV

        (v2q)=1

        Λ(v)

        v+sumpαleVp oddp|q

        log p

        pα(q pα)

        where we are using (220) to bound the sums on u by 1 We notice that

        sumpαleVp oddp|q

        log p

        pα(q pα)lesump oddp|q

        (log p)

        vp(q) +sum

        αgtvp(q)

        pαleV

        1

        pαminusvp(q)

        le log q +

        sump oddp|q

        (log p)sumβgt0

        pβle V

        pvp(q)

        log p

        pβle log q +

        sumvleVv odd

        (vq)=1

        Λ(v)

        v

        and so

        sumaleMa oddq|a

        sumv|a

        aUlevleV

        Λ(v)micro(av)

        a=

        1

        qmiddotOlowast

        log q +sumvleV

        (v2)=1

        Λ(v)

        v

        =

        1

        qmiddotOlowast(log q + log V )

        by (212) The absolute value of the sum of the terms with η(minusδ2) in (453) is thus atmost

        x

        q

        η(minusδ2)

        2(log q + log V ) le x

        2qmin

        (1

        c0(πδ)2

        )log V q

        where we are bounding η(minusδ2) by (21) (with k = 2)

        42 TYPE I ESTIMATES 73

        The other terms in (453) contribute at most

        (π2 minus 4)|ηprimeprime|infin2π2

        1

        x

        sumuleU

        sumvleV

        uv odduvleM q|uvu sq-free

        Λ(v)uv (454)

        For any RsumuleRu oddq|u le R24q + 3R4 Using the estimates (212) (215)

        and (216) we obtain that the double sum in (454) is at mostsumvleV

        (v2q)=1

        Λ(v)vsum

        ulemin(UMv)

        u oddq|u

        u+sumpαleVp oddp|q

        (log p)pαsumuleUu oddq

        (qpα)|u

        u

        lesumvleV

        (v2q)=1

        Λ(v)v middot(

        (Mv)2

        4q+

        3M

        4v

        )+sumpαleVp oddp|q

        (log p)pα middot (U + 1)2

        4

        le M2 log V

        4q+

        3c44MV +

        (U + 1)2

        4V log q

        (455)

        where c4 = 103884From this point onwards we use the easy bound∣∣∣∣∣∣∣∣∣

        sumv|a

        aUlevleV

        Λ(v)micro(av)

        ∣∣∣∣∣∣∣∣∣ le log a

        What we must bound now issummleUVm odd

        q - m orm gt M

        (logm)sumn odd

        e(αmn)η(mnx) (456)

        The inner sum is the same as the sum Tm(α) in (435) we will be using the bound(436) Much as before we will be able to ignore the condition that m is odd

        Let D = UV What remains to do is similar to what we did in the proof of Lemma421 (or Lemma 422)

        Case (a) δ large |δ| ge 12c2 Instead of (416) we have

        sum1lemleMq-m

        (logm)|Tm(α)| le 40

        3π2

        c0q3

        4x

        sum0lejleMq

        (j + 1) log(j + 1)q

        74 CHAPTER 4 TYPE I SUMS

        and since M le min(c2xqD) q leradic

        2c2x (just as in the proof of Lemma 421) andsum0lejleMq

        (j + 1) log(j + 1)q

        le M

        qlogM +

        (M

        q+ 1

        )log(M + 1) +

        1

        q2

        int M

        0

        t log t dt

        le(

        2M

        q+ 1

        )log x+

        M2

        2q2log

        Mradice

        we conclude thatsum1lemleMq-m

        |Tm(α)| le 5c0c23π2

        M logMradice

        +20c03π2

        (2c2)32radicx log x

        (457)

        Instead of (417) we have

        bDminus(Q+1)2

        qprime csumj=0

        x

        jqprime + Q+12

        log

        (jqprime +

        Q+ 1

        2

        )le x

        Q+12

        logQ+ 1

        2+x

        qprime

        int D

        Q+12

        log t

        tdt

        le 2x

        Qlog

        Q

        2+

        (1 + ε)x

        2εQ

        ((logD)2 minus

        (log

        Q

        2

        )2)

        Instead of (418) we estimate

        qprime

        lfloorDminusQ+1

        2qprime

        rfloorsumj=0

        (log

        (Q+ 1

        2+ jqprime

        ))radic1 +

        qprime

        jqprime + Q+12

        le qprime(

        logD + (radic

        3 + 2εminus 1) logQ+ 1

        2

        )+

        int D

        Q+12

        log t dt+

        int D

        Q+12

        qprime log t

        2tdt

        le qprime(

        logD +(radic

        3 + 2εminus 1)

        logQ+ 1

        2

        )+

        (D log

        D

        eminus Q+ 1

        2log

        Q+ 1

        2e

        )+qprime

        2logD log+ D

        Q+12

        We conclude that when D ge Q2 the sumsumQ2ltmleD(logm)|Tm(α)| is at most

        2radicc0c1π

        (D log

        D

        e+ (Q+ 1)

        ((1 + ε)(

        radic3 + 2εminus 1) log

        Q+ 1

        2minus 1

        2log

        Q+ 1

        2e

        ))+

        radicc0c1π

        (Q+ 1)(1 + ε) logD log+ e2DQ+1

        2

        +3c12

        (2x

        Qlog

        Q

        2+

        (1 + ε)x

        2εQ

        ((logD)2 minus

        (log

        Q

        2

        )2))

        42 TYPE I ESTIMATES 75

        We must now add this to (457) Since

        (1 + ε)(radic

        3 + 2εminus 1) logradic

        2minus 1

        2log 2e+

        1 +radic

        133

        2log 2radice gt 0

        and Q ge 2radicx we conclude that (456) is at most

        2radicc0c1π

        D logD

        e

        +2radicc0c1π

        (1 + ε)(Q+ 1)

        ((radic

        3 + 2εminus 1) logQ+ 1radic

        2+

        1

        2logD log+ e2D

        Q+12

        )

        +

        (3c12

        (1

        2+

        3(1 + ε)

        16εlog x

        )+

        20c03π2

        (2c2)32

        )radicx log x

        (458)Case (b) δ small |δ| le 12c2 or D le Q02 The analogue of (423) is a bound of

        le 2|ηprime|1π

        qmax

        (1 log

        c0e3q2

        4π|ηprime|1x

        )log

        q

        2

        for the terms with m le q2 If q2 lt 2c2x then much as in (424) we havesumq2ltmleDprime

        q-m

        |Tm(α)|(logm) le 10

        π2

        c0q3

        3x

        sum1lejleDprimeq + 1

        2

        (j +

        1

        2

        )log(j + 12)q

        le 10

        π2

        c0q

        3x

        int Dprime+ 32 q

        q

        x log x dx

        (459)

        Sinceint Dprime+ 32 q

        q

        x log x dx =1

        2

        (Dprime +

        3

        2q

        )2

        logDprime + 3

        2qradiceminus 1

        2q2 log

        qradice

        =

        (1

        2Dprime2 +

        3

        2Dprimeq

        )(log

        Dprimeradice

        +3

        2

        q

        Dprime

        )+

        9

        8q2 log

        Dprime + 32qradiceminus 1

        2q2 log

        qradice

        =1

        2Dprime2 log

        Dprimeradice

        +3

        2Dprimeq logDprime +

        9

        8q2

        (2

        9+

        3

        2+ log

        (Dprime +

        19

        18q

        ))

        where Dprime = min(c2xqD) and since the assumption (UV + (1918)Q0) le x56implies that (29 + 32 + log(Dprime + (1918)q)) le x we conclude thatsum

        q2ltmleDprime

        q-m

        |Tm(α)|(logm)

        le 5c0c23π2

        Dprime logDprimeradice

        +10c03π2

        (3

        4(2c2)32

        radicx log x+

        9

        8(2c2)32

        radicx log x

        )le 5c0c2

        3π2Dprime log

        Dprimeradice

        +25c04π2

        (2c2)32radicx log x

        (460)

        76 CHAPTER 4 TYPE I SUMS

        Let R = max(c2xq q2) We bound the terms R lt m le D as in (425) with afactor of log(jq +R) inside the sum The analogues of (426) and (427) are

        b 1q (DminusR)csumj=0

        x

        jq +Rlog(jq +R) le x

        RlogR+

        x

        q

        int D

        R

        log t

        tdt

        leradic

        2x

        c2log

        radicc2x

        2+x

        qlogD log+ D

        R

        (461)

        where we use the assumption that x ge e2c2 and

        b 1q (DminusR)csumj=0

        log(jq +R)

        radic1 +

        q

        jq +Rleradic

        3 logR

        +1

        q

        (D log

        D

        eminusR log

        R

        e

        )+

        1

        2logD log

        D

        R

        (462)

        (or 0 if D lt R) We sum with (460) and the terms with m le q2 and obtain forDprime = c2xq = R

        2radicc0c1π

        (D log

        Dradice

        + q

        (radic3 log

        c2x

        q+

        logD

        2log+ D

        q2

        ))+

        3c12

        x

        qlogD log+ D

        c2xq+

        2|ηprime|1π

        qmax

        (1 log

        c0e3q2

        4π|ηprime|1x

        )log

        q

        2

        +3c1

        2radic

        2c2

        radicx log

        c2x

        2+

        25c04π2

        (2c2)32radicx log x

        which it is easy to check is also valid even if Dprime = D (in which case (461) and (462)do not appear) or R = q2 (in which case (460) does not appear)

        Chapter 5

        Type II sums

        We must now consider the sum

        SII =summgtU

        (mv)=1

        sumdgtUd|m

        micro(d)

        sumngtV

        (nv)=1

        Λ(n)e(αmn)η(mnx) (51)

        Here the main improvements over classical treatments of type II sums are as fol-lows

        1 obtaining cancellation in the term sumdgtUd|m

        micro(d)

        leading to a gain of a factor of log

        2 using a large sieve for primes getting rid of a further log

        3 exploiting via a non-conventional application of the principle of the large sieve(Lemma 521) the fact that α is in the tail of an interval (when that is the case)

        It should be clear that these techniques are of general applicability (It is also clear that(2) is not new though strangely enough it seems not to have been applied to Gold-bachrsquos problem Perhaps this oversight is due to the fact that proofs of Vinogradovrsquosresult given in textbooks often follow Linnikrsquos dispersion method rather than the largesieve Our treatment of the large sieve for primes will follow the lines set by Mont-gomery and Montgomery-Vaughan [MV73 (16)] The fact that the large sieve forprimes can be combined with the new technique (3) is of course a novelty)

        While (1) is particularly useful for the treatment of a term that generally arises inapplications of Vaughanrsquos identity all of the points above address issues that can arisein more general situations in number theory

        77

        78 CHAPTER 5 TYPE II SUMS

        It is technically helpful to express η as the (multiplicative) convolution of two func-tions of compact support ndash preferrably the same function

        η(x) = η1 lowastM η1 =

        int infin0

        η1(t)η1(xt)dt

        t (52)

        For the smoothing function η(t) = η2(t) = 4 max(log 2 minus | log 2t| 0) equation (52)holds with η1 = 2 middot 1[121] where 1[121] is the characteristic function of the interval[12 1] We will work with η = η2 yet most of our work will be valid for any η of theform η = η1 lowast η1

        By (52) the sum (51) equals

        4

        int infin0

        summgtU

        (mv)=1

        sumdgtUd|m

        micro(d)

        sumngtV

        (nv)=1

        Λ(n)e(αmn)η1(t)η1

        (mnx

        t

        )dt

        t

        = 4

        int xU

        V

        summax( x

        2W U)ltmle xW

        (mv)=1

        sumdgtUd|m

        micro(d)

        summax(VW2 )ltnleW

        (nv)=1

        Λ(n)e(αmn)dW

        W

        (53)by the substitution t = (mx)W (We can assume V le W le xU because otherwiseone of the sums in (54) is empty) As we can see the sums within the integral are nowunsmoothed This will not be truly harmful and to some extent it will be convenientin that ready-to-use large-sieve estimates in the literature have been optimized morecarefully for unsmoothed sums than for smooth sums The fact that the sums start atx2W and W2 rather than at 1 will also be slightly helpful

        (This is presumably why the weight η2 was introduced in [Tao14] which also usesthe large sieve As we will later see the weight η2 ndash or anything like it ndash will simplynot do on the major arcs which are much more sensitive to the choice of weights Onthe minor arcs however η2 is convenient and this is why we use it here For type Isums ndash as should be clear from our work so far which was stated for general weightsndash any function whose second derivative exists almost everywhere and lies in `1 woulddo just as well The option of having no smoothing whatsoever ndash as in Vinogradovrsquoswork or as in most textbook accounts ndash would not be quite as good for type I sumsand would lead to a routine but inconvenient splitting of sums into short intervals inplace of (53))

        We now do what is generally the first thing in type II treatments we use Cauchy-Schwarz A minor note however that may help avoid confusion the treatments fa-miliar to some readers (eg the dispersion method not followed here) start with thespecial case of Cauchy-Schwarz that is most common in number theory∣∣∣∣∣∣

        sumnleN

        an

        ∣∣∣∣∣∣2

        le NsumnleN

        |an|2

        79

        whereas here we apply the general rule

        summ

        ambm leradicsum

        m

        |am|2radicsum

        m

        |bm|2

        to the integrand in (53) At any rate we will have reduced the estimation of a sumto the estimation of two simpler sums

        summ |am|2

        summ |bm|2 but each of these two

        simpler sums will be of a kind that we will lead to a loss of a factor of log x (or(log x)3) if not estimated carefully Since we cannot afford to lose a single factor oflog x we will have to deploy and develop techniques to eliminate these factors of log xThe procedure followed will be quite different for the two sums a variety of techniqueswill be needed

        We separate n prime and n non-prime in the integrand of (53) and as we weresaying we apply Cauchy-Schwarz We obtain that the expression within the integral in(53) is at most

        radicS1(UW ) middot S2(U VW ) +

        radicS1(UW ) middot S3(W ) where

        S1(UW ) =sum

        max( x2W U)ltmle x

        W

        (mv)=1

        sumdgtUd|m

        micro(d)

        2

        S2(U VW ) =sum

        max( x2W U)ltmle x

        W

        (mv)=1

        ∣∣∣∣∣∣∣∣∣∣sum

        max(VW2 )ltpleW(pv)=1

        (log p)e(αmp)

        ∣∣∣∣∣∣∣∣∣∣

        2

        (54)

        and

        S3(W ) =sum

        x2W ltmle x

        W

        (mv)=1

        ∣∣∣∣∣∣∣∣sumnleW

        n non-prime

        Λ(n)

        ∣∣∣∣∣∣∣∣2

        =sum

        x2W ltmle x

        W

        (mv)=1

        (142620W 12

        )2

        le 10171x+ 20341W

        (55)

        (by [RS62 Thm 13]) We will assume V le w thus the condition (p v) = 1 will befulfilled automatically and can be removed

        The contribution of S3(W ) will be negligible We must bound S1(UW ) andS2(U VW ) from above

        80 CHAPTER 5 TYPE II SUMS

        51 The sum S1 cancellationWe shall bound

        S1(UW ) =sum

        max(Ux2W )ltmlexW(mv)=1

        sumdgtUd|m

        micro(d)

        2

        (56)

        There will be a surprising amount of cancellation the expression within the sumwill be bounded by a constant on average ndash a constant less than 1 and usually less than12 in fact In other words the inner sum in (56) is exactly 0 most of the time

        Recall that we need explicit constants throughout and that this essentially con-strains us to elementary means (We will at one point use Dirichlet series and ζ(s) fors real and greater than 1)

        511 Reduction to a sum with microIt is tempting to start by applying Mobius inversion to change d gt U to d le U in(56) but this just makes matters worse We could also try changing variables so thatmd (which is smaller than xUW ) becomes the variable instead of d but this leadsto complications for m non-square-free Instead we write

        summax(Ux2W )ltmlexW

        (mv)=1

        sumdgtUd|m

        micro(d)

        2

        =sum

        x2W ltmle x

        W

        (mv)=1

        sumd1d2|m

        micro(d1 gt U)micro(d2 gt U)

        =sum

        r1ltxWU

        sumr2ltxWU

        (r1r2)=1

        (r1r2v)=1

        suml

        (lr1r2)=1

        r1lr2lgtU

        (`v)=1

        micro(r1l)micro(r2l)sum

        x2W ltmle x

        W

        r1r2l|m(mv)=1

        1

        (57)where d1 = r1l d2 = r2l l = (d1 d2) (The inequality r1 lt xWU comes fromr1r2l|m m le xW r2l gt U r2 lt xWU is proven in the same way) Now (57)equals sum

        slt xWU

        (sv)=1

        sumr1lt

        xWUs

        sumr2lt

        xWUs

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)sum

        max(

        Umin(r1r2)

        xW

        2r1r2s

        )ltlle xW

        r1r2s

        (lr1r2)=1(micro(l))2=1

        (`v)=1

        1 (58)

        where we have set s = m(r1r2l) We begin by simplifying the innermost triple sumThis we do in the following Lemma it is not a trivial task and carrying it out efficientlyactually takes an idea

        51 THE SUM S1 CANCELLATION 81

        Lemma 511 Let z y gt 0 Thensumr1lty

        sumr2lty

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)sum

        min(

        zymin(r1r2)

        z2r1r2

        )ltlle z

        r1r2

        (lr1r2)=1(micro(l))2=1

        (`v)=1

        1 (59)

        equals

        6z

        π2

        v

        σ(v)

        sumr1lty

        sumr2lty

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)

        σ(r1)σ(r2)

        (1minusmax

        (1

        2r1

        yr2

        y

        ))

        +Olowast

        508 ζ

        (3

        2

        )2

        yradicz middotprodp|v

        (1 +

        1radicp

        )(1minus 1

        p32

        )2

        (510)

        If v = 2 the error term in (510) can be replaced by

        Olowast

        (127ζ

        (3

        2

        )2

        yradicz middot(

        1 +1radic2

        )(1minus 1

        232

        )2) (511)

        Proof By Mobius inversion (59) equalssumr1lty

        sumr2lty

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)sum

        lle zr1r2

        lgtmin(

        zymin(r1r2)

        z2r1r2

        )(`v)=1

        sumd1|r1d2|r2d1d2|l

        micro(d1)micro(d2)

        sumd3|vd3|l

        micro(d3)summ2|l

        (mr1r2v)=1

        micro(m)

        (512)

        We can change the order of summation of ri and di by defining si = ridi and we canalso use the obvious fact that the number of integers in an interval (a b] divisible by dis (bminus a)d+Olowast(1) Thus (512) equalssum

        d1d2lty

        (d1d2)=1

        (d1d2v)=1

        micro(d1)micro(d2)sum

        s1ltyd1s2ltyd2

        (d1s1d2s2)=1

        (s1s2v)=1

        micro(d1s1)micro(d2s2)

        sumd3|v

        micro(d3)sum

        mleradic

        z

        d21s1d22s2d3

        (md1s1d2s2v)=1

        micro(m)

        d1d2d3m2

        z

        s1d1s2d2

        (1minusmax

        (1

        2s1d1

        ys2d2

        y

        ))

        (513)

        82 CHAPTER 5 TYPE II SUMS

        plus

        Olowast

        sum

        d1d2lty

        (d1d2v)=1

        sums1ltyd1s2ltyd2

        (s1s2v)=1

        sumd3|v

        summle

        radicz

        d21s1d22s2d3

        m sq-free

        1

        (514)

        If we complete the innermost sum in (513) by removing the condition

        m leradicz(d2

        1sd22s2)

        we obtain (reintroducing the variables ri = disi)

        z middotsum

        r1r2lty

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)

        r1r2

        (1minusmax

        (1

        2r1

        yr2

        y

        ))

        sumd1|r1d2|r2

        sumd3|v

        summ

        (mr1r2v)=1

        micro(d1)micro(d2)micro(m)micro(d3)

        d1d2d3m2

        (515)

        times z Now (515) equalssumr1r2lty

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)z

        r1r2

        (1minusmax

        (1

        2r1

        yr2

        y

        )) prodp|r1r2

        or v

        (1minus 1

        p

        ) prodp-r1r2p-v

        (1minus 1

        p2

        )

        =6z

        π2

        v

        σ(v)

        sumr1r2lty

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)

        σ(r1)σ(r2)

        (1minusmax

        (1

        2r1

        yr2

        y

        ))

        ie the main term in (510) It remains to estimate the terms used to complete thesum their total is by definition given exactly by (513) with the inequality m leradicz(d2

        1sd22s2d3) changed to m gt

        radicz(d2

        1sd22s2d3) This is a total of size at most

        1

        2

        sumd1d2lty

        (d1d2v)=1

        sums1ltyd1s2ltyd2

        (s1s2v)=1

        sumd3|v

        summgt

        radicz

        d21s1d22s2d3

        m sq-free

        1

        d1d2d3m2

        z

        s1d1s2d2 (516)

        Adding this to (514) we obtain as our total error termsumd1d2lty

        (d1d2v)=1

        sums1ltyd1s2ltyd2

        (s1s2v)=1

        sumd3|v

        f

        (radicz

        d21s1d2

        2s2d3

        ) (517)

        51 THE SUM S1 CANCELLATION 83

        where

        f(x) =summlexm sq-free

        1 +1

        2

        summgtxm sq-free

        x2

        m2

        It is easy to see that f(x)x has a local maximum exactly when x is a square-free(positive) integer We can hence check that

        f(x) le 1

        2

        (2 + 2

        (ζ(2)

        ζ(4)minus 125

        ))x = 126981 x

        for all x ge 0 by checking all integers smaller than a constant using m m sq-free subm 4 - m and 15 middot (34) lt 126981 to bound f from below for x larger than aconstant Therefore (517) is at most

        127sum

        d1d2lty

        (d1d2v)=1

        sums1ltyd1s2ltyd2

        (s1s2v)=1

        sumd3|v

        radicz

        d21s1d2

        2s2d3

        = 127radiczprodp|v

        (1 +

        1radicp

        )middot

        sumdlty

        (dv)=1

        sumsltyd

        (sv)=1

        1

        dradics

        2

        We can bound the double sum simply by

        sumdlty

        (dv)=1

        sumsltyd

        1radicsdle 2

        sumdlty

        radicyd

        dle 2radicy middot ζ

        (3

        2

        )prodp|v

        (1minus 1

        p32

        )

        Alternatively if v = 2 we bound

        sumsltyd

        (sv)=1

        1radics

        =sumsltyd

        s odd

        1radicsle 1 +

        1

        2

        int yd

        1

        1radicsds =

        radicyd

        and thus

        sumdlty

        (dv)=1

        sumsltyd

        (sv)=1

        1radicsdle

        sumdlty

        (d2)=1

        radicyd

        dle radicy

        (1minus 1

        232

        (3

        2

        )

        Applying Lemma 511 with y = Ss and z = xWs where S = xWU we

        84 CHAPTER 5 TYPE II SUMS

        obtain that (58) equals

        6x

        π2W

        v

        σ(v)

        sumsltS

        (sv)=1

        1

        s

        sumr1ltSs

        sumr2ltSs

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)

        σ(r1)σ(r2)

        (1minusmax

        (1

        2r1

        Ssr2

        Ss

        ))

        +Olowast

        504ζ

        (3

        2

        )3

        S

        radicx

        W

        prodp|v

        (1 +

        1radicp

        )(1minus 1

        p32

        )3

        (518)with 504 replaced by 127 if v = 2 The main term in (518) can be written as

        6x

        π2W

        v

        σ(v)

        sumsleS

        (sv)=1

        1

        s

        int 1

        12

        sumr1leuSs

        sumr2leuSs

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)

        σ(r1)σ(r2)du (519)

        As we can see the use of an integral eliminates the unpleasant factor(1minusmax

        (1

        2r1

        Ssr2

        Ss

        ))

        From now on we will focus on the cases v = 1 and v = 2 for simplicity (Highervalues of v do not seem to be really profitable in the last analysis)

        512 Explicit bounds for a sum with microWe must estimate the expression within parentheses in (519) It is not too hard toshow that it tends to 0 the first part of the proof of Lemma 512 will reduce this to thefact that

        sumn micro(n)n = 0 Obtaining good bounds is a more delicate matter For our

        purposes we will need the expression to converge to 0 at least as fast as 1(log)2 witha good constant in front For this task the bound (221) on

        sumnlex micro(n)n is enough

        Lemma 512 Let

        gv(x) =sumr1lex

        sumr2lex

        (r1r2)=1

        (r1r2v)=1

        micro(r1)micro(r2)

        σ(r1)σ(r2)

        where v = 1 or v = 2 Then

        |g1(x)| le

        1x if 33 le x le 1061x (111536 + 55768 log x) if 106 le x lt 101000044325(log x)2 + 01079radic

        xif x ge 1010

        |g2(x)| le

        21x if 33 le x le 1061x (163434 + 817168 log x) if 106 le x lt 10100038128(log x)2 + 02046radic

        x if x ge 1010

        51 THE SUM S1 CANCELLATION 85

        Tbe proof involves what may be called a version of Rankinrsquos trick using Dirichletseries and the behavior of ζ(s) near s = 1

        Proof We prove the statements for x le 106 by a direct computation using intervalarithmetic (In fact in that range one gets 20895071x instead of 21x) Assumefrom now on that x gt 106

        Clearly

        g(x) =sumr1lex

        sumr2lex

        (r1r2v)=1

        sumd|(r1r2)

        micro(d)

        micro(r1)micro(r2)

        σ(r1)σ(r2)

        =sumdlex

        (dv)=1

        micro(d)sumr1lex

        sumr2lex

        d|(r1r2)

        (r1r2v)=1

        micro(r1)micro(r2)

        σ(r1)σ(r2)

        =sumdlex

        (dv)=1

        micro(d)

        (σ(d))2

        sumu1lexd

        (u1dv)=1

        sumu2lexd

        (u2dv)=1

        micro(u1)micro(u2)

        σ(u1)σ(u2)

        =sumdlex

        (dv)=1

        micro(d)

        (σ(d))2

        sumrlexd

        (rdv)=1

        micro(r)

        σ(r)

        2

        (520)

        Moreover sumrlexd

        (rdv)=1

        micro(r)

        σ(r)=

        sumrlexd

        (rdv)=1

        micro(r)

        r

        sumdprime|r

        prodp|dprime

        (p

        p+ 1minus 1

        )

        =sum

        dprimelexdmicro(dprime)2=1

        (dprimedv)=1

        prodp|dprime

        minus1

        p+ 1

        sumrlexd

        (rdv)=1

        dprime|r

        micro(r)

        r

        =sum

        dprimelexdmicro(dprime)2=1

        (dprimedv)=1

        1

        dprimeσ(dprime)

        sumrlexddprime

        (rddprimev)=1

        micro(r)

        r

        and sumrlexddprime

        (rddprimev)=1

        micro(r)

        r=

        sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

        1

        dprimeprime

        sumrlexddprimedprimeprime

        micro(r)

        r

        86 CHAPTER 5 TYPE II SUMS

        Hence

        |g(x)| lesumdlex

        (dv)=1

        (micro(d))2

        (σ(d))2

        sum

        dprimelexdmicro(dprime)2=1

        (dprimedv)=1

        1

        dprimeσ(dprime)

        sumdprimeprimelexddprimedprimeprime|(ddprimev)infin

        1

        dprimeprimef(xddprimedprimeprime)

        2

        (521)

        where f(t) =∣∣∣sumrlet micro(r)r

        ∣∣∣We intend to bound the function f(t) by a linear combination of terms of the form

        tminusδ δ isin [0 12) Thus it makes sense now to estimate Fv(s1 s2 x) defined to be thequantity

        sumd

        (dv)=1

        (micro(d))2

        (σ(d))2

        sumdprime1

        (dprime1dv)=1

        micro(dprime1)2

        dprime1σ(dprime1)

        sumdprimeprime1 |(ddprime1v)infin

        1

        dprimeprime1middot (ddprime1dprimeprime1)1minuss1

        sum

        dprime2(dprime2dv)=1

        micro(dprime2)2

        dprime2σ(dprime2)

        sumdprimeprime2 |(ddprime2v)infin

        1

        dprimeprime2middot (ddprime2dprimeprime2)1minuss2

        for s1 s2 isin [12 1] This is equal to

        sumd

        (dv)=1

        micro(d)2

        ds1+s2

        prodp|d

        1

        (1 + pminus1)2

        (1minus pminuss1)prodp|v

        1(1minuspminuss1 )(1minuspminuss2 )

        (1minus pminuss2)

        middot

        sumdprime

        (dprimedv)=1

        micro(dprime)2

        (dprime)s1+1

        prodpprime|dprime

        1

        (1 + pprimeminus1) (1minus pprimeminuss1)

        middot

        sumdprime

        (dprimedv)=1

        micro(dprime)2

        (dprime)s2+1

        prodpprime|dprime

        1

        (1 + pprimeminus1) (1minus pprimeminuss2)

        which in turn can easily be seen to equalprodp-v

        (1 +

        pminuss1pminuss2

        (1minus pminuss1 + pminus1)(1minus pminuss2 + pminus1)

        )prodp|v

        1

        (1minus pminuss1)(1minus pminuss2)

        middotprodp-v

        (1 +

        pminus1pminuss1

        (1 + pminus1)(1minus pminuss1)

        )middotprodp-v

        (1 +

        pminus1pminuss2

        (1 + pminus1)(1minus pminuss2)

        ) (522)

        51 THE SUM S1 CANCELLATION 87

        Now for any 0 lt x le y le x12 lt 1

        (1+xminusy)(1minusxy)(1minusxy2)minus(1+x)(1minusy)(1minusx3) = (xminusy)(y2minusx)(xyminusxminus1)x le 0

        and so

        1 +xy

        (1 + x)(1minus y)=

        (1 + xminus y)(1minus xy)(1minus xy2)

        (1 + x)(1minus y)(1minus xy)(1minus xy2)le (1minus x3)

        (1minus xy)(1minus xy2)

        (523)For any x le y1 y2 lt 1 with y2

        1 le x y22 le x

        1 +y1y2

        (1minus y1 + x)(1minus y2 + x)le (1minus x3)2(1minus x4)

        (1minus y1y2)(1minus y1y22)(1minus y2

        1y2) (524)

        This can be checked as follows multiplying by the denominators and changing vari-ables to x s = y1 + y2 and r = y1y2 we obtain an inequality where the left sidequadratic on s with positive leading coefficient must be less than or equal to the rightside which is linear on s The left side minus the right side can be maximal for givenx r only when s is maximal or minimal This happens when y1 = y2 or when eitheryi =

        radicx or yi = x for at least one of i = 1 2 In each of these cases we have re-

        duced (524) to an inequality in two variables that can be proven automatically1 by aquantifier-elimination program the author has used QEPCAD [HB11] to do this

        Hence Fv(s1 s2 x) is at most

        prodp-v

        (1minus pminus3)2(1minus pminus4)

        (1minus pminuss1minuss2)(1minus pminus2s1minuss2)(1minus pminuss1minus2s2)middotprodp|v

        1

        (1minus pminuss1)(1minus pminuss2)

        middotprodp-v

        1minus pminus3

        (1 + pminuss1minus1)(1 + pminus2s1minus1)

        prodp-v

        1minus pminus3

        (1 + pminuss2minus1)(1 + pminus2s2minus1)

        = Cvs1s2 middotζ(s1 + 1)ζ(s2 + 1)ζ(2s1 + 1)ζ(2s2 + 1)

        ζ(3)4ζ(4)(ζ(s1 + s2)ζ(2s1 + s2)ζ(s1 + 2s2))minus1

        (525)where Cvs1s2 equals 1 if v = 1 and

        (1minus 2minuss1minus2s2)(1 + 2minuss1minus1)(1 + 2minus2s1minus1)(1 + 2minuss2minus1)(1 + 2minus2s2minus1)

        (1minus 2minuss1+s2)minus1(1minus 2minus2s1minuss2)minus1(1minus 2minuss1)(1minus 2minuss2)(1minus 2minus3)4(1minus 2minus4)

        if v = 2For 1 le t le x (221) and (224) imply

        f(t) le

        radic

        2t if x le 1010radic2t + 003

        log x

        (xt

        ) log log 1010

        log xminuslog 1010 if x gt 1010(526)

        1In practice the case yi =radicx leads to a polynomial of high degree and quantifier elimination increases

        sharply in complexity as the degree increases a stronger inequality of lower degree (with (1minus 3x3) insteadof (1minus x3)2(1minus x4)) was given to QEPCAD to prove in this case

        88 CHAPTER 5 TYPE II SUMS

        where we are using the fact that log x is convex-down Note that again by convexity

        log log xminus log log 1010

        log xminus log 1010lt (log t)prime|t=log 1010 =

        1

        log 1010= 00434294

        Obviouslyradic

        2t in (526) can be replaced by (2t)12minusε for any ε ge 0By (521) and (526)

        |gv(x)| le(

        2

        x

        )1minus2ε

        Fv(12 + ε 12 + ε x)

        for x le 1010 We set ε = 1 log x and obtain from (525) that

        Fv(12 + ε 12 + ε x) le Cv 12 +ε 12 +ε

        ζ(1 + 2ε)ζ(32)4ζ(2)2

        ζ(3)4ζ(4)

        le 55768 middot Cv 12 +ε 12 +ε middot(

        1 +log x

        2

        )

        (527)

        where we use the easy bound ζ(s) lt 1 + 1(sminus 1) obtained bysumns lt 1 +

        int infin1

        tsdt

        (For sharper bounds see [BR02]) Now

        C2 12 +ε 12 +ε le(1minus 2minus32minusε)2(1 + 2minus32)2(1 + 2minus2)2(1minus 2minus1minus2ε)

        (1minus 2minus12)2(1minus 2minus3)4(1minus 2minus4)

        le 14652983

        whereas C1 12 +ε 12 +ε = 1 (We are assuming x ge 106 and so ε le 1(log 106)) Hence

        |gv(x)| le

        1x (111536 + 55768 log x) if v = 11x (163434 + 817168 log x) if v = 2

        for 106 le x lt 1010For general x we must use the second bound in (526) Define c = 1(log 1010)

        We see that if x gt 1010

        |gv(x)| le 0032

        (log x)2F1(1minus c 1minus c) middot Cv1minusc1minusc

        + 2 middotradic

        2radicx

        003

        log xF (1minus c 12) middot Cv1minusc12

        +1

        x(111536 + 55768 log x) middot Cv 12 +ε 12 +ε

        For v = 1 this gives

        |g1(x)| le 00044325

        (log x)2+

        21626radicx log x

        +1

        x(111536 + 55768 log x)

        le 00044325

        (log x)2+

        01079radicx

        51 THE SUM S1 CANCELLATION 89

        for v = 2 we obtain

        |g2(x)| le 0038128

        (log x)2+

        25607radicx log x

        +1

        x(163434 + 817168 log x)

        le 0038128

        (log x)2+

        02046radicx

        513 Estimating the triple sumWe will now be able to bound the triple sum in (519) vizsum

        sleS(sv)=1

        1

        s

        int 1

        12

        gv(uSs)du (528)

        where gv is as in Lemma 512As we will soon see Lemma 512 that (528) is bounded by a constant (essentially

        because the integralint 12

        01t(log t)2 converges) We must give as good a constant as

        we can since it will affect the largest term in the final resultClearly gv(R) = gv(bRc) The contribution of each gv(m) 1 le m le S to (528)

        is exactly gv(m) timessumS

        m+1ltsleSm

        1

        s

        (sv)=1

        int 1

        msS

        1du+sum

        S2mltsle

        Sm+1

        1

        s

        (sv)=1

        int (m+1)sS

        msS

        1du

        +sum

        S2(m+1)

        ltsle S2m

        1

        s

        (sv)=1

        int (m+1)sS

        12

        du =sum

        Sm+1ltsle

        Sm

        (sv)=1

        (1

        sminus m

        S

        )

        +sum

        S2mltsle

        Sm+1

        (sv)=1

        1

        S+

        sumS

        2(m+1)ltsle S

        2m

        (sv)=1

        (m+ 1

        Sminus 1

        2s

        )

        (529)

        Write f(t) = 1S for S2m lt t le S(m+1) f(t) = 0 for t gt Sm or t lt S2(m+1) f(t) = 1tminusmS for S(m+ 1) lt t le Sm and f(t) = (m+ 1)S minus 12t forS2(m + 1) lt t le S2m then (529) equals

        sumn(nv)=1 f(n) By Euler-Maclaurin

        (second order)sumn

        f(n) =

        int infinminusinfin

        f(x)minus 1

        2B2(x)f primeprime(x)dx =

        int infinminusinfin

        f(x) +Olowast(

        1

        12|f primeprime(x)|

        )dx

        =

        int infinminusinfin

        f(x)dx+1

        6middotOlowast

        (∣∣∣∣f prime( 3

        2m

        )∣∣∣∣+

        ∣∣∣∣f prime( s

        m+ 1

        )∣∣∣∣)=

        1

        2log

        (1 +

        1

        m

        )+

        1

        6middotOlowast

        ((2m

        s

        )2

        +

        (m+ 1

        s

        )2)

        (530)

        90 CHAPTER 5 TYPE II SUMS

        Similarly

        sumn odd

        f(n) =

        int infinminusinfin

        f(2x+ 1)minus 1

        2B2(x)d

        2f(2x+ 1)

        dx2dx

        =1

        2

        int infinminusinfin

        f(x)dxminus 2

        int infinminusinfin

        1

        2B2

        (xminus 1

        2

        )f primeprime(x)dx

        =1

        2

        int infinminusinfin

        f(x)dx+1

        6

        int infinminusinfin

        Olowast (|f primeprime(x)|) dx

        =1

        4log

        (1 +

        1

        m

        )+

        1

        3middotOlowast

        ((2m

        s

        )2

        +

        (m+ 1

        s

        )2)

        We use these expressions form le C0 where C0 ge 33 is a constant to be computedlater they will give us the main term For m gt C0 we use the bounds on |g(m)| thatLemma 512 gives us

        (Starting now and for the rest of the paper we will focus on the cases v = 1v = 2 when giving explicit computational estimates All of our procedures wouldallow higher values of v as well but as will become clear much later the gains fromhigher values of v are offset by losses and complications elsewhere)

        Let us estimate (528) Let

        cv0 =

        16 if v = 113 if v = 2

        cv1 =

        1 if v = 125 if v = 2

        cv2 =

        55768 if v = 1817168 if v = 2

        cv3 =

        111536 if v = 1163434 if v = 2

        cv4 =

        00044325 if v = 10038128 if v = 2

        cv5 =

        01079 if v = 102046 if v = 2

        Then (528) equals

        summleC0

        gv(m) middot(φ(v)

        2vlog

        (1 +

        1

        m

        )+Olowast

        (cv0

        5m2 + 2m+ 1

        S2

        ))

        +sum

        S106lesltSC0

        1

        s

        int 1

        12

        Olowast(cv1uSs

        )du

        +sum

        S1010lesltS106

        1

        s

        int 1

        12

        Olowast(cv2 log(uSs) + cv3

        uSs

        )du

        +sum

        sltS1010

        1

        s

        int 1

        12

        Olowast

        (cv4

        (log uSs)2+

        cv5radicuSs

        )du

        51 THE SUM S1 CANCELLATION 91

        which issummleC0

        gv(m) middot φ(v)

        2vlog

        (1 +

        1

        m

        )+summleC0

        |g(m)| middotOlowast(cv0

        5m2 + 2m+ 1

        S2

        )

        +Olowast

        (cv1

        log 2

        C0+

        log 2

        106

        (cv3 + cv2(1 + log 106)

        )+

        2minusradic

        2

        10102cv5

        )

        +Olowast

        sumsltS1010

        cv42

        s(logS2s)2

        for S ge (C0 + 1) Note that

        sumsltS1010

        1s(logS2s)2 =

        int 21010

        01

        t(log t)2 dtNow

        cv42

        int 21010

        0

        1

        t(log t)2dt =

        cv42

        log(10102)=

        000009923 if v = 1

        0000853636 if v = 2

        and

        log 2

        106

        (cv3 + cv2(1 + log 106)

        )+

        2minusradic

        2

        105cv5 =

        00006506 if v = 1

        0009525 if v = 2

        For C0 = 10000

        φ(v)

        v

        1

        2

        summleC0

        gv(m) middot log

        (1 +

        1

        m

        )=

        0362482 if v = 10360576 if v = 2

        cv0summleC0

        |gv(m)|(5m2 + 2m+ 1) le

        62040665 if v = 1159113401 if v = 2

        and

        cv1 middot (log 2)C0 =

        000006931 if v = 1000017328 if v = 2

        Thus for S ge 100000sumsleS

        (sv)=1

        1

        s

        int 1

        12

        gv(uSs)du le

        036393 if v = 1037273 if v = 2

        (531)

        For S lt 100000 we proceed as above but using the exact expression (529) insteadof (530) Note (529) is of the form fsm1(S) + fsm2(S)S where both fsm1(S)and fsm2(S) depend only on bSc (and on s andm) Summing overm le S we obtaina bound of the form sum

        sleS(sv)=1

        1

        s

        int 1

        12

        gv(uSs)du le Gv(S)

        92 CHAPTER 5 TYPE II SUMS

        withGv(S) = Kv1(|S|) +Kv2(|S|)S

        where Kv1(n) and Kv2(n) can be computed explicitly for each integer n (For exam-ple Gv(S) = 1minus 1S for 1 le S lt 2 and Gv(S) = 0 for S lt 1)

        It is easy to check numerically that this implies that (531) holds not just for S ge100000 but also for 40 le S lt 100000 (if v = 1) or 16 le S lt 100000 (if v =

        2) Using the fact that Gv(S) is non-negative we can compareint T

        1Gv(S)dSS with

        log(T+1N) for each T isin [2 40]cap 1NZ (N a large integer) to show again numerically

        that int T

        1

        Gv(S)dS

        Sle

        03698 log T if v = 1037273 log T if v = 2

        (532)

        (We use N = 100000 for v = 1 already N = 10 gives us the answer above forv = 2 Indeed computations suggest the better bound 0358 instead of 037273 weare committed to using 037273 because of (531))

        Multiplying by 6vπ2σ(v) we conclude that

        S1(UW ) =x

        WmiddotH1

        ( x

        WU

        )+Olowast

        (508ζ(32)3 x32

        W 32U

        )(533)

        if v = 1

        S1(UW ) =x

        WmiddotH2

        ( x

        WU

        )+Olowast

        (127ζ(32)3 x32

        W 32U

        )(534)

        if v = 2 where

        H1(S) =

        6π2G1(S) if 1 le S lt 40022125 if S ge 40

        H2(s) =

        4π2G2(S) if 1 le S lt 16015107 if S ge 16

        (535)Hence (by (532)) int T

        1

        Hv(S)dS

        Sle

        022482 log T if v = 1015107 log T if v = 2

        (536)

        moreover

        H1(S) le 3

        π2 H2(S) le 2

        π2(537)

        for all S

        Note There is another way to obtain cancellation on micro applicable when (xW ) gtUq (as is unfortunately never the case in our main application) For this alternativeto be taken one must either apply Cauchy-Schwarz on n rather than m (resulting inexponential sums over m) or lump together all m near each other and in the same

        52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 93

        congruence class modulo q before applying Cauchy-Schwarz on m (one can indeed dothis if δ is small) We could then writesum

        msimWmequivr mod q

        sumd|mdgtU

        micro(d) = minussummsimW

        mequivr mod q

        sumd|mdleU

        micro(d) = minussumdleU

        micro(d)(Wqd+O(1))

        and obtain cancellation on d If Uq ge (xW ) however the error term dominates

        52 The sum S2 the large sieve primes and tailsWe must now bound

        S2(U primeW primeW ) =sum

        U primeltmle xW

        (mv)=1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αmp)

        ∣∣∣∣∣∣2

        (538)

        for U prime = max(U x2W ) W prime = max(VW2) (The condition (p v) = 1 will befulfilled automatically by the assumption V gt v)

        From a modern perspective this is clearly a case for a large sieve It is also clear thatwe ought to try to apply a large sieve for sequences of prime support What is subtlerhere is how to do things well for very large q (ie xq small) This is in some sense adual problem to that of q small but it poses additional complications for example it isnot obvious how to take advantage of prime support for very large q

        As in type I we avoid this entire issue by forbidding q large and then taking advan-tage of the error term δx in the approximation α = a

        q + δx This is one of the main

        innovations here Note this alternative method will allow us to take advantage of primesupport

        A key situation to study is that of frequencies αi clustering around given rationalsaq while nevertheless keeping at a certain small distance from each other

        Lemma 521 Let q ge 1 Let α1 α2 αk isin RZ be of the form αi = aiq + υi0 le ai lt q where the elements υi isin R all lie in an interval of length υ gt 0 and whereai = aj implies |υi minus υj | gt ν gt 0 Assume ν + υ le 1q Then for any WW prime ge 1W prime geW2

        ksumi=1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αip)

        ∣∣∣∣∣∣2

        le min

        (1

        2q

        φ(q)

        1

        log ((q(ν + υ))minus1)

        )middot(W minusW prime + νminus1

        ) sumW primeltpleW

        (log p)2

        (539)

        Proof For any distinct i j the angles αi αj are separated by at least ν (if ai = aj) orat least 1qminus|υiminusυj | ge 1qminusυ ge ν (if ai 6= aj) Hence we can apply the large sieve(in the optimal N + δminus1 minus 1 form due to Selberg [Sel91] and Montgomery-Vaughan[MV74]) and obtain the bound in (539) with 1 instead of min(1 ) immediately

        94 CHAPTER 5 TYPE II SUMS

        We can also apply Montgomeryrsquos inequality ([Mon68] [Hux72] see the exposi-tions in [Mon71 pp 27ndash29] and [IK04 sect74]) This gives us that the left side of (539)is at most

        sumrleR

        (rq)=1

        (micro(r))2

        φ(r)

        minus1 sum

        rleR(rq)=1

        sumaprime mod r(aprimer)=1

        ksumi=1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e((αi + aprimer)p)

        ∣∣∣∣∣∣2

        (540)

        If we add all possible fractions of the form aprimer r le R (r q) = 1 to the fractionsaiq we obtain fractions that are separated by at least 1qR2 If ν + υ ge 1qR2 thenthe resulting angles αi + aprimer are still separated by at least ν Thus we can apply thelarge sieve to (540) setting R = 1

        radic(ν + υ)q we see that we gain a factor of

        sumrleR

        (rq)=1

        (micro(r))2

        φ(r)ge φ(q)

        q

        sumrleR

        (micro(r))2

        φ(r)ge φ(q)

        q

        sumdleR

        1

        dge φ(q)

        2qlog((q(ν + υ))minus1

        )

        (541)since

        sumdleR 1d ge log(R) for all R ge 1 (integer or not)

        Let us first give a bound on sums of the type of S2(U VW ) using prime sup-port but not the error terms (or Lemma 521) This is something that can be donevery well using tools available in the literature (Not all of these tools seem to beknown as widely as they should be) Bounds (542) and (544) are completely standardlarge-sieve bounds To obtain the gain of a factor of log in (543) we use a lemmaof Montgomeryrsquos for whose modern proof (containing an improvement by Huxley)we refer to the standard source [IK04 Lemma 715] The purpose of Montgomeryrsquoslemma is precisely to gain a factor of log in applications of the large sieve to sequencessupported on the primes To use the lemma efficiently we apply Montgomery andVaughanrsquos large sieve with weights [MV73 (16)] rather than more common forms ofthe large sieve (The idea ndash used in [MV73] to prove an improved version of the Brun-Titchmarsh inequality ndash is that Farey fractions (rationals with bounded denominator)are not equidistributed this fact can be exploited if a large sieve with weights is used)

        Lemma 522 Let W ge 1 W prime geW2 Let α = aq +Olowast(1qQ) q le Q Then

        sumA0ltmleA1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αmp)

        ∣∣∣∣∣∣2

        lelceil

        A1 minusA0

        min(q dQ2e)

        rceilmiddot (W minusW prime + 2q)

        sumW primeltpleW

        (log p)2

        (542)

        52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 95

        If q lt W2 and Q ge 35W the following bound also holds

        sumA0ltmleA1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αmp)

        ∣∣∣∣∣∣2

        lelceilA1 minusA0

        q

        rceilmiddot q

        φ(q)

        W

        log(W2q)middot

        sumW primeltpleW

        (log p)2

        (543)

        If A1 minusA0 le q and q le ρQ ρ isin [0 1] the following bound also holds

        sumA0ltmleA1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αmp)

        ∣∣∣∣∣∣2

        le (W minusW prime + q(1minus ρ))sum

        W primeltpleW

        (log p)2

        (544)

        Proof Let k = min(q dQ2e) ge dq2e We split (A0 A1] into d(A1minusA0)ke blocksof at most k consecutive integers m0 + 1m0 + 2 For m mprime in such a block αmand αmprime are separated by a distance of at least

        |(aq)(mminusmprime)| minusOlowast(kqQ) = 1q minusOlowast(12q) ge 12q

        By the large sieve

        qsuma=1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(α(m0 + a)p)

        ∣∣∣∣∣∣2

        le ((W minusW prime)+2q)sum

        W primeltpleW

        (log p)2 (545)

        We obtain (542) by summing over all d(A1 minusA0)ke blocksIf A1 minus A0 le |q| and q le ρQ ρ isin [0 1] we obtain (544) simply by applying

        the large sieve without splitting the interval A0 lt m le A1Let us now prove (543) We will use Montgomeryrsquos inequality followed by Mont-

        gomery and Vaughanrsquos large sieve with weights An angle aq + aprime1r1 is separatedfrom other angles aprimeq + aprime2r2 (r1 r2 le R (ai ri) = 1) by at least 1qr1R ratherthan just 1qR2 We will choose R so that qR2 lt Q this implies 1Q lt 1qR2 le1qr1R

        By a lemma of Montgomeryrsquos [IK04 Lemma 715] applied (for each 1 le a le q)to S(α) =

        sumn ane(αn) with an = log(n)e(α(m0 + a)n) if n is prime and an = 0

        otherwise

        1

        φ(r)

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(α(m0 + a)p)

        ∣∣∣∣∣∣2

        lesum

        aprime mod r(aprimer)=1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e

        ((α (m0 + a) +

        aprime

        r

        )p

        )∣∣∣∣∣∣2

        (546)

        96 CHAPTER 5 TYPE II SUMS

        for each square-free r leW prime We multiply both sides of (546) by(W

        2+

        3

        2

        (1

        qrRminus 1

        Q

        )minus1)minus1

        and sum over all a = 0 1 q minus 1 and all square-free r le R coprime to q we willlater make sure that R leW prime We obtain that

        sumrleR

        (rq)=1

        (W

        2+

        3

        2

        (1

        qrRminus 1

        Q

        )minus1)minus1

        micro(r)2

        φ(r)

        middotqsuma=1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(α(m0 + a)p)

        ∣∣∣∣∣∣2

        (547)

        is at mostsumrleR

        (rq)=1

        r sq-free

        (W

        2+

        3

        2

        (1

        qrRminus 1

        Q

        )minus1)minus1

        qsuma=1

        sumaprime mod r(aprimer)=1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e

        ((α (m0 + a) +

        aprime

        r

        )p

        )∣∣∣∣∣∣2

        (548)

        We now apply the large sieve with weights [MV73 (16)] recalling that each angleα(m0 +a)+aprimer is separated from the others by at least 1qrRminus1Q we obtain that(548) is at most

        sumW primeltpleW (log p)2 It remains to estimate the sum in the first line of

        (547) (We are following here a procedure analogous to that used in [MV73] to provethe Brun-Titchmarsh theorem)

        Assume first that q leW135 Set

        R =

        (σW

        q

        )12

        (549)

        where σ = 12e2middot025068 = 030285 It is clear that qR2 lt Q q lt W prime and R ge 2Moreover for r le R

        1

        Qle 1

        35Wle σ

        35

        1

        σW=

        σ

        35

        1

        qR2le σ35

        qrR

        Hence

        W

        2+

        3

        2

        (1

        qrRminus 1

        Q

        )minus1

        le W

        2+

        3

        2

        qrR

        1minus σ35=W

        2+

        3r

        2(1minus σ

        35

        )Rmiddot 2σW

        2

        =W

        2

        (1 +

        1minus σ35rW

        R

        )ltW

        2

        (1 +

        rW

        R

        )

        52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 97

        and so

        sumrleR

        (rq)=1

        (W

        2+

        3

        2

        (1

        qrRminus 1

        Q

        )minus1)minus1

        micro(r)2

        φ(r)

        ge 2

        W

        sumrleR

        (rq)=1

        (1 + rRminus1)minus1micro(r)2

        φ(r)ge 2

        W

        φ(q)

        q

        sumrleR

        (1 + rRminus1)minus1micro(r)2

        φ(r)

        For R ge 2 sumrleR

        (1 + rRminus1)minus1micro(r)2

        φ(r)gt logR+ 025068

        this is true for R ge 100 by [MV73 Lemma 8] and easily verifiable numerically for2 le R lt 100 (It suffices to verify this for R integer with r lt R instead of r le R asthat is the worst case)

        Now

        logR =1

        2

        (log

        W

        2q+ log 2σ

        )=

        1

        2log

        W

        2qminus 025068

        Hence sumrleR

        (1 + rRminus1)minus1micro(r)2

        φ(r)gt

        1

        2log

        W

        2q

        and the statement followsNow consider the case q gt W135 If q is even then in this range inequality

        (542) is always better than (543) and so we are done Assume then that W135 ltq le W2 and q is odd We set R = 2 clearly qR2 lt W le Q and q lt W2 le W primeand so this choice of R is valid It remains to check that

        1

        W2 + 3

        2

        (12q minus

        1Q

        )minus1 +1

        W2 + 3

        2

        (14q minus

        1Q

        )minus1 ge1

        Wlog

        W

        2q

        This follows because

        112 + 3

        2

        (t2 minus

        135

        )minus1 +1

        12 + 3

        2

        (t4 minus

        135

        )minus1 ge logt

        2

        for all 2 le t le 135

        We need a version of Lemma 522 with m restricted to the odd numbers since weplan to set the parameter v equal to 2

        98 CHAPTER 5 TYPE II SUMS

        Lemma 523 Let W ge 1 W prime geW2 Let 2α = aq +Olowast(1qQ) q le Q Then

        sumA0ltmleA1

        m odd

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αmp)

        ∣∣∣∣∣∣2

        lelceilA1 minusA0

        min(2qQ)

        rceilmiddot (W minusW prime + 2q)

        sumW primeltpleW

        (log p)2

        (550)

        If q lt W2 and Q ge 35W the following bound also holds

        sumA0ltmleA1

        m odd

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αmp)

        ∣∣∣∣∣∣2

        lelceilA1 minusA0

        2q

        rceilmiddot q

        φ(q)

        W

        log(W2q)middot

        sumW primeltpleW

        (log p)2

        (551)

        If A1 minusA0 le 2q and q le ρQ ρ isin [0 1] the following bound also holds

        sumA0ltmleA1

        ∣∣∣∣∣∣sum

        W primeltpleW

        (log p)e(αmp)

        ∣∣∣∣∣∣2

        le (W minusW prime + q(1minus ρ))sum

        W primeltpleW

        (log p)2

        (552)

        Proof We follow the proof of Lemma 522 noting the differences Let

        k = min(q dQ2e) ge dq2e

        just as before We split (A0 A1] into d(A1 minusA0)ke blocks of at most 2k consecutiveintegers any such block contains at most k odd numbers For odd m mprime in such ablock αm and αmprime are separated by a distance of

        |α(mminusmprime)| =∣∣∣∣2α

        mminusmprime

        2

        ∣∣∣∣ = |(aq)k| minusOlowast(kqQ) ge 12q

        We obtain (550) and (552) just as we obtained (542) and (544) before To obtain(551) proceed again as before noting that the angles we are working with can belabelled as α(m0 + 2a) 0 le a lt q

        The idea now (for large δ) is that if δ is not negligible then as m increases andαm loops around the circle RZ αm roughly repeats itself every q steps ndash but with aslight displacement This displacement gives rise to a configuration to which Lemma521 is applicable The effect is that we can apply the large sieve once instead of manytimes thus leading to a gain of a large factor (essentially the number of times the largesieve would have been used) This is how we obtain the factor of |δ| in the denominatorof the main term x|δ|q in (556) and (557)

        52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 99

        Proposition 524 Let x ge W ge 1 W prime ge W2 U prime ge x2W Let Q ge 35W Let2α = aq + δx (a q) = 1 |δx| le 1qQ q le Q Let S2(U primeW primeW ) be as in(538) with v = 2

        For q le ρQ where ρ isin [0 1]

        S2(U primeW primeW ) le(

        max(1 2ρ)

        (x

        8q+

        x

        2W

        )+W

        2+ 2q

        )middot

        sumW primeltpleW

        (log p)2

        (553)If q lt W2

        S2(U primeW primeW ) le(

        x

        4φ(q)

        1

        log(W2q)+

        q

        φ(q)

        W

        log(W2q)

        )middot

        sumW primeltpleW

        (log p)2

        (554)If W gt x4q the following bound also holds

        S2(U primeW primeW ) le(W

        2+

        q

        1minus x4Wq

        ) sumW primeltpleW

        (log p)2 (555)

        If δ 6= 0 and x4W + q le x|δ|q

        S2(U primeW primeW ) le min

        12qφ(q)

        log(

        x|δq|(q + x

        4W

        )minus1)

        middot(

        x

        |δq|+W

        2

        ) sumW primeltpleW

        (log p)2

        (556)

        Lastly if δ 6= 0 and q le ρQ where ρ isin [0 1)

        S2(U primeW primeW ) le(

        x

        |δq|+W

        2+

        x

        8(1minus ρ)Q+

        x

        4(1minus ρ)W

        ) sumW primeltpleW

        (log p)2

        (557)

        The trivial bound would be in the order of

        S2(U primeW primeW ) = (x2 log x)sum

        W primeltpleW

        (log p)2

        In practice (555) gets applied when W ge xq

        Proof Let us first prove statements (554) and (553) which do not involve δ Assumefirst q leW2 Then by (551) with A0 = U prime A1 = xW

        S2(U primeW primeW ) le(xW minus U prime

        2q+ 1

        )q

        φ(q)

        W

        log(W2q)

        sumW primeltpleW

        (log p)2

        Clearly (xW minus U prime)W le (x2W ) middotW = x2 Thus (554) holds

        100 CHAPTER 5 TYPE II SUMS

        Assume now that q le ρQ Apply (550) with A0 = U prime A1 = xW Then

        S2(U primeW primeW ) le(

        xW minus U prime

        q middotmin(2 ρminus1)+ 1

        )(W minusW prime + 2q)

        sumW primeltpleW

        (log p)2

        Now (xW minus U prime

        q middotmin(2 ρminus1)+ 1

        )middot (W minusW prime + 2q)

        le( xWminus U prime

        ) W minusW prime

        qmin(2 ρminus1)+ max(1 2ρ)

        ( xWminus U prime

        )+W2 + 2q

        le x4

        qmin(2 ρminus1)+ max(1 2ρ)

        x

        2W+W2 + 2q

        This implies (553)If W gt x4q apply (544) with = x4Wq ρ = 1 This yields (555)Assume now that δ 6= 0 and x4W + q le x|δq| Let Qprime = x|δq| For any m1

        m2 with x2W lt m1m2 le xW we have |m1 minusm2| le x2W le 2(Qprime minus q) andso ∣∣∣∣m1 minusm2

        2middot δx+ qδx

        ∣∣∣∣ le Qprime|δ|x =1

        q (558)

        The conditions of Lemma 521 are thus fulfilled with υ = (x4W ) middot |δ|x and ν =|δq|x We obtain that S2(U primeW primeW ) is at most

        min

        (1

        2q

        φ(q)

        1

        log ((q(ν + υ))minus1)

        )(W minusW prime + νminus1

        ) sumW primeltpleW

        (log p)2

        Here W minusW prime + νminus1 = W minusW prime + x|qδ| leW2 + x|qδ| and

        (q(ν + υ))minus1 =

        (q|δ|x

        )minus1 (q +

        x

        4W

        )minus1

        Lastly assume δ 6= 0 and q le ρQ We let Qprime = x|δq| ge Q again and we splitthe range U prime lt m le xW into intervals of length 2(Qprime minus q) so that (558) still holdswithin each interval We apply Lemma 521 with υ = (Qprimeminus q) middot |δ|x and ν = |δq|xWe obtain that S2(U primeW primeW ) is at most(

        1 +xW minus U2(Qprime minus q)

        )(W minusW prime + νminus1

        ) sumW primeltpleW

        (log p)2

        Here W minusW prime + νminus1 leW2 + xq|δ| as before Moreover(W

        2+

        x

        q|δ|

        )(1 +

        xW minus U2(Qprime minus q)

        )le(W

        2+Qprime

        )(1 +

        x2W

        2(1minus ρ)Qprime

        )le W

        2+Qprime +

        x

        8(1minus ρ)Qprime+

        x

        4W (1minus ρ)

        le x

        |δq|+W

        2+

        x

        8(1minus ρ)Q+

        x

        4(1minus ρ)W

        Hence (557) holds

        Chapter 6

        Minor-arc totals

        It is now time to make all of our estimates fully explicit choose our parameters putour type I and type II estimates together and give our final minor-arc estimates

        Let x gt 0 be given Starting in section 631 we will assume that x ge x0 =216 middot1020 We will choose our main parameters U and V gradually as the need ariseswe assume from the start that 2 middot 106 le V lt x4 and UV le x

        We are also given an angle α isin RZ We choose an approximation 2α = aq +δx (a q) = 1 q le Q |δx| le 1qQ The parameter Q will be chosen later weassume from the start that Q ge max(16 2

        radicx) and Q ge max(2U xU)

        (Actually U and V will be chosen in different ways depending on the size of qActually evenQ will depend on the size of q this may seem circular but what actuallyhappens is the following we will first set a value for Q depending only on x and ifthe corresponding value of q le Q is larger than a certain parameter y depending on xthen we reset U V and Q and obtain a new value of q)

        Let SI1 SI2 SII S0 be as in (39) with the smoothing function η = η2 as in(34) (We bounded the type I sums SI1 SI2 for a general smoothing function η it isonly here that we are specifying η)

        The term S0 is 0 because V lt x4 and η2 is supported on [minus14 1] We set v = 2

        61 The smoothing functionFor the smoothing function η2 in (34)

        |η2|1 = 1 |ηprime2|1 = 8 log 2 |ηprimeprime2 |1 = 48 (61)

        as per [Tao14 (59)ndash(513)] Similarly for η2ρ(t) = log(ρt)η2(t) where ρ ge 4

        |η2ρ|1 lt log(ρ)|η2|1 = log(ρ)

        |ηprime2ρ|1 = 2η2ρ(12) = 2 log(ρ2)η2(12) lt (8 log 2) log ρ

        |ηprimeprime2ρ|1 = 4 log(ρ4) + |2 log ρminus 4 log(ρ4)|+ |4 log 2minus 4 log ρ|+ | log ρminus 4 log 2|+ | log ρ| lt 48 log ρ

        (62)

        101

        102 CHAPTER 6 MINOR-ARC TOTALS

        In the first inequality we are using the fact that log(ρt) is always positive (and less thanlog(ρ)) when t is in the support of η2

        Write log+ x for max(log x 0)

        62 Contributions of different types

        621 Type I terms SI1The term SI1 can be handled directly by Lemma 423 with ρ0 = 4 and D = U (Condition (438) is valid thanks to (62)) Since U le Q2 the contribution of SI1gets bounded by (440) and (441) the absolute value of SI1 is at most

        x

        qmin

        (1c0δ

        2

        (2π)2

        ) ∣∣∣∣∣∣∣∣∣∣summleUq

        (mq)=1

        micro(m)

        mlog

        x

        mq

        ∣∣∣∣∣∣∣∣∣∣+x

        q|log middotη(minusδ)|

        ∣∣∣∣∣∣∣∣∣∣summleUq

        (mq)=1

        micro(m)

        m

        ∣∣∣∣∣∣∣∣∣∣+

        2radicc0c1π

        (U log

        ex

        U+radic

        3q logq

        c2+q

        2log

        q

        c2log+ 2U

        q

        )+

        3c1x

        2qlog

        q

        c2log+ U

        c2xq

        +3c12

        radic2x

        c2log

        2x

        c2+

        (c02minus 2c0π2

        )(U2

        4qxlog

        e12x

        U+

        1

        e

        )+

        2|ηprime|1π

        qmax

        (1 log

        c0e3q2

        4π|ηprime|1x

        )log x+

        20c0c322

        3π2

        radic2x log

        2radicex

        c2

        (63)where c0 = 31521 (by Lemma B23) c1 = 10000028 gt 1 + (8 log 2)V ge 1 +(8 log 2)(xU) and c2 = 6π5

        radicc0 = 067147 By (21) (with k = 2) (B17) and

        Lemma B24

        |log middotη(minusδ)| le min

        (2minus log 4

        24 log 2

        π2δ2

        )

        By (220) (222) and (223) the first line of (63) is at most

        x

        qmin

        (1cprime0δ2

        )(min

        (4

        5

        qφ(q)

        log+ Uq2

        1

        )log

        x

        U+ 100303

        q

        φ(q)

        )

        +x

        qmin

        (2minus log 4

        cprimeprime0δ2

        )min

        (4

        5

        qφ(q)

        log+ Uq2

        1

        )

        where cprime0 = 0798437 gt c0(2π)2 cprimeprime0 = 1685532 Clearly cprimeprime0c0 gt 1 gt 2minus log 4Taking derivatives we see that t 7rarr (t2) log(tc2) log+ 2Ut takes its maxi-

        mum (for t isin [1 2U ]) when log(tc2) log+ 2Ut = log tc2 minus log+ 2Ut sincetrarr log tc2 minus log+ 2Ut is increasing on [1 2U ] we conclude that

        q

        2log

        q

        c2log+ 2U

        qle U log

        2U

        c2

        62 CONTRIBUTIONS OF DIFFERENT TYPES 103

        Similarly t 7rarr t log(xt) log+(Ut) takes its maximum at a point t isin [0 U for whichlog(xt) log+(Ut) = log(xt) + log+(Ut) and so

        x

        qlog

        q

        c2log+ U

        c2xq

        le U

        c2(log x+ logU)

        We conclude that

        |SI1| lex

        qmin

        (1cprime0δ2

        )(min

        (4qφ(q)

        5 log+ Uq2

        1

        )(log

        x

        U+ c3I

        )+ c4I

        q

        φ(q)

        )

        +

        (c7I log

        q

        c2+ c8I log xmax

        (1 log

        c11Iq2

        x

        ))q + c10I

        U2

        4qxlog

        e12x

        U

        +

        (c5I log

        2U

        c2+ c6I log xU

        )U + c9I

        radicx log

        2radicex

        c2+c10I

        e

        (64)where c2 and cprime0 are as above c3I = 211104 gt cprimeprime0c

        prime0 c4I = 100303 c5I =

        357422 gt 2radicc0c1π c6I = 223389 gt 3c12c2 c7I = 619072 gt 2

        radic3c0c1π

        c8I = 353017 gt 2(8 log 2)π

        c9I = 191568 gt3radic

        2c12radicc2

        +20radic

        2c0c322

        3π2

        c10I = 937301 gt c0(12minus 2π2) and c11I = 90857 gt c0e3(4π middot 8 log 2)

        622 Type I terms SI2The case q le QV If q le QV then for v le V

        2vα =va

        q+Olowast

        (v

        Qq

        )=va

        q+Olowast

        (1

        q2

        )

        and so vaq is a valid approximation to 2vα (Here we are using v to label an integervariable bounded above by v le V we no longer need v to label the quantity in (310)since that has been set equal to the constant 2) Moreover for Qv = Qv we see that2vα = (vaq) +Olowast(1qQv) If α = aq + δx then vα = vaq + δ(xv) Now

        SI2 =sumvleVv odd

        Λ(v)summleUm odd

        micro(m)sumn

        n odd

        e((vα) middotmn)η(mn(xv)) (65)

        We can thus estimate SI2 by applying Lemma 422 to each inner double sum in (65)We obtain that if |δ| le 12c2 where c2 = 6π5

        radicc0 and c0 = 31521 then |SI2| is

        at most

        sumvleV

        Λ(v)

        xv2qvmin

        (1

        c0(πδ)2

        ) ∣∣∣∣∣∣∣∣∣sum

        mleMvq

        (m2q)=1

        micro(m)

        m

        ∣∣∣∣∣∣∣∣∣+c10Iq

        4xv

        (U

        qv+ 1

        )2

        (66)

        104 CHAPTER 6 MINOR-ARC TOTALS

        plus

        sumvleV

        Λ(v)

        (2radicc0c+

        πU +

        3c+2

        x

        vqvlog+ U

        c2xvqv

        +

        radicc0c+

        πqv log+ U

        qv2

        )

        +sumvleV

        Λ(v)

        (c8I max

        (log

        c11Iq2v

        xv 1

        )qv +

        (2radic

        3c0c+π

        +3c+2c2

        +55c0c2

        6π2

        )qv

        )

        (67)where qv = q(q v) Mv isin [min(Q2v U) U ] and c+ = 1 + (8 log 2)(xUV ) if|δ| ge 12c2 then |SI2| is at most (66) plus

        sumvleV

        Λ(v)

        radicc0c1π2

        U +3c12

        2 +(1 + ε)

        εlog+ 2U

        xv|δ|qv

        x

        Q+

        35c0c23π2

        qv

        +sumvleV

        Λ(v)

        radicc0c1π2

        (1 + ε) min

        (lfloorxv

        |δ|qv

        rfloor+ 1 2U

        )radic3 + 2ε+

        log+ 2U

        b xv|δ|qv c+1

        2

        (68)

        Write SV =sumvleV Λ(v)(vqv) By (212)

        SV lesumvleV

        Λ(v)

        vq+

        sumvleV

        (vq)gt1

        Λ(v)

        v

        ((q v)

        qminus 1

        q

        )

        le log V

        q+

        1

        q

        sump|q

        (log p)

        vp(q) +sumαge1

        pα+vp(q)leV

        1

        pαminussumαge1

        pαleV

        1

        le log V

        q+

        1

        q

        sump|q

        (log p)vp(q) =log V q

        q

        (69)

        This helps us to estimate (66) We could also use this to estimate the second term inthe first line of (67) but for that purpose it will actually be wiser to use the simplerbound sum

        vleV

        Λ(v)x

        vqvlog+ U

        c2xvqv

        lesumvleV

        Λ(v)Uc2ele 10004

        ec2UV (610)

        (by (214) and the fact that t log+At takes its maximum at t = Ae)We bound the sum over m in (66) by (220) and (222)∣∣∣∣∣∣∣∣∣

        summleMvq

        (m2q)=1

        micro(m)

        m

        ∣∣∣∣∣∣∣∣∣ le min

        (4

        5

        qφ(q)

        log+ Mv

        2q2 1

        )

        62 CONTRIBUTIONS OF DIFFERENT TYPES 105

        To bound the terms involving (Uqv + 1)2 we usesumvleV

        Λ(v)v le 05004V 2 (by (217))

        sumvleV

        Λ(v)v(v q)j lesumvleV

        Λ(v)v + VsumvleV

        (vq)6=1

        Λ(v)(v q)j

        sumvleV

        (vq) 6=1

        Λ(v)(v q) lesump|q

        (log p)sum

        1leαlelogp V

        pvp(q) lesump|q

        (log p)log V

        log ppvp(q)

        le (log V )sump|q

        pvp(q) le q log V

        and sumvleV

        (vq)6=1

        Λ(v)(v q)2 lesump|q

        (log p)sum

        1leαlelogp V

        pvp(q)+α

        lesump|q

        (log p) middot 2pvp(q) middot plogp V le 2qV log q

        Using (214) and (69) as well we conclude that (66) is at most

        x

        2qmin

        (1

        c0(πδ)2

        )min

        (4

        5

        qφ(q)

        log+ min(Q2VU)2q2

        1

        )log V q

        +c10I

        4x

        (05004V 2q

        (U

        q+ 1

        )2

        + 2UV q log V + 2U2V log V

        )

        AssumeQ le 2UVe Using (214) (610) (218) and the inequality vq le V q le Q(which implies q2 le Ue) we see that (67) is at most

        10004

        ((2radicc0c+

        π+

        3c+2ec2

        )UV +

        radicc0c+

        πQ log

        U

        q2

        )+

        (c5I2 max

        (log

        c11Iq2V

        x 2

        )+ c6I2

        )Q

        where c5I2 = 353312 gt 10004 middot c8I and

        c6I2 = 10004

        (2radic

        3c0c+π

        +3c+2c2

        +55c0c2

        6π2

        ) (611)

        The expressions in (68) get estimated similarly The first line of (68) is at most

        10004

        (2radicc0c+

        πUV +

        3c+2

        (2 +

        1 + ε

        εlog+ 2UV |δ|q

        x

        )xV

        Q+

        35c0c23π2

        qV

        )

        106 CHAPTER 6 MINOR-ARC TOTALS

        by (214) Since q le QV we can obviously bound qV by Q As for the second lineof (68) ndash

        sumvleV

        Λ(v) min

        (lfloorxv

        |δ|qv

        rfloor+ 1 2U

        )middot 1

        2log+ 2Ulfloor

        xv|δ|qv

        rfloor+ 1

        lesumvleV

        Λ(v) maxtgt0

        t log+ U

        tlesumvleV

        Λ(v)U

        e=

        10004

        eUV

        but

        sumvleV

        Λ(v) min

        (lfloorxv

        |δ|qv

        rfloor+ 1 2U

        )le

        sumvle x

        2U|δ|q

        Λ(v) middot 2U

        +sum

        x2U|δ|qltvleV

        (vq)=1

        Λ(v)x|δ|vq

        +sumvleV

        Λ(v) +sumvleV

        (vq)6=1

        Λ(v)x|δ|v

        (1

        qvminus 1

        q

        )

        le 103883x

        |δ|q+

        x

        |δ|qmax

        (log V minus log

        x

        2U |δ|q+ log

        3radic2 0

        )+ 10004V +

        x

        |δ|1

        q

        sump|q

        (log p)vp(q)

        le x

        |δ|q

        (103883 + log q + log+ 6UV |δ|qradic

        2x

        )+ 10004V

        by (212) (213) (214) and (215) we are proceeding much as in (69)

        Let us collect our bounds If |δ| le 12c2 then assuming Q le 2UVe we con-clude that |SI2| is at most

        x

        2φ(q)min

        (1

        c0(πδ)2

        )min

        (45

        log+ Q4V q2

        1

        )log V q

        + c8I2x

        q

        (UV

        x

        )2 (1 +

        q

        U

        )2

        +c10I

        2

        (UV

        xq log V +

        U2V

        xlog V

        ) (612)

        plus

        (c4I2 +c9I2)UV +(c10I2 logU

        q+c5I2 max

        (log

        c11Iq2V

        x 2

        )+c12I2)middotQ (613)

        62 CONTRIBUTIONS OF DIFFERENT TYPES 107

        where

        c4I2 = 357565(1 + ε0) gt 10004 middot 2radicc0c+πc5I2 = 353312 gt 10004 middot c8I

        c8I2 = 117257 gtc10I

        4middot 05004

        c9I2 = 082214(1 + 2ε0) gt 3c+ middot 100042ec2

        c10I2 = 178783radic

        1 + 2ε0 gt 10004radicc0c+π

        c12I2 = 293333 + 11902ε0

        gt 10004

        (3

        2c2c+ +

        2radic

        3c0π

        radicc+ +

        55c0c26π2

        )+ 178783(1 + ε0) log 2

        = c6I2 + c10I2 log 2

        and c10I = 937301 as before Here ε0 = (4 log 2)(xUV ) and c6I2 is as in (611)If |δ| ge 12c2 then |SI2| is at most (612) plus

        (c4I2 + (1 + ε)c13I2)UV + cε

        (c14I2

        (log q + log+ 6UV |δ|qradic

        2x

        )+ c15I2

        )x

        |δ|q

        + c16I2

        (2 +

        1 + ε

        εlog+ 2UV |δ|q

        x

        )x

        QV+ c17I2Q+ cε middot c4I2V

        (614)where

        c13I2 = 131541(1 + ε0) gt2radicc0c+

        πmiddot 10004

        e

        c14I2 = 357422radic

        1 + 2ε0 gt2radicc0c+

        π

        c15I2 = 371301radic

        1 + 2ε0 gt2radicc0c+

        πmiddot 103883

        c16I2 = 15006(1 + 2ε0) gt 10004 middot 3c+2

        c17I2 = 250295 gt 10004 middot 35c0c23π2

        and cε = (1 + ε)radic

        3 + 2ε We recall that c2 = 6π5radicc0 = 067147 We will

        choose ε isin (0 1) later we also leave the task of bounding ε0 for laterThe case q gt QV We use Lemma 424 in this case

        623 Type II termsAs we showed in (51)ndash(55) SII (given in (51)) is at most

        4

        int xU

        V

        radicS1(UW ) middot S2(U VW )

        dW

        W+4

        int xU

        V

        radicS1(UW ) middot S3(W )

        dW

        W (615)

        where S1 S2 and S3 are as in (54) and (55) We bounded S1 in (533) and (534) S2

        in Prop 524 and S3 in (55)

        108 CHAPTER 6 MINOR-ARC TOTALS

        Let us try to give some structure to the bookkeeping we must now inevitably doThe second integral in (615) will be negligible (because S3 is) let us focus on the firstintegral

        Thanks to our work in sect51 the term S1(UW ) is bounded by a (small) constanttimes xW (This represents a gain of several factors of log with respect to the trivialbound) We bounded S2(U VW ) using the large sieve we expected and got a boundthat is better than trivial by a factor of size roughly radicq log x ndash the exact factor inthe bound depends on the value of W In particular it is only in the central part of therange for W that we will really be able to save a factor of radicq log x as opposed tojust radicq We will have to be slightly clever in order to get a good total bound in theend

        We first recall our estimate for S1 In the whole range [V xU ] for W we knowfrom (533) (534) and (537) that S1(UW ) is at most

        2

        π2

        x

        W+ κ0ζ(32)3 x

        W

        radicxWU

        U (616)

        whereκ0 = 127

        (We recall we are working with v = 2)We have better estimates for the constant in front in some parts of the range in

        what is usually the main part (534) and (536) give us a constant of 015107 insteadof 2π2 Note that 127ζ(32)3 = 226417 We should choose U V so that thefirst term in (616) dominates For the while being assume only

        U ge 5 middot 105 x

        V U (617)

        then (616) givesS1(UW ) le κ1

        x

        W (618)

        whereκ1 =

        2

        π2+

        226418radic1062

        le 02347

        This will suffice for our cruder estimatesThe second integral in (615) is now easy to bound By (55)

        S3(W ) le 10171x+ 20341W le 10172x

        since W le xU le x5 middot 105 Hence

        4

        int xU

        V

        radicS1(UW ) middot S3(W )

        dW

        Wle 4

        int xU

        V

        radicκ1

        x

        Wmiddot 10172x

        dW

        W

        le κ9xradicV

        62 CONTRIBUTIONS OF DIFFERENT TYPES 109

        whereκ9 = 8 middot

        radic10172 middot κ1 le 39086

        Let us now examine S2 which was bounded in Prop 524 We set the parametersW prime U prime as follows in accordance with (54)

        W prime = max(VW2) U prime = max(U x2W )

        Since W prime geW2 and W ge V gt 117 we can always boundsumW primeltpleW

        (log p)2 le 1

        2W (logW ) (619)

        by (219)Bounding S2 for δ arbitrary We set

        W0 = min(max(2θq V ) xU)

        where θ ge e is a parameter that will be set laterFor V leW lt W0 we use the bound (553)

        S2(U primeW primeW ) le(

        max(1 2ρ)

        (x

        8q+

        x

        2W

        )+W

        2+ 2q

        )middot 1

        2W (logW )

        le max

        (1

        2 ρ

        )(W

        8q+

        1

        2

        )x logW +

        W 2 logW

        4+ qW logW

        where ρ = qQIf W0 gt V the contribution of the terms with V leW lt W0 to (615) is (by 618)

        bounded by

        4

        int W0

        V

        radicκ1

        x

        W

        (ρ0

        4

        (W

        4q+ 1

        )x logW +

        W 2 logW

        4+ qW logW

        )dW

        W

        le κ2

        2

        radicρ0x

        int W0

        V

        radiclogW

        W 32dW +

        κ2

        2

        radicx

        int W0

        V

        radiclogW

        W 12dW

        + κ2

        radicρ0x2

        16q+ qx

        int W0

        V

        radiclogW

        WdW

        le(κ2radicρ0

        xradicV

        + κ2

        radicxW0

        )radiclogW0

        +2κ2

        3

        radicρ0x2

        16q+ qx

        ((logW0)32 minus (log V )32

        )

        (620)

        where ρ0 = max(1 2ρ) and

        κ2 = 4radicκ1 le 193768

        (We are using the easy boundradica+ b+ c le

        radica+radicb+radicc)

        110 CHAPTER 6 MINOR-ARC TOTALS

        We now examine the terms with W ge W0 If 2θq gt xU then W0 = xU thecontribution of the case is nil and the computations below can be ignored Thus wecan assume that 2θq le xU

        We use (554)

        S2(U primeW primeW ) le(

        x

        4φ(q)

        1

        log(W2q)+

        q

        φ(q)

        W

        log(W2q)

        )middot 1

        2W logW

        Byradica+ b le

        radica+radicb we can take out the qφ(q) middotW log(W2q) term and estimate

        its contribution on its own it is at most

        4

        int xU

        W0

        radicκ1

        x

        Wmiddot q

        φ(q)middot 1

        2W 2

        logW

        logW2q

        dW

        W

        =κ2radic

        2

        radicq

        φ(q)

        int xU

        W0

        radicx logW

        W logW2qdW

        le κ2radic2

        radicqx

        φ(q)

        int xU

        W0

        1radicW

        (1 +

        radiclog 2q

        logW2q

        )dW

        (621)

        Nowint xU

        W0

        1radicW

        radiclog 2q

        logW2qdW =

        radic2q log 2q

        int x2Uq

        max(θV2q)

        1radict log t

        dt

        We bound this last integral somewhat crudely for T ge e

        int T

        e

        1radict log t

        dt le 23

        radicT

        log T (622)

        (This is shown as follows since

        1radicT log T

        lt

        (23

        radicT

        log T

        )prime

        if and only if T gt T0 where T0 = e(1minus223)minus1

        = 213594 it is enough to check(numerically) that (622) holds for T = T0) Since θ ge e this gives us that

        int xU

        W0

        1radicW

        (1 +

        radiclog 2q

        logW2q

        )dW

        le 2

        radicx

        U+ 23

        radic2q log 2q middot

        radicx2Uq

        log x2Uq

        62 CONTRIBUTIONS OF DIFFERENT TYPES 111

        and so (621) is at most

        radic2κ2

        radicq

        φ(q)

        (1 + 115

        radiclog 2q

        log x2Uq

        )xradicU

        We are left with what will usually be the main term viz

        4

        int xU

        W0

        radicS1(UW ) middot

        (x

        8φ(q)

        logW

        logW2q

        )WdW

        W (623)

        which by (534) is at most xradicφ(q) times the integral of

        1

        W

        radicradicradicradic(2H2

        ( x

        WU

        )+κ4

        2

        radicxWU

        U

        )logW

        logW2q

        for W going from W0 to xU where H2 is as in (535) and

        κ4 = 4κ0ζ(32)3 le 905671

        By the arithmeticgeometric mean inequality the integrand is at most 1W times

        β + βminus1 middot 2H2(xWU)

        2+βminus1

        2

        κ4

        2

        radicxWU

        U+β

        2

        log 2q

        logW2q(624)

        for any β gt 0 We will choose β laterThe first summand in (624) gives what we can think of as the main or worst term

        in the whole paper let us compute it first The integral isint xU

        W0

        β + βminus1 middot 2H2(xWU)

        2

        dW

        W=

        int xUW0

        1

        β + βminus1 middot 2H2(s)

        2

        ds

        s

        le(β

        2+κ6

        )log

        x

        UW0

        (625)

        by (536) whereκ6 = 060428

        Thus the main term is simply(β

        2+κ6

        )xradicφ(q)

        logx

        UW0 (626)

        The integral of the second summand is at most

        βminus1 middot κ4

        4

        radicx

        U

        int xU

        V

        dW

        W 32le βminus1 middot κ4

        2

        radicxUV

        U

        112 CHAPTER 6 MINOR-ARC TOTALS

        By (617) this is at most

        βminus1

        radic2middot 10minus3 middot κ4 le βminus1κ72

        where

        κ7 =

        radic2κ4

        1000le 01281

        Thus the contribution of the second summand is at most

        βminus1κ7

        2middot xradic

        φ(q)

        The integral of the third summand in (624) is

        β

        2

        int xU

        W0

        log 2q

        logW2q

        dW

        W (627)

        If V lt 2θq le xU this is

        β

        2

        int xU

        2θq

        log 2q

        logW2q

        dW

        W=β

        2log 2q middot

        int x2Uq

        θ

        1

        log t

        dt

        t

        2log 2q middot

        (log log

        x

        2Uqminus log log θ

        )

        If 2θq gt xU the integral is over an empty range and its contribution is hence 0If 2θq le V (627) is

        β

        2

        int xU

        V

        log 2q

        logW2q

        dW

        W=β log 2q

        2

        int x2Uq

        V2q

        1

        log t

        dt

        t

        =β log 2q

        2middot (log log

        x

        2Uqminus log log V2q)

        =β log 2q

        2middot log

        (1 +

        log xUV

        log V2q

        )

        (628)

        (Let us stop for a moment and ask ourselves when this will be smaller than whatwe can see as the main term namely the term (β2) log xUW0 in (625) Clearlylog(1 + (log xUV )(log V2q)) le (log xUV )(log V2q) and that is smaller than(log xUV ) log 2q when V2q gt 2q Of course it does not actually matter if (628)is smaller than the term from (625) or not since we are looking for upper bounds herenot for asymptotics)

        The total bound for (623) is thus

        xradicφ(q)

        middot(β middot(

        1

        2log

        x

        UW0+

        Φ

        2

        )+ βminus1

        (1

        4κ6 log

        x

        UW0+κ7

        2

        )) (629)

        62 CONTRIBUTIONS OF DIFFERENT TYPES 113

        where

        Φ =

        log 2q(

        log log x2Uq minus log log θ

        )if V2θ lt q lt x(2θU)

        log 2q log(

        1 + log xUVlog V2q

        )if q le V2θ

        (630)

        Choosing β optimally we obtain that (623) is at most

        xradic2φ(q)

        radic(log

        x

        UW0+ Φ

        )(κ6 log

        x

        UW0+ 2κ7

        ) (631)

        where Φ is as in (630)Bounding S2 for |δ| ge 8 Let us see how much a non-zero δ can help us It makes

        sense to apply (556) only when |δ| ge 8 otherwise (554) is almost certainly betterNow by definition |δ|x le 1qQ and so |δ| ge 8 can happen only when q le x8Q

        With this in mind let us apply (556) assuming |δ| gt 8 Note first that

        x

        |δq|

        (q +

        x

        4W

        )minus1

        ge 1|δq|qx + 1

        4W

        ge 4|δq|1

        2Q + 1W

        ge 4W

        |δ|qmiddot 1

        1 + W2Q

        ge 4W

        |δ|qmiddot 1

        1 + xU2Q

        This is at least 2 min(2QW )|δq| Thus we are allowed to apply (556) when |δq| le2 min(2QW ) Since Q ge xU we know that min(2QW ) = W for all W le xU and so it is enough to assume that |δq| le 2W We will soon be making a strongerassumption

        Recalling also (619) we see that (556) gives us

        S2(U primeW primeW ) le min

        12qφ(q)

        log

        (4W|δ|q middot

        1

        1+xU2Q

        )( x

        |δq|+W

        2

        )middot 1

        2W (logW )

        (632)Similarly to before we define W0 = max(V θ|δq|) where θ ge 3e28 will be set

        later (Here θ ge 3e28 is an assumption we do not yet need but we will be using itsoon to simplify matters slightly) For W geW0 we certainly have |δq| le 2W Hencethe part of the first term of (615) coming from the range W0 leW lt xU is

        4

        int xU

        W0

        radicS1(UW ) middot S2(U VW )

        dW

        W

        le 4

        radicq

        φ(q)

        int xU

        W0

        radicradicradicradicradicS1(UW ) middot logW

        log

        (4W|δ|q middot

        1

        1+xU2Q

        ) (Wx

        |δq|+W 2

        2

        )dW

        W

        (633)

        114 CHAPTER 6 MINOR-ARC TOTALS

        By (534) the contribution of the term Wx|δq| to (633) is at most

        4xradic|δ|φ(q)

        int xU

        W0

        radicradicradicradicradicradic(H2

        ( x

        WU

        )+κ4

        4

        radicxWU

        U

        )logW

        log

        (4W|δ|q middot

        1

        1+xU2Q

        ) dWW

        Note that 1 + (xU)2Q le 32 Proceeding as in (623)ndash(631) we obtain that this isat most

        2xradic|δ|φ(q)

        radic(log

        x

        UW0+ Φ

        )(κ6 log

        x

        UW0+ 2κ7

        )

        where

        Φ =

        log (1+ε1)|δq|4 log

        (1 + log xUV

        log 4V|δ|(1+ε1)q

        )if |δq| le Vθ

        log 3|δq|8

        (log log 8x

        3U |δq| minus log log 8θ3

        )if Vθ lt |δq| le xθU

        (634)

        where ε1 = x2UQ This is what we think of as the main termBy (618) the contribution of the term W 22 to (633) is at most

        4

        radicq

        φ(q)

        int xU

        W0

        radicκ1

        2xdWradicWmiddot maxW0leWle x

        U

        radiclogW

        log 8W3|δq|

        (635)

        Since trarr (log t)(log tc) is decreasing for t gt c (635) is at most

        4radic

        2κ1

        radicq

        φ(q)

        (xradicUminusradicxW0

        )radiclogW0

        log 8W0

        3|δq| (636)

        If W0 gt V we also have to consider the range V leW lt W0 By Prop 524 and(619) the part of (615) coming from this is

        4

        int θ|δq|

        V

        radicS1(UW ) middot (logW )

        (Wx

        2|δq|+W 2

        4+

        Wx

        16(1minus ρ)Q+

        x

        8(1minus ρ)

        )dW

        W

        The contribution of W 24 is at most

        4

        int W0

        V

        radicκ1

        x

        WlogW middot W

        2

        4

        dW

        Wle 4radicκ1 middot

        radicxW0 middot

        radiclogW

        the sum of this and (636) is at most

        4radicκ1

        (radic2q

        φ(q)

        (xradicUminusradicxW0

        )radiclogW0

        log 8θ3

        +radicxW0

        radiclogW0

        )

        le κ2 middotradic

        q

        φ(q)

        xradicU

        radiclogW0

        62 CONTRIBUTIONS OF DIFFERENT TYPES 115

        where we use the facts that W0 = θ|δq| (by W0 gt V ) and θ ge 3e28 and where werecall that κ2 = 4

        radicκ1

        The terms Wx2|δ|q and Wx(16(1minus ρ)Q) contribute at most

        4radicκ1

        int θ|δq|

        V

        radicx

        Wmiddot (logW )W

        (x

        2|δq|+

        x

        16(1minus ρ)Q

        )dW

        W

        = κ2x

        (1radic2|δ|q

        +1

        4radic

        (1minus ρ)Q

        )int θ|δq|

        V

        radiclogW

        dW

        W

        =2κ2

        3x

        (1radic2|δ|q

        +1

        4radic

        (1minus ρ)Q

        )((log θ|δ|q)32 minus (log V )32

        )

        The term x8(1minus ρ) contributes

        radic2κ1x

        int θ|δq|

        V

        radiclogW

        W (1minus ρ)

        dW

        Wleradic

        2κ1xradic1minus ρ

        int infinV

        radiclogW

        W 32dW

        le κ2xradic2(1minus ρ)V

        (radic

        log V +radic

        1 log V )

        where we use the estimate

        int infinV

        radiclogW

        W 32dW =

        1radicV

        int infin1

        radiclog u+ log V

        u32du

        le 1radicV

        int infin1

        radiclog V

        u32du+

        1radicV

        int infin1

        1

        2radic

        log V

        log u

        u32du

        = 2

        radiclog VradicV

        +1

        2radicV log V

        middot 4 le 2radicV

        (radiclog V +

        radic1 log V

        )

        It is time to collect all type II terms Let us start with the case of general δ We willset θ ge e later If q le V2θ then |SII | is at most

        xradic2φ(q)

        middot

        radic(log

        x

        UV+ log 2q log

        (1 +

        log xUV

        log V2q

        ))(κ6 log

        x

        UV+ 2κ7

        )+radic

        2κ2

        radicq

        φ(q)

        (1 + 115

        radiclog 2q

        log x2Uq

        )xradicU

        + κ9xradicV

        (637)

        116 CHAPTER 6 MINOR-ARC TOTALS

        If V2θ lt q le x2θU then |SII | is at most

        xradic2φ(q)

        middot

        radic(log

        x

        U middot 2θq+ log 2q log

        log x2Uq

        log θ

        )(κ6 log

        x

        U middot 2θq+ 2κ7

        )

        +radic

        2κ2

        radicq

        φ(q)

        (1 + 115

        radiclog 2q

        log x2Uq

        )xradicU

        + (κ2

        radiclog 2θq + κ9)

        xradicV

        +κ2

        6

        ((log 2θq)32 minus (log V )32

        ) xradicq

        + κ2

        (radic2θ middot log 2θq +

        2

        3((log 2θq)32 minus (log V )32)

        )radicqx

        (638)where we use the fact that Q ge xU (implying that ρ0 = max(1 2qQ) equals 1 forq le x2U ) Finally if q gt x2θU

        |SII | le (κ2

        radic2 log xU + κ9)

        xradicV

        + κ2

        radiclog xU

        xradicU

        +2κ2

        3((log xU)32 minus (log V )32)

        (x

        2radic

        2q+radicqx

        )

        (639)

        Now let us examine the alternative bounds for |δ| ge 8 Here we assume θ ge 3e28If |δq| le Vθ then |SII | is at most

        2xradic|δ|φ(q)

        radicradicradicradiclogx

        UV+ log

        |δq|(1 + ε1)

        4log

        (1 +

        log xUV

        log 4V|δ|(1+ε1)q

        )

        middotradicκ6 log

        x

        UV+ 2κ7

        + κ2

        radic2q

        φ(q)middot

        radiclog V

        log 2V|δq|middot xradic

        U+ κ9

        xradicV

        (640)

        where ε1 = x2UQ If Vθ lt |δ|q le xθU then |SII | is at most

        2xradic|δ|φ(q)

        radicradicradicradic(logx

        U middot θ|δ|q+ log

        3|δq|8

        loglog 8x

        3U |δq|

        log 8θ3

        )(κ6 log

        x

        U middot θ|δq|+ 2κ7

        )

        +2κ2

        3

        (xradic2|δq|

        +x

        4radicQminus q

        )((log θ|δq|)32 minus (log V )32

        )+

        (κ2radic

        2(1minus ρ)

        (radiclog V +

        radic1 log V

        )+ κ9

        )xradicV

        + κ2

        radicq

        φ(q)middotradic

        log θ|δq| middot xradicU

        (641)

        63 ADJUSTING PARAMETERS CALCULATIONS 117

        where ρ = qQ Note that |δ| le xQq implies ρ le xQ2 and so ρ will be very smalland Qminus q will be very close to Q

        The case |δq| gt xθU will not arise in practice essentially because of |δ|q le xQ

        63 Adjusting parameters Calculations

        We must bound the exponential sumsumn Λ(n)e(αn)η(nx) By (38) it is enough to

        sum the bounds we obtained in sect62 We will now see how it will be best to set U Vand other parameters

        Usually the largest terms will be

        C0UV (642)

        where C0 equals

        c4I2 + c9I2 = 439779 + 521993ε0 if |δ| le 12c2 sim 074463c4I2 + (1 + ε)c13I2 = (489106 + 131541ε)(1 + ε0) if |δ| gt 12c2

        (643)(from (613) and (614) type I we will specify ε and ε0 = (4 log 2)(xUV ) later)and

        xradicδ0φ(q)

        radicradicradicradiclogx

        UV+ (log δ0(1 + ε1)q) log

        (1 +

        log xUV

        log Vδ0(1+ε1)q

        )

        middotradicκ6 log

        x

        UV+ 2κ7

        (644)

        (from (637) and (640) type II here δ0 = max(2 |δ|4) while ε1 = x2UQ for|δ| gt 8 and ε1 = 0 for |δ| lt 8

        We set UV = κxradicqδ0 we must choose κ gt 0

        Let us first optimize (or rather almost optimize) κ in the case |δ| le 4 so thatδ0 = 2 and ε1 = 0 For the purpose of choosing κ we replace

        radicφ(q) by

        radicqC1

        where C1 = 23536 sim 510510φ(510510) and also replace V by q2c c a constantWe use the approximation

        log

        (1 +

        log xUV

        log V|2q|

        )= log

        (1 +

        log(radic

        2qκ)

        log(q2c)

        )= log

        (3

        2+

        log 2radiccκ

        log q2c

        )sim log

        3

        2+

        2 log 2radiccκ

        3 log q2c

        118 CHAPTER 6 MINOR-ARC TOTALS

        What we must minimize then is

        C0κradic2q

        +C1radic2q

        radicradicradicradic(log

        radic2q

        κ+ log 2q

        (log

        3

        2+

        2 log 2radicc

        κ3 log q

        2c

        ))(κ6 log

        radic2q

        κ+ 2κ7

        )

        le C0κradic2q

        +C1

        2radicq

        radicκ6radicκprime1

        radicκprime1 log q minus

        (5

        3+

        2

        3

        log 4c

        log q2c

        )logκ + κprime2

        middot

        radicκprime1 log q minus 2κprime1 logκ +

        4κprime1κ7

        κ6+ κprime1 log 2

        le C0radic2q

        (κ + κprime4

        (κprime1 log q minus

        ((5

        6+ κprime1

        )+

        1

        3

        log 4c

        log q2c

        )logκ + κprime3

        ))

        (645)where

        κprime1 =1

        2+ log

        3

        2 κprime2 = log

        radic2 + log 2 log

        3

        2+

        log 4c log 2q

        3 log q2c

        κprime3 =1

        2

        (κprime2 +

        4κprime1κ7

        κ6+ κprime1 log 2

        )=

        log 4c

        6+

        (log 4c)2

        6 log q2c

        + κprime5

        κprime4 =C1

        C0

        radicκ6

        2κprime1sim

        030915

        1+118694ε0if |δ| le 4

        027797(1+026894ε)(1+ε0) if |δ| gt 4

        κprime5 =1

        2(logradic

        2 + log 2 log3

        2+

        4κprime1κ7

        κ6+ κprime1 log 2) sim 101152

        Taking derivatives we see that the minimum is attained when

        κ =

        (5

        6+ κprime1 +

        1

        3

        log 4c

        log q2c

        )κprime4 sim

        (17388 +

        log 4c

        3 log q2c

        )middot 030915

        1 + 119ε0(646)

        provided that |δ| le 4 (What we obtain for |δ| gt 4 is essentially the same only withδ0q = δq4 instead of 2q and 027797((1 + 027ε)(1 + ε0)) in place of 030915) Forq = 5 middot 105 c = 25 and |δ| le 4 (typical values in the most delicate range) we get thatκ should be about 05582(1 + 119ε0) Values of q c nearby give similar values forκ whether |δ| le 4 or for |δ| gt 4

        (Incidentally at this point we could already give a back-of-the-envelope estimatefor the last line of (645) ie our main term It suggests that choosing w = 1 insteadof w = 2 would have given bounds worse by about 15 percent)

        We make the choices

        κ = 12 and so UV =x

        2radicqδ0

        for the sake of simplicity (Unsurprisingly (645) changes very slowly around its min-imum) Note by the way that this means that ε0 = (2 log 2)

        radicqδ0

        Now we must decide how to choose U V and Q given our choice of UV We willactually make two sets of choices

        63 ADJUSTING PARAMETERS CALCULATIONS 119

        First we will use the SI2 estimates for q le QV to treat all α of the form α =aq +Olowast(1qQ) q le y (Here y is a parameter satisfying y le QV )

        Then the remaining α will get treated with the (coarser) SI2 estimate for q gtQV with Q reset to a lower value (call it Qprime) If α was not treated in the first go (sothat it must be dealt with the coarser estimate) then α = aprimeqprime + δprimex where eitherqprime gt y or δprimeqprime gt xQ (Otherwise α = aprimeqprime +Olowast(1qprimeQ) would be a valid estimatewith qprime le y) The value of Qprime is set to be smaller than Q both because this is helpful(it diminishes error terms that would be large for large q) and because this is harmless(since we are no longer assuming that q le QV )

        631 First choice of parameters q le y

        The largest items affected strongly by our choices at this point are

        c16I2

        (2 +

        1 + ε

        εlog+ 2UV |δ|q

        x

        )x

        QV+ c17I2Q (from SI2 |δ| gt 12c2)(

        c10I2 logU

        q+ 2c5I2 + c12I2

        )Q (from SI2 |δ| le 12c2)

        (647)and

        κ2

        radic2q

        φ(q)

        (1 + 115

        radiclog 2q

        log x2Uq

        )xradicU

        + κ9xradicV

        (from SII any |delta|)

        (648)with

        κ2

        radic2q

        φ(q)middot

        radiclog V

        log 2V|δq|middot xradic

        U(from SII )

        as an alternative to (648) for |δ| ge 8 (In several of these expressions we are apply-ing some minor simplifications that our later choices will justify Of course even ifthese simplifications were not justified we would not be getting incorrect results onlypotentially suboptimal ones we are trying to decide how choose certain parameters)

        In addition we have a relatively mild but important dependence on V in the mainterm (644) even when we hold UV constant (as we do in so far as we have alreadychosen UV ) We must also respect the condition q le QV the lower bound onU given by (617) and the assumptions made at the beginning of the chapter (egQ ge xU V ge 2 middot 106) Recall that UV = x2

        radicqδ0

        We setQ =

        x

        8y

        since we will then have not just q le y but also q|δ| le xQ = 8y and so qδ0 le 2yWe want q le QV to be true whenever q le y this means that

        q le Q

        V=QU

        UV=

        QU

        x2radicqδ0

        =Uradicqδ0

        4y

        120 CHAPTER 6 MINOR-ARC TOTALS

        must be true when q le y and so it is enough to set U = 4y2radicqδ0 The following

        choices make sense we will work with the parameters

        y =x13

        6 Q =

        x

        8y=

        3

        4x23 xUV = 2

        radicqδ0 le 2

        radic2y

        U =4y2

        radicqδ0

        =x23

        9radicqδ0

        V =x

        (xUV ) middot U=

        x

        8y2=

        9x13

        2

        (649)

        where as before δ0 = max(2 |δ|4) So for instance we obtain ε1 le x2UQ =6radicqδ0x

        13 le 2radic

        3x16 Assuming

        x ge 216 middot 1020 (650)

        we obtain that U(xUV ) ge (x239radicqδ0)(2

        radicqδ0) = x2318qδ0 ge x136 ge

        106 and so (617) holds We also get that ε1 le 0002Since V = x8y2 = (92)x13 (650) also implies that V ge 2 middot 106 (in fact

        V ge 27 middot 106) It is easy to check that

        V lt x4 UV le x Q ge max(16 2radicx) Q ge max(2U xU) (651)

        as stated at the beginning of the chapter Let θ = (32)3 = 278 Then

        V

        2θq=x8y2

        2θqge x

        16θy3=

        x

        54y3= 4 gt 1

        V

        θ|δq|ge x8y2

        8θyge x

        64θy3=

        x

        216y3= 1

        (652)

        The first type I bound is

        |SI1| lex

        qmin

        (1cprime0δ2

        )min

        45

        qφ(q)

        log+ x23 9

        q52 δ

        120

        1

        (log 9x13

        radicqδ0 + c3I

        )+c4Iq

        φ(q)

        +

        (c7I log

        y

        c2+ c8I log x

        )y +

        c10Ix13

        3422q32δ120

        (log 9x13radiceqδ0)

        +

        (c5I log

        2x23

        9c2radicqδ0

        + c6I logx53

        9radicqδ0

        )x23

        9radicqδ0

        + c9Iradicx log

        2radicex

        c2+c10I

        e

        (653)where the constants are as in sect621 For any cR ge 1 the function

        xrarr (log cx)(log xR)

        attains its maximum on [Rprimeinfin] Rprime gt R at x = Rprime Hence for qδ0 fixed

        min

        45

        log+ 4x23

        9(δ0q)52

        1

        (log 9x13

        radicqδ0 + c3I

        )(654)

        63 ADJUSTING PARAMETERS CALCULATIONS 121

        attains its maximum for x isin [(9e45(δ0q)524)32infin) at

        x =(

        9e45(δ0q)524

        )32

        = (278)e65(qδ0)154 (655)

        Now notice that for smaller values of x (654) increases as x increases since the termmin( 1) equals the constant 1 Hence (654) attains its maximum for x isin (0infin)at (655) and so

        min

        45

        log+ 4x23

        9(δ0q)52

        1

        (log 9x13

        radicqδ0 + c3I

        )+ c4I

        le log27

        2e25(δ0q)

        74 + c3I + c4I le7

        4log δ0q + 611676

        Examining the other terms in (653) and using (650) we conclude that

        |SI1| lex

        qmin

        (1cprime0δ2

        )middot q

        φ(q)

        (7

        4log δ0q + 611676

        )+

        x23

        radicqδ0

        (067845 log xminus 120818) + 037864x23

        (656)

        where we are using (650) (and of course the trivial bound δ0q ge 2) to simplify thesmaller error terms We recall that cprime0 = 0798437 gt c0(2π)2

        Let us now consider SI2 The terms that appear both for |δ| small and |δ| large aregiven in (612) The second line in (612) equals

        c8I2

        (x

        4q2δ0+

        2UV 2

        x+qV 2

        x

        )+c10I

        2

        (q

        2radicqδ0

        +x23

        18qδ0

        )log

        9x13

        2

        le c8I2(

        x

        4q2δ0+

        9x13

        2radic

        2+

        27

        8

        )+c10I

        2

        (y16

        232+

        x23

        18qδ0

        )(1

        3log x+ log

        9

        2

        )le 029315

        x

        q2δ0+ (008679 log x+ 039161)

        x23

        qδ0+ 000153

        radicx

        where we are using (650) to simplify Now

        min

        (45

        log+ Q4V q2

        1

        )log V q = min

        (45

        log+ y4q2

        1

        )log

        9x13q

        2(657)

        can be bounded trivially by log(9x13q2) le (23) log x+log 34 We can also bound(657) as we bounded (654) before namely by fixing q and finding the maximum forx variable In this way we obtain that (657) is maximal for y = 4e45q2 since bydefinition x136 = y (657) then equals

        log9(6 middot 4e45q2)q

        2= 3 log q + log 108 +

        4

        5le 3 log q + 548214

        122 CHAPTER 6 MINOR-ARC TOTALS

        We conclude that (612) is at most

        min

        (1

        4cprime0δ2

        )middot(

        3

        2log q + 274107

        )x

        φ(q)

        + 029315x

        q2δ0+ (00434 log x+ 01959)x23

        (658)

        If |δ| le 12c2 we must consider (613) This is at most

        (c4I2 + c9I2)x

        2radicqδ0

        + (c10I2 logx23

        9q32radicδ0

        + 2c5I2 + c12I2) middot 3

        4x23

        le 21989xradicqδ0

        +361818x

        qδ0+ (177019 log x+ 292955)x23

        where we recall that ε0 = (4 log 2)(xUV ) = (2 log 2)radicqδ0 which can be bounded

        crudely byradic

        2 log 2 (Thus c10I2 leradic

        1 +radic

        8 log 2middot178783 lt 354037 and c12I2 le293333 + 11902

        radic2 log 2 le 410004)

        If |δ| gt 12c2 we must consider (614) instead For ε = 007 that is at most

        (c4I2 + (1 + ε)c13I2)x

        2radicqδ0

        (1 +

        2 log 2radicqδ0

        )+ (338845

        (1 +

        2 log 2radicqδ0

        )log δq3 + 208823)

        x

        |δ|q

        +

        (688133

        (1 +

        4 log 2radicqδ0

        )log |δ|q + 720828

        )x23 + 604141x13

        = 249157xradicqδ0

        (1 +

        2 log 2radicqδ0

        )+ (338845 log δq3 + 326771)

        x

        |δ|q

        +

        (229378 log x+ 190791

        log |δ|qradicqδ0

        + 130691

        )x

        23

        le 249157xradicqδ0

        + (359676 log δ0 + 273032 log q + 912218)x

        qδ0

        + (229378 log x+ 411228)x23

        where besides the crude bound ε0 leradic

        2 log 2 we use the inequalities

        log |δ|qradicqδ0

        le log 4qδ0radicqδ0

        le log 8radic2

        log qradicqδ0le 1radic

        2

        log qradicqle 1radic

        2

        log e2

        e=

        radic2

        e

        1

        |δ|le 4c2

        δ0

        log |δ||δ|

        le 2

        e log 2middot log δ0

        δ0

        (Obviously 1|δ| le 4c2δ0 is based on the assumption |δ| gt 12c2 and on the inequal-ity 16c2 ge 1 The bound on (log |δ|)|δ| is based on the fact that (log t)t reaches itsmaximum at t = e and (log δ0)δ0 = (log 2)2 for |δ| le 8)

        63 ADJUSTING PARAMETERS CALCULATIONS 123

        We sum (658) and whichever one of our bounds for (613) and (614) is greater(namely the latter) We obtain that for any δ

        |SI2| le 249157xradicqδ0

        + min

        (1

        4cprime0δ2

        )middot(

        3

        2log q + 274107

        )x

        φ(q)

        + (359676 log δ0 + 273032 log q + 91515)x

        qδ0+ (229812 log x+ 411424)x23

        (659)where we bound one of the lower-order terms in (658) by xq2δ0 le xqδ0

        For type II we have to consider two cases (a) |δ| lt 8 and (b) |δ| ge 8 Considerfirst |δ| lt 8 Then δ0 = 2 Recall that θ = 278 We have q le V2θ and |δq| le Vθthanks to (652) We apply (637) and obtain that for |δ| lt 8

        |SII | lexradic

        2φ(q)middot

        radicradicradicradic1

        2log 4qδ0 + log 2q log

        (1 +

        12 log 4qδ0

        log V2q

        )middotradic

        030214 log 4qδ0 + 02562

        + 822088

        radicq

        φ(q)

        1 + 115

        radicradicradicradic log 2q

        log 9x13radicδ0

        2radicq

        (qδ0)14x23 + 184251x56

        le xradic2φ(q)

        middotradicCx2q log 2q +

        log 8q

        2middotradic

        030214 log 2q + 067506

        + 16406

        radicq

        φ(q)x34 + 184251x56

        (660)where we bound

        log 2q

        log 9x13radicδ0

        2radicq

        lelog x13

        3

        log 9x16radic

        2

        2radic

        16

        lt limxrarrinfin

        log x13

        3

        log 9x16radic

        2

        2radic

        16

        = 2

        and where we define

        Cxt = log

        (1 +

        log 4t

        2 log 9x13

        2004t

        )

        for 0 lt t lt 9x132 (We have 2004 here instead of 2 because we want a constantge 2(1 + ε1) in later occurences of Cxt for reasons that will soon become clear)

        For purposes of later comparison we remark that 16404 le 157863x45minus34 forx ge 216 middot 1020

        Consider now case (b) namely |δ| ge 8 Then δ0 = |δ|4 By (652) |δq| le Vθ

        124 CHAPTER 6 MINOR-ARC TOTALS

        Hence (640) gives us that

        |SII | le2xradic|δ|φ(q)

        middot

        radicradicradicradic1

        2log |δq|+ log

        |δq|(1 + ε1)

        4log

        (1 +

        log |δ|q2 log 18x13

        |δ|(1+ε1)q

        )middotradic

        030214 log |δ|q + 02562

        + 822088

        radicq

        φ(q)

        radicradicradicradic log 9x13

        2

        log 9x13

        |δq|

        middot (qδ0)14x23 + 184251x56

        le xradicδ0φ(q)

        radicCxδ0q log δ0(1 + ε1)q +

        log 4δ0q

        2

        radic030214 log δ0q + 067506

        + 179926

        radicq

        φ(q)x45 + 184251x56

        (661)since

        822088

        radicradicradicradic log 9x13

        2

        log 9x13

        |δq|

        middot (qδ0)14 le 822088

        radiclog 9x13

        2

        log 274

        middot (x133)14

        le 179926x45minus23

        for x ge 216 middot 1020 Clearly

        log δ0(1 + ε1)q = log δ0q + log(1 + ε1) le log δ0q + ε1

        By Lemma C22 qφ(q) le z(y) = z(x136) (since x ge 183) It is easy tocheck that x rarr

        radicz(x136)x45minus56 is decreasing for x ge 216 middot 1020 (in fact for

        183) Using (650) we conclude that 167718radicqφ(q)x45 le 089657x56 and by

        the way 16406radicqφ(q)x34 le 078663x56 This allows us to simplify the last lines

        of (660) and (661) We obtain that for δ arbitrary

        |SII | lexradicδ0φ(q)

        radicCxδ0q(log δ0q + ε1) +

        log 4δ0q

        2

        radic030214 log δ0q + 067506

        + 273908x56(662)

        It is time to sum up SI1 SI2 and SII The main terms come from the first lineof (662) and the first term of (659) Lesser-order terms can be dealt with roughlywe bound min(1 cprime0δ

        2) and min(1 4cprime0δ2) from above by 2δ0 (using the fact that

        cprime0 = 0798437 lt 16 which implies that 8δ gt 4cprime0δ2 for δ gt 8 of course for δ le 8

        we have min(1 4cprime0δ2) le 1 = 22 = 2δ0)

        63 ADJUSTING PARAMETERS CALCULATIONS 125

        The terms inversely proportional to q φ(q) or q2 thus add up to at most

        2x

        δ0qmiddot q

        φ(q)

        (7

        4log δ0q + 611676

        )+

        2x

        δ0φ(q)

        (3

        2log q + 274107

        )+ (359676 log δ0 + 273032 log q + 91515)

        x

        qδ0

        le 2x

        δ0φ(q)

        (13

        4log δ0q + 781811

        )+

        2x

        δ0q(136516 log δ0q + 375415)

        where for instance we bound (32) log q + 274107 by (32) log δ0q + 274107 minus(32) log 2

        As for the other terms ndash we use the assumption x ge 216 middot 1020 to bound x23

        and x23 log x by a small constant times x56 We bound x23radicqδ0 by x23

        radic2 (in

        (656)) We obtain

        x23

        radic2

        (067845 log xminus 120818) + 037864x23

        + (229812 log x+ 411424)x23 + 273908x

        56 le 335531x56

        The sums S0infin and S0w in (311) are 0 (by (650) and the fact that η2(t) = 0 fort le 14) We conclude that for q le y = x136 x ge 216 middot 1020 and η = η2 as in(34)

        |Sη(x α)| le |SI1|+ |SI2|+ |SII |

        le xradicδ0φ(q)

        radicCxδ0q(log δ0q + 0002) +

        log 4δ0q

        2

        radic030214 log δ0q + 067506

        +249157xradic

        δ0q+

        2x

        δ0φ(q)

        (13

        4log δ0q + 781811

        )+

        2x

        δ0q(136516 log δ0q + 375415)

        + 335531x56(663)

        where

        δ0 = max(2 |δ|4) Cxt = log

        (1 +

        log 4t

        2 log 9x13

        2004t

        ) (664)

        SinceCxt is an increasing function as a function of t (for x fixed and t le 9x132004)and δ0q le 2y we see that Cxt le Cx2y It is clear that x 7rarr Cxt (fixed t) is adecreasing function of x For x = 216 middot 1020 Cx2y = 139942

        632 Second choice of parameters

        If with the original choice of parameters we obtained q gt y = x136 we now resetour parameters (Q U and V ) Recall that while the value of q may now change (due tothe change inQ) we will be able to assume that either q gt y or |δq| gt x(x8y) = 8y

        126 CHAPTER 6 MINOR-ARC TOTALS

        We want U(xUV ) ge 5 middot 105 (this is (617)) We also want UV small With thisin mind we let

        V =x13

        3 U = 500

        radic6x13 Q =

        x

        U=

        x23

        500radic

        6 (665)

        Then (617) holds (as an equality) Since we are assuming (650) we have V ge 2 middot106It is easy to check that (650) also implies that U le

        radicx2 and Q ge 2

        radicx and so the

        inequalities in (651) all holdWrite 2α = aq + δx for the new approximation we must have either q gt y or

        |δ| gt 8yq since otherwise aq would already be a valid approximation under the firstchoice of parameters Thus either (a) q gt y or both (b1) |δ| gt 8 and (b2) |δ|q gt 8ySince now V = 2y we have q gt V2θ in case (a) and |δq| gt Vθ in case (b) for anyθ ge 1 We set θ = 4

        (Thanks to this choice of θ we have |δq| le xQ le xθU as we commented at theend of sect623 this will help us avoid some case-work later)

        By (64)

        |SI1| lex

        qmin

        (1cprime0δ2

        )(log x23 minus log 500

        radic6 + c3I + c4I

        q

        φ(q)

        )+

        (c7I log

        Q

        c2+ c8I log x log c11I

        Q2

        x

        )Q+ c10I

        U2

        4xlog

        e12x23

        500radic

        6+c10I

        e

        +

        (c5I log

        1000radic

        6x13

        c2+ c6I log 500

        radic6x43

        )middot 500radic

        6x13 + c9Iradicx log

        2radicex

        c2

        le x

        qmin

        (1cprime0δ2

        )(2

        3log xminus 499944 + 100303

        q

        φ(q)

        )+

        289

        1000x23(log x)2

        where we are bounding

        c7I logQ

        c2+ c8I log x log c11I

        Q2

        x

        =c8I(log x)2 minus(c8I(log 1500000minus log c11I)minus

        2

        3c7I

        )log x+ c7I log

        1

        500radic

        6c2

        lec8I(log x)2 minus 38 log x

        We are also using the assumption (650) repeatedly in order to show that the sum ofall lower-order terms is less than (38c8I log x)(500

        radic6) Note that c8I(log x)2Q le

        000289x23(log x)2We have qφ(q) le z(Q) (where z is as in (C19)) and since Q gt

        radic6 middot 12 middot 109

        for x ge 216 middot 1020

        100303z(Q) le 100303

        (eγ log logQ+

        250637

        log logradic

        6 middot 12 middot 109

        )le 02359 logQ+ 079 lt 01573 log x

        63 ADJUSTING PARAMETERS CALCULATIONS 127

        (It is possible to give a much better estimation but it is not worthwhile since this willbe a very minor term) We have either q gt y or q|δ| gt 8y if q|δ| gt 8y but q le y then|δ| ge 8 and so cprime0δ

        2q lt 18|δ|q lt 164y lt 1y Hence

        |SI1| lex

        y

        ((2

        3+ 01573

        )log x

        )+ 000289x23(log x)2

        le 24719x23 log x+ 000289x23(log x)2

        We bound |SI2| using Lemma 424 First we bound (450) this is at most

        x

        2qmin

        (1

        4cprime0δ2

        )log

        x13q

        3

        + c0

        (1

        4minus 1

        π2

        ) (UV )2 log x13

        3

        2x+

        3c42

        500radic

        6

        9+

        (500radic

        6x13 + 1)2x13 log x

        23

        6x

        where c4 = 103884 We bound the second line of this using (650) As for the firstline we have either q ge y (and so the first line is at most (x2y)(log x13y3)) orq lt y and 4cprime0δ

        2q lt 116y lt 1y (and so the same bound applies) Hence (450) isat most

        3x23

        (2

        3log xminus log 18

        )+ 002017x23 log x = 202017x23 log xminus3(log 18)x23

        Now we bound (451) which comes up when |δ| le 12c2 where c2 = 6π5radicc0

        c0 = 31521 (and so c2 = 06714769 ) Since 12c2 lt 8 it follows that q gt y (thealternative q le y q|δ| gt 8y is impossible since it implies |δ| gt 8) Then (451) is atmost

        2radicc0c1π

        (UV log

        UVradice

        +Q

        (radic3 log

        c2x

        Q+

        logUV

        2log

        UV

        Q2

        ))+

        3c12

        x

        ylogUV log

        UV

        c2xy+

        16 log 2

        πQ log

        c0e3Q2

        4π middot 8 log 2 middot xlog

        Q

        2

        +3c1

        2radic

        2c2

        radicx log

        c2x

        2+

        25c04π2

        (3c2)12radicx log x

        (666)

        where c1 = 1000189 gt 1 + (8 log 2)(2xUV )The first line of (666) is a linear combination of terms of the form x23 logCx

        C gt 1 using (650) we obtain that it is at most 1144693x23 log x (The main contri-bution comes from the first term) Similarly we can bound the first term in the secondline by 330536x23 log x Since log(c0e

        3Q2(4π middot 8 log 2 middot x)) logQ2 is at mostlog x13 log x23 the second term in the second line is at most 00006406x(log x)2The third line of (666) can be bounded easily by 00122x23 log x

        Hence (666) is at most

        117776x23 log x+ 00006406x23(log x)2

        128 CHAPTER 6 MINOR-ARC TOTALS

        If |δ| gt 12c2 then we know that |δq| gt min(y2c2 8y) = y2c2 Thus (452)(with ε = 001) is at most

        2radicc0c1π

        UV logUVradice

        +202radicc0c1

        π

        (x

        y2c2+ 1

        )((radic

        302minus 1) log

        xy2c2

        + 1radic

        2+

        1

        2logUV log

        e2UVx

        y2c2

        )

        +

        (3c12

        (1

        2+

        303

        016log x

        )+

        20c03π2

        (2c2)32

        )radicx log x

        Again by (650) and in much the same way as before this simplifies to

        le (114466 + 15107 + 68523)x23 log x+ 29136x12(log x)2

        le 122885x23(log x)

        Hence in total and for any |δ|

        |SI2| le 202017x23 log x+ 122885x23(log x) + 00006406x23(log x)2

        le 12309x23(log x) + 00006406x23(log x)2

        Now we must estimate SII As we said before either (a) q gt y or both (b1)|δ| gt 8 and (b2) |δ|q gt 8y Recall that θ = 4 In case (a) we have q gt x136 =V2 gt V2θ thus we can use (638) and obtain that if q le x8U |SII | is at most

        xradicz(q)radic2q

        radic(log

        x

        U middot 8q+ log 2q log

        log x(2Uq)

        log 4

        )(κ6 log

        x

        U middot 8q+ 2κ7

        )

        +radic

        2κ2

        radicz( x

        8U

        )(1 + 115

        radiclog x4U

        log 4

        )xradicU

        + (κ2

        radiclog xU + κ9)

        xradicV

        +κ2

        6

        ((log 8y)32 minus (log 2y)32

        ) xradicy

        + κ2

        (radic8 log xU +

        2

        3((log xU)32 minus (log V )32)

        )xradic8U

        (667)where z is as in (C19) (We are already simplifying the third line the bound givenis justified by a derivative test) It is easy to check that q rarr (log 2q)(log log q)q isdecreasing for q ge y (indeed for q ge 9) and so the first line of (667) is maximal forq = y

        63 ADJUSTING PARAMETERS CALCULATIONS 129

        We can thus bound (667) by x56 timesradic3z(et36)

        (t

        3minus log 8c+

        (t

        3minus log 3

        )log

        t3 minus log 2c

        log 4

        )(κ6

        3tminus 4214

        )+

        radic2κ2radic6c

        radicz(e2t3

        48c

        )1 + 115

        radic23 tminus log 24c

        log 4

        +

        (κ2

        radic2t

        3minus log 6c+ κ9

        )radic

        3

        +κ2radic

        6

        ((t

        3+ log

        8

        6

        ) 32

        minus(t

        3+ log

        2

        6

        ) 32

        )

        +κ2radic48c

        (radic8

        (2t

        3minus log 6c

        )+

        2

        3

        ((2t

        3minus log 6c

        ) 32

        minus(t

        3minus log 3

        ) 32

        ))(668)

        where t = log x and c = 500radic

        6 Asymptotically the largest term in (667) comesfrom the last line (of order t32) even if the first line is larger in practice (while beingof order at most t log t) Let us bound (668) by a multiple of t32

        First of all notice that

        d

        dt

        z(et3

        6

        )log t

        =

        (eγ log

        (t3 minus log 6

        )+ 250637

        log( t3minuslog 6)

        )primelog t

        minusz(et3

        6

        )t(log t)2

        =eγ minus 250637

        log2( t3minuslog 6)

        (tminus 3 log 6) log tminuseγ + 250637

        log2( t3minuslog 6)

        t log tmiddot

        log(t3 minus log 6

        )log t

        (669)

        which for t ge 100 is

        gteγ log 3minus 2middot250637 log t

        log2( t3minuslog 6)

        t(log t)2ge

        195671minus 892482log t

        t(log t)2gt 0

        Similarly for t ge 2000

        d

        dt

        z(e2t3

        48c

        )log t

        gteγ log 3

        2 minus250637 log t

        log2( 2t3 minuslog 48c)

        minus 250637

        log( 2t3 minuslog 48c)

        t(log t)2

        ge072216minus 545234

        log t

        t(log t)2gt 0

        Thus

        z(et3

        6

        )le (log t) middot lim

        srarrinfin

        z(es3

        6

        )log s

        = eγ log t for t ge 100

        z(e2t3

        48c

        )le (log t) middot lim

        srarrinfin

        z(e2s3

        48c

        )log s

        = eγ log t for t ge 2000

        (670)

        130 CHAPTER 6 MINOR-ARC TOTALS

        Also note that since (x32)prime = (32)radicx((

        t

        3+ log

        8

        6

        ) 32

        minus(t

        3+ log

        2

        6

        ) 32

        )le 3

        2

        radict

        3+ log

        8

        6middot log 4 le 120083

        radict

        for t ge 2000 We also have(2t

        3minus log 6c

        ) 32

        minus(t

        3minus log 3

        ) 32

        lt

        (2t

        3minus log 9

        ) 32

        minus(t

        3minus log 3

        ) 32

        = (232 minus 1)

        (t

        3minus log 3

        ) 32

        lt (232 minus 1)t32

        332le 035189t32

        Of course

        t

        3minus log 8c+

        (t

        3minus log 3

        )log

        t3 minus log 2c

        log 4lt

        (t

        3+t

        3log

        t

        3

        )ltt

        3log t

        We conclude that for t ge 2000 (668) is at mostradic3 middot eγ log t middot t

        3log t middot κ6

        3t+

        radic2κ2radic6c

        radiceγ log t

        (1 + 079749

        radict)

        +

        (κ2

        radic2

        3t12 + κ9

        )radic

        3 +κ2radic

        6middot 12009

        radict+

        κ2radic48c

        (radic16t

        3+

        2

        3middot 035189t32

        )le (010181 + 000012 + 000145 + 0000048 + 000462)t32 le 010848t32

        On the remaining interval log(216 middot 1020) le t le log 2000 we use interval arith-metic (as in sect26 with 30 iterations) to bound the ratio of (668) to t32 We obtain thatit is at most

        0275964t32

        Hence for all x ge 216 middot 1020

        |SII | le 0275964x56(log x)32 (671)

        in the case y lt q le x8U If x8U lt q le Q we use (639) In this range x2

        radic2q +

        radicqx adopts its max-

        imum at q = Q (because x2radic

        2q for q = x8U is smaller thanradicqx for q = Q by

        (665) and (650)) Hence (639) is at most x56 times(κ2

        radic2

        (2

        3tminus log cprime

        )+ κ9

        )radic

        3 + κ2

        radic2

        3tminus log cprime middot 1radic

        cprime

        +2κ2

        3

        ((2

        3tminus log cprime

        ) 32

        minus(t

        3minus log 3

        ) 32

        )( radiccprime

        2radic

        2eminust6 +

        1radiccprime

        )

        63 ADJUSTING PARAMETERS CALCULATIONS 131

        where t = log x (as before) and cprime = 500radic

        6 This is at most

        (2κ2 +radic

        3κ9)radict+

        κ2radiccprime

        radic2

        3

        radict+

        2κ2

        3

        232 minus 1

        332t32

        ( radiccprime

        2radic

        2eminust6 +

        1radiccprime

        )le 010327

        for t ge log(216 middot 1020

        ) and so

        |SII | le 010327x56(log x)32

        for x8U lt q le Q using the assumption x ge 216 middot 1020Finally let us treat case (b) that is |δ| gt 8 and |δ|q gt 8y we can also assume

        q le y as otherwise we are in case (a) which has already been treated Since |δx| le1qQ we know that

        |δq| le x

        Q= U = 500

        radic6x13 le x23

        2000radic

        6=

        x

        4U=

        x

        θU

        again under assumption (650) We apply (641) and obtain that |SII | is at most

        2xradicz(y)radic8y

        radic(log

        x

        U middot 4 middot 8y+ log 3y log

        log x3Uy

        log 323

        )(κ6 log

        x

        U middot 4 middot 8y+ 2κ7

        )+

        2κ2

        3

        (xradic16y

        ((log 32y)32 minus (log 2y)

        32 ) +

        x4radicQminus y

        ((log 4U)32 minus (log 2y)

        32 )

        )+

        (κ2radic

        2(1minus yQ)

        (radiclog V +

        radic1 log V

        )+ κ9

        )xradicV

        + κ2

        radicz(y) middot

        radiclog 4U middot xradic

        U

        (672)where we are using the facts that (log 3t8)t is increasing for t ge 8y gt 8e3 and that

        d

        dt

        (log t)32 minus (log V )32

        radict

        =3(log t)12 minus ((log t)32 minus (log V )32)

        2t32

        = minuslog t

        e3 middotradic

        log tminus (log V )32

        2t32lt 0

        for t ge θ middot 8y = 16V thanks to(log

        16V

        e3

        )2

        log 16V gt (log V )3 +

        (log 16minus 2 log

        e3

        16

        )(log V )2

        +

        ((log

        16

        e3

        )2

        minus 2 loge3

        16log 16

        )log V gt (log V )3

        132 CHAPTER 6 MINOR-ARC TOTALS

        (valid for log V ge 1) Much as before we can rewrite (672) as x56 times

        2radicz(et36)radic

        86

        radict

        3minus log 32c+

        (t

        3minus log 2

        )log

        t3 minus log 3c

        log 323

        middot

        radicκ6

        (t

        3minus log 32c

        )+ 2κ7 +

        2κ2

        3

        radic3

        8

        ((t

        3+ log

        32

        6

        ) 32

        minus(t

        3minus log 3

        ) 32

        )

        +2κ2

        3

        14radicet3

        6c minus16

        ((t

        3+ log 24c

        )32

        minus(t

        3minus log 3

        )32)

        +κ2

        radic3radic

        2(1minus c

        et3

        )(radic

        t3minus log 3 +1radic

        t3minus log 3

        )+ κ9

        radic3

        + κ2

        radicz(et36)

        radict3 + log 24c

        6c

        (673)where t = log x and c = 500

        radic6 For t ge 100 we use (670) to bound z(et36)

        and we obtain that (673) is at most

        2radiceγradic

        86

        radic1

        3middot κ6

        3middot (log t)t+

        2κ2

        3

        radic3

        8middot 1

        2

        (t

        3+ log

        32

        6

        )12

        middot log 16

        +2κ2

        3

        14radice1003

        6c minus 16

        middot 1

        2

        (t

        3+ log 24c

        )12

        middot log 72c

        +κ2

        radic3radic

        2(1minus c

        e1003

        )(radic

        t3 +1radict3

        )+ κ9

        radic3 + κ2

        radiceγ log t

        radict3 + log 24c

        6c

        (674)where we have bounded expressions of the form a32minusb32 (a gt b) by (a122)middot(aminusb)The ratio of (674) to t32 is clearly a decreasing function of t For t = 200 this ratiois 023747 hence (674) (and thus (673)) is at most 023748t32 for t ge 200

        On the range log(216 middot 1020) le t le 200 the bisection method (with 25 iterations)gives that the ratio of (673) to t32 is at most 023511

        We conclude that when |δ| gt 8 and |δ|q gt 8y

        |SII | le 023511x56(log x)32

        Thus (671) gives the worst caseWe now take totals and obtain

        Sη(x α) le |SI1|+ |SI2|+ |SII |le (24719 + 12309)x23 log x+ (000289 + 00006406)x23(log x)2

        + 0275964x56(log x)32

        le 027598x56(log x)32 + 123338x23 log x(675)

        64 CONCLUSION 133

        where we use (650) yet again

        64 ConclusionProof of Theorem 311 We have shown that |Sη(α x)| is at most (663) for q lex136 and at most (675) for q gt x136 It remains to simplify (663) slightlyBy the geometric meanarithmetic mean inequalityradic

        Cxδ0q(log δ0q + 0002) +log 4δ0q

        2

        radic030214 log δ0q + 067506 (676)

        is at most

        1

        2radicρ

        (Cxδ0q(log δ0q + 0002) +

        log 4δ0q

        2

        )+

        radicρ

        2(030214 log δ0q + 067506)

        for any ρ gt 0 We recall that

        Cxt = log

        (1 +

        log 4t

        2 log 9x13

        2004t

        )

        Let

        ρ =Cx12q0(log 2q0 + 0002) + log 8q0

        2

        030214 log 2q0 + 067506= 3397962

        where x1 = 1025 q0 = 2 middot 105 (In other words we are optimizing matters for x = x1δ0q = 2q0 the losses in nearby ranges will be very slight) We obtain that (676) is atmost

        Cxδ0q2radicρ

        (log δ0q + 0002) +

        (1

        4radicρ

        +

        radicρ middot 030214

        2

        )log δ0q

        +1

        2

        (log 2radicρ

        +

        radicρ

        2middot 067506

        )le 027125Cxt(log δ0q + 0002) + 04141 log δ0q + 049911

        (677)

        Now for x ge x0 = 216 middot 1020

        Cxtlog t

        le Cx0t

        log t=

        1

        log tlog

        (1 +

        log 4t

        2 log 54middot106

        2004t

        )le 008659

        for 8 le t le 106 (by the bisection method with 20 iterations) and

        Cxtlog t

        leC(6t)3t

        log tle 1

        log tlog

        (1 +

        log 4t

        2 log 9middot62004

        )le 008659

        if 106 lt t le x136 Hence

        027125 middot Cxδ0q middot 0002 le 0000047 log δ0q

        134 CHAPTER 6 MINOR-ARC TOTALS

        We conclude that for q le x136

        |Sη(α x)| le Rxδ0q log δ0q + 049911radicφ(q)δ0

        middot x+2492xradicqδ0

        +2x

        δ0φ(q)

        (13

        4log δ0q + 782

        )+

        2x

        δ0q(1366 log δ0q + 3755) + 336x56

        where

        Rxt = 027125 log

        (1 +

        log 4t

        2 log 9x13

        2004t

        )+ 041415

        Part II

        Major arcs

        135

        Chapter 7

        Major arcs overview andresults

        Our task as in Part I will be to estimate

        Sη(α x) =sumn

        Λ(n)e(αn)η(nx) (71)

        where η R+ rarr C us a smooth function Λ is the von Mangoldt function and e(t) =e2πit Here we will treat the case of α lying on the major arcs

        We will see how we can obtain good estimates by using smooth functions η basedon the Gaussian eminust

        22 This will involve proving new fully explicit bounds for theMellin transform of the twisted Gaussian or what is the same bounds on paraboliccylindrical functions in certain ranges It will also require explicit formulae that aregeneral and strong enough even for moderate values of x

        Let α = aq + δx For us saying that α lies on a major arc will be the same assaying that q and δ are bounded more precisely q will be bounded by a constant r and|δ| will be bounded by a constant times rq As is customary on the major arcs wewill express our exponential sum (31) as a linear combination of twisted sums

        Sηχ(δx x) =

        infinsumn=1

        Λ(n)χ(n)e(δnx)η(nx) (72)

        for χ Zrarr C a Dirichlet character mod q ie a multiplicative character on (ZqZ)lowast

        lifted to Z (The advantage here is that the phase term is now e(δnx) rather thane(αn) and e(δnx) varies very slowly as n grows) Our task then is to estimateSηχ(δx x) for δ small

        Estimates on Sηχ(δx x) rely on the properties of DirichletL-functionsL(s χ) =sumn χ(n)nminuss What is crucial is the location of the zeroes of L(s χ) in the critical strip

        0 le lt(s) le 1 (a region in which L(s χ) can be defined by analytic continuation) Incontrast to most previous work we will not use zero-free regions which are too narrowfor our purposes Rather we use a verification of the Generalized Riemann Hypothesisup to bounded height for all conductors q le 300000 (due to D Platt [Plab])

        137

        138 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

        A key feature of the present work is that it allows one to mimic a wide varietyof smoothing functions by means of estimates on the Mellin transform of a singlesmoothing function ndash here the Gaussian eminust

        22

        71 Results

        Write ηhearts(t) = eminust22 Let us first give a bound for exponential sums on the primes

        using ηhearts as the smooth weight Without loss of generality we may assume that ourcharacter χ mod q is primitive ie that it is not really a character to a smaller modulusqprime|q

        Theorem 711 Let x be a real numberge 108 Let χ be a primitive Dirichlet charactermod q 1 le q le r where r = 300000

        Then for any δ isin R with |δ| le 4rq

        infinsumn=1

        Λ(n)χ(n)e

        xn

        )eminus

        (nx)2

        2 = Iq=1 middot ηhearts(minusδ) middot x+ E middot x

        where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

        |E| le 4306 middot 10minus22 +1radicx

        (650400radicq

        + 112

        )

        We normalize the Fourier transform f as follows f(t) =intinfinminusinfin e(minusxt)f(x)dx Of

        course ηhearts(minusδ) is justradic

        2πeminus2π2δ2 As it turns out smooth weights based on the Gaussian are often better in applica-

        tions than the Gaussian ηhearts itself Let us give a bound based on η(t) = t2ηhearts(t)

        Theorem 712 Let η(t) = t2eminust22 Let x be a real number ge 108 Let χ be a

        primitive character mod q 1 le q le r where r = 300000Then for any δ isin R with |δ| le 4rq

        infinsumn=1

        Λ(n)χ(n)e

        xn

        )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

        where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

        |E| le 2485 middot 10minus19 +1radicx

        (281200radicq

        + 56

        )

        The advantage of η(t) = t2ηhearts(t) over ηhearts is that it vanishes at the origin (to secondorder) as we shall see this makes it is easier to estimate exponential sums with thesmoothing η lowastM g where lowastM is a Mellin convolution and g is nearly arbitrary Here isa good example that is used crucially in Part III

        71 RESULTS 139

        Corollary 713 Let η(t) = t2eminust22 lowastM η2(t) where η2 = η1 lowastM η1 and η1 =

        2 middot I[121] Let x be a real number ge 108 Let χ be a primitive character mod q1 le q le r where r = 300000

        Then for any δ isin R with |δ| le 4rq

        infinsumn=1

        Λ(n)χ(n)e

        xn

        )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

        where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

        |E| le 2485 middot 10minus19 +1radicx

        (381500radicq

        + 76

        )

        Let us now look at a different kind of modification of the Gaussian smoothing Saywe would like a weight of a specific shape for example what we will need to do inPart III we would like an approximation to the function

        η t 7rarr

        t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

        (73)

        At the same time what we have is an estimate for the Mellin transform of the Gaussianeminust

        22 centered at t = 0The route taken here is to work with an approximation η+ to η We let

        η+(t) = hH(t) middot teminust22 (74)

        where hH is a band-limited approximation to

        h(t) =

        t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

        (75)

        By band-limited we mean that the restriction of the Mellin transform of hH to theimaginary axis is of compact support (We could alternatively let hH be a functionwhose Fourier transform is of compact support this would be technically easier insome ways but it would also lead to using GRH verifications less efficiently)

        To be precise we define

        FH(t) =sin(H log y)

        π log y

        hH(t) = (h lowastM FH)(y) =

        int infin0

        h(tyminus1)FH(y)dy

        y

        (76)

        and H is a positive constant It is easy to check that MFH(iτ) = 1 for minusH ltτ lt H and MFH(iτ) = 0 for τ gt H or τ lt minusH (unsurprisingly since FH is aDirichlet kernel under a change of variables) Since in general the Mellin transform ofa multiplicative convolution f lowastM g equals Mf middotMg we see that the Mellin transform

        140 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

        of hH on the imaginary axis equals the truncation of the Mellin transform of h to[minusiH iH] Thus hH is a band-limited approximation to h as we desired

        The distinction between the odd and the even case in the statement that followssimply reflects the two different points up to which computations where carried out in[Plab] these computations were in turn to some extent tailored to the needs of thepresent work (as was the shape of η+ itself)

        Theorem 714 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

        H = 200 Let x be a real numberge 1012 Let χ be a primitive character mod q where1 le q le 150000 if q is odd and 1 le q le 300000 if q is even

        Then for any δ isin R with |δ| le 600000 middot gcd(q 2)q

        infinsumn=1

        Λ(n)χ(n)e

        xn

        )η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x

        where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and

        |E| le 13482 middot 10minus14 +1617 middot 10minus10

        q+

        1radicx

        (499900radicq

        + 52

        )

        If q = 1 we have the sharper bound

        |E| le 4772 middot 10minus11 +251400radic

        x

        This is a paradigmatic example in that following the proof given in sect94 we canbound exponential sums with weights of the form hH(t)eminust

        22 where hH is a band-limited approximation to just about any continuous function of our choosing

        Lastly we will need an explicit estimate of the `2 norm corresponding to the sumin Thm 714 for the trivial character

        Proposition 715 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and

        H = 200 Let x be a real number ge 1012Theninfinsumn=1

        Λ(n)(log n)η2(nx) = x middotint infin

        0

        η2+(t) log xt dt+ E1 middot x log x

        = 0640206x log xminus 0021095x+ E2 middot x log x

        where|E1| le 5123 middot 10minus15 +

        36691radicx

        |E2| le 2 middot 10minus6 +36691radic

        x

        72 Main ideasAn explicit formula gives an expression

        Sηχ(δx x) = Iq=1η(minusδ)xminussumρ

        Fδ(ρ)xρ + small error (77)

        72 MAIN IDEAS 141

        where Iq=1 = 1 if q = 1 and Iq=1 = 0 otherwise Here ρ runs over the complexnumbers ρ with L(ρ χ) = 0 and 0 lt lt(ρ) lt 1 (ldquonon-trivial zerosrdquo) The function Fδis the Mellin transform of e(δt)η(t) (see sect24)

        The questions are then where are the non-trivial zeros ρ of L(s χ) How fast doesFδ(ρ) decay as =(ρ)rarr plusmninfin

        Write σ = lt(s) τ = =(s) The belief is of course that σ = 12 for every non-trivial zero (Generalized Riemann Hypothesis) but this is far from proven Most workto date has used zero-free regions of the form σ le 1minus1C log q|τ | C a constant Thisis a classical zero-free region going back qualitatively to de la Vallee-Poussin (1899)The best values of C known are due to McCurley [McC84a] and Kadiri [Kad05]

        These regions seem too narrow to yield a proof of the three-primes theorem Whatwe will use instead is a finite verification of GRH ldquoup to Tqrdquo ie a computation show-ing that for every Dirichlet character of conductor q le r0 (r0 a constant as above)every non-trivial zero ρ = σ + iτ with |τ | le Tq satisfies lt(σ) = 12 Such verifica-tions go back to Riemann modern computer-based methods are descended in part froma paper by Turing [Tur53] (See the historical article [Boo06b]) In his thesis [Pla11]D Platt gave a rigorous verification for r0 = 105 Tq = 108q In coordination withthe present work he has extended this to

        bull all odd q le 3 middot 105 with Tq = 108q

        bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

        This was a major computational effort involving in particular a fast implementationof interval arithmetic (used for the sake of rigor)

        What remains to discuss then is how to choose η in such a way Fδ(ρ) decreasesfast enough as |τ | increases so that (77) gives a good estimate We cannot hope forFδ(ρ) to start decreasing consistently before |τ | is at least as large as a constant times|δ| Since δ varies within (minuscr0q cr0q) this explains why Tq is taken inverselyproportional to q in the above As we will work with r0 ge 150000 we also see that wehave little margin for maneuver we want Fδ(ρ) to be extremely small already for say|τ | ge 80|δ| We also have a Scylla-and-Charybdis situation courtesy of the uncertaintyprinciple roughly speaking Fδ(ρ) cannot decrease faster than exponentially on |τ ||δ|both for |δ| le 1 and for δ large

        The most delicate case is that of δ large since then |τ ||δ| is small It turns outwe can manage to get decay that is much faster than exponential for δ large while noslower than exponential for δ small This we will achieve by working with smoothingfunctions based on the (one-sided) Gaussian ηhearts(t) = eminust

        22The Mellin transform of the twisted Gaussian e(δt)eminust

        22 is a parabolic cylinderfunction U(a z) with z purely imaginary Since fully explicit estimates for U(a z)z imaginary have not been worked in the literature we will have to derive them our-selves

        Once we have fully explicit estimates for the Mellin transform of the twisted Gaus-sian we are able to use essentially any smoothing function based on the Gaussianηhearts(t) = eminust

        22 As we already saw we can and will consider smoothing functionsobtained by convolving the twisted Gaussian with another function and also functionsobtained by multiplying the twisted Gaussian with another function All we need to

        142 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS

        do is use an explicit formula of the right kind ndash that is a formula that does not as-sume too much about the smoothing function or the region of holomorphy of its Mellintransform but still gives very good error terms with simple expressions

        All results here will be based on a single general explicit formula (Lem 911) validfor all our purposes The contribution of the zeros in the critical trip can be handled ina unified way (Lemmas 913 and 914) All that has to be done for each smoothingfunction is to bound a simple integral (in (924)) We then apply a finite verification ofGRH and are done

        Chapter 8

        The Mellin transform of thetwisted Gaussian

        Our aim in this chapter is to give fully explicit yet relatively simple bounds for theMellin transform Fδ(ρ) of e(δt)ηhearts(t) where ηhearts(t) = eminust

        22 and δ is arbitrary Therapid decay that results will establish that the Gaussian ηhearts is a very good choice for asmoothing particularly when the smoothing has to be twisted by an additive charactere(δt)

        The Gaussian smoothing has been used before in number theory see notablyHeath-Brownrsquos well-known paper on the fourth power moment of the Riemann zetafunction [HB79] What is new here is that we will derive fully explicit bounds on theMellin transform of the twisted Gaussian This means that the Gaussian smoothing willbe a real option in explicit work on exponential sums in number theory and elsewherefrom now on1

        Theorem 801 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of fδ

        Let s = σ + iτ σ ge 0 τ 6= 0 Let ` = minus2πδ Then if sgn(δ) 6= sgn(τ) and δ 6= 0

        |Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

        c1σττ

        σ2 for ρ arbitraryc2στ`

        σ for ρ le 32(81)

        1 There has also been work using the Gaussian after a logarithmic change of variables see in particular[Leh66] In that case the Mellin transform is simply a Gaussian (as in eg [MV07 Ex XII29]) Howeverfor δ non-zero the Mellin transform of a twist e(δt)eminus(log t)22 decays very slowly and thus would not beuseful for our purposes or in general for most applications in which GRH is not assumed

        143

        144 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        where ρ = 4τ`2

        E(ρ) =1

        2

        (arccos

        1

        υ(ρ)minus 2(υ(ρ)minus 1)

        ρ

        )

        c1στ =1

        2

        1 + 214

        (2

        1 + sin2 π8

        )σ2+eminus(radic

        2minus12

        )τ(

        tan π8

        c2στ =1

        2

        1 + min

        2σ+ 12

        radicsec 2π

        5(sin π

        5

        )σ+

        eminusτ6

        (1radic

        3)σ

        (82)

        and

        υ(ρ) =

        radic1 +

        radicρ2 + 1

        2

        If sgn(δ) = sgn(τ) or δ = 0

        |Fδ(s)| le |x0|minusσ middot eminus12 `

        2

        |Γ(s)|eπ2 |τ | middot((

        1 +π

        232

        )eminus

        π4 |τ | +

        1

        2eminusπ|τ |

        ) (83)

        where

        |x0| ge

        051729

        radicτ for ρ arbitrary

        084473 |τ ||`| for ρ le 32(84)

        As we shall see the choice of smoothing function η(t) = eminust22 can be easily

        motivated by the method of stationary phase but the problem is actually solved by thesaddle-point method One of the challenges here is to keep all expressions explicit andpractical

        (In particular the more critical estimate (81) is optimal up to a constant dependingon σ the constants we give will be good rather than optimal)

        The expressions in Thm 801 can be easily simplified further especially if one isready to introduce some mild constraints and make some sacrifices in the main term

        Corollary 802 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of

        fδ Let s = σ + iτ where σ isin [0 1] and |τ | ge 20 Then for 0 le k le 2

        |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

        κk0(|τ ||`|

        )keminus01065( 2|τ|

        |`| )2

        if 4|τ |`2 lt 32

        κk1|τ |k2eminus01598|τ | if 4|τ |`2 ge 32

        whereκ00 le 3001 κ10 le 4903 κ20 le 796

        κ01 le 3286 κ11 le 4017 κ21 le 513

        We are considering Fδ(s + k) and not just Fδ(s) because bounding Fδ(s + k)

        enables us to work with smoothing functions equal to or based on tkeminust22 Clearly

        we can easily derive bounds with k arbitrary from Thm 801 It is just that we will

        81 HOW TO CHOOSE A SMOOTHING FUNCTION 145

        use k = 0 1 2 in practice Corollary 802 is meant to be applied to cases where τis larger than a constant (10 say) times |`| and σ cannot be bounded away from 1 ifeither condition fails to hold it is better to apply Theorem 801 directly

        Let us end by a remark that may be relevant to applications outside number theoryBy (89) Thm 801 gives us bounds on the parabolic cylinder function U(a z) for zpurely imaginary (Surprisingly there seem to have been no fully explicit bounds forthis case in the literature) The bounds are useful when |=(a)| is at least somewhatlarger than |=(z)| (ie when |τ | is large compared to `) While the Thm 801 is statedfor σ ge 0 (ie for lt(a) ge minus12) extending the result to larger half-planes for a isnot hard

        81 How to choose a smoothing functionLet us motivate our choice of smoothing function η The method of stationary phase([Olv74 sect411] [Won01 sectII3])) suggests that the main contribution to the integral

        Fδ(t) =

        int infin0

        e(δt)η(t)tsdt

        t(85)

        should come when the phase has derivative 0 The phase part of (85) is

        e(δt)t=(s)i = e(2πδt+τ log t)i

        (where we write s = σ + iτ ) clearly

        (2πδt+ τ log t)prime = 2πδ +τ

        t= 0

        when t = minusτ2πδ This is meaningful when t ge 0 ie sgn(τ) 6= sgn(δ) Thecontribution of t = minusτ2πδ to (85) is then

        η(t)e(δt)tsminus1 = η

        (minusτ2πδ

        )eminusiτ

        (minusτ2πδ

        )σ+iτminus1

        (86)

        multiplied by a ldquowidthrdquo approximately equal to a constant divided byradic|(2πiδt+ τ log t)primeprime| =

        radic| minus τt2| = 2π|δ|radic

        |τ |

        The absolute value of (86) is

        η(minus τ

        2πδ

        )middot∣∣∣∣ minusτ2πδ

        ∣∣∣∣σminus1

        (87)

        In other words if sgn(τ) 6= sgn(δ) and δ is not too small asking that Fδ(σ + iτ)decay rapidly as |τ | rarr infin amounts to asking that η(t) decay rapidly as t rarr 0 Thusif we ask for Fδ(σ + iτ) to decay rapidly as |τ | rarr infin for all moderate δ we arerequesting that

        146 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        1 η(t) decay rapidly as trarrinfin

        2 the Mellin transform F0(σ + iτ) decay rapidly as τ rarr plusmninfin

        Requirement (2) is there because we also need to consider Fδ(σ+ it) for δ very smalland in particular for δ = 0

        There is clearly an uncertainty-principle issue here one cannot do arbitrarily wellin both aspects at the same time Once we are conscious of this the choice η(t) = eminust

        in Hardy-Littlewood actually looks fairly good obviously η(t) = eminust decays expo-nentially and its Mellin transform Γ(s + iτ) also decays exponentially as τ rarr plusmninfinMoreover for this choice of η the Mellin transform Fδ(s) can be written explicitlyFδ(s) = Γ(s)(1minus 2πiδ)s

        It is not hard to work out an explicit formula2 for η(t) = eminust However it is nothard to see that for Fδ(s) as above Fδ(12 + it) decays like eminust2π|δ| just as weexpected from (87) This is a little too slow for our purposes we will often haveto work with relatively large δ and we would like to have to check the zeroes of Lfunctions only up to relatively low heights t ndash say up to 50|δ| Then eminust2π|δ| gteminus8 = 000033 which is not very small We will settle for a different choice of ηthe Gaussian

        The decay of the Gaussian smoothing function η(t) = eminust22 is much faster than

        exponential Its Mellin transform is Γ(s2) which decays exponentially as =(s) rarrplusmninfin Moreover the Mellin transform Fδ(s) (δ 6= 0) while not an elementary orvery commonly occurring function equals (after a change of variables) a relativelywell-studied special function namely a parabolic cylinder function U(a z) (or inWhittakerrsquos [Whi03] notation Dminusaminus12(z))

        For δ not too small the main term will indeed work out to be proportional toeminus(τ2πδ)22 as the method of stationary phase indicated This is of course muchbetter than eminusτ2π|δ| The ldquocostrdquo is that the Mellin transform Γ(s2) for δ = 0 nowdecays like eminus(π4)|τ | rather than eminus(π2)|τ | This we can certainly afford

        82 The twisted Gaussian overview and setup

        821 Relation to the existing literatureWe wish to approximate the Mellin transform

        Fδ(s) =

        int infin0

        eminust22e(δt)ts

        dt

        t (88)

        where δ isin R The parabolic cylinder function U C2 rarr C is given by

        U(a z) =eminusz

        24

        Γ(

        12 + a

        ) int infin0

        taminus12 eminus

        12 t

        2minusztdt

        2There may be a minor gap in the literature in this respect The explicit formula given in [HL22 Lemma4] does not make all constants explicit The constants and trivial-zero terms were fully worked out forq = 1 by [Wig20] (cited in [MV07 Exercise 12118(c)] the sign of hypκq(z) there seems to be off) Aswas pointed out by Landau (see [Har66 p 628]) [HL22] seems to neglect the effect of the zeros ρ withlt(ρ) = 0 =(ρ) 6= 0 for χ non-primitive (The author thanks R C Vaughan for this information and thereferences)

        82 THE TWISTED GAUSSIAN OVERVIEW AND SETUP 147

        for lt(a) gt minus12 the function can be extended to all a z isin C either by analyticcontinuation or by other integral representations ([AS64 sect195] [Tem10 sect125(i)])Hence

        Fδ(s) = e(πiδ)2Γ(s)U

        (sminus 1

        2minus2πiδ

        ) (89)

        The second argument of U is purely imaginary it would be otherwise if a Gaussian ofnon-zero mean were chosen

        Let us briefly discuss the state of knowledge up to date on Mellin transforms ofldquotwistedrdquo Gaussian smoothings that is eminust

        22 multiplied by an additive charactere(δt) As we have just seen these Mellin transforms are precisely the parabolic cylin-der functions U(a z)

        The function U(a z) has been well-studied for a and z real see eg [Tem10]Less attention has been paid to the more general case of a and z complex The mostnotable exception is by far the work of Olver [Olv58] [Olv59] [Olv61] [Olv65] hegave asymptotic series for U(a z) a z isin C These were asymptotic series in the senseof Poincare and thus not in general convergent they would solve our problem if andonly if they came with error term bounds Unfortunately it would seem that all fullyexplicit error terms in the literature are either for a and z real or for a and z outsideour range of interest (see both Olverrsquos work and [TV03]) The bounds in [Olv61]involve non-explicit constants Thus we will have to find expressions with expliciterror bounds ourselves Our case is that of a in the critical strip z purely imaginary

        822 General approach

        We will use the saddle-point method (see eg [dB81 sect5] [Olv74 sect47] [Won01sectII4]) to obtain bounds with an optimal leading-order term and small error terms (Weused the stationary-phase method solely as an exploratory tool)

        What do we expect to obtain Both the asymptotic expressions in [Olv59] and thebounds in [Olv61] make clear that if the sign of τ = =(s) is different from that of δthere will a change in behavior when τ gets to be of size about (2πδ)2 This is unsur-prising given our discussion using stationary phase for |=(a)| smaller than a constanttimes |=(z)|2 the term proportional to eminus(π4)|τ | = eminus|=(a)|2 should be dominantwhereas for |=(a)| much larger than a constant times |=(z)|2 the term proportional to

        eminus12 ( τ

        2πδ )2

        should be dominantThere is one important difference between the approach we will follow here and

        that in [Hela] In [Hela] the integral (88) was estimated by a direct application ofthe saddle-point method Here following a suggestion of N Temme we will use theidentity

        U(a z) =e

        14 z

        2

        radic2πi

        int c+iinfin

        cminusiinfineminuszu+u2

        2 uminusaminus12 du (810)

        (see eg [OLBC10 (1256)] c gt 0 is arbitrary) Together (89) and (810) give usthat

        Fδ(s) =eminus2π2δ2Γ(s)radic

        2πi

        int c+iinfin

        cminusiinfine2πiδu+u2

        2 uminussdu (811)

        148 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        Estimating the integral in (811) turns out to be a somewhat cleaner task than estimating(88) The overall procedure however is in essence the same in both cases

        We write

        φ(u) = minusu2

        2minus (2πiδ)u+ iτ log u (812)

        for u real or complex so that the integral in (811) equals

        I(s) =

        int c+iinfin

        cminusiinfineminusφ(u)uminusσdu (813)

        We wish to find a saddle point A saddle point is a point u at which φprime(u) = 0This means that

        minus uminus 2πiδ +iτ

        u= 0 ie u2 minus i`uminus iτ = 0 (814)

        where ` = minus2πδ The solutions to φprime(u) = 0 are thus

        u0 =i`plusmnradicminus`2 + 4iτ

        2 (815)

        The value of φ(u) at u0 is

        φ(u0) = minus i`u0 + iτ

        2+ i`u0 + iτ log u0

        =i`

        2u0 + iτ log

        u0radice

        (816)

        The second derivative at u0 is

        φprimeprime(u0) = minus 1

        u20

        (u2

        0 + iτ)

        = minus 1

        u20

        (i`u0 + 2iτ) (817)

        Assign the names u0+ u0minus to the roots in (815) according to the sign in frontof the square-root (where the square-root is defined so as to have its argument in theinterval (minusπ2 π2]) We will actually have to pay attention just to u0+ since unlikeu0minus it lies on the right half of the plane where our contour of integration also liesWe remark that

        u0+ =i`+ |`|

        radicminus1 + 4iτ

        `2

        2=`

        2

        (iplusmnradicminus1 +

        `2i

        )(818)

        where the sign plusmn is + if ` gt 0 and minus if ` lt 0 If ` = 0 then u0+ = (1radic

        2 +iradic

        2)radicτ

        We can assume without loss of generality that τ ge 0 We will find it convenient toassume τ gt 0 since we can deal with τ = 0 simply by letting τ rarr 0+

        83 THE SADDLE POINT 149

        83 The saddle point

        831 The coordinates of the saddle point

        We should start by determining u0+ explicitly both in rectangular and polar coordi-nates For one thing we will need to estimate the integrand in (813) for u = u0+ Theabsolute value of the integrand is then

        ∣∣eminusφ(u0+)uminusσ0+

        ∣∣ = |u0+|minusσeminusltφ(u0+) and by(816)

        ltφ(u0+) = minus `2=(u0+)minus arg(u0+)τ (819)

        If ` = 0 we already know that lt(u0+) = =(u0+) =radicτ2 |u0+| =

        radicτ and

        arg u0+ = π4 Assume from now on that ` 6= 0

        We will use the expression for u0+ in (818) Solving a quadratic equation we seethat

        radicminus1 +

        `2i =

        radicj(ρ)minus 1

        2+ i

        radicj(ρ) + 1

        2 (820)

        where j(ρ) = (1 + ρ2)12 and ρ = 4τ`2 Hence

        lt(u0+) = plusmn `2

        radicj(ρ)minus 1

        2 =(u0+) =

        `

        2

        (1plusmn

        radicj(ρ) + 1

        2

        ) (821)

        Here and in what follows the signplusmn is + if ` gt 0 andminus if ` lt 0 (Notice thatlt(u0+)and =(u0+) are always positive except for τ = ` = 0 in which case lt(u0+) ==(u0+) = 0) By (821)

        |u0+| =|`|2middot

        ∣∣∣∣∣radicminus1 + j(ρ)

        2+

        (1plusmn

        radic1 + j(ρ)

        2

        )i

        ∣∣∣∣∣=|`|2

        radicminus1 + j(ρ)

        2+

        1 + j(ρ)

        2+ 1plusmn 2

        radic1 + j(ρ)

        2

        =|`|2

        radic1 + j(ρ)plusmn 2

        radic1 + j(ρ)

        2=|`|radic

        2

        radicυ(ρ)2 plusmn υ(ρ)

        (822)

        150 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        where υ(ρ) =radic

        (1 + j(ρ))2 We now compute the argument of u0+

        arg(u0+) = arg(`(iplusmnradicminus1 + iρ

        ))= arg

        (radicminus1 + j(ρ)

        2+ i

        (plusmn1 +

        radic1 + j(ρ)

        2

        ))

        = arcsin

        plusmn1 +radic

        1+j(ρ)2radic

        1 + j(ρ)plusmn 2radic

        1+j(ρ)2

        = arcsin

        radicplusmn1 +

        radic1+j(ρ)

        2radic2radic

        1+j(ρ)2

        = arcsin

        radicradicradicradic1

        2

        (1plusmn

        radic2

        1 + j(ρ)

        ) =π

        2minus 1

        2arccos

        (plusmn

        radic2

        1 + j(ρ)

        )(823)

        (by cos(π minus 2θ) = minus cos 2θ = 2 sin2 θ minus 1) Thus

        arg(u0+) =

        π2 minus

        12 arccos 1

        υ(ρ) = 12 arccos minus1

        υ(ρ) if ` gt 012 arccos 1

        υ(ρ) if ` lt 0(824)

        In particular arg(u0+) lies in [0 π2] and is close to π2 only when ` gt 0 andρ rarr 0+ Here and elsewhere we follow the convention that arcsin and arctan haveimage in [minusπ2 π2] whereas arccos has image in [0 π]

        832 The direction of steepest descent

        As is customary in the saddle-point method it is now time to determine the directionof steepest descent at the saddle-point u0+ Even if we decide to use a contour thatgoes through the saddle-point in a direction that is not quite optimal it will be usefulto know what the direction w of steepest descent actually is A contour that passesthrough the saddle-point making an angle between minusπ4 + ε and π4 minus ε with wmay be acceptable in that the contribution of the saddle point is then suboptimal by atmost a bounded factor depending on ε an angle approaching minusπ4 or π4 leads to acontribution suboptimal by an unbounded factor

        Let w isin C be the unit vector pointing in the direction of steepest descent Thenby definition w2φprimeprime(u0+) is real and positive where φ is as in (812) Thus arg(w) =minus arg(φprimeprime(u0+))2 modπ (The direction of steepest descent is defined only moduloπ) By (817)

        arg(φprimeprime(u0+)) = minusπ + arg(i`u0+ + 2iτ)minus 2 arg(u0+) mod 2π

        = minusπ2

        + arg(`u0+ + 2τ)minus 2 arg(u0+) mod 2π

        83 THE SADDLE POINT 151

        By (821)

        lt(`u0+ + 2τ) =`2

        2

        (plusmnradicj(ρ)minus 1

        2+

        `2

        )=`2

        2

        (ρplusmn

        radicj(ρ)minus 1

        2

        )

        =(`u0+ + 2τ) =`2

        2

        (1plusmn

        radicj(ρ) + 1

        2

        )

        Therefore arg(`u0+ + 2τ) = arctan$ where

        $ =1plusmn

        radicj(ρ)+1

        2

        ρplusmnradic

        j(ρ)minus12

        It is easy to check that sgn$ = sgn ` Hence

        arctan$ = plusmnπ2minus arctan

        ρplusmnradic

        j(ρ)minus12

        1plusmnradic

        j(ρ)+12

        At the same time

        ρplusmnradic

        jminus12

        1plusmnradic

        j+12

        =

        (ρplusmn

        radicjminus1

        2

        )(1∓

        radicj+1

        2

        )1minus j+1

        2

        =ρplusmn

        radic2(j minus 1)∓ ρ

        radic2(j + 1)

        1minus j

        =ρplusmn

        radic2j+1

        (radicj2 minus 1minus ρ middot (j + 1)

        )1minus j

        =ρplusmn 1

        υ (ρminus ρ middot (j + 1))

        1minus j

        =ρ(1∓ jυ)

        1minus j=

        (minus1plusmn jυ)(j + 1)

        ρ=

        2υ(minusυ plusmn j)ρ

        (825)Hence modulo 2π

        arg(φprimeprime(u0+)) = minus arctan2υ(minusυ plusmn j)

        ρminus 2 arg(u0+)minus

        0 if ` ge 0

        π if ` lt 0

        Therefore the direction of steepest descent is

        arg(w) = minusarg(φprimeprime(u0+))

        2= arg(u0+) +

        1

        2arctan

        2υ(minusυ plusmn j)ρ

        +

        0 if ` ge 0π2 if ` lt 0

        (826)By (824) and arccos 1υ = arctan

        radicυ2 minus 1 = arctan

        radic(j minus 1)2 we conclude that

        arg(w) =

        π2 + 1

        2

        (minus arctan 2υ(j+υ)

        ρ + arctanradic

        jminus12

        )if ` lt 0

        π2 + 1

        2

        (arctan 2υ(jminusυ)

        ρ minus arctanradic

        jminus12

        )if ` ge 0

        (827)

        152 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        Figure 81 arg(w) minus π2 as a function ofρ for ` lt 0

        Figure 82 arg(w) minus π2 as a function ofρ for ` ge 0

        There is nothing wrong in using plots here to get an idea of the behavior of arg(w)since at any rate the direction of steepest descent will play only an advisory role inour choices See Figures 81 and 82

        84 The integral over the contourWe must now choose the contour of integration The optimal contour should be one onwhich the phase of the integrand in (813) is constant ie =(φ(u)) is constant Thisis so because throughout the contour we want to keep descending from the saddleas rapidly as possible and so we want to maximize the absolute value of the deriva-tive of the real part of the exponent minusφ(u) At any point u if we are to maximize|lt(dφ(u)dt)| we want our contour to be such that =(dφ(u)dt) = 0 (We can alsosee this as follows if =(φ(u)) is constant there is no cancellation in (813) for us tomiss)

        Writing u = x+ iy we obtain from (812) that

        =(φ(u)) = minusxy + `x+ τ logradicx2 + y2 (828)

        We would thus be considering the curve =(φ(u)) = c where c is a constant Since weneed the contour to pass through the saddle point u0+ we set c = =(φ(u0+)) Theonly problem is that the curve =(φ(u)) = 0 given by (828) is rather uncomfortable towork with

        Instead we shall use several rather simple contours each appropriate for differentvalues of ` and τ

        841 A simple contourAssume first that ` gt 0 We could just let our contour L be the vertical line goingthrough u0+ Since the direction of steepest descent is never far from vertical (see

        84 THE INTEGRAL OVER THE CONTOUR 153

        (82)) this would be a good choice However the vertical line has the defect of goingtoo close to the origin when ρrarr 0

        Instead we will let L consist of three segments (a) the straight vertical ray

        (x0 y) y ge y0

        where x0 = ltu0+ ge 0 y0 = =u0+ gt 0 (b) the straight segment going downwardsand to the right from u0+ to the x-axis forming an angle of π2 minus β (where β gt 0will be determined later) with the x-axis at a point (x1 0) (c) the straight vertical ray(x1 y) y le 0 Let us call these three segments L1 L2 L3 Shifting the contour in(813) we obtain

        I =

        intL

        eminusφ(u)uminusσdu

        and so |I| le I1 + I2 + I3 where

        Ij =

        intLj

        ∣∣∣eminusφ(u)uminusσ∣∣∣ |du| (829)

        As we shall see we have chosen the segments Lj so that each of the three integrals Ijwill be easy to bound

        Let us start with I1 Since σ ge 0

        I1 le |u0+|minusσint infiny0

        eminusltφ(x0+iy)dy

        where by (812)

        ltφ(x+ iy) =y2 minus x2

        2minus `y minus τ arg(x+ iy) (830)

        Let us expand the expression on the right of (830) for x = x0 and y around y0 ==u0+ gt 0 The constant term is

        ltφ(u0+) = minus `2y0 minus τ arg(u0+) = minus`

        2

        4(1 + υ(ρ))minus τ

        2arccos

        minus1

        υ(ρ)

        = minus(

        1 + υ(ρ)

        ρ+

        1

        2arccos

        minus1

        υ(ρ)

        (831)

        where we are using (819) (821) and (824)The linear term vanishes because u0+ is a saddle-point (and thus a local extremum

        on L) It remains to estimate the quadratic term Now in (830) the term arg(x+ iy)equals arctan(yx) whose quadratic term we should now examine ndash but instead weare about to see that we can bound it trivially In general for t0 t isin R and f isin C2

        f(t) = f(t0) + f prime(t0) middot (tminus t0) +

        int t

        t0

        int r

        t0

        f primeprime(s)dsdr (832)

        Now arctanprimeprime(s) = minus2s(s2 + 1)2 and this is negative for s gt 0 and obeys

        arctanprimeprime(minuss) = minus arctanprimeprime(s)

        154 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        for all s Hence for t0 ge 0 and t ge minust0

        arctan t le arctan t0 + (arctanprime t0) middot (tminus t0) (833)

        Therefore in (830) we can consider only the quadratic term coming from (y2minusx2)2ndash namely (yminusy0)22 ndash and ignore the quadratic term coming from arg(x+ iy) Thus

        ltφ(x0 + iy) ge (y minus y0)2

        2+ ltφ(u0+) (834)

        for y ge minusy0 and in particular for y ge y0 Henceint infiny0

        eminusltφ(x0+iy)dy le eminusltφ(u0+)

        int infiny0

        eminus12 (yminusy0)2dy =

        radicπ2 middot eminusltφ(u0+) (835)

        Notice that once we choose to use the approximation (833) the vertical direction isactually optimal (In turn the fact that the direction of steepest descent is close tovertical shows us that we are not losing much by using the approximation (833))

        As for |u0+|minusσ we will estimate it by the easy bound

        |u0+| =`radic2

        radicυ2 + υ ge `radic

        2max

        (radicρ

        2radic

        2

        )= max(

        radicτ `) (836)

        where we use (822)Let us now bound I2 As we already said the linear term at u0+ vanishes Let

        u be the point at which L2 meets the line normal to it through the origin We musttake care that the angle formed by the origin u0+ and u be no larger than the angleformed by the origin (x1 0) and u0 this will ensure that we are in the range in whichthe approximation (833) is valid (namely t ge minust0 where t0 = tanα0) The firstangle is π2 +βminus arg u0+ whereas the second angle is π2minusβ Hence it is enoughto set β le (arg u0+)2 Then we obtain from (812) and (833) that

        ltφ(u) ge ltφ(u0+)minuslt (uminus u0+)2

        2 (837)

        If we let s = |uminus u0+| we see that

        lt (uminus u0+)2

        2=s2

        2cos(

        2 middot(π

        2minus β

        ))= minuss

        2

        2cos 2β

        Hence

        I2 le |u|minusσintL2

        eminusltφ(u)|du|

        lt |u|minusσint infin

        0

        eminusltφ(u0+)minus s22 cos 2βds = |u|minusσeminusltφ(u0+)

        radicπ

        2 cos 2β

        (838)

        Since arg u0 = arg u0+ minus β we see that by (821)

        |u| = lt ((x0 + iy0) (cosβ minus i sinβ))

        =`

        2

        (radicj minus 1

        2cosβ +

        (1 +

        radicj + 1

        2

        )sinβ

        )

        (839)

        84 THE INTEGRAL OVER THE CONTOUR 155

        The square of the expression within the outer parentheses is at least

        j minus 1

        2cos2 β +

        (1 +

        j + 1

        2+radic

        2(j + 1)

        )sin2 β +

        (radicj2 minus 1

        4+

        radicj minus 1

        2

        )sin 2β

        ge j

        2+

        7

        2sin2 β minus 1

        2cos2 β +

        j

        2sin2 β

        If β ge π8 then tanβ gt 1radic

        7 and so since j gt ρ we obtain

        |u| ge`

        2

        radicj

        2(1 + sin2 β) gt

        `radicρ

        232

        radic1 + sin2 β

        We can also apply the trivial bound j ge 1 directly to (839) Thus

        |u| ge max

        (radicτ

        2

        radic1 + sin2 β ` sinβ

        )

        Let us choose β as follows We could always set β = π8 since arg u0+ ge π4 wethen have β le (arg u0+)2 as required However if ρ le 32 then υ(ρ) le 118381and so by (824) arg u0+ ge 128842 We can thus set either β = π6 = 0523598 or β = π5 = 0628318 say either of which is smaller than (arg u0+)2 Goingback to (838) we conclude that

        I2 le eminusltφ(u0+) middotradicπ

        214

        ∣∣∣∣radicτ

        2

        radic1 + sin2 π

        8

        ∣∣∣∣minusσfor ρ arbitrary and

        I2 le eminusltφ(u0+) middotmin

        (radicπ2

        cos 2π5middot∣∣∣` sin

        π

        5

        ∣∣∣minusσ radicπ ∣∣∣∣ `2∣∣∣∣minusσ)

        when υ(ρ) le 32It remains to estimate I3 For u = x1

        minuslt (uminus u0+)2

        2= minuslty

        20 (tanβ minus i)2

        2=

        1

        2

        (1minus tan2 β

        )y2

        0

        ge(1minus tan2 β

        )middot `

        2

        8

        (1 +

        j + 1

        2

        )ge `2

        8

        (1minus tan2 β

        )middot ρ

        2

        ge 1

        4

        (1minus tan2 β

        where we are using (821) Thus (837) tells us that

        ltφ(x1) ge ltφ(u0+) +1minus tan2 β

        At the same time by (830) and τ ` ge 0

        ltφ(x1 + iy) ge ltφ(x1) +y2

        2

        156 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        for y le 0 Hence

        I3 le |x1|minusσintL3

        eminusltφ(u)|du| le |x1|minusσeminusltφ(x1)

        int 0

        minusinfineminusy

        22dy

        le |x1|minusσ middotradicπ

        2eminus

        1minustan2 β4 τeminusltφ(u0+)

        Here note that x1 ge (tanβ)|u0+| and so by (836)

        x1 ge tanβ middotmax(radicτ `)

        We conclude that for ` gt 0

        |I| le

        1 + 214

        (2

        1 + sin2 π8

        )σ2+eminus(radic

        2minus12

        )τ(

        tan π8

        )σ middot radicπ2

        τσ2eminusltφ(u0+)

        (since (1minus tan2 π8)4 = (radic

        2minus 1)2) and when ρ le 32

        |I| le

        1 + min

        2σ+ 12

        radicsec 2π

        5(sin π

        5

        )σ+

        eminusτ6

        (1radic

        3)σ

        middot radicπ2`σ

        eminusltφ(u0+)

        We know ltφ(u0+) from (831) Write

        E(ρ) =1

        2arccos

        1

        υ(ρ)minus υ(ρ)minus 1

        ρ (840)

        so that

        minusltφ(u0+) =1 + υ(ρ)

        ρ+

        1

        2arccos

        minus1

        υ(ρ)=π

        2minus E(ρ) +

        2

        ρ

        To finish we just need to apply (811) It makes sense to group together Γ(s)eπ2 τ

        since it is bounded on the critical line (by the classical formula |Γ(12 + iτ)| =radicπ coshπτ as in [MV07 Exer C1(b)]) and in general of slow growth on bounded

        strips Using (811) and noting that 2π2δ2 = `22 = (2ρ) middot τ we obtain

        |Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot

        c1σττ

        σ2 for ρ arbitraryc2στ`

        σ for ρ le 32(841)

        where

        c1στ =1

        2

        1 + 214

        (2

        1 + sin2 π8

        )σ2+eminus(radic

        2minus12

        )τ(

        tan π8

        c2στ =1

        2

        1 + min

        2σ+ 12

        radicsec 2π

        5(sin π

        5

        )σ+

        eminusτ6

        (1radic

        3)σ

        (842)

        84 THE INTEGRAL OVER THE CONTOUR 157

        We have assumed throughout that ` ge 0 and τ ge 0 We can immediately obtain abound valid for ` le 0 τ le 0 by reflection on the x-axis we simply put absolutevalues around τ and ` in (841)

        We see that we have obtained a bound in a neat closed form without too mucheffort Of course this effortlessness is usually in part illusory the contour we haveused here is actually the product of some trial and error in that some other contoursgive results that are comparable in quality but harder to simplify We will have tochoose a different contour when sgn(`) 6= sgn(τ)

        842 Another simple contourWe now wish to give a bound for the case of sgn(`) 6= sgn(τ) ie sgn(δ) = sgn(τ)We expect a much smaller upper bound than for sgn(`) = sgn(τ) given what wealready know from the method of stationary phase This also means that we will notneed to be as careful in order to get a bound that is good enough for all practicalpurposes

        Our contour L will consist of three segments (a) the straight vertical ray (x0 y) y ge 0 (b) the quarter-circle from (x0 0) to (0minusx0) (that is an arc where the argu-ment runs from 0 to minusπ2) and (c) the straight vertical ray (0 y) y le minusx0 Wecall these segments L1 L2 L3 and define the integrals I1 I2 and I3 just as in (829)

        Much as before we have

        I1 le xminusσ0

        int infin0

        eminusltφ(x0+iy)dy

        Since (833) is valid for t ge 0 (834) holds and so

        I1 le xminusσ0 eminusltφ(u0+)

        int infinminusinfin

        eminus12 (yminusy0)2dy = xminusσ0

        radic2π middot eminusltφ(u0+)

        By (812) and (830)

        I2 le xminusσ0

        intL2

        eminusltφ(u)du = x1minusσ0

        int π2

        0

        eminus(minus x

        20 cos 2α

        2 +`x0 sinα+τα

        )dα (843)

        Now for α ge 0 and ` le 0

        (`x0 sinα+ τα)prime

        = `x0 cosα+ τ ge `x0 + τ

        Since j =radic

        1 + ρ2 le 1 + ρ22 we haveradic

        (j minus 1)2 le ρ2 and so by (821)|`x0| le `2ρ4 = τ and thus `x0 + τ ge 0 In other words the exponent in (843)equals (x2

        0 cos 2α)2 minus an increasing function and so since ltφ(x0) = minusx202

        I2 le xminusσ0 middot x0

        int π2

        0

        ex20 cos 2α

        2 dα = xminusσ0 middot π2x0 middot I0(x2

        02)

        where I0(t) = 1π

        int π0et cos θdθ is the modified Bessel function of the first kind (and

        order 0)

        158 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        Since cos θ =radic

        1minus sin2 θ lt 1minus (sin2 θ)2 le 1minus 2θ2π2 we have3

        I0(t) le 1

        π

        int π

        0

        et(

        1minus 2θ2

        π2

        )dθ lt et middot 1

        π

        int infin0

        eminus2tπ2 θ

        2

        dθ = etπradic

        2t

        π

        radicπ

        2=

        radicπ

        232

        etradict

        for t ge 0Using the fact that ltφ(x0) = minusx2

        02 we conclude that

        I2 le xminusσ0 middot π2x0 middot

        radicπ

        232

        ex202

        x0radic

        2=π32

        4xminusσ0 eminusltφ(x0)

        By (834) which is valid for all ` we know that ltφ(x0) ge ltφ(u0+)Let us now estimate the integral on L3 Again by (830) for y lt 0

        ltφ(iy) =y2

        2minus `y + τ

        π

        2

        Hence ∣∣∣∣intL3

        eminusφ(u)uminusσdu

        ∣∣∣∣ le xminusσ0

        int minusx0

        minusinfineminus(y2

        2 minus`y+τ π2

        )du

        = xminusσ0 e12 `

        2

        eminusτπ2

        int minusx0

        minusinfineminus

        12 (yminus`)2dy = xminusσ0 eminus

        τπ2

        radicπ

        2

        since yminus` le minus` for y le minusx0 andint minus`minusinfin eminust

        22dt leradicπ2middoteminus`22 (by [AS64 7113])

        Now that we have bounded the integrals over L1 L2 and L3 it remains to boundx0 from below starting from (821) We will bound it differently for ρ lt 32 and forρ ge 32 (The choice of 32 is fairly arbitrary)

        Expanding (radic

        1 + t minus 1)2 gt 0 we obtain that 2(1 + t) minus 2radic

        1 + t ge t for allt ge minus1 and so(radic

        1 + tminus 1

        t

        )prime=

        1

        t2

        (t

        2radic

        1 + tminus (radic

        1 + tminus 1)

        )lt 0

        ie (radic

        1 + tminus 1)t decreases as t increases Hence for ρ le ρ0 where ρ0 ge 0

        j(ρ) =radic

        1 + ρ2 ge 1 +

        radic1 + ρ2

        0 minus 1

        ρ20

        ρ2 (844)

        which equals 1 + (29)(radic

        13minus 2)ρ2 for ρ0 = 32 Thus for ρ le 32

        x0 ge|`|2

        radic29 (radic

        13minus 2)ρ2

        2=

        radicradic13minus 2

        6|`|ρ

        =2radicradic

        13minus 2

        3

        τ

        |`|ge 084473

        |τ |`

        (845)

        3It is actually not hard to prove rigorously the better bound I0(t) le 0468823etradict For t ge 8 this can

        be done directly by the change of variables cos θ = 1 minus 2s2 dθ = 2dsradic

        1minus s2 followed by the usageof different upper bounds on the the integrand exp(minus2ts2

        radic1minus s2) for 0 le s le 12 and 12 le s le 1

        (Thanks are due G Kuperberg for this argument) For t lt 8 use the Taylor expansion of I0(t) aroundt = 0 [AS64 (9612)] truncate it after 16 terms and then bound the maximum of the truncated series bythe bisection method implemented via interval arithmetic (as described in sect26)

        85 CONCLUSIONS 159

        On the other hand(j(ρ)minus 1

        ρ

        )prime=

        1

        ρ2(jprime(ρ)ρminus (j(ρ)minus 1)) =

        ρ2 minus (1 + ρ2) +radic

        1 + ρ2

        ρ2radic

        1 + ρ2ge 0

        and so for ρ ge 32 (j(ρ) minus 1)ρ is minimal at ρ = 32 where it takes the value(radic

        13minus 2)3 Hence

        x0 =|`|2

        radicj(ρ)minus 1

        2ge|`|radicρ

        2

        radicradic13minus 2radic

        6=

        radicradic13minus 2radic

        6

        radicτ ge 051729

        radicτ (846)

        We now sum I1 I2 and I3 and then use (811) we obtain that when ` lt 0 andτ ge 0

        |Fδ(s)| leeminus2π2δ2 |Γ(s)|radic

        ∣∣∣∣intL

        eminusφ(u)uminusσdu

        ∣∣∣∣le |x0|minusσ

        ((1 +

        π

        232

        )eminusltφ(u0+) +

        1

        2eminus

        τπ2

        )eminus

        12 `

        2

        |Γ(s)|(847)

        By (819) (821) and (824)

        minuslt(φ(u0+)) =`2

        4(1minus υ(ρ)) +

        τ

        2arccos

        1

        υ(ρ)ltτ

        2arccos

        1

        υ(ρ)le π

        We conclude that when sgn(`) 6= sgn(τ) (ie sgn(δ) = sgn(τ))

        |Fδ(s)| le |x0|minusσ middot eminus12 `

        2

        |Γ(s)|eπ2 |τ | middot((

        1 +π

        232

        )eminus

        π4 |τ | +

        1

        2eminusπ|τ |

        )

        where x0 can be bounded as in (845) and (846) Here as before we reducing the caseτ lt 0 to the case τ gt 0 by reflection This concludes the proof of Theorem 801

        85 ConclusionsWe have obtained bounds on |Fδ(s)| for sgn(δ) 6= sgn(τ) (841) and for sgn(δ) =sgn(τ) (847) Our task is now to simplify them

        First let us look at the exponent E(ρ) defined as in (82) Its plot can be seen inFigure 85 We claim that

        E(ρ) ge

        01598 if ρ ge 1501065ρ if ρ lt 15

        (848)

        This is so for ρ ge 15 because E(ρ) is increasing on ρ and E(15) = 015982 Thecase ρ lt 15 is a little more delicate We can easily see that arccos(1minus t22) ge t for0 le t ge 2 (since the derivative of the left side is 1

        radic1minus t24 which is always ge 1)

        We also have

        1 +ρ2

        2minus ρ4

        8le j(ρ) le 1 +

        ρ2

        2

        160 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        Figure 83 The function E(ρ)

        for 0 le ρ leradic

        8 and so

        1 +ρ2

        8minus 5ρ4

        128le υ(ρ) le 1 +

        ρ2

        8

        for 0 le ρ leradic

        325 this in turn gives us that 1υ(ρ) le 1minus ρ28 + 7ρ4128 (againfor 0 le ρ le

        radic325) and so 1υ(ρ) le 1 minus (1 minus 764)ρ28 for 0 le ρ le 12 We

        conclude that

        arccos1

        υ(ρ)ge 1

        2

        radic57

        64ρ

        therefore

        E(ρ) ge 1

        4

        radic57

        64ρminus ρ

        8gt 011093ρ gt 01065ρ

        In the remaining range 12 le ρ le 32 we prove that E(ρ)ρ gt 0106551 usingthe bisection method (with 20 iterations) implemented by means of interval arithmeticThis concludes the proof of (848)

        Assume from this point onwards that |τ | ge 20 Let us show that the contributionof (83) is negligible relative to that of (81) Indeed((

        1 +π

        232

        )eminus

        π4 |τ | +

        1

        2eminusπ|τ |

        )le 78

        106eminus01598τ

        It is useful to note that eminus`22 = eminus2τρ and so for σ le k + 1 and ρ le 32

        eminus2τρ

        (084473|τ |`)σle eminus40ρ(

        0844734 ρ

        )σ`σle 1

        (4

        084473 middot 15

        )σeminus80(3t)

        le 1

        `σmiddot 315683k+1 e

        minus80(3t)

        tk+1

        (849)

        85 CONCLUSIONS 161

        where t = 2ρ3 le 1 Since eminuscttk+1 attains its maximum at t = c(k + 1)

        eminus80(3t)

        tk+1le eminus(k+1)

        (3(k + 1)

        80

        )k+1

        and so for ρ le 32

        |x0|minusσeminus12 `

        2

        le 1

        `σmiddot

        004355 if 0 le σ le 1

        000759 if 1 le σ le 2

        000224 if 2 le σ le 3

        whereas |x0|minusσeminus`22 le |x0|minusσ le (051729

        radicτ)minusσ for ρ ge 32

        We conclude that for |τ | ge 20 and σ le 3

        |Fδ(s)| le |Γ(s)|eπ2 τ middot eminus01598τ middot

        4

        1071`σ if ρ le 32

        6105

        1τσ2

        if ρ ge 32(850)

        provided that sgn(δ) = sgn(τ) or δ = 0 This will indeed be negligible compared toour bound for the case sgn(δ) = minus sgn(τ)

        Let us now deal with the factor |Γ(s)|eπ2 τ By Stirlingrsquos formula with remainderterm [GR94 (8344)]

        log Γ(s) =1

        2log(2π) +

        (sminus 1

        2

        )log sminus s+

        1

        12s+R2(s)

        where

        |R2(s)| lt 130

        12|s|3 cos3(

        arg s2

        ) =

        radic2

        180|s|3

        for lt(s) ge 0 The real part of (sminus 12) log sminus s is

        (σ minus 12) log |s| minus τ arg(s)minus σ = (σ minus 12) log |s| minus π

        2τ + τ

        (arctan

        σ

        |τ |minus σ

        |τ |

        )for s = σ + iτ σ ge 0 Since arctan(r) le r for r ge 0 we conclude that

        |Γ(s)|eπ2 τ leradic

        2π|s|σminus 12 e

        112|s|+

        radic2

        180|s|3 (851)

        Lastly |s|σminus12 = |τ |σminus12|1 + iστ |σminus12 For |τ | ge 20

        |1 + iστ |σminus12 le

        1000625 if 0 le σ le 11007491 if 1 le σ le 21028204 if 2 le σ le 3

        ande

        112|τ|+

        radic2

        180|τ|3 le 1004177

        162 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN

        Thus

        |Γ(s)|eπ2 τ le |τ |σminus12 middot

        251868 if 0 le σ le 1253596 if 1 le σ le 225881 if 2 le σ le 3

        (852)

        Let us now estimate the constants c1στ and c2στ in (82) By |τ | ge 20

        eminus(radic

        2minus12

        )τ le 0015889 eminus

        τ6 le 0035674 (853)

        Since 8 sin(π8) = 3061467 gt 1 we obtain that

        c1στ le

        130454 if 0 le σ le 1158361 if 1 le σ le 2198186 if 2 le σ le 3

        c2στ le

        194511 if 0 le σ le 1315692 if 1 le σ le 2502186 if 2 le σ le 3

        Lastly note that for k le σ le k + 1 we have

        1

        τσ2middot |τ |σminus12 = |τ |(σminus1)2 le τk2

        whereas for ρ le 32 and 0 le γ le 1

        |τ |γminus12

        |`|γle |τ |

        γ2minus

        12

        ( τ`2

        )γ2le 20

        γ2minus

        12

        (32

        4

        )γ2le(

        3

        8

        )12

        and so1

        `σmiddot |τ |σminus12 =

        (|τ |`

        )k |τ |σminus12

        |`|σleradic

        3

        8middot(|τ |`

        )k

        Multiplying and remembering to add (850) we obtain that for k = 0 1 2 σ isin[0 1] and |τ | ge 20

        |Fδ(s+ k)|+ |Fδ((1minus s) + k)| le

        κk0(|τ ||`|

        )keminus01065( 2|τ|

        |`| )2

        if ρ lt 32

        κk1|τ |keminus01598|τ | if ρ ge 32

        whereκ00 le (4 middot 10minus7 + 194511) middot 251868 middot

        radic38 le 3001

        κ10 le (4 middot 10minus7 + 315692) middot 253596 middotradic

        38 le 4903

        κ20 le (4 middot 10minus7 + 502186) middot 25881 middotradic

        38 le 796

        and similarly

        κ01 le (6 middot 10minus5 + 130454) middot 251868 le 3286

        κ11 le (6 middot 10minus5 + 158361) middot 253596 le 4017

        κ21 le (6 middot 10minus5 + 198186) middot 25881 le 513

        This concludes the proof of Corollary 802

        Chapter 9

        Explicit formulas

        An explicit formula is an expression restating a sum such as Sηχ(δx x) as a sum ofthe Mellin transformGδ(s) over the zeros of the L function L(s χ) More specificallyfor us Gδ(s) is the Mellin transform of η(t)e(δt) for some smoothing function η andsome δ isin R We want a formula whose error terms are good both for δ very close orequal to 0 and for δ farther away from 0 (Indeed our choice(s) of η will be made sothat Fδ(s) decays rapidly in both cases)

        We will be able to base all of our work on a single general explicit formula namelyLemma 911 This explicit formula has simple error terms given purely in terms of afew norms of the given smoothing function η We also give a common framework forestimating the contribution of zeros on the critical strip (Lemmas 913 and 914)

        The first example we work out is that of the Gaussian smoothing η(t) = eminust22

        We actually do this in part for didactic purposes and in part because of its likely ap-plicability elsewhere for our applications we will always use smoothing functionsbased on teminust

        22 and t2eminust22 generally in combination with something else Since

        η(t) = eminust22 does not vanish at t = 0 its Mellin transform has a pole at s = 0

        ndash something that requires some additional work (Lemma 912 see also the proof ofLemma 911)

        Other than that for each function η(t) all that has to be done is to bound an integral(from Lemma 913) and bound a few norms Still both for ηlowast and for η+ we find afew interesting complications Since η+ is defined in terms of a truncation of a Mellintransform (or alternatively in terms of a multiplicative convolution with a Dirichletkernel as in (74) and (76)) bounding the norms of η+ and ηprime+ takes a little work Weleave this to Appendix A The effect of the convolution is then just to delay the decaya shift in that a rapidly decaying function f(τ) will get replaced by f(τ minus H) H aconstant

        The smoothing function ηlowast is defined as a multiplicative convolution of t2eminust22

        with something else Given that we have an explicit formula for t2eminust22 we obtain an

        explicit formula for ηlowast by what amounts to just exchanging the order of a sum and anintegral (We already went over this in the introduction in (140))

        163

        164 CHAPTER 9 EXPLICIT FORMULAS

        91 A general explicit formulaWe will prove an explicit formula valid whenever the smoothing η and its derivative ηprime

        satisfy rather mild assumptions ndash they will be assumed to be L2-integrable and to havestrips of definition containing s 12 le lt(s) le 32 though any strip of the forms ε le lt(s) le 1 + ε would do just as well

        (For explicit formulas with different sets of assumptions see eg [IK04 sect55] and[MV07 Ch 12])

        The main idea in deriving any explicit formula is to start with an expression givinga sum as integral over a vertical line with an integrand involving a Mellin transform(here Gδ(s)) and an L-function (here L(s χ)) We then shift the line of integration tothe left If stronger assumptions were made (as in Exercise 5 in [IK04 sect55]) we couldshift the integral all the way tolt(s) = minusinfin the integral would then disappear replacedentirely by a sum over zeros (or even as in the same Exercise 5 by a particularly simpleintegral) Another possibility is to shift the line only to lt(s) = 12 + ε for some ε gt 0ndash but this gives a weaker result and at any rate the factor Lprime(s χ)L(s χ) can be largeand messy to estimate within the critical strip 0 lt lt(s) lt 1

        Instead we will shift the line to lts = minus12 We can do this because the assump-tions on η and ηprime are enough to continue Gδ(s) analytically up to there (with a possiblepole at s = 0) The factor Lprime(s χ)L(s χ) is easy to estimate for lts lt 0 and s = 0(by the functional equation) and the part of the integral on lts = minus12 coming fromGδ(s) can be estimated easily using the fact that the Mellin transform is an isometry

        Lemma 911 Let η R+0 rarr R be in C1 Let x isin R+ δ isin R Let χ be a primitive

        character mod q q ge 1Write Gδ(s) for the Mellin transform of η(t)e(δt) Assume that η(t) and ηprime(t) are

        in `2 (with respect to the measure dt) and that η(t)tσminus1 and ηprime(t)tσminus1 are in `1 (againwith respect to dt) for all σ in an open interval containing [12 32]

        Theninfinsumn=1

        Λ(n)χ(n)e

        xn

        )η(nx) = Iq=1 middot η(minusδ)xminus

        sumρ

        Gδ(ρ)xρ

        minusR+Olowast ((log q + 601) middot (|ηprime|2 + 2π|δ||η|2))xminus12

        (91)

        where

        Iq=1 =

        1 if q = 10 if q 6= 1

        R = η(0)

        (log

        q+ γ minus Lprime(1 χ)

        L(1 χ)

        )+Olowast(c0)

        (92)

        for q gt 1 R = η(0) log 2π for q = 1 and

        c0 =2

        3Olowast(∣∣∣∣ηprime(t)radict

        ∣∣∣∣1

        +∣∣∣ηprime(t)radict∣∣∣

        1+ 2π|δ|

        (∣∣∣∣η(t)radict

        ∣∣∣∣1

        + |η(t)radict|1))

        (93)

        The norms |η|2 |ηprime|2 |ηprime(t)radict|1 etc are taken with respect to the usual measure dt

        The sumsumρ is a sum over all non-trivial zeros ρ of L(s χ)

        91 A GENERAL EXPLICIT FORMULA 165

        Proof Since (a) η(t)tσminus1 is in `1 for σ in an open interval containing 32 and (b)η(t)e(δt) has bounded variation (since η ηprime isin `1 implying that the derivative ofη(t)e(δt) is also in `1) the Mellin inversion formula (as in eg [IK04 4106]) holds

        η(nx)e(δnx) =1

        2πi

        int 32 +iinfin

        32minusiinfin

        Gδ(s)xsnminussds

        Since Gδ(s) is bounded for lt(s) = 32 (by η(t)t32minus1 isin `1) andsumn Λ(n)nminus32 is

        bounded as well we can change the order of summation and integration as follows

        infinsumn=1

        Λ(n)χ(n)e(δnx)η(nx) =

        infinsumn=1

        Λ(n)χ(n) middot 1

        2πi

        int 32 +iinfin

        32minusiinfin

        Gδ(s)xsnminussds

        =1

        2πi

        int 32 +iinfin

        32minusiinfin

        infinsumn=1

        Λ(n)χ(n)Gδ(s)xsnminussds

        =1

        2πi

        int 32 +iinfin

        32minusiinfin

        minusLprime(s χ)

        L(s χ)Gδ(s)x

        sds

        (94)

        (This is the way the procedure always starts see for instance [HL22 Lemma 1] orto look at a recent standard reference [MV07 p 144] We are being very scrupulousabout integration because we are working with general η)

        The first question we should ask ourselves is up to where can we extend Gδ(s)Since η(t)tσminus1 is in `1 for σ in an open interval I containing [12 32] the transformGδ(s) is defined for lt(s) in the same interval I However we also know that thetransformation rule M(tf prime(t))(s) = minuss middotMf(s) (see (210) by integration by parts)is valid when s is in the holomorphy strip for both M(tf prime(t)) and Mf In our case(f(t) = η(t)e(δt)) this happens when lt(s) isin (I minus 1) cap I (so that both sides of theequation in the rule are defined) Hence s middot Gδ(s) (which equals s middotMf(s)) can beanalytically continued to lt(s) in (I minus 1) cup I which is an open interval containing[minus12 32] This implies immediately that Gδ(s) can be analytically continued to thesame region with a possible pole at s = 0

        When does Gδ(s) have a pole at s = 0 This happens when sGδ(s) is non-zero ats = 0 ie when M(tf prime(t))(0) 6= 0 for f(t) = η(t)e(δt) Now

        M(tf prime(t))(0) =

        int infin0

        f prime(t)dt = limtrarrinfin

        f(t)minus f(0)

        We already know that f prime(t) = (ddt)(η(t)e(δt)) is in `1 Hence limtrarrinfin f(t) existsand must be 0 because f is in `1 Hence minusM(tf prime(t))(0) = f(0) = η(0)

        Let us look at the next term in the Laurent expansion of Gδ(s) at s = 0 It is

        limsrarr0

        sGδ(s)minus η(0)

        s= limsrarr0

        minusM(tf prime(t))(s)minus f(0)

        s= minus lim

        srarr0

        1

        s

        int infin0

        f prime(t)(ts minus 1)dt

        = minusint infin

        0

        f prime(t) limsrarr0

        ts minus 1

        sdt = minus

        int infin0

        f prime(t) log t dt

        166 CHAPTER 9 EXPLICIT FORMULAS

        Here we were able to exchange the limit and the integral because f prime(t)tσ is in `1for σ in a neighborhood of 0 in turn this is true because f prime(t) = ηprime(t) + 2πiδη(t)and ηprime(t)tσ and η(t)tσ are both in `1 for σ in a neighborhood of 0 In fact we willuse the easy bounds |η(t) log t| le (23)(|η(t)tminus12|1 + |η(t)t12|1) |ηprime(t) log t| le(23)(|ηprime(t)tminus12|1 + |ηprime(t)t12|1) resulting from the inequality

        2

        3

        (tminus

        12 + t

        12

        )le | log t| (95)

        valid for all t gt 0We conclude that the Laurent expansion of Gδ(s) at s = 0 is

        Gδ(s) =η(0)

        s+ c0 + c1s+ (96)

        where

        c0 = Olowast(|f prime(t) log t|1)

        =2

        3Olowast(∣∣∣∣ηprime(t)radict

        ∣∣∣∣1

        +∣∣∣ηprime(t)radict∣∣∣

        1+ 2πδ

        (∣∣∣∣η(t)radict

        ∣∣∣∣1

        + |η(t)radict|1))

        We shift the line of integration in (94) to lt(s) = minus12 We obtain

        1

        2πi

        int 2+iinfin

        2minusiinfinminusLprime(s χ)

        L(s χ)Gδ(s)x

        sds = Iq=1Gδ(1)xminussumρ

        Gδ(ρ)xρ minusR

        minus 1

        2πi

        int minus12+iinfin

        minus12minusiinfin

        Lprime(s χ)

        L(s χ)Gδ(s)x

        sds

        (97)

        where

        R = Ress=0Lprime(s χ)

        L(s χ)Gδ(s)

        Of course

        Gδ(1) = M(η(t)e(δt))(1) =

        int infin0

        η(t)e(δt)dt = η(minusδ)

        Let us work out the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 By the func-tional equation (as in eg [IK04 Thm 415])

        Lprime(s χ)

        L(s χ)= log

        π

        qminus 1

        (s+ κ

        2

        )minus 1

        (1minus s+ κ

        2

        )minus Lprime(1minus s χ)

        L(1minus s χ) (98)

        where ψ(s) = Γprime(s)Γ(s) and

        κ =

        0 if χ(minus1) = 1

        1 if χ(minus1) = minus1

        91 A GENERAL EXPLICIT FORMULA 167

        By ψ(1 minus x) minus ψ(x) = π cotπx (immediate from Γ(s)Γ(1 minus s) = π sinπs) andψ(s) + ψ(s+ 12) = 2(ψ(2s)minus log 2) (Legendre [AS64 (638)])

        minus 1

        2

        (s+ κ

        2

        )+ ψ

        (1minus s+ κ

        2

        ))= minusψ(1minuss)+log 2+

        π

        2cot

        π(s+ κ)

        2 (99)

        Hence unless q = 1 the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 is

        1minus κs

        +

        (log

        qminus ψ(1)minus Lprime(1 χ)

        L(1 χ)

        )+a1

        s+a2

        s2+

        Here ψ(1) = minusγ the Euler gamma constant [AS64 (632)]There is a special case for q = 1 due to the pole of ζ(s) at s = 1 We know that

        ζ prime(0)ζ(0) = log 2π (see eg [MV07 p 331])From this and (96) we conclude that if η(0) = 0 then

        R =

        c0 if q gt 1 and χ(minus1) = 10 otherwise

        where c0 = Olowast(|ηprime(t) log t|1 + 2π|δ||η(t) log t|1) If η(0) 6= 0 then

        R = η(0)

        (log

        q+ γ minus Lprime(1 χ)

        L(1 χ)

        )+

        c0 if χ(minus1) = 1

        0 otherwise

        for q gt 1 andR = η(0) log 2π

        for q = 1It is time to estimate the integral on the right side of (97) For that we will need to

        estimate Lprime(s χ)L(s χ) for lt(s) = minus12 using (98) and (99)If lt(z) = 32 then |t2 + z2| ge 94 for all real t Hence by [OLBC10 (5915)]

        and [GR94 (34111)]

        ψ(z) = log z minus 1

        2zminus 2

        int infin0

        tdt

        (t2 + z2)(e2πt minus 1)

        = log z minus 1

        2z+ 2 middotOlowast

        (int infin0

        tdt94 (e2πt minus 1)

        )= log z minus 1

        2z+

        8

        9Olowast(int infin

        0

        tdt

        e2πt minus 1

        )= log z minus 1

        2z+

        8

        9middotOlowast

        (1

        (2π)2Γ(2)ζ(2)

        )= log z minus 1

        2z+Olowast

        (1

        27

        )= log z +Olowast

        (10

        27

        )

        (910)

        Thus in particular ψ(1 minus s) = log(32 minus iτ) + Olowast(1027) where we write s =12 + iτ Now ∣∣∣∣cot

        π(s+ κ)

        2

        ∣∣∣∣ =

        ∣∣∣∣e∓π4 iminusπ2 τ + eplusmnπ4 i+

        π2 τ

        e∓π4 iminus

        π2 τ minus eplusmnπ4 i+π

        2 τ

        ∣∣∣∣ = 1

        168 CHAPTER 9 EXPLICIT FORMULAS

        Since lt(s) = minus12 a comparison of Dirichlet series gives∣∣∣∣Lprime(1minus s χ)

        L(1minus s χ)

        ∣∣∣∣ le |ζ prime(32)||ζ(32)|

        le 150524 (911)

        where ζ prime(32) and ζ(32) can be evaluated by Euler-Maclaurin Therefore (98) and(99) give us that for s = minus12 + iτ ∣∣∣∣Lprime(s χ)

        L(s χ)

        ∣∣∣∣ le ∣∣∣logq

        π

        ∣∣∣+ log

        ∣∣∣∣32 + iτ

        ∣∣∣∣+10

        27+ log 2 +

        π

        2+ 150524

        le∣∣∣log

        q

        π

        ∣∣∣+1

        2log

        (τ2 +

        9

        4

        )+ 41396

        (912)

        Recall that we must bound the integral on the right side of (97) The absolute valueof the integral is at most xminus12 times

        1

        int minus 12 +iinfin

        minus 12minusiinfin

        ∣∣∣∣Lprime(s χ)

        L(s χ)Gδ(s)

        ∣∣∣∣ ds (913)

        By Cauchy-Schwarz this is at mostradicradicradicradic 1

        int minus 12 +iinfin

        minus 12minusiinfin

        ∣∣∣∣Lprime(s χ)

        L(s χ)middot 1

        s

        ∣∣∣∣2 |ds| middotradicradicradicradic 1

        int minus 12 +iinfin

        minus 12minusiinfin

        |Gδ(s)s|2 |ds|

        By (912)radicradicradicradicint minus 12 +iinfin

        minus 12minusiinfin

        ∣∣∣∣Lprime(s χ)

        L(s χ)middot 1

        s

        ∣∣∣∣2 |ds| leradicradicradicradicint minus 1

        2 +iinfin

        minus 12minusiinfin

        ∣∣∣∣ log q

        s

        ∣∣∣∣2 |ds|+

        radicradicradicradicint infinminusinfin

        ∣∣ 12 log

        (τ2 + 9

        4

        )+ 41396 + log π

        ∣∣214 + τ2

        leradic

        2π log q +radic

        226844

        where we compute the last integral numerically1

        Again we use the fact that by (210) sGδ(s) is the Mellin transform of

        minus td(e(δt)η(t))

        dt= minus2πiδte(δt)η(t)minus te(δt)ηprime(t) (914)

        Hence by Plancherel (as in (26))radicradicradicradic 1

        int minus 12 +iinfin

        minus 12minusiinfin

        |Gδ(s)s|2 |ds| =

        radicint infin0

        |minus2πiδte(δt)η(t)minus te(δt)ηprime(t)|2 tminus2dt

        = 2π|δ|

        radicint infin0

        |η(t)|2dt+

        radicint infin0

        |ηprime(t)|2dt

        (915)1By a rigorous integration from τ = minus100000 to τ = 100000 using VNODE-LP [Ned06] which runs

        on the PROFILBIAS interval arithmetic package [Knu99]

        91 A GENERAL EXPLICIT FORMULA 169

        Thus (913) is at most(log q +

        radic226844

        )middot (|ηprime|2 + 2π|δ||η|2)

        Lemma 911 leaves us with three tasks bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with small imaginary part bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with large imaginary part and bounding Lprime(1 χ)L(1 χ) Letus start with the last task while in a narrow sense it is optional ndash in that in theapplications we actually need (Thm 712 Cor 713 and Thm 714) we will haveη(0) = 0 thus making the term Lprime(1 χ)L(1 χ) disappear ndash it is also very easy andcan be dealt with quickly

        Since we will be using a finite GRH check in all later applications we might aswell use it here

        Lemma 912 Let χ be a primitive character mod q q gt 1 Assume that all non-trivialzeroes ρ = σ + it of L(s χ) with |t| le 58 satisfy lt(ρ) = 12 Then∣∣∣∣Lprime(1 χ)

        L(1 χ)

        ∣∣∣∣ le 5

        2logM(q) + c

        where M(q) = maxn

        ∣∣∣summlen χ(m)∣∣∣ and

        c = 5 log2radic

        3

        ζ(94)ζ(98)= 1507016

        Proof By a lemma of Landaursquos (see eg [MV07 Lemma 63] where the constantsare easily made explicit) based on the Borel-Caratheodory Lemma (as in [MV07Lemma 62]) any function f analytic and zero-free on a disc Cs0R = s |sminus s0| leR of radius R gt 0 around s0 satisfies

        f prime(s)

        f(s)= Olowast

        (2R logM|f(s0)|

        (Rminus r)2

        )(916)

        for all s with |s minus s0| le r where 0 lt r lt R and M is the maximum of |f(z)| onCs0R Assuming L(s χ) has no non-trivial zeros off the critical line with |=(s)| le H where H gt 12 we set s0 = 12 +H r = H minus 12 and let Rrarr Hminus We obtain

        Lprime(1 χ)

        L(1 χ)= Olowast

        (8H log

        maxsisinCs0H |L(s χ)||L(s0 χ)|

        ) (917)

        Now

        |L(s0 χ)| geprodp

        (1 + pminuss0)minus1 =prodp

        (1minus pminus2s0)minus1

        (1minus pminuss0)minus1=ζ(2s0)

        ζ(s0)

        Since s0 = 12 +H Cs0H is contained in s isin C lt(s) gt 12 for any value of H We choose (somewhat arbitrarily) H = 58

        170 CHAPTER 9 EXPLICIT FORMULAS

        By partial summation for s = σ + it with 12 le σ lt 1 and any N isin Z+

        L(s χ) =sumnleN

        χ(m)nminuss minus

        summleN

        χ(m)

        (N + 1)minuss

        +sum

        ngeN+1

        summlen

        χ(m)

        (nminuss minus (n+ 1)minuss+1)

        = Olowast(N1minus12

        1minus 12+N1minusσ +M(q)Nminusσ

        )

        (918)

        where M(q) = maxn

        ∣∣∣summlen χ(m)∣∣∣ We set N = M(q)3 and obtain

        |L(s χ)| le 2M(q)Nminus12 = 2radic

        3radicM(q) (919)

        We put this into (917) and are done

        Let M(q) be as in the statement of Lem 912 Since the sum of χ(n) (χ mod qq gt 1) over any interval of length q is 0 it is easy to see that M(q) le q2 We alsohave the following explicit version of the Polya-Vinogradov inequality

        M(q) le

        2π2

        radicq log q + 4

        π2

        radicq log log q + 3

        2

        radicq if χ(minus1) = 1

        12π

        radicq log q + 1

        π

        radicq log log q +

        radicq if χ(minus1) = 1

        (920)

        Taken together with M(q) le q2 this implies that

        M(q) le q45 (921)

        for all q ge 1 and also thatM(q) le 2q35 (922)

        for all q ge 1Notice lastly that ∣∣∣∣log

        q+ γ

        ∣∣∣∣ le log q + logeγ middot 2π

        32

        for all q ge 3 (There are no primitive characters modulo 2 so we can omit q = 2)We conclude that for χ primitive and non-trivial∣∣∣∣log

        q+ γ minus Lprime(1 χ)

        L(1 χ)

        ∣∣∣∣ le logeγ middot 2π

        32+ log q +

        5

        2log q

        45 + 1507017

        le 3 log q + 15289

        Obviously 15289 is more than log 2π the bound for χ trivial Hence the absolutevalue of the quantity R in the statement of Lemma 911 is at most

        |η(0)|(3 log q + 15289) + |c0| (923)

        91 A GENERAL EXPLICIT FORMULA 171

        for all primitive χIt now remains to bound the sum

        sumρGδ(ρ)xρ in (91) Clearly∣∣∣∣∣sum

        ρ

        Gδ(ρ)xρ

        ∣∣∣∣∣ lesumρ

        |Gδ(ρ)| middot xlt(ρ)

        Recall that these are sums over the non-trivial zeros ρ of L(s χ)We first prove a general lemma on sums of values of functions on the non-trivial

        zeros of L(s χ) This is little more than partial summation given a (classical) boundfor the number of zeroesN(T χ) of L(s χ) with |=(s)| le T The error term becomesparticularly simple if f is real-valued and decreasing the statement is then practicallyidentical to that of [Leh66 Lemma 1] (for χ principal) except for the fact that the errorterm is improved here

        Lemma 913 Let f R+ rarr C be piecewise C1 Assume limtrarrinfin f(t)t log t = 0Let χ be a primitive character mod q q ge 1 let ρ denote the non-trivial zeros ρ ofL(s χ) Then for any y ge 1sum

        ρ non-trivial=(ρ)gty

        f(=(ρ)) =1

        int infiny

        f(T ) logqT

        2πdT

        +1

        2Olowast(|f(y)|gχ(y) +

        int infiny

        |f prime(T )| middot gχ(T )dT

        )

        (924)

        wheregχ(T ) = 05 log qT + 177 (925)

        If f is real-valued and decreasing on [yinfin) the second line of (924) equals

        Olowast(

        1

        4

        int infiny

        f(T )

        TdT

        )

        Proof WriteN(T χ) for the number of non-trivial zeros ofL(s χ) satisfying |=(s)| leT Write N+(T χ) for the number of (necessarily non-trivial) zeros of L(s χ) with0 lt =(s) le T Then for any f R+ rarr C with f piecewise differentiable andlimtrarrinfin f(t)N(T χ) = 0sum

        ρ=(ρ)gty

        f(=(ρ)) =

        int infiny

        f(T ) dN+(T χ)

        = minusint infiny

        f prime(T )(N+(T χ)minusN+(y χ))dT

        = minus1

        2

        int infiny

        f prime(T )(N(T χ)minusN(y χ))dT

        Now by [Ros41 Thms 17ndash19] and [McC84a Thm 21] (see also [Tru Thm 1])

        N(T χ) =T

        πlog

        qT

        2πe+Olowast (gχ(T )) (926)

        172 CHAPTER 9 EXPLICIT FORMULAS

        for T ge 1 where gχ(T ) is as in (925) (This is a classical formula the referencesserve to prove the explicit form (925) for the error term gχ(T ))

        Thus for y ge 1sumρ=(ρ)gty

        f(=(ρ)) = minus1

        2

        int infiny

        f prime(T )

        (T

        πlog

        qT

        2πeminus y

        πlog

        qy

        2πe

        )dT

        +1

        2Olowast(|f(y)|gχ(y) +

        int infiny

        |f prime(T )| middot gχ(T )dT

        )

        (927)

        Here

        minus 1

        2

        int infiny

        f prime(T )

        (T

        πlog

        qT

        2πeminus y

        πlog

        qy

        2πe

        )dT =

        1

        int infiny

        f(T ) logqT

        2πdT (928)

        If f is real-valued and decreasing (and so by limtrarrinfin f(t) = 0 non-negative)

        |f(y)|gχ(y) +

        int infiny

        |f prime(T )| middot gχ(T )dT = f(y)gχ(y)minusint infiny

        f prime(T )gχ(T )dT

        = 05

        int infiny

        f(T )

        TdT

        since gprimeχ(T ) le 05T for all T ge T0

        Let us bound the part of the sumsumρGδ(ρ)xρ corresponding to ρ with bounded

        |=(ρ)| The bound we will give is proportional toradicT0 log qT0 whereas a very naive

        approach (based on the trivial bound |Gδ(σ + iτ)| le |G0(σ)|) would give a boundproportional to T0 log qT0

        We could obtain a bound proportional toradicT0 log qT0 for η(t) = tkeminust

        22 by usingTheorem 801 Instead we will give a bound of that same quality valid for η essentiallyarbitrary simply by using the fact that the Mellin transform is an isometry (preceded byan application of Cauchy-Schwarz)

        Lemma 914 Let η R+0 rarr R be such that both η(t) and (log t)η(t) lie in L1 cap L2

        and η(t)radict lies in L1 (with respect to dt) Let δ isin R Let Gδ(s) be the Mellin

        transform of η(t)e(δt)Let χ be a primitive character mod q q ge 1 Let T0 ge 1 Assume that all non-

        trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Thensumρ non-trivial|=(ρ)|leT0

        |Gδ(ρ)|

        is at most

        (|η|2 + |η middot log |2)radicT0 log qT0 + (1721|η middot log |2 minus (log 2π

        radice)|η|2)

        radicT0

        +∣∣∣η(t)

        radict∣∣∣1middot (132 log q + 345)

        (929)

        91 A GENERAL EXPLICIT FORMULA 173

        Proof For s = 12 + iτ we have the trivial bound

        |Gδ(s)| leint infin

        0

        |η(t)|t12 dtt

        =∣∣∣η(t)

        radict∣∣∣1 (930)

        where Fδ is as in (947) We also have the trivial bound

        |Gprimeδ(s)| =∣∣∣∣int infin

        0

        (log t)η(t)tsdt

        t

        ∣∣∣∣ le int infin0

        |(log t)η(t)|tσ dtt

        =∣∣(log t)η(t)tσminus1

        ∣∣1

        (931)for s = σ + iτ

        Let us start by bounding the contribution of very low-lying zeros (|=(ρ)| le 1) By(926) and (925)

        N(1 χ) =1

        πlog

        q

        2πe+Olowast (05 log q + 177) = Olowast(0819 log q + 168)

        Therefore sumρ non-trivial|=(ρ)|le1

        |Gδ(ρ)| le∣∣∣η(t)tminus12

        ∣∣∣1middot (0819 log q + 168)

        Let us now consider zeros ρ with |=(ρ)| gt 1 Apply Lemma 913 with y = 1 and

        f(t) =

        |Gδ(12 + it)| if t le T0

        0 if t gt T0

        This gives us thatsumρ1lt|=(ρ)|leT0

        f(=(ρ)) =1

        π

        int T0

        1

        f(T ) logqT

        2πdT

        +Olowast(|f(1)|gχ(1) +

        int infin1

        |f prime(T )| middot gχ(T ) dT

        )

        (932)

        where we are using the fact that f(σ+ iτ) = f(σminus iτ) (because η is real-valued) ByCauchy-Schwarz

        1

        π

        int T0

        1

        f(T ) logqT

        2πdT le

        radic1

        π

        int T0

        1

        |f(T )|2dT middot

        radic1

        π

        int T0

        1

        (log

        qT

        )2

        dT

        Now

        1

        π

        int T0

        1

        |f(T )|2dT le 1

        int infinminusinfin

        ∣∣∣∣Gδ (1

        2+ iT

        )∣∣∣∣2 dT le int infin0

        |e(δt)η(t)|2dt = |η|22

        by Plancherel (as in (26)) We also haveint T0

        1

        (log

        qT

        )2

        dT le 2π

        q

        int qT02π

        0

        (log t)2dt le

        ((log

        qT0

        2πe

        )2

        + 1

        )middot T0

        174 CHAPTER 9 EXPLICIT FORMULAS

        Hence1

        π

        int T0

        1

        f(T ) logqT

        2πdT le

        radic(log

        qT0

        2πe

        )2

        + 1 middot |η|2radicT0

        Again by Cauchy-Schwarzint infin1

        |f prime(T )| middot gχ(T ) dT le

        radic1

        int infinminusinfin|f prime(T )|2dT middot

        radic1

        π

        int T0

        1

        |gχ(T )|2dT

        Since |f prime(T )| = |Gprimeδ(12 + iT )| and (Mη)prime(s) is the Mellin transform of log(t) middote(δt)η(t) (by (210))

        1

        int infinminusinfin|f prime(T )|2dT = |η(t) log(t)|2

        Much as beforeint T0

        1

        |gχ(T )|2dT leint T0

        0

        (05 log qT + 177)2dT

        = (025(log qT0)2 + 172(log qT0) + 29609)T0

        Summing we obtain

        1

        π

        int T0

        1

        f(T ) logqT

        2πdT +

        int infin1

        |f prime(T )| middot gχ(T ) dT

        le((

        logqT0

        2πe+

        1

        2

        )|η|2 +

        (log qT0

        2+ 1721

        )|η(t)(log t)|2

        )radicT0

        Finally by (930) and (925)

        |f(1)|gχ(1) le∣∣∣η(t)

        radict∣∣∣1middot (05 log q + 177)

        By (932) and the assumption that all non-trivial zeros with |=(ρ)| le T0 lie on the linelt(s) = 12 we conclude thatsum

        ρ non-trivial1lt|=(ρ)|leT0

        |Gδ(ρ)| le (|η|2 + |η middot log |2)radicT0 log qT0

        + (1721|η middot log |2 minus (log 2πradice)|η|2)

        radicT0

        +∣∣∣η(t)

        radict∣∣∣1middot (05 log q + 177)

        All that remains is to bound the contribution tosumρGδ(ρ)xρ corresponding to all

        zeroes ρ with |=(ρ)| gt T0 This will do by another application of Lemma 913combined with bounds on Gδ(ρ) for =(ρ) large This is the only part that will requireus to take a look at the actual smoothing function η we are working with it is at thispoint not before that we actually have to look at each of our options for η one by one

        92 SUMS AND DECAY FOR THE GAUSSIAN 175

        92 Sums and decay for the GaussianIt is now time to derive our bounds for the Gaussian smoothing As we were sayingthere is really only one thing left to do namely an estimate for the sum

        sumρ |Fδ(ρ)|

        over all zeros ρ with |=(ρ)| gt T0

        Lemma 921 Let ηhearts(t) = eminust22 Let x isin R+ δ isin R Let χ be a primitive character

        mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge 50

        Write Fδ(s) for the Mellin transform of η(t)e(δt) Thensumρ

        |=(ρ)|gtT0

        |Fδ(ρ)| le logqT0

        2πmiddot(

        353eminus01598T0 + 225δ2

        T0eminus01065( T0

        π|δ| )2)

        Here we have preferred to give a bound with a simple form It is probably feasibleto derive from Theorem 801 a bound essentially proportional to eminusE(ρ)T0 where ρ =T0(πδ)

        2 and E(ρ) is as in (82) (As we discussed in sect85 E(ρ) behaves as eminus(π4)T0

        for ρ large and as eminus0125(T0(πδ))2

        for ρ small)

        Proof First of allsumρ

        |=(ρ)|gtT0

        |Fδ(ρ)| =sumρ

        =(ρ)gtT0

        (|Fδ(ρ)|+ |Fδ(1minus ρ)|)

        by the functional equation (which implies that non-trivial zeros come in pairs ρ 1minusρ)Hence by a somewhat brutish application of Cor 802sum

        ρ

        |=(ρ)|gtT0

        |Fδ(ρ)| lesumρ

        =(ρ)gtT0

        f(=(ρ)) (933)

        wheref(τ) = 3001eminus01065( τ

        πδ )2

        + 3286eminus01598|τ | (934)

        Obviously f(τ) is a decreasing function of τ for τ ge T0We now apply Lemma 913 We obtain thatsum

        ρ

        =(ρ)gtT0

        f(=(ρ)) leint infinT0

        f(T )

        (1

        2πlog

        qT

        2π+

        1

        4T

        )dT (935)

        We just need to estimate some integrals For any y ge 1 c c1 gt 0int infiny

        (log t+

        c1t

        )eminusctdt le

        int infiny

        (log tminus 1

        ct

        )eminusctdt+

        (1

        c+ c1

        )int infiny

        eminusct

        tdt

        =(log y)eminuscy

        c+

        (1

        c+ c1

        )E1(cy)

        176 CHAPTER 9 EXPLICIT FORMULAS

        where E1(x) =intinfinxeminustdtt Clearly E1(x) le

        intinfinxeminustdtx = eminusxx Henceint infin

        y

        (log t+

        c1t

        )eminusctdt le

        (log y +

        (1

        c+ c1

        )1

        y

        )eminuscy

        c

        We conclude thatint infinT0

        eminus01598t

        (1

        2πlog

        qt

        2π+

        1

        4t

        )dt

        le 1

        int infinT0

        (log t+

        π2

        t

        )eminusctdt+

        log q2π

        2πc

        int infinT0

        eminusctdt

        =1

        2πc

        (log T0 + log

        q

        2π+

        (1

        c+π

        2

        )1

        T0

        )eminuscT0

        (936)

        with c = 01598 Since T0 ge 50 and q ge 1 this is at most

        1072 logqT0

        2πeminuscT0 (937)

        Now let us deal with the Gaussian term (It appears only if T0 lt (32)(πδ)2 asotherwise |τ | ge (32)(πδ)2 holds whenever |τ | ge T0) For any y ge e c ge 0int infin

        y

        eminusct2

        dt =1radicc

        int infinradiccy

        eminust2

        dt le 1

        cy

        int infinradiccy

        teminust2

        dt le eminuscy2

        2cy (938)

        int infiny

        eminusct2

        tdt =

        int infincy2

        eminust

        2tdt =

        E1(cy2)

        2le eminuscy

        2

        2cy2 (939)int infin

        y

        (log t)eminusct2

        dt leint infiny

        (log t+

        log tminus 1

        2ct2

        )eminusct

        2

        dt =log y

        2cyeminuscy

        2

        (940)

        Hence int infinT0

        eminus01065( Tπδ )2(

        1

        2πlog

        qT

        2π+

        1

        4T

        )dT

        =

        int infinT0π|δ|

        eminus01065t2(|δ|2

        logq|δ|t

        2+

        1

        4t

        )dt

        le

        |δ|2 log T0

        π|δ|

        2cprime T0

        π|δ|+|δ|2 log q|δ|

        2

        2cprime T0

        π|δ|+

        1

        8cprime(T0

        π|δ|

        )2

        eminuscprime( T0π|δ| )

        2

        (941)

        with cprime = 01065 Since T0 ge 50 and q ge 1

        8T0le π

        200le 00152 middot 1

        2log

        qT0

        Thus the last line of (941) is less than

        10152|δ|2 log qT0

        2π2cprimeT0

        π|δ|eminusc

        prime( T0π|δ| )

        2

        = 7487δ2

        T0middot log

        qT0

        2πmiddot eminusc

        prime( T0π|δ| )

        2

        (942)

        92 SUMS AND DECAY FOR THE GAUSSIAN 177

        Again by T0 ge 4π2|δ| we see that 10057π|δ|(4cT0) le 10057(16cπ) le 018787To obtain our final bound we simply sum (937) and (942) after multiplying them

        by the constants 3286 and 3001 in (934) We conclude that the integral in (935) is atmost (

        353eminus01598T0 + 225δ2

        T0eminus01065( T0

        π|δ| )2)

        logqT0

        We need to record a few norms related to the Gaussian ηhearts(t) = eminust22 before we

        proceed Recall we are working with the one-sided Gaussian ie we set ηhearts(t) = 0for t lt 0 Symbolic integration then gives

        |ηhearts|22 =

        int infin0

        eminust2

        dt =

        radicπ

        2

        |ηprimehearts|22 =

        int infin0

        (teminust22)2dt =

        radicπ

        4

        |ηhearts middot log |22 =

        int infin0

        eminust2

        (log t)2dt

        =

        radicπ

        16

        (π2 + 2γ2 + 8γ log 2 + 8(log 2)2

        )le 194753

        (943)

        |ηhearts(t)radict|1 =

        int infin0

        eminust22

        radictdt =

        Γ(14)

        234le 215581

        |ηprimehearts(t)radict| = |ηhearts(t)

        radict|1 =

        int infin0

        eminust2

        2

        radictdt =

        Γ(34)

        214le 103045∣∣∣ηprimehearts(t)t12

        ∣∣∣1

        =∣∣∣ηhearts(t)t32

        ∣∣∣1

        =

        int infin0

        eminust2

        2 t32 dt = 107791

        (944)

        We can now state what is really our main result for the Gaussian smoothing (Theversion in sect71 will as we shall later see follow from this given numerical inputs)

        Proposition 922 Let η(t) = eminust22 Let x ge 1 δ isin R Let χ be a primitive character

        mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie onthe critical line Assume that T0 ge 50

        Then

        infinsumn=1

        Λ(n)χ(n)e

        xn

        )η(nx

        )=

        η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

        (945)where

        errηχ(δ x) = logqT0

        2πmiddot(

        353eminus01598T0 + 225δ2

        T0eminus01065( T0

        π|δ| )2)

        + (2337radicT0 log qT0 + 21817

        radicT0 + 285 log q + 7438)xminus

        12

        + (3 log q + 14|δ|+ 17)xminus1 + (log q + 6) middot (1 + 5|δ|) middot xminus32

        178 CHAPTER 9 EXPLICIT FORMULAS

        Proof Let Fδ(s) be the Mellin transform of ηhearts(t)e(δt) By Lemmas 914 (withGδ =Fδ) and Lemma 921 ∣∣∣∣∣∣

        sumρ non-trivial

        Fδ(ρ)xρ

        ∣∣∣∣∣∣is at most (929) (with η = ηhearts) times

        radicx plus

        logqT0

        2πmiddot(

        353eminus01598T0 + 225|δ|2

        T0eminus01065( T0

        π|δ| )2)middot x

        By the norm computations in (943) and (944) we see that (929) is at most

        2337radicT0 log qT0 + 21817

        radicT0 + 285 log q + 7438

        Let us now apply Lemma 911 We saw that the value of R in Lemma 911 isbounded by (923) We know that ηhearts(0) = 1 Again by (943) and (944) the quantityc0 defined in (93) is at most 14056 + 133466|δ| Hence

        |R| le 3 log q + 13347|δ|+ 16695

        Lastly|ηprimehearts|2 + 2π|δ||ηhearts|2 le 0942 + 4183|δ| le 1 + 5|δ|

        Clearly(601minus 6) middot (1 + 5|δ|) + 13347|δ|+ 16695 lt 14|δ|+ 17

        and so we are done

        93 The case of ηlowast(t)We will now work with a weight based on the Gaussian

        η(t) =

        t2eminust

        22 if t ge 00 if t lt 0

        (946)

        The fact that this vanishes at t = 0 actually makes it easier to work with at severallevels

        Its Mellin transform is just a shift of that of the Gaussian Write

        Fδ(s) = (M(eminust2

        2 e(δt)))(s)

        Gδ(s) = (M(η(t)e(δt)))(s)(947)

        Then by the definition of the Mellin transform

        Gδ(s) = Fδ(s+ 2)

        We start by bounding the contribution of zeros with large imaginary part just asbefore

        93 THE CASE OF ηlowast(T ) 179

        Lemma 931 Let η(t) = t2eminust22 Let x isin R+ δ isin R Let χ be a primitive character

        mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge max(10π|δ| 50)

        Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

        sumρ

        |=(ρ)|gtT0

        |Gδ(ρ)| le T0 logqT0

        2πmiddot(

        611eminus01598T0 + 1578eminus01065middot T

        20

        (πδ)2

        )

        Proof We start by writingsumρ

        |=(ρ)|gtT0

        |Gδ(ρ)| =sumρ

        =(ρ)gtT0

        (|Fδ(ρ+ 2)|+ |Fδ((1minus ρ) + 2)|)

        where we are usingGδ(ρ) = Fδ(ρ+2) and the fact that non-trivial zeros come in pairsρ 1minus ρ

        By Cor 802 with k = 2sumρ

        |=(ρ)|gtT0

        |Gδ(ρ)| lesumρ

        =(ρ)gtT0

        f(=(ρ))

        where

        f(τ) =

        κ21|τ |eminus01598|τ | +κ20

        4

        (|τ |πδ

        )2

        eminus01065( |τ|πδ )2

        if |τ | lt 32 (πδ)2

        κ21|τ |eminus01598|τ | if |τ | ge 32 (πδ)2

        (948)

        where κ20 = 796 and κ21 = 513 We are including the term |τ |eminus01598|τ | in bothcases in part because we cannot be bothered to take it out (just as we could not bebothered in the proof of Lem 921) and in part to ensure that f(τ) is a decreasingfunction of τ for τ ge T0

        We can now apply Lemma 913 We obtain againsumρ

        =(ρ)gtT0

        f(=(ρ)) leint infinT0

        f(T )

        (1

        2πlog

        qT

        2π+

        1

        4T

        )dT (949)

        Just as before we will need to estimate some integralsFor any y ge 1 c c1 gt 0 such that log y gt 1(cy)int infin

        y

        teminusctdt =

        (y

        c+

        1

        c2

        )eminuscy

        int infiny

        (t log t+

        c1t

        )eminusctdt le

        int infiny

        ((t+

        aminus 1

        c

        )log tminus 1

        cminus a

        c2t

        )eminusctdt

        =(yc

        +a

        c2

        )eminuscy log y

        (950)

        180 CHAPTER 9 EXPLICIT FORMULAS

        where

        a =

        log yc + 1

        c + c1y

        log yc minus

        1c2y

        Setting c = 01598 c1 = π2 y = T0 ge 50 we obtain thatint infinT0

        (1

        2πlog

        qT

        2π+

        1

        4T

        )Teminus01598T dT

        le 1

        (log

        q

        2πmiddot(T0

        c+

        1

        c2

        )+

        (T0

        c+a

        c2

        )log T0

        )eminus01598T0

        (951)

        and

        a =

        log T0

        01598 + 101598 + π2

        T0

        log T0

        01598 minus1

        015982T0

        le 1299

        It is easy to see that ratio of the expression within parentheses on the right side of(951) to T0 log(qT02π) increases as q decreases and if we hold q fixed decreases asT0 ge 2π increases thus it is maximal for q = 1 and T0 = 50 Multiplying (951) byκ21 = 513 and simplifying by the assumption T0 ge 50 we obtain thatint infin

        T0

        513Teminus01598T

        (1

        2πlog

        qT0

        2π+

        1

        4T

        )dT le 611T0 log

        qT0

        2πmiddot eminus01598T0

        (952)Now let us examine the Gaussian term First of all ndash when does it arise If T0 ge

        (32)(πδ)2 then |τ | ge (32)(πδ)2 holds whenever |τ | ge T0 and so (948) does notgive us a Gaussian term Recall that T0 ge 10π|δ| which means that |δ| le 20(3π)implies that T0 ge (32)(πδ)2 We can thus assume from now on that |δ| gt 20(3π)since otherwise there is no Gaussian term to treat

        For any y ge 1 c c1 gt 0int infiny

        t2eminusct2

        dt lt

        int infiny

        (t2 +

        1

        4c2t2

        )eminusct

        2

        dt =

        (y

        2c+

        1

        4c2y

        )middot eminuscy

        2

        int infiny

        (t2 log t+ c1t) middot eminusct2

        dt leint infiny

        (t2 log t+

        at log et

        2cminus log et

        2cminus a

        4c2t

        )eminusct

        2

        dt

        =(2cy + a) log y + a

        4c2middot eminuscy

        2

        where

        a =c1y + log ey

        2cy log ey

        2c minus 14c2y

        =1

        y+

        c1y + 14c2y2

        y log ey2c minus 1

        4c2y

        =1

        y+

        2c1c

        log ey+

        c12cy log ey + 1

        4c2y2

        y log ey2c minus 1

        4c2y

        (Note that a decreases as y ge y0 increases provided that log ey0 gt 1(2cy20)) Setting

        93 THE CASE OF ηlowast(T ) 181

        c = 01065 c1 = 1(2|δ|) le 316 and y = T0(π|δ|) ge 4π we obtainint infinT0π|δ|

        (1

        2πlog

        q|δ|t2

        +1

        4π|δ|t

        )t2eminus01065t2dt

        le(

        1

        2πlog

        q|δ|2

        )middot(

        T0

        2πc|δ|+

        1

        4c2 middot 10

        )middot eminus01065( T0

        π|δ| )2

        +1

        2πmiddot

        (2c T0

        π|δ| + a)

        log T0

        π|δ| + a

        4c2middot eminus01065( T0

        π|δ| )2

        and

        a le 1

        10+

        (2middot203π

        )minus1 middot 10 + 14middot010652middot102

        10 log 10e2middot01065 minus

        14middot010652middot10

        le 0117

        Multiplying by (κ204)π|δ| we get thatint infinT0

        κ20

        4

        (T

        π|δ|

        )2

        eminus01065( Tπ|δ| )

        2(

        1

        2πlog

        qT0

        2π+

        1

        4T

        )dT (953)

        is at most eminus01065( T0π|δ| )

        2

        times((1487T0 + 2194|δ|) middot log

        q|δ|2

        + 1487T0 logT0

        π|δ|+ 2566|δ| log

        eT0

        π|δ|

        )le

        (1487 + 2566 middot

        1 + 1log T0π|δ|

        T0|δ|

        )T0 log

        qT0

        2πle 1578 middot T0 log

        qT0

        (954)

        where we are using several times the assumption that T0 ge 4π2|δ| (and in one occa-sion the fact that |δ| gt 20(3π) gt 2)

        We sum (952) and the estimate for (953) we have just got to reach our conclusion

        Again we record some norms obtained by symbolic integration for η as in (946)

        |η|22 =3

        8

        radicπ |ηprime|22 =

        7

        16

        radicπ

        |η middot log |22 =

        radicπ

        64

        (8(3γ minus 8) log 2 + 3π2 + 6γ2 + 24(log 2)2 + 16minus 32γ

        )le 016364

        |η(t)radict|1 =

        214Γ(14)

        4le 107791 |η(t)

        radict|1 =

        3

        4234Γ(34) le 154568

        |ηprime(t)radict|1 =

        int radic2

        0

        t32eminust2

        2 dtminusint infinradic

        2

        t32eminust2

        2 dt le 148469

        |ηprime(t)radict|1 le 172169

        (955)

        182 CHAPTER 9 EXPLICIT FORMULAS

        Proposition 932 Let η(t) = t2eminust22 Let x ge 1 δ isin R Let χ be a primitive

        character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0

        lie on the critical line Assume that T0 ge max(10π|δ| 50)Theninfinsumn=1

        Λ(n)χ(n)e

        xn

        )η(nx) =

        η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1

        (956)where

        errηχ(δ x) = T0 logqT0

        2πmiddot(

        611eminus01598T0 + 1578eminus01065middot T

        20

        (πδ)2

        )+(

        122radicT0 log qT0 + 5056

        radicT0 + 1423 log q + 3719

        )middot xminus12

        + (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32(957)

        Proof We proceed as in the proof of Prop 922 The contribution of Lemma 931 is

        T0 logqT0

        2πmiddot(

        611eminus01598T0 + 1578eminus01065middot T

        20

        (πδ)2

        )middot x

        whereas the contribution of Lemma 914 is at most

        (122radicT0 log qT0 + 5056

        radicT0 + 1423 log q + 37188)

        radicx

        Let us now apply Lemma 911 Since η(0) = 0 we have

        R = Olowast(c0) = Olowast(2138 + 1099|δ|)

        Lastly|ηprime|2 + 2π|δ||η|2 le 0881 + 5123|δ|

        Now that we have Prop 932 we can derive from it similar bounds for a smoothingdefined as the multiplicative convolution of η with something else In general forϕ1 ϕ2 [0infin)rarr C if we know how to bound sums of the form

        Sfϕ1(x) =sumn

        f(n)ϕ1(nx) (958)

        we can bound sums of the form Sfϕ1lowastMϕ2 simply by changing the order of summationand integration

        Sfϕ1lowastMϕ2 =sumn

        f(n) middot (ϕ1 lowastM ϕ2)(nx

        )=

        int infin0

        sumn

        f(n)ϕ1

        ( n

        wx

        )ϕ2(w)

        dw

        w=

        int infin0

        Sfϕ1(wx)ϕ2(w)

        dw

        w

        (959)

        93 THE CASE OF ηlowast(T ) 183

        This is particularly nice if ϕ2(t) vanishes in a neighbourhood of the origin since thenthe argument wx of Sfϕ1(wx) is always large

        We will use ϕ1(t) = t2eminust22 ϕ2(t) = η1 lowastM η1 where η1 is 2 times the char-

        acteristic function of the interval [12 1] The motivation for the choice of ϕ1 and ϕ2

        is clear we have just got bounds based on ϕ1(t) in the major arcs and we obtainedminor-arc bounds for the weight ϕ2(t) in Part I

        Corollary 933 Let η(t) = t2eminust22 η1 = 2 middot I[121] η2 = η1 lowastM η1 Let ηlowast =

        η2 lowastM η Let x isin R+ δ isin R Let χ be a primitive character mod q q ge 1 Assumethat all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Assumethat T0 ge max(10π|δ| 50)

        Theninfinsumn=1

        Λ(n)χ(n)e

        xn

        )ηlowast(nx) =

        ηlowast(minusδ)x+Olowast (errηlowastχ(δ x)) middot x if q = 1Olowast (errηlowastχ(δ x)) middot x if q gt 1

        (960)where

        errηχlowast(δ x) = T0 logqT0

        2πmiddot(

        611eminus01598T0 + 00102 middot eminus01065middot T20

        (πδ)2

        )+(

        1679radicT0 log qT0 + 6957

        radicT0 + 1958 log q + 5117

        )middot xminus 1

        2

        + (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32(961)

        Proof The left side of (960) equalsint infin0

        infinsumn=1

        Λ(n)χ(n)e

        (δn

        x

        )η( n

        wx

        )η2(w)

        dw

        w

        =

        int 1

        14

        infinsumn=1

        Λ(n)χ(n)e

        (δwn

        wx

        )η( n

        wx

        )η2(w)

        dw

        w

        since η2 is supported on [minus14 1] By Prop 932 the main term (if q = 1) contributesint 1

        14

        η(minusδw)xw middot η2(w)dw

        w= x

        int infin0

        η(minusδw)η2(w)dw

        = x

        int infin0

        int infinminusinfin

        η(t)e(δwt)dt middot η2(w)dw = x

        int infin0

        int infinminusinfin

        η( rw

        )e(δr)

        dr

        wη2(w)dw

        = x

        int infinminusinfin

        (int infin0

        η( rw

        )η2(w)

        dw

        w

        )e(δr)dr = ηlowast(minusδ) middot x

        The error term isint 1

        14

        errηχ(δwwx) middot wx middot η2(w)dw

        w= x middot

        int 1

        14

        errηχ(δwwx)η2(w)dw (962)

        184 CHAPTER 9 EXPLICIT FORMULAS

        Using the fact that

        η2(w) =

        4 log 4w if w isin [14 12]4 logwminus1 if w isin [12 1]0 otherwise

        we can easily check thatint infin0

        η2(w)dw = 1

        int infin0

        wminus12η2(w)dw le 137259int infin0

        wminus1η2(w)dw = 4(log 2)2 le 192182

        int infin0

        wminus32η2(w)dw le 274517

        and by rigorous numerical integration from 14 to 12 and from 12 to 1 (using egVNODE-LP [Ned06])int infin

        0

        eminus01065middot102( 1w2minus1)η2(w)dw le 0006446

        We then see that (957) and (962) imply (961)

        94 The case of η+(t)

        We will work with

        η(t) = η+(t) = hH(t) middot tηhearts(t) = hH(t) middot teminust22 (963)

        where hH is as in (76) We recall that hH is a band-limited approximation to thefunction h defined in (75) ndash to be more precise MhH(it) is the truncation of Mh(it)to the interval [minusHH]

        We are actually defining h hH and η in a slightly different way from what was donein the first version of [Hela] The difference is instructive There η(t) was defined ashH(t)eminust

        22 and hH was a band-limited approximation to a function h defined as in(75) but with t3(2 minus t)3 instead of t2(2 minus t)3 The reason for our new definitions isthat now the truncation of Mh(it) will not break the holomorphy of Mη and so wewill be able to use the general results we proved in sect91

        In essence Mh will still be holomorphic because the Mellin transform of tηhearts(t) isholomorphic in the domain we care about unlike the Mellin transform of ηhearts(t) whichdoes have a pole at s = 0

        As usual we start by bounding the contribution of zeros with large imaginary partThe procedure is much as before since η+(t) = ηH(t)ηhearts(t) the Mellin transformMη+ is a convolution of M(teminust

        22) and something of support in [minusHH]i namelyMηH restricted to the imaginary axis This means that the decay of Mη+ is (at worst)like the decay of M(teminust

        22) delayed by H

        94 THE CASE OF η+(T ) 185

        Lemma 941 Let η = η+ be as in (963) for some H ge 25 Let x isin R+ δ isin R Letχ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ)with |=(ρ)| le T0 satisfy lt(s) = 12 where T0 ge H + max(10π|δ| 50)

        Write Gδ(s) for the Mellin transform of η(t)e(δt) Then

        sumρ

        |=(ρ)|gtT0

        |Gδ(ρ)| le

        (11308

        radicT prime0eminus01598T prime0 + 16147|δ|e

        minus01065

        (T prime0πδ

        )2)log

        qT0

        where T prime0 = T0 minusH

        Proof As usual sumρ

        |=(ρ)|gtT0

        |Gδ(ρ)| =sumρ

        =(ρ)gtT0

        (|Gδ(ρ)|+ |Gδ(1minus ρ)|)

        Let Fδ be as in (947) Then since η+(t)e(δt) = hH(t)teminust22e(δt) where hH is as

        in (76) we see by (29) that

        Gδ(s) =1

        int H

        minusHMh(ir)Fδ(s+ 1minus ir)dr

        and so since |Mh(ir)| = |Mh(minusir)|

        |Gδ(ρ)|+ |Gδ(1minus ρ)| le 1

        int H

        minusH|Mh(ir)|(|Fδ(1 +ρminus ir)|+ |Fδ(2minus (ρminus ir))|)dr

        (964)We apply Cor 802 with k = 1 and T0minusH instead of T0 and obtain that |Fδ(ρ)|+

        |Fδ(1minus ρ)| le g(τ) where

        g(τ) = κ11

        radic|τ |eminus01598|τ | + κ10

        |τ |2π|δ|

        eminus01065( τπδ )

        2

        (965)

        where κ10 = 4903 and κ11 = 4017 (As in the proof of Lemmas 921 and 931 weare putting in extra terms so as to simplify our integrals)

        From (964) we conclude that

        |Gδ(ρ)|+ |Gδ(1minus ρ)| le f(τ)

        for ρ = σ + iτ τ gt 0 where

        f(τ) =|Mh(ir)|1

        2πmiddot g(τ minusH)

        is decreasing for τ ge T0 (because g(τ) is decreasing for τ ge T0 minus H) By (A17)|Mh(ir)|1 le 16193918

        186 CHAPTER 9 EXPLICIT FORMULAS

        We apply Lemma 913 and get that

        sumρ

        |=(ρ)|gtT0

        |Gδ(ρ)| leint infinT0

        f(T )

        (1

        2πlog

        qT

        2π+

        1

        4T

        )dT

        =|Mh(ir)|1

        int infinT0

        g(T minusH)

        (1

        2πlog

        qT

        2π+

        1

        4T

        )dT

        (966)

        Now we just need to estimate some integrals For any y ge e2 c gt 0 and κ κ1 ge 0int infiny

        radicteminusctdt le

        (radicy

        c+

        1

        2c2radicy

        )eminuscy

        int infiny

        (radict log(t+ κ) +

        κ1radict

        )eminusctdt le

        (radicy

        c+

        a

        c2radicy

        )log(y + κ)eminuscy

        where

        a =1

        2+

        1 + cκ1

        log(y + κ)

        The contribution of the exponential term in (965) to (966) thus equals

        κ11|Mh(ir)|12π

        int infinT0

        (1

        2πlog

        qT

        2π+

        1

        4T

        )radicT minusH middot eminus01598(TminusH)dT

        le 103532

        int infinT0minusH

        (1

        2πlog(T +H) +

        log q2π

        2π+

        1

        4T

        )radicTeminus01598T dT

        le 103532

        (radicT0 minusH01598

        +a

        015982radicT0 minusH

        )log

        qT0

        2πmiddot eminus01598(T0minusH)

        (967)

        where a = 12+(1+01598π2) log T0 Since T0minusH ge 50 and T0 ge 50+25 = 75this is at most

        11308radicT0 minusH log

        qT0

        2πmiddot eminus01598(T0minusH)

        We now estimate a few more integrals so that we can handle the Gaussian term in(965) For any y gt 1 c gt 0 κ κ1 ge 0int infin

        y

        teminusct2

        dt =eminuscy

        2

        2c

        int infiny

        (t log(t+ κ) + κ1)eminusct2

        dt le

        (1 +

        κ1 + 12cy

        y log(y + κ)

        )log(y + κ) middot eminuscy2

        2c

        Proceeding just as before we see that the contribution of the Gaussian term in (965)

        94 THE CASE OF η+(T ) 187

        to (966) is at most

        κ10|Mh(ir)|12π

        int infinT0

        (1

        2πlog

        qT

        2π+

        1

        4T

        )T minusH2π|δ|

        middot eminus01065(TminusHπδ )2

        dT

        le 126368 middot |δ|4

        int infinT0minusHπ|δ|

        (log

        (T +

        H

        π|δ|

        )+ log

        q|δ|2

        +π2

        T

        )Teminus01065T 2

        dT

        le 126368 middot |δ|8 middot 01065

        1 +

        π2 + π|δ|

        2middot01065middot(T0minusH)

        T0minusHπ|δ| log T0

        π|δ|

        logqT0

        2πmiddot eminus01065(T0minusHπδ )

        2

        (968)Since (T0 minusH)(π|δ|) ge 10 this is at most

        16147|δ| logqT0

        2πmiddot eminus01065(T0minusHπδ )

        2

        Proposition 942 Let η = η+ be as in (963) for some H ge 25 Let x ge 103 δ isin RLet χ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0 lie on the critical line where T0 ge H + max(10π|δ| 50)

        Theninfinsumn=1

        Λ(n)χ(n)e

        xn

        )η+(nx) =

        η+(minusδ)x+Olowast

        (errη+χ(δ x)

        )middot x if q = 1

        Olowast(errη+χ(δ x)

        )middot x if q gt 1

        (969)where

        errη+χ(δ x) =

        (11308

        radicT prime0 middot eminus01598T prime0 + 16147|δ|e

        minus01065

        (T prime0πδ

        )2)log

        qT0

        + (1634radicT0 log qT0 + 1243

        radicT0 + 1321 log q + 3451)x12

        + (9 + 11|δ|)xminus1 + (log q)(11 + 6|δ|)xminus32(970)

        where T prime0 = T0 minusH

        Proof We can apply Lemmas 911 and Lemma 914 because η+(t) (log t)η+(t) andηprime+(t) are in `2 (by (A25) (A28) and (A32)) and η+(t)tσminus1 and ηprime+(t)tσminus1 are in`1 for σ in an open interval containing [12 32] (by (A30) and (A33)) (Because of(95) the fact that η+(t)tminus12 and η+(t)t12 are in `1 implies that η+(t) log t is also in`1 as is required by Lemma 914)

        We apply Lemmas 911 914 and 941 We bound the norms involving η+ usingthe estimates in sectA3 and sectA4 Since η+(0) = 0 (by the definition (A3) of η+) theterm R in (92) is at most c0 where c0 is as in (93) We bound

        c0 le2

        3

        (2922875

        (radicΓ(12) +

        radicΓ(32)

        )+ 1062319

        (radicΓ(52) +

        radicΓ(72)

        ))+

        3|δ| middot 1062319

        (radicΓ(32) +

        radicΓ(52)

        )le 6536232 + 9319578|δ|

        188 CHAPTER 9 EXPLICIT FORMULAS

        using (A30) and (A33) By (A25) (A32) and the assumption H ge 25

        |η+|2 le 080365 |ηprime+|2 le 10845789

        Thus the error terms in (91) total at most

        6536232+9319578|δ|+ (log q + 601)(10845789 + 2π middot 080365|δ|)xminus12

        le 9 + 11|δ|+ (log q)(11 + 6|δ|)xminus12(971)

        The part of the sumsumρGδ(ρ)xρ in (91) corresponding to zeros ρ with |=(ρ)| gt

        T0 gets estimated by Lem 941 By Lemma 914 the part of the sum correspondingto zeros ρ with |=(ρ)| le T0 is at most

        (1634radicT0 log qT0 + 1243

        radicT0 + 1321 log q + 3451)x12

        where we estimate the norms |η+|2 |η middot log |2 and |η(t)radict|1 by (A25) (A28) and

        (A30)

        95 A sum for η+(t)2

        Using a smoothing function sometimes leads to considering sums involving the squareof the smoothing function In particular in Part III we will need a result involving η2

        +

        ndash something that could be slightly challenging to prove given the way in which η+ isdefined Fortunately we have bounds on |η+|infin and other `infin-norms (see AppendixA5) Our task will also be made easier by the fact that we do not have a phase e(δnx)this time All in all this will be yet another demonstration of the generality of theframework developed in sect91

        Proposition 951 Let η = η+ be as in (963) H ge 25 Let x ge 108 Assume thatall non-trivial zeros ρ of the Riemann zeta function ζ(s) with |=(ρ)| le T0 lie on thecritical line where T0 ge max(2H + 25 200)

        Theninfinsumn=1

        Λ(n)(log n)η2+(nx) = x middot

        int infin0

        η2+(t) log xt dt+Olowast(err`2η+) middot x log x (972)

        where

        err`2η+ =

        ((0462

        (log T1)2

        log x+ 0909 log T1

        )T1 + 171

        (1 +

        log T1

        log x

        )H

        )eminus

        π4 T1

        + (2445radicT0 log T0 + 5004) middot xminus12

        (973)and T1 = T0 minus 2H

        The assumption T0 ge 200 is stronger than what we strictly need but as it happenswe could make much stronger assumptions still Proposition 951 relies on a verifica-tion of zeros of the Riemann zeta function such verifications have gone up to valuesof T0 much higher than 200

        95 A SUM FOR η+(T )2 189

        Proof We will need to consider two smoothing functions namely η+0(t) = η+(t)2

        and η+1 = η+(t)2 log t Clearly

        infinsumn=1

        Λ(n)(log n)η2+(nx) = (log x)

        infinsumn=1

        Λ(n)η+0(nx) +

        infinsumn=1

        Λ(n)η+1(nx)

        Since η+(t) = hH(t)teminust22

        η+0(r) = h2H(t)t2eminust

        2

        η+1(r) = h2H(t)(log t)t2eminust

        2

        Let η+2 = (log x)η+0 + η+1 = η2+(t) log xt

        We wish to apply Lemma 911 For this we must first check that some norms arefinite Clearly

        η+2(t) = η2+(t) log x+ η2

        +(t) log t

        ηprime+2(t) = 2η+(t)ηprime+(t) log x+ 2η+(t)ηprime+(t) log t+ η2+(t)t

        (974)

        Thus we see that η+2(t) is in `2 because η+(t) is in `2 and η+(t) η+(t) log t are bothin `infin (see (A25) (A38) (A40))

        |η+2(t)|2 le∣∣η2

        +(t)∣∣2

        log x+∣∣η2

        +(t) log t∣∣2

        le |η+|infin |η+|2 log x+ |η+(t) log t|infin |η+|2 (975)

        Similarly ηprime+2(t) is in `2 because η+(t) is in `2 ηprime+(t) is in `2 (A32) and η+(t)η+(t) log t and η+(t)t (see (A41)) are all in `infin∣∣ηprime+2(t)

        ∣∣2le∣∣2η+(t)ηprime+(t)

        ∣∣2

        log x+∣∣2η+(t)ηprime+(t) log t

        ∣∣2

        +∣∣η2

        +(t)t∣∣2

        le 2 |η+|infin∣∣ηprime+∣∣2 log x+ 2 |η+(t) log t|infin

        ∣∣ηprime+∣∣2 + |η+(t)t|infin |η+|2 (976)

        In the same way we see that η+2(t)tσminus1 is in `1 for all σ in (minus1infin) (because the sameis true of η+(t)tσminus1 (A30) and η+(t) η+(t) log t are both in `infin) and ηprime+2(t)tσminus1 isin `1 for all σ in (0infin) (because the same is true of η+(t)tσminus1 and ηprime+(t)tσminus1 (A33)and η+(t) η+(t) log t η+(t)t are all in `infin)

        We now apply Lemma 911 with q = 1 δ = 0 Since η+2(0) = 0 the residueterm R equals c0 which by (974) is at most 23 times

        2 (|η+|infin log x+ |η+(t) log t|infin)(∣∣∣ηprime+(t)

        radict∣∣∣1

        +∣∣∣ηprime+(t)

        radict∣∣∣1

        )+ |η+(t)t|infin

        (∣∣∣η+(t)radict∣∣∣1

        +∣∣∣η+(t)

        radict∣∣∣1

        )

        Using the bounds (A38) (A40) (A41) (with the assumption H ge 25) (A30) and(A33) we get that this means that

        c0 le 1857606 log x+ 863264

        190 CHAPTER 9 EXPLICIT FORMULAS

        Since q = 1 and δ = 0 we get from (976) (and (A38) (A40) (A41) with theassumption H ge 25 and also (A25) and (A32)) that

        (log q + 601)middot(∣∣ηprime+2∣∣2 + 2π|δ| |η+2|2

        )xminus12

        = 601∣∣ηprime+2∣∣2 xminus12 le (16256 log x+ 59325)xminus12

        Using the assumption x ge 108 we obtain

        c0 + (18526 log x+ 71799)xminus12 le 19064 log x (977)

        We will now apply Lemma 914 ndash as we may because of the finiteness of the normswe have already checked together with

        |η+2(t) log t|2 le∣∣η2

        +(t) log t∣∣2

        log x+∣∣η2

        +(t)(log t)2∣∣2

        le |η+(t) log t|infin (|η+(t)|2 log x+ |η+(t) log t|2)

        le 04976 middot (080365 log x+ 082999) le 03999 log x+ 041301(978)

        (by (A40) (A25) and (A28) use the assumption H ge 25) We also need the bounds

        |η+2(t)|2 le 114199 log x+ 039989 (979)

        (from (975) by the norm bounds (A38) (A40) and (A25) all with H ge 25) and∣∣∣η+2(t)radict∣∣∣1le (|η+(t)|infin log x+ |η+(t) log t|infin)

        ∣∣∣η+(t)radict∣∣∣1

        le 14211 log x+ 049763(980)

        (by (A38) (A40) (again with H ge 25) and (A30))Applying Lemma 914 we obtain that the sum

        sumρ |G0(ρ)|xρ (where G0(ρ) =

        Mη+2(ρ)) over all non-trivial zeros ρ with |=(ρ)| le T0 is at most x12 times

        (154189 log x+ 08129)radicT0 log T0 + (421245 log x+ 617301)

        radicT0

        + 491 log x+ 172(981)

        where we are bounding norms by (979) (978) and (980) (We are using the fact thatT0 ge 2π

        radice to ensure that the quantity

        radicT0 log T0minus (log 2π

        radice)radicT0 being multiplied

        by |η+2|2 is positive thus an upper bound for |η+2|2 suffices) By the assumptionsx ge 108 T0 ge 200 (981) is at most

        (2445radicT0 log T0 + 50034) log x

        In comparison 19064xminus12 log x le 0002 log x since x ge 108It remains to bound the sum of Mη+2(ρ) over zeros with |=(ρ)| gt T0 This we

        will do as usual by Lemma 913 For that we will need to bound Mη+2(ρ) for ρ inthe critical strip

        95 A SUM FOR η+(T )2 191

        The Mellin transform of eminust2

        is Γ(s2)2 and so the Mellin transform of t2eminust2

        is Γ(s2 + 1)2 By (210) this implies that the Mellin transform of (log t)t2eminust2

        isΓprime(s2 + 1)4 Hence by (29)

        Mη+2(s) =1

        int infinminusinfin

        M(h2H)(ir) middot Fx (sminus ir) dr (982)

        whereFx(s) = (log x)Γ

        (s2

        + 1)

        +1

        2Γprime(s

        2+ 1) (983)

        Moreover

        M(h2H)(ir) =

        1

        int infinminusinfin

        MhH(iu)MhH(i(r minus u)) du (984)

        and so M(h2H)(ir) is supported on [minus2H 2H] We also see that |Mh2

        H(ir)|1 le|MhH(ir)|212π We know that |MhH(ir)|212π le 4173727 by (A17)

        Hence

        |Mη+2(s)| le 1

        int infinminusinfin|M(h2

        H)(ir)|dr middot max|r|le2H

        |Fx(sminus ir)|

        le 4173727

        4πmiddot max|r|le2H

        |Fx(sminus ir)| le 332135 middot max|r|le2H

        |Fx(sminus ir)|(985)

        By (851) (Stirling with explicit constants)

        |Γ(s)| leradic

        2π|s|σminus 12 e

        112|s|+

        radic2

        180|s|3 eminusπ|=(s)|2 (986)

        when lt(s) ge 0 and so

        |Γ(s)| leradic

        (radic1252 + 152

        125

        )e

        112middot125 +

        radic2

        180middot1253 middot |=(s)|eminusπ|=(s)|2

        le 2542|=(s)|eminusπ|=(s)|2

        (987)

        for s isin C with 0 lt lt(s) le 32 and |=(s)| ge 252 Moreover by [OLBC10 5112]and the remarks at the beginning of [OLBC10 511(ii)]

        Γprime(s)

        Γ(s)= log sminus 1

        2s+Olowast

        (1

        12|s|2middot 1

        cos3 θ2

        )for | arg(s)| lt θ (θ isin (minusπ π)) Again for s = σ + iτ with 0 lt σ le 32 and|τ | ge 252 this gives us

        Γprime(s)

        Γ(s)= log |τ |+ log

        radic|τ |2 + 152

        |τ |+Olowast

        (1

        2|τ |

        )+Olowast

        (1

        12|τ |2middot 1

        (1radic

        2)3

        )= log |τ |+Olowast

        (9

        8|τ |2+

        1

        2|τ |

        )+Olowast(0236)

        |τ |2

        = log |τ |+Olowast(

        0609

        |τ |

        )

        192 CHAPTER 9 EXPLICIT FORMULAS

        Hence for 0 le lt(s) le 1 (or in fact minus2 le lt(s) le 1) and |=(s)| ge 25

        |Fx(s)| le(

        (log x) +1

        2log∣∣∣τ2

        ∣∣∣+1

        2Olowast(

        0609

        |τ2|

        ))Γ(s

        2+ 1)

        le 2542((log x) +1

        2log |τ | minus 0297)

        |τ |2eminusπ|τ |2

        (988)

        Thus by (985) for ρ = σ + iτ with |τ | ge T0 ge 2H + 25 and 0 le σ le 1

        |Mη+2(ρ)| le f(τ)

        where

        f(T ) = 845

        (log x+

        1

        2log T

        )(|τ |2minusH

        )middot eminus

        π(|τ|minus2H)4 (989)

        The functions t 7rarr teminusπt2 and t 7rarr (log t)teminusπt2 are decreasing for t ge e (or in factfor t ge 1762) setting t = T2minusH we see that the right side of (989) is a decreasingfunction of T for T ge T0 since T02minusH ge 252 gt e

        We can now apply Lemma 913 and get thatsumρ

        |=(ρ)|gtT0

        |Mη+2(ρ)| leint infinT0

        f(T )

        (1

        2πlog

        T

        2π+

        1

        4T

        )dT (990)

        Since T ge T0 ge 75 gt 2 we know that ((12π) log(T2π) + 14T ) le (12π) log T Hence the right side of (990) is at most

        839

        int infinT0

        ((log x)(log T ) +

        (log T )2

        2

        )(T minus 2H)eminus

        π(Tminus2H)4 dT

        le 0668

        int infinT1

        ((log x)

        (log t+

        2H

        t

        )+

        ((log t)2

        2+ 2H

        log t

        t

        ))teminus

        πt4 dt

        (991)

        where T1 = T0 minus 2H and t = T minus 2H we are using the facts that (log t)primeprime lt 0 fort gt 0 and ((log t)2)primeprime lt 0 for t gt e (Of course T1 ge 25 gt e)

        Of courseintinfinT1eminus(π4)t = (4π)eminus(π4)T1 We recall (936) and (950)int infinT1

        log t middot eminusπ4 tdt le(

        log T1 +4π

        T1

        )eminus

        π4 T1

        π4int infinT1

        (log t)teminusπ4 tdt le

        (T1 +

        4a

        π

        )eminus

        π4 T1 log T1

        π4

        for T1 ge 1 satisfying log T1 gt 4(πT1) where a = 1 + (1 + 4(πT1))(log T1 minus4(πT1)) It is easy to check that log T1 gt 4(πT1) and 4aπ le 16957 for T1 ge 25of course we also have (4π)25 le 0051 Lastlyint infin

        T1

        (log t)2teminusπ4 tdt le

        (T1 +

        4b

        π

        )eminus

        π4 T1(log T1)2

        π4

        96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 193

        for T1 ge e where b = 1 + (2 + 8(πT1))(log T1 minus 8(πT1)) and we check that4bπ le 21319 for T1 ge 25 We conclude that the integral on the second line of (991)is at most

        4

        π

        ((log T1)2

        2(T1 + 2132) + (log x)(log T1)(T1 + 1696)

        )eminus

        π4 T1

        +4

        πmiddot 2H(log T1 + 0051 + log x)eminus

        π4 T1

        Multiplying this by 0668 and simplifying further (using T1 ge 25) we conclude thatsumρ|=(ρ)|gtT0

        |Mη+2(ρ)| is at most

        ((0462 log T1 + 0909 log x)(log T1)T1 + 171(log T1 + log x)H) eminusπ4 T1

        96 A verification of zeros and its consequencesDavid Platt verified in his doctoral thesis [Pla11] that for every primitive character χof conductor q le 105 all the non-trivial zeroes of L(s χ) with imaginary partle 108qlie on the critical line ie have real part exactly 12 (We call this a GRH verificationup to 108q)

        In work undertaken in coordination with the present work [Plab] Platt has extendedthese computations to

        bull all odd q le 3 middot 105 with Tq = 108q

        bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)

        The method used was rigorous its implementation uses interval arithmeticLet us see what this verification gives us when used as an input to Prop 922 We

        are interested in bounds on | errηχlowast(δ x)| for q le r and |δ| le 4rq We set r = 3middot105(We will not be using the verification for q even with 3 middot 105 lt q le 4 middot 105 though wecertainly could)

        We let T0 = 108q Thus

        T0 ge108

        3 middot 105=

        1000

        3

        T0

        π|δ|ge 108q

        π middot 4rq=

        1000

        12π

        (992)

        and so by |δ| le 4rq le 12 middot 106q le 12 middot 106

        353eminus01598T0 le 2597 middot 10minus23

        225δ2

        T0eminus01065

        T20

        (πδ)2 le |δ| middot 7715 middot 10minus34 le 9258 middot 10minus28

        194 CHAPTER 9 EXPLICIT FORMULAS

        Since qT0 le 108 this gives us that

        logqT0

        2πmiddot(

        353eminus01598T0 + 225δ2

        T0eminus01065

        T20

        (πδ)2

        )le 43054 middot 10minus22 +

        154 middot 10minus26

        qle 4306 middot 10minus22

        Again by T0 = 108q

        2337radicT0 log qT0 + 21817

        radicT0 + 285 log q + 7438

        is at most648662radicq

        + 111

        and

        3 log q + 14|δ|+ 17 le 55 +17 middot 107

        q

        (log q + 6) middot (1 + 5|δ|) le 19 +12 middot 108

        q

        Hence assuming x ge 108 to simplify we see that Prop 922 gives us that

        errηχ(δ x) le 4306 middot 10minus22 +

        648662radicq + 111radicx

        +55 + 17middot107

        q

        x+

        19 + 12middot108

        q

        x32

        le 4306 middot 10minus22 +1radicx

        (650400radicq

        + 112

        )for η(t) = eminust

        22 This proves Theorem 711Let us now see what Plattrsquos calculations give us when used as an input to Prop 932

        and Cor 933 Again we set r = 3 middot 105 δ0 = 8 |δ| le 4rq and T0 = 108q so(992) is still valid We obtain

        T0 logqT0

        2πmiddot(

        611eminus01598T0 + 1578eminus01065middot T

        20

        (πδ)2

        )le log

        108

        (611 middot 1000

        3eminus01598middot 10003 + 108 middot 1578eminus01065( 1000

        12π )2)

        le 2485 middot 10minus19

        since t exp(minus01598t) is decreasing on t for t ge 101598 We use the same boundwhen we have 00102 instead of 1578 on the left side as in (961) (The coefficientaffects what is by far the smaller term so we are wasting nothing) Again by T0 =108q and q le r

        122radicT0 log qT0 + 5053

        radicT0 + 1423 log q + 3719 le 279793

        radicq

        + 552

        1679radicT0 log qT0 + 6957

        radicT0 + 1958 log q + 5117 le 378854

        radicq

        + 759

        96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 195

        For x ge 108 we use |δ| le 4rq le 12 middot 106q to bound

        (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32 le(

        00004 +1322

        q

        )xminus12

        (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32 le(

        00007 +2644

        q

        )xminus12

        Summing we obtain

        errηχ le 2485 middot 10minus19 +1radicx

        (281200radicq

        + 56

        )for η(t) = t2eminust

        22 and

        errηχ le 2485 middot 10minus19 +1radicx

        (381500radicq

        + 76

        )for η(t) = t2eminust

        22 lowastM η2(t) This proves Theorem 712 and Corollary 713Now let us work with the smoothing weight η+ This time around set r = 150000

        if q is odd and r = 300000 if q is even As before we assume

        q le r |δ| le 4rq

        We can see that Plattrsquos verification [Plab] mentioned before allows us to take

        T0 = H +250r

        q H = 200

        since Tq is always at least this (Tq = 108q ge 200 + 7 middot 107q gt 200 + 375 middot 107qfor q le 150000 odd Tq ge 200 + 75 middot 107q for q le 300000 even)

        Thus

        T0 minusH =250r

        qge 250r

        r= 250

        T0 minusHπδ

        ge 250r

        πδqge 250

        4π= 1989436

        and also

        T0 le 200 + 250 middot 150000 le 3751 middot 107 qT0 le rH + 250r le 135 middot 108

        Hence sinceradicteminus01598t is decreasing on t for t ge 1(2 middot 01598)

        11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

        (T0minusH)2

        (πδ)2

        le 79854 middot 10minus16 +4r

        qmiddot 79814 middot 10minus18

        le 79854 middot 10minus16 +95777 middot 10minus12

        q

        196 CHAPTER 9 EXPLICIT FORMULAS

        Examining (970) we get

        errη+χ(δ x) le log135 middot 108

        2πmiddot(

        79854 middot 10minus16 +95777 middot 10minus12

        q

        )+

        ((1634 log(135 middot 108) + 1243

        ) radic135 middot 108

        radicq

        + 1321 log 300000 + 3451

        )1radicx

        +

        (9 + 11 middot 12 middot 106

        q

        )xminus1 + (log 300000)

        (11 + 6 middot 12 middot 106

        q

        )xminus32

        le 13482 middot 10minus14 +1617 middot 10minus10

        q

        +

        (499845radicq

        + 5117 +132 middot 106

        qradicx

        +9radicx

        +91 middot 107

        qx+

        139

        x

        )1radicx

        Making the assumption x ge 1012 we obtain

        errη+χ(δ x) le 13482 middot 10minus14 +1617 middot 10minus10

        q+

        (499900radicq

        + 52

        )1radicx

        This proves Theorem 714 for general qLet us optimize things a little more carefully for the trivial character χT Again

        we will make the assumption x ge 1012 We will also assume as we did before that|δ| le 4rq this now gives us |δ| le 600000 since q = 1 and r = 150000 for q oddWe will go up to a height T0 = H + 600000π middot t where H = 200 and t ge 10 Then

        T0 minusHπδ

        =600000πt

        4πrge t

        Hence

        11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065

        (T0minusH)2

        (πδ)2

        le 10minus1300000 + 9689000eminus01065t2

        Looking at (970) we get

        errη+χT (δ x) le logT0

        2πmiddot(

        10minus1300000 + 9689000eminus01065t2)

        + ((1634 log T0 + 1243)radicT0 + 3451)xminus12 + 6600009xminus1

        The value t = 20 seems good enough we choose it because it is not far from optimalfor x sim 1027 We get that T0 = 12000000π + 200 since T0 lt 108 we are within therange of the computations in [Plab] (or for that matter [Wed03] or [Plaa]) We obtain

        errη+χT (δ x) le 4772 middot 10minus11 +251400radic

        x

        Lastly let us look at the sum estimated in (972) Here it will be enough to go upto just T0 = 2H + max(50 H4) = 450 where as before H = 200 Of course the

        96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 197

        verification of the zeros of the Riemann zeta function does go that far as we alreadysaid it goes until 108 (or rather more see [Wed03] and [Plaa]) We make again theassumption x ge 1012 We look at (973) and obtain that err`2η+ is at most((

        0462(log 50)2

        log 1012+ 0909 log 50

        )middot 50 + 171

        (1 +

        log 50

        log 1012

        )middot 200

        )eminus

        π4 50

        + (2445radic

        450 log 450 + 5004) middot xminus12

        le 5123 middot 10minus15 +36691radic

        x

        (993)It remains only to estimate the integral in (972) First of allint infin

        0

        η2+(t) log xt dt =

        int infin0

        η2(t) log xt dt

        + 2

        int infin0

        (η+(t)minus η(t))η(t) log xt dt+

        int infin0

        (η+(t)minus η(t))2 log xt dt

        The main term will be given byint infin0

        η2(t) log xt dt =

        (064020599736635 +O

        (10minus14

        ))log x

        minus 0021094778698867 +O(10minus15

        )

        where the integrals were computed rigorously using VNODE-LP [Ned06] (The in-tegral

        intinfin0η2(t)dt can also be computed symbolically) By Cauchy-Schwarz and the

        triangle inequalityint infin0

        (η+(t)minus η(t))η(t) log xt dt le |η+ minus η|2|η(t) log xt|2

        le |η+ minus η|2(|η|2 log x+ |η middot log |2)

        le 27486

        H72(080013 log x+ 0214)

        le 1944 middot 10minus6 middot log x+ 52 middot 10minus7

        where we are using (A23) and evaluate |η middot log |2 rigorously as above By (A23) and(A24)int infin

        0

        (η+(t)minus η(t))2 log xt dt le(

        27486

        H72

        )2

        log x+27428

        H7

        le 5903 middot 10minus12 middot log x+ 2143 middot 10minus12

        We conclude thatint infin0

        η2+(t) log xt dt

        = (0640206 +Olowast(195 middot 10minus6)) log xminus 0021095 +Olowast(53 middot 10minus7)

        (994)

        198 CHAPTER 9 EXPLICIT FORMULAS

        We add to this the error term 5123 middot 10minus15 + 36691radicx from (993) and simplify

        using the assumption x ge 1012 We obtain

        infinsumn=1

        Λ(n)(log n)η2+(nx) = 0640206x log xminus 0021095x

        +Olowast(2 middot 10minus6x log x+ 36691

        radicx log x

        )

        (995)

        and so Prop 951 gives us Proposition 715As we can see the relatively large error term 2 middot 10minus6 comes from the fact that we

        have wanted to give the main term in (972) as an explicit constant rather than as anintegral This is satisfactory Prop 715 is an auxiliary result that will be needed forone specific purpose in Part III as opposed to Thms 711ndash714 which while crucialfor Part III are also of general applicability and interest

        Part III

        The integral over the circle

        199

        Chapter 10

        The integral over the major arcs

        LetSη(α x) =

        sumn

        Λ(n)e(αn)η(nx) (101)

        where α isin RZ Λ is the von Mangoldt function and η R rarr C is of fast enoughdecay for the sum to converge

        Our ultimate goal is to bound from belowsumn1+n2+n3=N

        Λ(n1)Λ(n2)Λ(n3)η1(n1x)η2(n2x)η3(n3x) (102)

        where η1 η2 η3 R rarr C Once we know that this is neither zero nor very close tozero we will know that it is possible to write N as the sum of three primes n1 n2 n3

        in at least one way that is we will have proven the ternary Goldbach conjectureAs can be readily seen (102) equalsint

        RZSη1(α x)Sη2(α x)Sη3(α x)e(minusNα) dα (103)

        In the circle method the set RZ gets partitioned into the set of major arcs M and theset of minor arcs m the contribution of each of the two sets to the integral (103) isevaluated separately

        Our objective here is to treat the major arcs we wish to estimateintM

        Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα)dα (104)

        for M = Mδ0r where

        Mδ0r =⋃qlerq odd

        ⋃a mod q

        (aq)=1

        (a

        qminus δ0r

        2qxa

        q+δ0r

        2qx

        )cup⋃qle2rq even

        ⋃a mod q

        (aq)=1

        (a

        qminus δ0r

        qxa

        q+δ0r

        qx

        )(105)

        201

        202 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        and δ0 gt 0 r ge 1 are givenIn other words our major arcs will be few (that is a constant number) and narrow

        While [LW02] used relatively narrow major arcs as well their number as in all pre-vious proofs of Vinogradovrsquos result was not bounded by a constant (In his proof ofthe five-primes theorem [Tao14] is able to take a single major arc around 0 this is notpossible here)

        What we are about to see is the general major-arc setup This is naturally the placewhere the overlap with the existing literature is largest Two important differences cannevertheless be singled out

        bull The most obvious one is the presence of smoothing At this point it improvesand simplifies error terms but it also means that we will later need estimates forexponential sums on major arcs and not just at the middle of each major arc (Ifthere is smoothing we cannot use summation by parts to reduce the problem ofestimating sums to a problem of counting primes in arithmetic progressions orweighted by characters)

        bull Since our L-function estimates for exponential sums will give bounds that arebetter than the trivial one by only a constant ndash even if it is a rather large con-stant ndash we need to be especially careful when estimating error terms findingcancellation when possible

        101 Decomposition of Sη by charactersWhat follows is largely classical cf [HL22] or say [Dav67 sect26] The only differencefrom the literature lies in the treatment of n non-coprime to q and the way in whichwe show that our exponential sum (108) is equal to a linear combination of twistedsums Sηχlowast over primitive characters χlowast (Non-primitive characters would give us L-functions with some zeroes inconveniently placed on the line lt(s) = 0)

        Write τ(χ b) for the Gauss sum

        τ(χ b) =sum

        a mod q

        χ(a)e(abq) (106)

        associated to a b isin ZqZ and a Dirichlet character χ with modulus q We let τ(χ) =τ(χ 1) If (b q) = 1 then τ(χ b) = χ(bminus1)τ(χ)

        Recall that χlowast denotes the primitive character inducing a given Dirichlet characterχ Writing

        sumχ mod q for a sum over all characters χ of (ZqZ)lowast) we see that for any

        a0 isin ZqZ

        1

        φ(q)

        sumχ mod q

        τ(χ b)χlowast(a0) =1

        φ(q)

        sumχ mod q

        suma mod q

        (aq)=1

        χ(a)e(abq)χlowast(a0)

        =sum

        a mod q

        (aq)=1

        e(abq)

        φ(q)

        sumχ mod q

        χlowast(aminus1a0) =sum

        a mod q

        (aq)=1

        e(abq)

        φ(q)

        sumχ mod qprime

        χ(aminus1a0)

        (107)

        101 DECOMPOSITION OF Sη BY CHARACTERS 203

        where qprime = q gcd(q ainfin0 ) Nowsumχ mod qprime χ(aminus1a0) = 0 unless a = a0 (in which

        casesumχ mod qprime χ(aminus1a0) = φ(qprime)) Thus (107) equals

        φ(qprime)

        φ(q)

        suma mod q

        (aq)=1

        aequiva0 mod qprime

        e(abq) =φ(qprime)

        φ(q)

        sumk mod qqprime

        (kqqprime)=1

        e

        ((a0 + kqprime)b

        q

        )

        =φ(qprime)

        φ(q)e

        (a0b

        q

        ) sumk mod qqprime

        (kqqprime)=1

        e

        (kb

        qqprime

        )=φ(qprime)

        φ(q)e

        (a0b

        q

        )micro(qqprime)

        provided that (b q) = 1 (We are evaluating a Ramanujan sum in the last step) Hencefor α = aq + δx q le x (a q) = 1

        1

        φ(q)

        sumχ

        τ(χ a)sumn

        χlowast(n)Λ(n)e(δnx)η(nx)

        equals sumn

        micro((q ninfin))

        φ((q ninfin))Λ(n)e(αn)η(nx)

        Since (a q) = 1 τ(χ a) = χ(a)τ(χ) The factor micro((q ninfin))φ((q ninfin)) equals 1when (n q) = 1 the absolute value of the factor is at most 1 for every n Clearlysum

        n(nq)6=1

        Λ(n)η(nx

        )=sump|q

        log psumαge1

        η

        (pα

        x

        )

        Recalling the definition (101) of Sη(α x) we conclude that

        Sη(α x) =1

        φ(q)

        sumχ mod q

        χ(a)τ(χ)Sηχlowast

        x x

        )+Olowast

        2sump|q

        log psumαge1

        η

        (pα

        x

        )

        (108)where

        Sηχ(β x) =sumn

        Λ(n)χ(n)e(βn)η(nx) (109)

        Hence Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα) equals

        1

        φ(q)3

        sumχ1

        sumχ2

        sumχ3

        τ(χ1)τ(χ2)τ(χ3)χ1(a)χ2(a)χ3(a)e(minusNaq)

        middot Sη1χlowast1 (δx x)Sη2χlowast2 (δx x)Sη3χlowast3 (δx x)e(minusδNx)

        (1010)

        plus an error term of absolute value at most

        2

        3sumj=1

        prodjprime 6=j

        |Sηjprime (α x)|sump|q

        log psumαge1

        ηj

        (pα

        x

        ) (1011)

        204 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        We will later see that the integral of (1011) over S1 is negligible ndash for our choices ofηj it will in fact be of size O(x(log x)A) A a constant The error term O(x(log x)A)should be compared to the main term which will be of size about a constant times x2

        In (1010) we have reduced our problems to estimating Sηχ(δx x) for χ prim-itive a more obvious way of reaching the same goal would have made (1011) worseby a factor of about

        radicq

        102 The integral over the major arcs the main term

        We are to estimate the integral (104) where the major arcs Mδ0r are defined as in(105) We will use η1 = η2 = η+ η3(t) = ηlowast(κt) where η+ and ηlowast will be set later

        We can write

        Sηχ(δx x) = Sη(δx x) =

        int infin0

        η(tx)e(δtx)dt+Olowast(errηχ(δ x)) middot x

        = η(minusδ) middot x+Olowast(errηχT (δ x)) middot x(1012)

        for χ = χT the trivial character and

        Sηχ(δx) = Olowast(errηχ(δ x)) middot x (1013)

        for χ primitive and non-trivial The estimation of the error terms err will come laterlet us focus on (a) obtaining the contribution of the main term (b) using estimates onthe error terms efficiently

        The main term three principal characters The main contribution will be given bythe term in (1010) with χ1 = χ2 = χ3 = χ0 where χ0 is the principal character modq

        The sum τ(χ0 n) is a Ramanujan sum as is well-known (see eg [IK04 (32)])

        τ(χ0 n) =sumd|(qn)

        micro(qd)d (1014)

        This simplifies to micro(q(q n))φ((q n)) for q square-free The special case n = 1 givesus that τ(χ0) = micro(q)

        Thus the term in (1010) with χ1 = χ2 = χ3 = χ0 equals

        e(minusNaq)φ(q)3

        micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx) (1015)

        where of course Sηχlowast0 (α x) = Sη(α x) (since χlowast0 is the trivial character) Summing(1015) for α = aq+δx and a going over all residues mod q coprime to q we obtain

        micro(

        q(qN)

        )φ((qN))

        φ(q)3micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx)

        102 THE INTEGRAL OVER THE MAJOR ARCS THE MAIN TERM 205

        The integral of (1015) over all of M = Mδ0r (see (105)) thus equals

        sumqlerq odd

        φ((qN))

        φ(q)3micro(q)2micro((qN))

        int δ0r2qx

        minus δ0r2qx

        S2η+χlowast0

        (α x)Sηlowastχlowast0 (α x)e(minusαN)dα

        +sumqle2rq even

        φ((qN))

        φ(q)3micro(q)2micro((qN))

        int δ0rqx

        minus δ0rqxS2η+χlowast0

        (α x)Sηlowastχlowast0 (α x)e(minusαN)dα

        (1016)The main term in (1016) is

        x3 middotsumqlerq odd

        φ((qN))

        φ(q)3micro(q)2micro((qN))

        int δ0r2qx

        minus δ0r2qx

        (η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

        +x3 middotsumqle2rq even

        φ((qN))

        φ(q)3micro(q)2micro((qN))

        int δ0rqx

        minus δ0rqx(η+(minusαx))2ηlowast(minusαx)e(minusαN)dα

        (1017)We would like to complete both the sum and the integral Before we should say

        that we will want to be able to use smoothing functions η+ whose Fourier transformsare not easy to deal with directly All we want to require is that there be a smoothingfunction η easier to deal with such that η be close to η+ in `2 norm

        Assume then that

        |η+ minus η|2 le ε0|η|

        where η is thrice differentiable outside finitely many points and satisfies η(3) isin L1

        Then (1017) equals

        x3 middotsumqlerq odd

        φ((qN))

        φ(q)3micro(q)2micro((qN))

        int δ0r2qx

        minus δ0r2qx

        (η(minusαx))2ηlowast(minusαx)e(minusαN)dα

        +x3 middotsumqle2rq even

        φ((qN))

        φ(q)3micro(q)2micro((qN))

        int δ0rqx

        minus δ0rqx(η(minusαx))2ηlowast(minusαx)e(minusαN)dα

        (1018)plus

        Olowast

        (x2 middot

        sumq

        micro(q)2

        φ(q)2

        int infinminusinfin|(η+(minusα))2 minus (η(minusα))2||ηlowast(minusα)|dα

        ) (1019)

        206 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        Here (1019) is bounded by 282643x2 (by (C9)) times

        |ηlowast(minusα)|infin middot

        radicint infinminusinfin|η+(minusα)minus η(minusα)|2dα middot

        int infinminusinfin|η+(minusα) + η(minusα)|2dα

        le |ηlowast|1 middot |η+ minus η|2|η+ + η|2 = |ηlowast|1 middot |η+ minus η|2|η+ + η|2le |ηlowast|1 middot |η+ minus η|2(2|η|2 + |η+ minus η|2) = |ηlowast|1|η|22 middot (2 + ε0)ε0

        Now (1018) equals

        x3

        int infinminusinfin

        (η(minusαx))2ηlowast(minusαx)e(minusαN)sum

        q(q2)lemin( δ0r

        2|α|x r)micro(q)2=1

        φ((qN))

        φ(q)3micro((qN))dα

        = x3

        int infinminusinfin

        (η(minusαx))2ηlowast(minusαx)e(minusαN)dα middot

        sumqge1

        φ((qN))

        φ(q)3micro(q)2micro((qN))

        minusx3

        int infinminusinfin

        (η(minusαx))2ηlowast(minusαx)e(minusαN)sum

        q(q2)

        gtmin( δ0r

        2|α|x r)micro(q)2=1

        φ((qN))

        φ(q)3micro((qN))dα

        (1020)The last line in (1020) is bounded1 by

        x2|ηlowast|infinint infinminusinfin|η(minusα)|2

        sumq

        (q2)gtmin( δ0r2|α| r)

        micro(q)2

        φ(q)2dα (1021)

        By (21) (with k = 3) (C16) and (C17) this is at most

        x2|ηlowast|1int δ02

        minusδ02|η(minusα)|2 431004

        rdα

        + 2x2|ηlowast|1int infinδ02

        (|η(3) |1

        (2πα)3

        )2862008|α|

        δ0rdα

        le |ηlowast|1

        (431004|η|22 + 000113

        |η(3) |21δ50

        )x2

        r

        It is easy to see that

        sumqge1

        φ((qN))

        φ(q)3micro(q)2micro((qN)) =

        prodp|N

        (1minus 1

        (pminus 1)2

        )middotprodp-N

        (1 +

        1

        (pminus 1)3

        )

        1This is obviously crude in that we are bounding φ((qN))φ(q) by 1 We are doing so in order toavoid a potentially harmful dependence on N

        103 THE `2 NORM OVER THE MAJOR ARCS 207

        Expanding the integral implicit in the definition of f int infininfin

        (η(minusαx))2ηlowast(minusαx)e(minusαN)dα =

        1

        x

        int infin0

        int infin0

        η(t1)η(t2)ηlowast

        (N

        xminus (t1 + t2)

        )dt1dt2

        (1022)

        (This is standard One rigorous way to obtain (1022) is to approximate the integralover α isin (minusinfininfin) by an integral with a smooth weight at different scales as the scalebecomes broader the Fourier transform of the weight approximates (as a distribution)the δ function Apply Plancherel)

        Hence (1017) equals

        x2 middotint infin

        0

        int infin0

        η(t1)η(t2)ηlowast

        (N

        xminus (t1 + t2)

        )dt1dt2

        middotprodp|N

        (1minus 1

        (pminus 1)2

        )middotprodp-N

        (1 +

        1

        (pminus 1)3

        )

        (1023)

        (the main term) plus

        282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 000113

        |η(3) |21δ50

        r

        |ηlowast|1x2 (1024)

        Here (1023) is just as in the classical case [IK04 (1910)] except for the fact thata factor of 12 has been replaced by a double integral Later in chapter 11 we will seehow to choose our smoothing functions (and x in terms ofN ) so as to make the doubleintegral as large as possible in comparison with the error terms This is an importantoptimization (We already had a first discussion of this in the introduction see (139)and what follows)

        What remains to estimate is the contribution of all the terms of the form errηχ(δ x)in (1012) and (1013) Let us first deal with another matter ndash bounding the `2 norm of|Sη(α x)|2 over the major arcs

        103 The `2 norm over the major arcs

        We can always bound the integral of |Sη(α x)|2 on the whole circle by Plancherel Ifwe only want the integral on certain arcs we use the bound in Prop 1212 (based onwork by Ramare) If these arcs are really the major arcs ndash that is the arcs on whichwe have useful analytic estimates ndash then we can hope to get better bounds using L-functions This will be useful both to estimate the error terms in this section and tomake the use of Ramarersquos bounds more efficient later

        208 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        By (108)

        suma mod q

        gcd(aq)=1

        ∣∣∣∣Sη (aq +δ

        x χ

        )∣∣∣∣2

        =1

        φ(q)2

        sumχ

        sumχprime

        τ(χ)τ(χprime)

        suma mod q

        gcd(aq)=1

        χ(a)χprime(a)

        middot Sηχlowast(δx x)Sηχprimelowast(δx x)

        +Olowast(

        2(1 +radicq)(log x)2|η|infinmax

        α|Sη(α x)|+

        ((1 +

        radicq)(log x)2|η|infin

        )2)=

        1

        φ(q)

        sumχ

        |τ(χ)|2|Sηχlowast(δx x)|2 +Kq1(2|Sη(0 x)|+Kq1)

        where

        Kq1 = (1 +radicq)(log x)2|η|infin

        As is well-known (see eg [IK04 Lem 31])

        τ(χ) = micro

        (q

        qlowast

        )χlowast(q

        qlowast

        )τ(χlowast)

        where qlowast is the modulus of χlowast (ie the conductor of χ) and

        |τ(χlowast)| =radicqlowast

        Using the expressions (1012) and (1013) we obtain

        suma mod q

        (aq)=1

        ∣∣∣∣Sη (aq +δ

        x x

        )∣∣∣∣2 =micro2(q)

        φ(q)|η(minusδ)x+Olowast (errηχT (δ x) middot x)|2

        +1

        φ(q)

        sumχ 6=χT

        micro2

        (q

        qlowast

        )qlowast middotOlowast

        (| errηχ(δ x)|2x2

        )+Kq1(2|Sη(0 x)|+Kq1)

        =micro2(q)x2

        φ(q)

        (|η(minusδ)|2 +Olowast (|errηχT (δ x)(2|η|1 + errηχT (δ x))|)

        )+Olowast

        (maxχ6=χT

        qlowast| errηχlowast(δ x)|2x2 +Kq2x

        )

        where Kq2 = Kq1(2|Sη(0 x)|x+Kq1x)

        103 THE `2 NORM OVER THE MAJOR ARCS 209

        Thus the integral of |Sη(α x)|2 over M (see (105)) is

        sumqlerq odd

        suma mod q

        (aq)=1

        int aq+

        δ0r2qx

        aqminus

        δ0r2qx

        |Sη(α x)|2 dα+sumqle2rq even

        suma mod q

        (aq)=1

        int aq+

        δ0rqx

        aqminus

        δ0rqx

        |Sη(α x)|2 dα

        =sumqlerq odd

        micro2(q)x2

        φ(q)

        int δ0r2qx

        minus δ0r2qx

        |η(minusαx)|2 dα+sumqle2rq even

        micro2(q)x2

        φ(q)

        int δ0rqx

        minus δ0rqx|η(minusαx)|2 dα

        +Olowast

        (sumq

        micro2(q)x2

        φ(q)middot gcd(q 2)δ0r

        qx

        (ET

        ηδ0r2

        (2|η|1 + ETηδ0r2

        )))

        +sumqlerq odd

        δ0rx

        qmiddotOlowast

        maxχ mod q

        χ 6=χT|δ|leδ0r2q

        qlowast| errηχlowast(δ x)|2 +Kq2

        x

        +sumqle2rq even

        2δ0rx

        qmiddotOlowast

        maxχ mod q

        χ 6=χT|δ|leδ0rq

        qlowast| errηχlowast(δ x)|2 +Kq2

        x

        (1025)where

        ETηs = max|δ|les

        | errηχT (δ x)|

        and χT is the trivial character If all we want is an upper bound we can simply remarkthat

        xsumqlerq odd

        micro2(q)

        φ(q)

        int δ0r2qx

        minus δ0r2qx

        |η(minusαx)|2 dα+ xsumqle2rq even

        micro2(q)

        φ(q)

        int δ0rqx

        minus δ0rqx|η(minusαx)|2 dα

        le

        sumqlerq odd

        micro2(q)

        φ(q)+sumqle2rq even

        micro2(q)

        φ(q)

        |η|22 = 2|η|22sumqlerq odd

        micro2(q)

        φ(q)

        If we also need a lower bound we proceed as follows

        Again we will work with an approximation η such that (a) |η minus η|2 is small (b)

        210 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        η is thrice differentiable outside finitely many points (c) η(3) isin L1 Clearly

        xsumqlerq odd

        micro2(q)

        φ(q)

        int δ0r2qx

        minus δ0r2qx

        |η(minusαx)|2 dα

        lesumqlerq odd

        micro2(q)

        φ(q)

        (int δ0r2q

        minus δ0r2q

        |η(minusα)|2 dα+ 2〈|η| |η minus η|〉+ |η minus η|22

        )

        =sumqlerq odd

        micro2(q)

        φ(q)

        int δ0r2q

        minus δ0r2q

        |η(minusα)|2 dα

        +Olowast(

        1

        2log r + 085

        )(2 |η|2 |η minus η|2 + |η minus η|22

        )

        where we are using (C11) and isometry Alsosumqle2rq even

        micro2(q)

        φ(q)

        int δ0rqx

        minus δ0rqx|η(minusαx)|2 dα =

        sumqlerq odd

        micro2(q)

        φ(q)

        int δ0r2qx

        minus δ0r2qx

        |η(minusαx)|2 dα

        By (21) and Plancherelint δ0r2q

        minus δ0r2q

        |η(minusα)|2 dα =

        int infinminusinfin|η(minusα)|2 dαminusOlowast

        (2

        int infinδ0r2q

        |η(3) |21

        (2πα)6dα

        )

        = |η|22 +Olowast

        (|η(3) |21q5

        5π6(δ0r)5

        )

        Hence

        sumqlerq odd

        micro2(q)

        φ(q)

        int δ0r2q

        minus δ0r2q

        |η(minusα)|2 dα = |η|22 middotsumqlerq odd

        micro2(q)

        φ(q)+Olowast

        sumqlerq odd

        micro2(q)

        φ(q)

        |η(3) |21q5

        5π6(δ0r)5

        Using (C18) we get thatsumqlerq odd

        micro2(q)

        φ(q)

        |η(3) |21q5

        5π6(δ0r)5le 1

        r

        sumqlerq odd

        micro2(q)q

        φ(q)middot |η

        (3) |21

        5π6δ50

        le |η(3) |21

        5π6δ50

        middot(

        064787 +log r

        4r+

        0425

        r

        )

        Going back to (1025) we use (C7) to boundsumq

        micro2(q)x2

        φ(q)

        gcd(q 2)δ0r

        qxle 259147 middot δ0rx

        103 THE `2 NORM OVER THE MAJOR ARCS 211

        We also note that sumqlerq odd

        1

        q+sumqle2rq even

        2

        q=sumqler

        1

        qminussumqle r2

        1

        2q+sumqler

        1

        q

        le 2 log er minus logr

        2le log 2e2r

        We have proven the following result

        Lemma 1031 Let η [0infin) rarr R be in L1 cap Linfin Let Sη(α x) be as in (101) andlet M = Mδ0r be as in (105) Let η [0infin) rarr R be thrice differentiable outsidefinitely many points Assume η(3)

        isin L1Assume r ge 182 ThenintM

        |Sη(α x)|2dα = Lrδ0x+Olowast(

        519δ0xr

        (ET

        ηδ0r2middot(|η|1 +

        ETηδ0r2

        2

        )))+Olowast

        (δ0r(log 2e2r)

        (x middot E2

        ηrδ0 +Kr2

        ))

        (1026)where

        Eηrδ0 = maxχ mod q

        qlermiddotgcd(q2)

        |δ|legcd(q2)δ0r2q

        radicqlowast| errηχlowast(δ x)| ETηs = max

        |δ|les| errηχT (δ x)|

        Kr2 = (1 +radic

        2r)(log x)2|η|infin(2|Sη(0 x)|x+ (1 +radic

        2r)(log x)2|η|infinx)(1027)

        and Lrδ0 satisfies both

        Lrδ0 le 2|η|22sumqlerq odd

        micro2(q)

        φ(q)(1028)

        and

        Lrδ0 = 2|η|22sumqlerq odd

        micro2(q)

        φ(q)+Olowast(log r + 17) middot

        (2 |η|2 |η minus η|2 + |η minus η|22

        )

        +Olowast

        (2|η(3) |21

        5π6δ50

        )middot(

        064787 +log r

        4r+

        0425

        r

        )

        (1029)Here as elsewhere χlowast denotes the primitive character inducing χ whereas qlowast denotesthe modulus of χlowast

        The error term xrETηδ0r will be very small since it will be estimated using theRiemann zeta function the error term involving Kr2 will be completely negligibleThe term involving xr(r+1)E2

        ηrδ0 we see that it constrains us to have | errηχ(xN)|

        less than a constant times 1r if we do not want the main term in the bound (1026) tobe overwhelmed

        212 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        104 The integral over the major arcs conclusion

        There are at least two ways we can evaluate (104) One is to substitute (1010) into(104) The disadvantages here are that (a) this can give rise to pages-long formulae (b)this gives error terms proportional to xr| errηχ(xN)| meaning that to win we wouldhave to show that | errηχ(xN)| is much smaller than 1r What we will do instead isto use our `2 estimate (1026) in order to bound the contribution of non-principal termsThis will give us a gain of almost

        radicr on the error terms in other words to win it will

        be enough to show later that | errηχ(xN)| is much smaller than 1radicr

        The contribution of the error terms in Sη3(α x) (that is all terms involving thequantities errηχ in expressions (1012) and (1013)) to (104) is

        sumqlerq odd

        1

        φ(q)

        sumχ3 mod q

        τ(χ3)sum

        a mod q

        (aq)=1

        χ3(a)e(minusNaq)

        int δ0r2qx

        minus δ0r2qx

        Sη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

        +sumqle2rq even

        1

        φ(q)

        sumχ3 mod q

        τ(χ3)sum

        a mod q

        (aq)=1

        χ3(a)e(minusNaq)

        int δ0rqx

        minus δ0rqxSη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα

        (1030)

        We should also remember the terms in (1011) we can integrate them over all of RZand obtain that they contribute at most

        intRZ

        2

        3sumj=1

        prodjprime 6=j

        |Sηjprime (α x)| middotmaxqler

        sump|q

        log psumαge1

        ηj

        (pα

        x

        )dα

        le 2

        3sumj=1

        prodjprime 6=j

        |Sηjprime (α x)|2 middotmaxqler

        sump|q

        log psumαge1

        ηj

        (pα

        x

        )

        = 2sumn

        Λ2(n)η2+(nx) middot log r middotmax

        pler

        sumαge1

        ηlowast

        (pα

        x

        )

        + 4

        radicsumn

        Λ2(n)η2+(nx) middot

        sumn

        Λ2(n)η2lowast(nx) middot log r middotmax

        pler

        sumαge1

        ηlowast

        (pα

        x

        )

        by Cauchy-Schwarz and Plancherel

        104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 213

        The absolute value of (1030) is at most

        sumqlerq odd

        suma mod q

        (aq)=1

        int δ0r2qx

        minus δ0r2qx

        ∣∣Sη+(α+ aq x)∣∣2 dα middot max

        χ mod q

        |δ|leδ0r2q

        radicqlowast| errηlowastχlowast(δ x)|

        +sumqle2rq even

        suma mod q

        (aq)=1

        int δ0rqx

        minus δ0rqx

        ∣∣Sη+(α+ aq x)∣∣2 dα middot max

        χ mod q

        |δ|leδ0rq

        radicqlowast| errηlowastχlowast(δ x)|

        leintMδ0r

        ∣∣Sη+(α)∣∣2 dα middot max

        χ mod q

        qlermiddotgcd(q2)

        |δ|legcd(q2)δ0rq

        radicqlowast| errηlowastχlowast(δ x)|

        (1031)We can bound the integral of |Sη+(α)|2 by (1026)

        What about the contribution of the error part of Sη2(α x) We can obviouslyproceed in the same way except that to avoid double-counting Sη3(α x) needs tobe replaced by

        1

        φ(q)τ(χ0)η3(minusδ) middot x =

        micro(q)

        φ(q)η3(minusδ) middot x (1032)

        which is its main term (coming from (1012)) Instead of having an `2 norm as in(1031) we have the square-root of a product of two squares of `2 norms (by Cauchy-Schwarz) namely

        intM|Slowastη+(α)|2dα and

        sumqlerq odd

        micro2(q)

        φ(q)2

        int δ0r2qx

        minus δ0r2qx

        |ηlowast(minusαx)x|2 dα+sumqle2rq even

        micro2(q)

        φ(q)2

        int δ0rqx

        minus δ0rqx|ηlowast(minusαx)x|2 dα

        le x|ηlowast|22 middotsumq

        micro2(q)

        φ(q)2

        (1033)

        By (C9) the sum over q is at most 282643As for the contribution of the error part of Sη1(α x) we bound it in the same way

        using solely the `2 norm in (1033) (and replacing both Sη2(α x) and Sη3(α x) byexpressions as in (1032))

        The total of the error terms is thus

        x middot maxχ mod q

        qlermiddotgcd(q2)

        |δ|legcd(q2)δ0rq

        radicqlowast middot | errηlowastχlowast(δ x)| middotA

        + x middot maxχ mod q

        qlermiddotgcd(q2)

        |δ|legcd(q2)δ0rq

        radicqlowast middot | errη+χlowast(δ x)|(

        radicA+

        radicB+)

        radicBlowast

        (1034)

        where A = (1x)intM|Sη+(α x)|2dα (bounded as in (1026)) and

        Blowast = 282643|ηlowast|22 B+ = 282643|η+|22 (1035)

        214 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        In conclusion we have proven

        Proposition 1041 Let x ge 1 Let η+ ηlowast [0infin)rarr R Assume η+ isin C2 ηprimeprime+ isin L2

        and η+ ηlowast isin L1 cap L2 Let η [0infin) rarr R be thrice differentiable outside finitelymany points Assume η(3)

        isin L1 and |η+ minus η|2 le ε0|η|2 where ε0 ge 0Let Sη(α x) =

        sumn Λ(n)e(αn)η(nx) Let errηχ χ primitive be given as in

        (1012) and (1013) Let δ0 gt 0 r ge 1 Let M = Mδ0r be as in (105)Then for any N ge 0int

        M

        Sη+(α x)2Sηlowast(α x)e(minusNα)dα

        equals

        C0Cηηlowastx2 +

        282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

        |η(3) |21δ50

        r

        |ηlowast|1x2

        +Olowast(Eηlowastrδ0Aη+ + Eη+rδ0 middot 16812(radicAη+ + 16812|η+|2)|ηlowast|2) middot x2

        +Olowast(

        2Zη2+2(x)LSηlowast(x r) middot x+ 4radicZη2+2(x)Zη2lowast2(x)LSη+(x r) middot x

        )

        (1036)where

        C0 =prodp|N

        (1minus 1

        (pminus 1)2

        )middotprodp-N

        (1 +

        1

        (pminus 1)3

        )

        Cηηlowast =

        int infin0

        int infin0

        η(t1)η(t2)ηlowast

        (N

        xminus (t1 + t2)

        )dt1dt2

        (1037)

        Eηrδ0 = maxχ mod q

        qlegcd(q2)middotr|δ|legcd(q2)δ0r2q

        radicqlowast middot | errηχlowast(δ x)| ETηs = max

        |δ|lesq| errηχT (δ x)|

        Aη =1

        x

        intM

        ∣∣Sη+(α x)∣∣2 dα Lηrδ0 le 2|η|22

        sumqlerq odd

        micro2(q)

        φ(q)

        Kr2 = (1 +radic

        2r)(log x)2|η|infin(2Zη1(x)x+ (1 +radic

        2r)(log x)2|η|infinx)

        Zηk(x) =1

        x

        sumn

        Λk(n)η(nx) LSη(x r) = log r middotmaxpler

        sumαge1

        η

        (pα

        x

        )

        (1038)and errηχ is as in (1012) and (1013)

        Here is how to read these expressions The error term in the first line of (1036)will be small provided that ε0 is small and r is large The third line of (1036) willbe negligible as will be the term 2δ0r(log er)Kr2 in the definition of Aη (ClearlyZηk(x)η (log x)kminus1 and LSη(x q)η τ(q) log x for any η of rapid decay)

        104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 215

        It remains to estimate the second line of (1036) This includes estimating Aη ndasha task that was already accomplished in Lemma 1031 We see that we will have togive very good bounds for Eηrδ0 when η = η+ or η = ηlowast We also see that we wantto make C0Cη+ηlowastx

        2 as large as possible it will be competing not just with the errorterms here but more importantly with the bounds from the minor arcs which will beproportional to |η+|22|ηlowast|1

        216 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS

        Chapter 11

        Optimizing and adaptingsmoothing functions

        One of our goals is to maximize the quantity Cηηlowast in (1037) relative to |η|22|ηlowast|1One way to do this is to ensure that (a) ηlowast is concentrated on a very short1 interval [0 ε)(b) η is supported on the interval [0 2] and is symmetric around t = 1 meaning thatη(t) sim η(2minus t) Then for x sim N2 the integralint infin

        0

        int infin0

        η(t1)η(t2)ηlowast

        (N

        xminus (t1 + t2)

        )dt1dt2

        in (1037) should be approximately equal to

        |ηlowast|1 middotint infin

        0

        η(t)η

        (N

        xminus t)dt = |ηlowast|1 middot

        int infin0

        η(t)2dt = |ηlowast|1 middot |η|22 (111)

        provided that η0(t) ge 0 for all t It is easy to check (using Cauchy-Schwarz in thesecond step) that this is essentially optimal (We will redo this rigorously in a littlewhile)

        At the same time the fact is that major-arc estimates are best for smoothing func-tions η of a particular form and we have minor-arc estimates from Part I for a differentspecific smoothing η2 The issue then is how do we choose η and ηlowast as above so that

        bull ηlowast is concentrated on [0 ε)

        bull η is supported on [0 2] and symmetric around t = 1

        bull we can give minor-arc and major-arc estimates for ηlowast

        bull we can give major-arc estimates for a function η+ close to η in `2 norm

        1This is an idea appearing in work by Bourgain in a related context [Bou99]

        217

        218 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

        111 The symmetric smoothing function ηWe will later work with a smoothing function ηhearts whose Mellin transform decreasesvery rapidly Because of this rapid decay we will be able to give strong results basedon an explicit formula for ηhearts The issue is how to define η given ηhearts so that η issymmetric around t = 1 (ie η(2minus x) sim η(x)) and is very small for x gt 2

        We will later set ηhearts(t) = eminust22 Let

        h t 7rarr

        t3(2minus t)3etminus12 if t isin [0 2]0 otherwise

        (112)

        We define η Rrarr R by

        η(t) = h(t)ηhearts(t) =

        t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 otherwise

        (113)

        It is clear that η is symmetric around t = 1 for t isin [0 2]

        1111 The product η(t)η(ρminus t)We now should go back and redo rigorously what we discussed informally around(111) More precisely we wish to estimate

        η(ρ) =

        int infinminusinfin

        η(t)η(ρminus t)dt =

        int infinminusinfin

        η(t)η(2minus ρ+ t)dt (114)

        for ρ le 2 close to 2 In this it will be useful that the Cauchy-Schwarz inequalitydegrades slowly in the following sense

        Lemma 1111 Let V be a real vector space with an inner product 〈middot middot〉 Then forany v w isin V with |w minus v|2 le |v|22

        〈v w〉 = |v|2|w|2 +Olowast(271|v minus w|22)

        Proof By a truncated Taylor expansion

        radic1 + x = 1 +

        x

        2+x2

        2max

        0letle1

        1

        4(1minus (tx)2)32

        = 1 +x

        2+Olowast

        (x2

        232

        )for |x| le 12 Hence for δ = |w minus v|2|v|2

        |w|2|v|2

        =

        radic1 +

        2〈w minus v v〉+ |w minus v|22|v|22

        = 1 +2 〈wminusvv〉|v|22

        + δ2

        2+Olowast

        ((2δ + δ2)2

        232

        )= 1 + δ +Olowast

        ((1

        2+

        (52)2

        232

        )δ2

        )= 1 +

        〈w minus v v〉|v|22

        +Olowast(

        271|w minus v|22|v|22

        )

        112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS219

        Multiplying by |v|22 we obtain that

        |v|2|w|2 = |v|22 + 〈w minus v v〉+Olowast(271|w minus v|22

        )= 〈v w〉+Olowast

        (271|w minus v|22

        )

        Applying Lemma 1111 to (114) we obtain that

        (η lowast η)(ρ) =

        int infinminusinfin

        η(t)η((2minus ρ) + t)dt

        =

        radicint infinminusinfin|η(t)|2dt

        radicint infinminusinfin|η((2minus ρ) + t)|2dt

        +Olowast(

        271

        int infinminusinfin|η(t)minus η((2minus ρ) + t)|2 dt

        )= |η|22 +Olowast

        (271

        int infinminusinfin

        (int 2minusρ

        0

        |ηprime(r + t)| dr)2

        dt

        )

        = |η|22 +Olowast(

        271(2minus ρ)

        int 2minusρ

        0

        int infinminusinfin|ηprime(r + t)|2 dtdr

        )= |η|22 +Olowast(271(2minus ρ)2|ηprime|22)

        (115)

        We will be working with ηlowast supported on the non-negative reals we recall that ηis supported on [0 2] Henceint infin

        0

        int infin0

        η(t1)η(t2)ηlowast

        (N

        xminus (t1 + t2)

        )dt1dt2

        =

        int Nx

        0

        (η lowast η)(ρ)ηlowast

        (N

        xminus ρ)dρ

        =

        int Nx

        0

        (|η|22 +Olowast(271(2minus ρ)2|ηprime|22)) middot ηlowast(N

        xminus ρ)dρ

        = |η|22int N

        x

        0

        ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

        x

        0

        ((2minusNx) + ρ)2ηlowast(ρ)dρ

        )

        (116)provided that Nx ge 2 We see that it will be wise to set Nx very slightly larger than2 As we said before ηlowast will be scaled so that it is concentrated on a small interval[0 ε)

        112 The smoothing function ηlowast adapting minor-arcbounds

        Here the challenge is to define a smoothing function ηlowast that is good both for minor-arcestimates and for major-arc estimates The two regimes tend to favor different kinds of

        220 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

        smoothing function For minor-arc estimates we use as [Tao14] did

        η2(t) = 4 max(log 2minus | log 2t| 0) = ((2I[121]) lowastM (2I[121]))(t) (117)

        where I[121](t) is 1 if t isin [12 1] and 0 otherwise For major-arc estimates we willuse a function based on

        ηhearts = eminust22

        We will actually use here the function t2eminust22 whose Mellin transform isMηhearts(s+2)

        (by eg [BBO10 Table 111]))We will follow the simple expedient of convolving the two smoothing functions

        one good for minor arcs the other one for major arcs In general let ϕ1 ϕ2 [0infin)rarrC It is easy to use bounds on sums of the form

        Sfϕ1(x) =

        sumn

        f(n)ϕ1(nx) (118)

        to bound sums of the form Sfϕ1lowastMϕ2

        Sfϕ1lowastMϕ2=sumn

        f(n)(ϕ1 lowastM ϕ2)(nx

        )=

        int infin0

        sumn

        f(n)ϕ1

        ( n

        wx

        )ϕ2(w)

        dw

        w=

        int infin0

        Sfϕ1(wx)ϕ2(w)dw

        w

        (119)The same holds of course if ϕ1 and ϕ2 are switched since ϕ1 lowastM ϕ2 = ϕ2 lowastM ϕ1The only objection is that the bounds on (118) that we input might not be valid ornon-trivial when the argument wx of Sfϕ1

        (wx) is very small Because of this it isimportant that the functions ϕ1 ϕ2 vanish at 0 and desirable that their first derivativesdo so as well

        Let us see how this works out in practice for ϕ1 = η2 Here η2 [0infin) rarr R isgiven by

        η2 = η1 lowastM η1 = 4 max(log 2minus | log 2t| 0) (1110)

        where η1 = 2 middot I[121]Let us restate the bounds from Theorem 311 ndash the main result of Part I We will

        use Lemma C22 to bound terms of the form qφ(q)Let x ge x0 x0 = 216 middot 1020 Let 2α = aq + δx q le Q gcd(a q) = 1

        |δx| le 1qQ where Q = (34)x23 Then if 3 le q le x136 Theorem 311 givesus that

        |Sη2(α x)| le gx(

        max

        (1|δ|8

        )middot q)x (1111)

        where

        gx(r) =(Rx2r log 2r + 05)

        radicz(r) + 25radic

        2r+L2r

        r+ 336xminus16 (1112)

        112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS221

        with

        Rxt = 027125 log

        (1 +

        log 4t

        2 log 9x13

        2004t

        )+ 041415

        Lt = z(t2)

        (13

        4log t+ 782

        )+ 1366 log t+ 3755

        (1113)

        If q gt x136 then again by Theorem 311

        |Sη2(α x)| le h(x)x (1114)

        whereh(x) = 0276xminus16(log x)32 + 1234xminus13 log x (1115)

        We will work with x varying within a range and so we must pay some attentionto the dependence of (1111) and (1114) on x Let us prove two auxiliary lemmas onthis

        Lemma 1121 Let gx(r) be as in (1112) and h(x) as in (1115) Then

        x 7rarr

        h(x) if x lt (6r)3

        gx(r) if x ge (6r)3

        is a decreasing function of x for r ge 11 fixed and x ge 21

        Proof It is clear from the definitions that x 7rarr h(x) (for x ge 21) and x 7rarr gx(r) areboth decreasing Thus we simply have to show that h(xr) ge gxr (r) for xr = (6r)3Since xr ge (6 middot 11)3 gt e125

        Rxr2r le 027125 log(0065 log xr + 1056) + 041415

        le 027125 log((0065 + 00845) log xr) + 041415 le 027215 log log xr

        Hence

        Rxr2r log 2r + 05 le 027215 log log xr log x13r minus 027215 log 125 log 3 + 05

        le 009072 log log xr log xr minus 0255

        At the same time

        z(r) = eγ log logx

        13r

        6+

        250637

        log log rle eγ log log xr minus eγ log 3 + 19521

        le eγ log log xr

        (1116)

        for r ge 37 and we also get z(r) le eγ log log xr for r isin [11 37] by the bisectionmethod with 10 iterations Hence

        (Rxr2r log 2r + 05)radicz(r) + 25

        le (009072 log log xr log xr minus 0255)radiceγ log log xr + 25

        le 01211 log xr(log log xr)32 + 2

        222 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

        and so

        (Rxr2r log 2r + 05)radic

        z(r) + 25radic2r

        le (021 log xr(log log xr)32 + 347)xminus16

        r

        Now by (1116)

        L2r le eγ log log xr middot(

        13

        4log(x13

        r 3) + 782

        )+ 1366 log(x13

        r 3) + 3755

        le eγ log log xr middot(

        13

        12xr + 425

        )+ 456 log xr + 2255

        It is clear that

        425eγ log log xr + 456 log xr + 2255

        x13r 6

        lt 1234xminus13r log xr

        for xr ge e we make the comparison for xr = e and take the derivative of the ratio ofthe left side by the right side

        It remains to show that

        021 log xr(log log xr)32 + 347 + 336 +

        13

        2eγxminus13

        r log xr log log xr (1117)

        is less than 0276(log xr)32 for xr large enough Since t 7rarr (log t)32t12 is de-

        creasing for t gt e3 we see that

        021 log xr(log log xr)32 + 683 + 13

        2 eγxminus13r log xr log log xr

        0276(log xr)32lt 1

        for all xr ge e33 simply because it is true for x = e33 which is greater than ee3

        We conclude that h(xr) ge gxr (r) = gxr (x

        13r 6) for xr ge e33 We check that

        h(xr) ge gxr (x13r 6) for log xr isin [log 663 33] as well by the bisection method

        (applied with 30 iterations with log xr as the variable on the intervals [log 663 20][20 25] [25 30] and [30 33]) Since r ge 11 implies xr ge 663 we are done

        Lemma 1122 Let Rxr be as in (1112) Then t rarr Retr(r) is convex-up for t ge3 log 6r

        Proof Since trarr eminust6 and trarr t are clearly convex-up all we have to do is to showthat trarr Retr is convex-up In general since

        (log f)primeprime =

        (f prime

        f

        )prime=f primeprimef minus (f prime)2

        f2

        a function of the form (log f) is convex-up exactly when f primeprimef minus (f prime)2 ge 0 If f(t) =1 + a(tminus b) we have f primeprimef minus (f prime)2 ge 0 whenever

        (t+ aminus b) middot (2a) ge a2

        ie a2 + 2at ge 2ab and that certainly happens when t ge b In our case b =3 log(2004r9) and so t ge 3 log 6r implies t ge b

        112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS223

        Now we come to the point where we prove bounds on exponential sums of the formSηlowast(α x) (that is sums based on the smoothing ηlowast) based on our bounds (1111) and(1114) on the exponential sums Sη2(α x) This is straightforward as promised

        Proposition 1123 Let x ge Kx0 x0 = 216 middot 1020 K ge 1 Let Sη(α x) be asin (101) Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in L1

        Let 2α = aq+δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23If q le (xK)136 then

        Sηlowast(α x) le gxϕ(

        max

        (1|δ|8

        )q

        )middot |ϕ|1x (1118)

        where

        gxϕ(r) =(RxKϕ2r log 2r + 05)

        radicz(r) + 25radic

        2r+L2r

        r+ 336K16xminus16

        RxKϕt = Rxt + (RxKt minusRxt)Cϕ2K|ϕ|1

        logK(1119)

        with Rxt and Lt are as in (1113) and

        Cϕ2K = minusint 1

        1K

        ϕ(w) logw dw (1120)

        If q gt (xK)136 then

        |Sηlowast(α x)| le hϕ(xK) middot |ϕ|1x

        wherehϕ(x) = h(x) + Cϕ0K|ϕ|1

        Cϕ0K = 104488

        int 1K

        0

        |ϕ(w)|dw(1121)

        and h(x) is as in (1115)

        Proof By (119)

        Sηlowast(α x) =

        int 1K

        0

        Sη2(αwx)ϕ(w)dw

        w+

        int infin1K

        Sη2(αwx)ϕ(w)dw

        w

        We bound the first integral by the trivial estimate |Sη2(αwx)| le |Sη2(0 wx)| andCor C13 int 1K

        0

        |Sη2(0 wx)|ϕ(x)dw

        wle 104488

        int 1K

        0

        wxϕ(w)dw

        w

        = 104488x middotint 1K

        0

        ϕ(w)dw

        224 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

        Ifw ge 1K thenwx ge x0 and we can use (1111) or (1114) If q gt (xK)136then |Sη2(αwx)| le h(xK)wx by (1114) moreover |Sη2(α y)| le h(y)y forxK le y lt (6q)3 (by (1114)) and |Sη2(α y)| le gy1(r) for y ge (6q)3 (by (1111))Thus Lemma 1121 gives us thatint infin

        1K

        |Sη2(αwx)|ϕ(w)dw

        wleint infin

        1K

        h(xK)wx middot ϕ(w)dw

        w

        = h(xK)x

        int infin1K

        ϕ(w)dw le h(xK)|ϕ|1 middot x

        If q le (xK)136 we always use (1111) We can use the coarse boundint infin1K

        336xminus16 middot wx middot ϕ(w)dw

        wle 336K16|ϕ|1x56

        Since Lr does not depend on xint infin1K

        Lrrmiddot wx middot ϕ(w)

        dw

        wle Lr

        r|ϕ|1x

        By Lemma 1122 and q le (xK)136 y 7rarr Reyt is convex-up and decreasingfor y isin [log(xK)infin) Hence

        Rwxt le

        logwlog 1

        K

        RxKt +(

        1minus logwlog 1

        K

        )Rxt if w lt 1

        Rxt if w ge 1

        Thereforeint infin1K

        Rwxt middot wx middot ϕ(w)dw

        w

        leint 1

        1K

        (logw

        log 1K

        RxKt +

        (1minus logw

        log 1K

        )Rxt

        )xϕ(w)dw +

        int infin1

        Rxtϕ(w)xdw

        le Rxtx middotint infin

        1K

        ϕ(w)dw + (RxKt minusRxt)x

        logK

        int 1

        1K

        ϕ(w) logwdw

        le(Rxt|ϕ|1 + (RxKt minusRxt)

        Cϕ2logK

        )middot x

        where

        Cϕ2K = minusint 1

        1K

        ϕ(w) logw dw

        We finish by proving a couple more lemmas

        Lemma 1124 Let x gt K gt 1 Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) andϕ [0infin)rarr [0infin) is continuous and in L1 Let gxϕ be as in (1119)

        Then gxϕ(r) is a decreasing function of r for 670 le r le (xK)136

        112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS225

        Proof Taking derivatives we can easily see that

        r 7rarr log log r

        r r 7rarr log r

        r r 7rarr log r log log r

        r r 7rarr (log r)2 log log r

        r(1122)

        are decreasing for r ge 20 The same is true if log log r is replaced by z(r) sincez(r) log log r is a decreasing function for r ge e Since (Cϕ2K|ϕ|1) logK le 1(by (1120)) we see that it is enough to prove that r 7rarr Ry2r log 2r

        radiclog log r

        radic2r is

        decreasing on r for y = x and y = xK (under the assumption that r ge 670)Looking at (1113) and at (1122) we see that it remains only to check that

        r 7rarr log

        (1 +

        log 8r

        2 log 9y13

        4008r

        )log 2r middot

        radiclog log r

        r(1123)

        is decreasing on r for r ge 670 Taking logarithms and then derivatives we see that wehave to show that

        1r `+

        log 8rr

        2`2(1 + log 8r

        2`

        )log(

        1 + log 8r2`

        ) +1

        r log 2r+

        1

        2r log r log log rlt

        1

        2r

        where ` = log 9y13

        4008r We multiply by 2r and see that this is equivalent to

        1`

        (2minus 1

        1+ log 8r2`

        )log(

        1 + log 8r2`

        ) +2

        log 2r+

        1

        log r log log rlt 1 (1124)

        A derivative test is enough to show that s log(1 + s) is an increasing function of s fors gt 0 hence so is s middot (2minus 1(1 + s)) log(1 + s) Setting s = (log 8r)` we obtainthat the left side of (1124) is a decreasing function of ` for r ge 1 fixed

        Since r le y136 ` ge log 544008 gt 26 Thus for (1124) to hold it is enoughto ensure that

        126

        (2minus 1

        1+ log 8r52

        )log(

        1 + log 8r52

        ) +2

        log 2r+

        1

        log r log log rlt 1 (1125)

        A derivative test shows that (2 minus 1s) log(1 + s) is a decreasing function of s fors ge 123 since log(8 middot 75)52 gt 123 this implies that the left side of (1125) is adecreasing function of r for r ge 75

        We check that the left side of (1125) is indeed less than 1 for r = 670 we concludethat it is less than 1 for all r ge 670

        Lemma 1125 Let x ge 1025 Let φ [0infin) rarr [0infin) be continuous and in L1 Letgxφ(r) and h(x) be as in (1119) and (1115) respectively Then

        gxφ

        (3

        8x415

        )ge h(2x log x)

        226 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS

        Proof We can bound gxφ(r) from below by

        gmx(r) =(Rxr log 2r + 05)

        radicz(r) + 25radic

        2r

        Let r = (38)x415 Using the assumption that x ge 1025 we see that

        Rxr = 027125 log

        1 +log(

        3x415

        2

        )2 log

        (9

        2004middot 38middot x 1

        3minus415)+ 041415 ge 063368

        (1126)(It is easy to see that the left side of (1126) is increasing on x) Using x ge 1025 againwe get that

        z(r) = eγ log log r +250637

        log log rge 568721

        Since log 2r = (415) log x+ log(34) we conclude that

        gmx(r) ge 040298 log x+ 325765radic34 middot x215

        Recall that

        h(x) =0276(log x)32

        x16+

        1234 log x

        x13

        We can see that

        x 7rarr (log x+ 33)x215

        (log(2x log x))32(2x log x)16(1127)

        is increasing for x ge 1025 (and indeed for x ge e27) by taking the logarithm of theright side of (1127) and then taking its derivative with respect to t = log x We cansee in the same way that (1x215)(log(2x log x)(2x log x)13) is increasing forx ge e22 Since

        040298(log x+ 33)radic34 middot x215

        ge 0276(log(2x log x))32

        (2x log x)16

        325765minus 33 middot 040298radic34 middot x215

        ge 1234 log(2x log(x))

        (2x log(x))13

        for x = 1025 we are done

        Chapter 12

        The `2 norm and the large sieve

        Our aim here is to give a bound on the `2 norm of an exponential sum over the minorarcs While we care about an exponential sum in particular we will prove a result validfor all exponential sums S(α x) =

        sumn ane(αn) with an of prime support

        We start by adapting ideas from Ramarersquos version of the large sieve for primes toestimate `2 norms over parts of the circle (sect121) We are left with the task of givingan explicit bound on the factor in Ramarersquos work this we do in sect122 As a side effectthis finally gives a fully explicit large sieve for primes that is asymptotically optimalmeaning a sieve that does not have a spurious factor of eγ in front this was an arguablyimportant gap in the literature

        121 Variations on the large sieve for primes

        We are trying to estimate an integralintRZ |S(α)|3dα Instead of bounding it trivially by

        |S|infin|S|22 we can use the fact that large (ldquomajorrdquo) values of S(α) have to be multipliedonly by

        intM|S(α)|2dα where M is a union (small in measure) of major arcs Now

        can we give an upper bound forintM|S(α)|2dα better than |S|22 =

        intRZ |S(α)|2dα

        The first version of [Helb] gave an estimate on that integral using a technique due toHeath-Brown which in turn rests on an inequality of Montgomeryrsquos ([Mon71 (39)]see also eg [IK04 Lem 715]) The technique was communicated by Heath-Brownto the present author who communicated it to Tao who used it in his own notable workon sums of five primes (see [Tao14 Lem 46] and adjoining comments) We will beable to do better than that estimate here

        The role played by Montgomeryrsquos inequality in Heath-Brownrsquos method is playedhere by a result of Ramarersquos ([Ram09 Thm 21] see also [Ram09 Thm 52]) Thefollowing proposition is based on Ramarersquos result or rather on one possible proof ofit Instead of using the result as stated in [Ram09] we will actually be using elementsof the proof of [Bom74 Thm 7A] credited to Selberg Simply integrating Ramarersquosinequality would give a non-trivial if slightly worse bound

        227

        228 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        Proposition 1211 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

        radicx Let Q0 ge 1 δ0 ge 1 be such that

        δ0Q20 le x2 set Q =

        radicx2δ0 ge Q0 Let

        M =⋃qleQ0

        ⋃a mod q

        (aq)=1

        (a

        qminus δ0r

        qxa

        q+δ0r

        qx

        ) (121)

        Let S(α) =sumn ane(αn) for α isin RZ Thenint

        M

        |S(α)|2 dα le(

        maxqleQ0

        maxsleQ0q

        Gq(Q0sq)

        Gq(Qsq)

        )sumn

        |an|2

        where

        Gq(R) =sumrleR

        (rq)=1

        micro2(r)

        φ(r) (122)

        Proof By (121)intM

        |S(α)|2 dα =sumqleQ0

        int δ0Q0qx

        minus δ0Q0qx

        suma mod q

        (aq)=1

        ∣∣∣∣S (aq + α

        )∣∣∣∣2 dα (123)

        Thanks to the last equations of [Bom74 p 24] and [Bom74 p 25]

        suma mod q

        (aq)=1

        ∣∣∣∣S (aq)∣∣∣∣2 =

        1

        φ(q)

        sumqlowast|q

        (qlowastqqlowast)=1

        micro2(qqlowast)=1

        qlowast middotsumlowast

        χ mod qlowast

        ∣∣∣∣∣sumn

        anχ(n)

        ∣∣∣∣∣2

        for every q leradicx where we use the assumption that n is prime and gt

        radicx (and thus

        coprime to q) when an 6= 0 HenceintM

        |S(α)|2 dα =sumqleQ0

        sumqlowast|q

        (qlowastqqlowast)=1

        micro2(qqlowast)=1

        qlowastint δ0Q0

        qx

        minus δ0Q0qx

        1

        φ(q)

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        =sumqlowastleQ0

        qlowast

        φ(qlowast)

        sumrleQ0qlowast

        (rqlowast)=1

        micro2(r)

        φ(r)

        int δ0Q0qlowastrx

        minus δ0Q0qlowastrx

        sumlowast

        χ mod qlowast

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        =sumqlowastleQ0

        qlowast

        φ(qlowast)

        int δ0Q0qlowastx

        minus δ0Q0qlowastx

        sumrleQ0

        qlowast min(1δ0|α|x )

        (rqlowast)=1

        micro2(r)

        φ(r)

        sumlowast

        χ mod qlowast

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 229

        Here |α| le δ0Q0qlowastx implies (Q0q)δ0|α|x ge 1 Thereforeint

        M

        |S(α)|2 dα le(

        maxqlowastleQ0

        maxsleQ0qlowast

        Gqlowast(Q0sqlowast)

        Gqlowast(Qsqlowast)

        )middot Σ (124)

        where

        Σ =sumqlowastleQ0

        qlowast

        φ(qlowast)

        int δ0Q0qlowastx

        minus δ0Q0qlowastx

        sumrle Q

        qlowast min(1δ0|α|x )

        (rqlowast)=1

        micro2(r)

        φ(r)

        sumlowast

        χ mod qlowast

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        lesumqleQ

        q

        φ(q)

        sumrleQq(rq)=1

        micro2(r)

        φ(r)

        int δ0Qqrx

        minus δ0Qqrx

        sumlowast

        χ mod q

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        As stated in the proof of [Bom74 Thm 7A]

        χ(r)χ(n)τ(χ)cr(n) =

        qrsumb=1

        (bqr)=1

        χ(b)e2πin bqr

        for χ primitive of modulus q Here cr(n) stands for the Ramanujan sum

        cr(n) =sum

        u mod r(ur)=1

        e2πnur

        For n coprime to r cr(n) = micro(r) Since χ is primitive |τ(χ)| =radicq Hence for

        r leradicx coprime to q

        q

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        =

        ∣∣∣∣∣∣∣∣qrsumb=1

        (bqr)=1

        χ(b)S

        (b

        qr+ α

        )∣∣∣∣∣∣∣∣2

        Thus

        Σ =sumqleQ

        sumrleQq(rq)=1

        micro2(r)

        φ(rq)

        int δ0Qqrx

        minus δ0Qqrx

        sumlowast

        χ mod q

        ∣∣∣∣∣∣∣∣qrsumb=1

        (bqr)=1

        χ(b)S

        (b

        qr+ α

        )∣∣∣∣∣∣∣∣2

        lesumqleQ

        1

        φ(q)

        int δ0Qqx

        minus δ0Qqx

        sumχ mod q

        ∣∣∣∣∣∣∣∣qsumb=1

        (bq)=1

        χ(b)S

        (b

        q+ α

        )∣∣∣∣∣∣∣∣2

        =sumqleQ

        int δ0Qqx

        minus δ0Qqx

        qsumb=1

        (bq)=1

        ∣∣∣∣S ( bq + α

        )∣∣∣∣2 dα

        230 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        Let us now check that the intervals (bq minus δ0Qqx bq + δ0Qqx) do not overlapSince Q =

        radicx2δ0 we see that δ0Qqx = 12qQ The difference between two

        distinct fractions bq bprimeqprime is at least 1qqprime For q qprime le Q 1qqprime ge 12qQ+ 12QqprimeHence the intervals around bq and bprimeqprime do not overlap We conclude that

        Σ leintRZ|S(α)|2 =

        sumn

        |an|2

        and so by (124) we are done

        We will actually use Prop 1211 in the slightly modified form given by the follow-ing statement

        Proposition 1212 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

        radicx Let Q0 ge 1 δ0 ge 1 be such that

        δ0Q20 le x2 set Q =

        radicx2δ0 ge Q0 Let M = Mδ0Q0

        be as in (105)Let S(α) =

        sumn ane(αn) for α isin RZ Then

        intMδ0Q0

        |S(α)|2 dα le

        maxqle2Q0

        q even

        maxsle2Q0q

        Gq(2Q0sq)

        Gq(2Qsq)

        sumn

        |an|2

        where

        Gq(R) =sumrleR

        (rq)=1

        micro2(r)

        φ(r) (125)

        Proof By (105)intM

        |S(α)|2 dα =sumqleQ0

        q odd

        int δ0Q02qx

        minus δ0Q02qx

        suma mod q

        (aq)=1

        ∣∣∣∣S (aq + α

        )∣∣∣∣2 dα+sumqleQ0

        q even

        int δ0Q0qx

        minus δ0Q0qx

        suma mod q

        (aq)=1

        ∣∣∣∣S (aq + α

        )∣∣∣∣2 dαWe proceed as in the proof of Prop 1211 We still have (123) Hence

        intM|S(α)|2 dα

        equals

        sumqlowastleQ0

        qlowast odd

        qlowast

        φ(qlowast)

        int δ0Q02qlowastx

        minus δ0Q02qlowastx

        sumrleQ0

        qlowast min(1δ0

        2|α|x )(r2qlowast)=1

        micro2(r)

        φ(r)

        sumlowast

        χ mod qlowast

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        +sum

        qlowastle2Q0

        qlowast even

        qlowast

        φ(qlowast)

        int δ0Q0qlowastx

        minus δ0Q0qlowastx

        sumrle 2Q0

        qlowast min(1δ0

        2|α|x )(rqlowast)=1

        micro2(r)

        φ(r)

        sumlowast

        χ mod qlowast

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 231

        (The sum with q odd and r even is equal to the first sum hence the factor of 2 in front)Therefore int

        M

        |S(α)|2 dα le

        maxqlowastleQ0

        qlowast odd

        maxsleQ0qlowast

        G2qlowast(Q0sqlowast)

        G2qlowast(Qsqlowast)

        middot 2Σ1

        +

        maxqlowastle2Q0

        qlowast even

        maxsle2Q0qlowast

        Gqlowast(2Q0sqlowast)

        Gqlowast(2Qsqlowast)

        middot Σ2

        (126)

        where

        Σ1 =sumqleQq odd

        q

        φ(q)

        sumrleQq

        (r2q)=1

        micro2(r)

        φ(r)

        int δ0Q2qrx

        minus δ0Q2qrx

        sumlowast

        χ mod q

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        =sumqleQq odd

        q

        φ(q)

        sumrle2Qq

        (rq)=1

        r even

        micro2(r)

        φ(r)

        int δ0Qqrx

        minus δ0Qqrx

        sumlowast

        χ mod q

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        Σ2 =sumqle2Qq even

        q

        φ(q)

        sumrle2Qq

        (rq)=1

        micro2(r)

        φ(r)

        int δ0Qqrx

        minus δ0Qqrx

        sumlowast

        χ mod q

        ∣∣∣∣∣sumn

        ane(αn)χ(n)

        ∣∣∣∣∣2

        The two expressions within parentheses in (126) are actually equalMuch as before using [Bom74 Thm 7A] we obtain that

        Σ1 lesumqleQq odd

        1

        φ(q)

        int δ0Q2qx

        minus δ0Q2qx

        qsumb=1

        (bq)=1

        ∣∣∣∣S ( bq + α

        )∣∣∣∣2 dαΣ1 + Σ2 le

        sumqle2Qq even

        1

        φ(q)

        int δ0Qqx

        minus δ0Qqx

        qsumb=1

        (bq)=1

        ∣∣∣∣S ( bq + α

        )∣∣∣∣2 dαLet us now check that the intervals of integration (bq minus δ0Q2qx bq + δ0Q2qx)(for q odd) (bq minus δ0Qqx bq + δ0Qqx) (for q even) do not overlap Recall thatδ0Qqx = 12qQ The absolute value of the difference between two distinct fractionsbq bprimeqprime is at least 1qqprime For q qprime le Q odd this is larger than 14qQ + 14Qqprimeand so the intervals do not overlap For q le Q odd and qprime le 2Q even (or vice versa)1qqprime ge 14qQ + 12Qqprime and so again the intervals do not overlap If q le Qand qprime le Q are both even then |bq minus bprimeqprime| is actually ge 2qqprime Clearly 2qqprime ge12qQ+ 12Qqprime and so again there is no overlap We conclude that

        2Σ1 + Σ2 leintRZ|S(α)|2 =

        sumn

        |an|2

        232 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        122 Bounding the quotient in the large sieve for primesThe estimate given by Proposition 1211 involves the quotient

        maxqleQ0

        maxsleQ0q

        Gq(Q0sq)

        Gq(Qsq) (127)

        where Gq is as in (122) The appearance of such a quotient (at least for s = 1)is typical of Ramarersquos version of the large sieve for primes see eg [Ram09] Wewill see how to bound such a quotient in a way that is essentially optimal not justasymptotically but also in the ranges that are most relevant to us (This includes forexample Q0 sim 106 Q sim 1015)

        As the present work shows an approach based on Ramarersquos work gives bounds thatare in some contexts better than those of other large sieves for primes by a constantfactor (approaching eγ = 178107 ) Thus giving a fully explicit and nearly optimalbound for (127) is a task of clear general relevance besides being needed for our maingoal

        We will obtain bounds for Gq(Q0sq)Gq(Qsq) when Q0 le 2 middot 1010 Q ge Q20

        As we shall see our bounds will be best when s = q = 1 ndash or sometimes when s = 1and q = 2 instead

        Write G(R) for G1(R) =sumrleR micro

        2(r)φ(r) We will need several estimates forGq(R) and G(R) As stated in [Ram95 Lemma 34]

        G(R) le logR+ 14709 (128)

        for R ge 1 By [MV73 Lem 7]

        G(R) ge logR+ 107 (129)

        for R ge 6 There is also the trivial bound

        G(R) =sumrleR

        micro2(r)

        φ(r)=sumrleR

        micro2(r)

        r

        prodp|r

        (1minus 1

        p

        )minus1

        =sumrleR

        micro2(r)

        r

        prodp|r

        sumjge1

        1

        pjgesumrleR

        1

        rgt logR

        (1210)

        The following bound also well-known and easy

        G(R) le q

        φ(q)Gq(R) le G(Rq) (1211)

        can be obtained by multiplying Gq(R) =sumrleR(rq)=1 micro

        2(r)φ(r) term-by-term byqφ(q) =

        prodp|q(1 + 1φ(p))

        We will also use Ramarersquos estimate from [Ram95 Lem 34]

        Gd(R) =φ(d)

        d

        logR+ cE +sump|d

        log p

        p

        +Olowast(

        7284Rminus13f1(d))

        (1212)

        122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 233

        for all d isin Z+ and all R ge 1 where

        f1(d) =prodp|d

        (1 + pminus23)

        (1 +

        p13 + p23

        p(pminus 1)

        )minus1

        (1213)

        andcE = γ +

        sumpge2

        log p

        p(pminus 1)= 13325822 (1214)

        by [RS62 (211)]If R ge 182 then

        logR+ 1312 le G(R) le logR+ 1354 (1215)

        where the upper bound is valid for R ge 120 This is true by (1212) for R ge 4 middot 107we check (1215) for 120 le R le 4 middot 107 by a numerical computation1 Similarly forR ge 200

        logR+ 1661

        2le G2(R) le logR+ 1698

        2(1216)

        by (1212) for R ge 16 middot108 and by a numerical computation for 200 le R le 16 middot108Write ρ = (logQ0)(logQ) le 1 We obtain immediately from (1215) and (1216)

        thatG(Q0)

        G(Q)le logQ0 + 1354

        logQ+ 1312

        G2(Q0)

        G2(Q)le logQ0 + 1698

        logQ+ 1661

        (1217)

        for QQ0 ge 200 What is hard is to approximate Gq(Q0)Gq(Q) for q large and Q0

        smallLet us start by giving an easy bound off from the truth by a factor of about eγ

        (Specialists will recognize this as a factor that appears often in first attempts at esti-mates based on either large or small sieves) First we need a simple explicit lemma

        Lemma 1221 Let m ge 1 q ge 1 Thenprodp|qorplem

        p

        pminus 1le eγ(log(m+ log q) + 065771) (1218)

        Proof Let P =prodplemorp|q p Then by [RS75 (51)]

        P le qprodplem

        p = qesumplem log p le qe(1+ε0)m

        where ε0 = 0001102 Now by [RS62 (342)]

        n

        φ(n)le eγ log log n+

        250637

        log log nle eγ log log x+

        250637

        log log x

        1Using D Plattrsquos implementation [Pla11] of double-precision interval arithmetic based on Lambovrsquos[Lam08] ideas

        234 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        for all x ge n ge 27 (since given a b gt 0 the function t 7rarr a + bt is increasing on tfor t ge

        radicba) Hence if qem ge 27

        P

        φ(P)le eγ log((1 + ε0)m+ log q) +

        250637

        log(m+ log q)

        le eγ(

        log(m+ log q) + ε0 +250637eγ

        log(m+ log q)

        )

        Thus (1218) holds when m + log q ge 853 since then ε0 + (250637eγ) log(m +log q) le 065771 We verify all choices of m q ge 1 with m + log q le 853 compu-tationally the worst case is that of m = 1 q = 6 which give the value 065771 in(1218)

        Here is the promised easy bound

        Lemma 1222 Let Q0 ge 1 Q ge 182Q0 Let q le Q0 s le Q0q q an integer Then

        Gq(Q0sq)

        Gq(Qsq)leeγ log

        (Q0

        sq + log q)

        + 1172

        log QQ0

        + 1312le eγ logQ0 + 1172

        log QQ0

        + 1312

        Proof Let P =prodpleQ0sqorp|q p Then

        Gq(Q0sq)GP(QQ0) le Gq(Qsq)

        and soGq(Q0sq)

        Gq(Qsq)le 1

        GP(QQ0) (1219)

        Now the lower bound in (1211) gives us that for d = P R = QQ0

        GP(QQ0) ge φ(P)

        PG(QQ0)

        By Lem 1221

        P

        φ(P)le eγ

        (log

        (Q0

        sq+ log q

        )+ 0658

        )

        Hence using (1215) we get that

        Gq(Q0sq)

        Gq(Qsq)le Pφ(P)

        G(QQ0)leeγ log

        (Q0

        sq + log q)

        + 1172

        log QQ0

        + 1312 (1220)

        since QQ0 ge 184 Since(Q0

        sq+ log q

        )prime= minusQ0

        sq2+

        1

        q=

        1

        q

        (1minus Q0

        sq

        )le 0

        the rightmost expression of (1220) is maximal for q = 1

        122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 235

        Lemma 1222 will play a crucial role in reducing to a finite computation the prob-lem of bounding Gq(Q0sq)Gq(Qsq) As we will now see we can use Lemma1222 to obtain a bound that is useful when sq is large compared to Q0 ndash precisely thecase in which asymptotic estimates such as (1212) are relatively weak

        Lemma 1223 Let Q0 ge 1 Q ge 200Q0 Let q le Q0 s le Q0q Let ρ =(logQ0) logQ le 23 Then for any σ ge 1312ρ

        Gq(Q0sq)

        Gq(Qsq)le logQ0 + σ

        logQ+ 1312(1221)

        holds provided thatQ0

        sqle c(σ) middotQ(1minusρ)eminusγ

        0 minus log q

        where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

        Proof By Lemma 1222 we see that (1221) will hold provided that

        eγ log

        (Q0

        sq+ log q

        )+ 1172 le

        log QQ0

        + 1312

        logQ+ 1312middot (logQ0 + σ) (1222)

        The expression on the right of (1222) equals

        logQ0 + σ minus (logQ0 + σ) logQ0

        logQ+ 1312

        = (1minus ρ)(logQ0 + σ) +1312ρ(logQ0 + σ)

        logQ+ 1312

        ge (1minus ρ)(logQ0 + σ) + 1312ρ2

        and so (1222) will hold provided that

        eγ log

        (Q0

        sq+ log q

        )+ 1172 le (1minus ρ)(logQ0) + (1minus ρ)σ + 1312ρ2

        Taking derivatives we see that

        (1minus ρ)σ + 1312ρ2 minus 1172 ge(

        1minus σ

        2624

        )σ + 1312

        ( σ

        2624

        )2

        minus 1172

        = σ minus σ2

        4 middot 1312minus 1172

        Hence it is enough that

        Q0

        sq+ log q le ee

        minusγ(

        (1minusρ) logQ0+σminus σ2

        4middot1312minus1172)

        = c(σ) middotQ(1minusρ)eminusγ0

        where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))

        We now pass to the main result of the section

        236 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        Proposition 1224 Let Q ge 20000Q0 Q0 ge Q0min where Q0min = 105 Letρ = (logQ0) logQ Assume ρ le 06 Then for every 1 le q le Q0 and everys isin [1 Q0q]

        Gq(Q0sq)

        Gq(Qsq)le logQ0 + c+

        logQ+ cE (1223)

        where cE is as in (1214) and c+ = 136

        An ideal result would have c+ instead of cE but this is not actually possible errorterms do exist even if they are in reality smaller than the bound given in (1212) thismeans that a bound such as (1223) with c+ instead of cE would be false for q = 1s = 1

        There is nothing special about the assumptions

        Q ge 20000Q0 Q0 ge 105 (logQ0)(logQ) le 06

        They can all be relaxed at the cost of an increase in c+

        Proof Define errqR so that

        Gq(R) =φ(q)

        q

        logR+ cE +sump|q

        log p

        p

        + errqR (1224)

        Then (1223) will hold if

        logQ0

        sq+ cE +

        sump|q

        log p

        p+

        q

        φ(q)err

        qQ0sq

        le

        logQ

        sq+ cE +

        sump|q

        log p

        p+

        q

        φ(q)errq Qsq

        logQ0 + c+logQ+ cE

        (1225)

        This in turn happens iflog sq minussump|q

        log p

        p

        (1minus logQ0 + c+logQ+ cE

        )+ c+ minus cE

        ge q

        φ(q)

        (err

        qQ0sqminus logQ0 + c+

        logQ+ cEerrq Qsq

        )

        Defineω(ρ) =

        logQ0min + c+1ρ logQ0min + cE

        = ρ+c+ minus ρcE

        1ρ logQ0min + cE

        Then ρ le (logQ0 + c+)(logQ+ cE) le ω(ρ) (because c+ ge ρcE) We conclude that(1225) (and hence (1223)) holds provided that

        (1minus ω(ρ))

        log sq minussump|q

        log p

        p

        + c∆

        ge q

        φ(q)

        (err

        qQ0sq

        +ω(ρ) max(

        0minus errq Qsq

        ))

        (1226)

        122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 237

        where c∆ = c+ minus cE Note that 1minus ω(ρ) gt 0First let us give some easy bounds on the error terms these bounds will yield upper

        bounds for s By (128) and (1211)

        errqR leφ(q)

        q

        log q minussump|q

        log p

        p+ (14709minus cE)

        for R ge 1 by (1215) and (1211)

        errqR ge minusφ(q)

        q

        sump|q

        log p

        p+ (cE minus 1312)

        for R ge 182 Therefore the right side of (1226) is at most

        log q minus (1minus ω(ρ))sump|q

        log p

        p+ ((14709minus cE) + ω(ρ)(cE minus 1312))

        and so (1226) holds provided that

        (1minus ω(ρ)) log sq ge log q + (14709minus cE) + ω(ρ)(cE minus 1312)minus c∆ (1227)

        We will thus be able to assume from now on that (1227) does not hold or what is thesame that

        sq lt (cρ2q)1

        1minusω(ρ) (1228)

        holds where cρ2 = exp((14709minus cE) + ω(ρ)(cE minus 1312)minus c∆)What values of R = Q0sq must we consider for q given First by (1228) we

        can assume R gt Q0min(cρ2q)1(1minusω(ρ)) We can also assume

        R gt c(c+) middotmax(RqQ0min)(1minusρ)eminusγ minus log q (1229)

        for c(c+) is as in Lemma 1223 since all smaller R are covered by that LemmaClearly (1229) implies that

        R1minusτ gt c(c+) middot qτ minus log q

        Rτgt c(c+)qτ minus log q

        where τ = (1minusρ)eminusγ and also thatR gt c(c+)Q(1minusρ)eminusγ0min minus log q Iterating we obtain

        that we can assume that R gt $(q) where

        $(q) = max

        ($0(q) c(c+)Qτ0min minus log q

        Q0min

        (cρ2q)1

        1minusω(ρ)

        )(1230)

        and

        $0(q) =

        (c(c+)qτ minus log q

        (c(c+)qτminuslog q)τ

        1minusτ

        ) 11minusτ

        if c(c+)qτ gt log q + 1

        0 otherwise

        238 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        Looking at (1226) we see that it will be enough to show that for all R satisfyingR gt $(q) we have

        errqR +ω(ρ) max (0minus errqtR) le φ(q)

        qκ(q) (1231)

        for all t ge 20000 where

        κ(q) = (1minus ω(ρ))

        log q minussump|q

        log p

        p

        + c∆

        Ramarersquos bound (1212) implies that

        | errqR | le 7284Rminus13f1(q) (1232)

        with f1(q) as in (1213) and so

        errqR +ω(ρ) max (0minus errqtR) le (1 + βρ) middot 7284Rminus13f1(q)

        where βρ = ω(ρ)2000013 This is enough when

        R ge λ(q) =

        (q

        φ(q)

        7284(1 + βρ)f1(q)

        κ(q)

        )3

        (1233)

        It remains to do two things First we have to compute how large q has to be for$(q) to be guaranteed to be greater than λ(q) (For such q there is no checking to bedone) Then we check the inequality (1231) for all smaller q letting R range throughthe integers in [$(q) λ(q)] We bound errqtR using (1232) but we compute errqRdirectly

        How large must q be for $(q) gt λ(q) to hold We claim that $(q) gt λ(q)whenever q ge 22 middot 1010 Let us show this

        It is easy to see that (p(pminus1)) middotf1(p) and prarr (log p)p are decreasing functionsof p for p ge 3 moreover for both functions the value at p ge 7 is smaller than forp = 2 Hence we have that for q lt

        prodplep0 p p0 a prime

        κ(q) ge (1minus ω(ρ))

        (log q minus

        sumpltp0

        log p

        p

        )+ c∆ (1234)

        and

        λ(q) le

        prodpltp0

        p

        pminus 1middot

        7284(1 + βρ)prodpltp0

        f1(p)

        (1minus ω(ρ))(

        log q minussumpltp0

        log pp

        )+ c∆

        3

        (1235)

        If we also assume that 2 middot 3 middot 5 middot 7 - q we obtain

        κ(q) ge (1minus ω(ρ))

        log q minussumpltp0p 6=7

        log p

        p

        + c∆ (1236)

        122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 239

        and

        λ(q) le

        prodpltp0p 6=7

        p

        pminus 1middot

        7284(1 + βρ)prodpltp0p 6=7 f1(p)

        (1minus ω(ρ))(

        log q minussumpltp0p6=7

        log pp

        )+ c∆

        3

        (1237)

        for q ltprodplep0 (We are taking out 7 because it is the ldquoleast helpfulrdquo prime to omit

        among all primes from 2 to 7 again by the fact that (p(p minus 1)) middot f1(p) and p rarr(log p)p are decreasing functions for p ge 3)

        We know how to give upper bounds for the expression on the right of (1235)The task is in essence simple we can base our bounds on the classic explicit work in[RS62] except that we also have to optimize matters so that they are close to tight forp1 = 29 p1 = 31 and other low p1

        By [RS62 (330)] and a numerical computation for 29 le p1 le 43prodplep1

        p

        pminus 1lt 190516 log p1

        for p1 ge 29 Since ω(ρ) is increasing on ρ and we are assuming ρ le 06 Q0min =100000

        ω(ρ) le 0627312 βρ le 0023111

        For x gt a where a gt 1 is any constant we obviously havesumaltplex

        log(

        1 + pminus23)le

        sumaltplex

        (log p)pminus23

        log a

        by Abel summation (133) and the estimate [RS62 (332)] for θ(x) =sumplex log psum

        altplex

        (log p)pminus23 = (θ(x)minus θ(a))xminus23 minus

        int x

        a

        (θ(u)minus θ(a))

        (minus2

        3uminus

        53

        )du

        le (101624xminus θ(a))xminus23 +

        2

        3

        int x

        a

        (101624uminus θ(a))uminus53 du

        = (101624xminus θ(a))xminus23 + 2 middot 101624(x13 minus a13) + θ(a)(xminus23 minus aminus23)

        = 3 middot 101624 middot x13 minus (203248a13 + θ(a)aminus23)

        We conclude thatsum

        104ltplex log(1 + pminus23) le 033102x13 minus 706909 for x gt 104Since

        sumple104 log p le 1009062 this means thatsum

        plex

        log(1 + pminus23) le(

        033102 +1009062minus 706909

        1043

        )x13 le 047126x13

        for x gt 104 a direct computation for all x prime between 29 and 104 then confirmsthat sum

        plex

        log(1 + pminus23) le 074914x13

        240 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        for all x ge 29 Thusprodplex

        f1(p) le esumplex log(1+pminus23)prod

        ple29

        (1 + p13+p23

        p(pminus1)

        ) le e074914x13

        662365

        for x ge 29 Finally by [RS62 (324)]sumplep1

        log pp lt log p1

        We conclude that for q ltprodplep0 p0 p0 a prime and p1 the prime immediately

        preceding p0

        λ(q) le

        190516 log p1 middot745235 middot

        (e074914p

        131

        662365

        )037268(log q minus log p1) + 002741

        3

        le 190272(log p1)3e224742p131

        (log q minus log p1 + 007354)3

        (1238)

        It is clear from (1230) that $(q) is increasing as soon as

        q ge max(Q0min Q1minusω(ρ)0min cρ2)

        and c(c+)qτ gt log q+ 1 since then $0(q) is increasing and $(q) = $0(q) Here it isuseful to recall that cρ2 ge exp(14709 minus c+) and to note that c(c+)qτ minus (log q + 1)is increasing for q ge 1(τ middot c(c+))1τ we see also that 1(τ middot c(c+))1τ le 1((1 minus06)eminusγc(c+))1((1minus06)eminusγ) for ρ le 06 A quick computation for our value of c+makes us conclude that q gt 112Q0min = 112000 is a sufficient condition for $(q) tobe equal to $0(q) and for $0(q) to be increasing

        Since (1238) is decreasing on q for p1 fixed and $0(q) is decreasing on ρ andincreasing on q we set ρ = 06 and check that then

        $0

        (22 middot 1010

        )ge 846765

        whereas by (1238)

        λ(22 middot 1010) le 838227 lt 846765

        this is enough to ensure that λ(q) lt $0(q) for 22 middot 1010 le q ltprodple31 p

        Let us now give some rough bounds that will be enough to cover the case q geprodple31 p First as we already discussed $(q) = $0(q) and since c(c+)qτ gt log q +

        1

        $0(q) ge (c(c+)qτ minus log q)1

        1minusτ ge (0911q0224 minus log q)1289 ge q02797 (1239)

        by q geprodple31 p We are in the range

        prodplep1 p le q le

        prodplep0 p where p1 lt p0

        are two consecutive primes with p1 ge 31 By [RS62 (316)] and a computation for31 le q lt 200 we know that log q ge

        prodplep1 log p ge 08009p1 By (1238) and

        (1239) it follows that we just have to show that

        e0224t gt190272(log t)3e224742t13

        (08009tminus log t+ 007354)3

        122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 241

        for t ge 31 Now t ge 31 implies 08009tminus log t+ 007354 ge 06924t and so takinglogarithms we see that we just have to verify

        0224tminus 224742t13 gt 3 log log tminus 3 log t+ 63513 (1240)

        for t ge 31 and since the left side is increasing and the right side is decreasing fort ge 31 this is trivial to check

        We conclude that $(q) gt λ(q) whenever q ge 22 middot 1010It remains to see how we can relax this assumption if we assume that 2 middot 3 middot 5 middot 7 - q

        We repeat the same analysis as before using (1236) and (1237) instead of (1234) and(1235) For p1 ge 29

        prodplep1p 6=7

        p

        pminus 1lt 1633 log p1

        prodplep1p6=7

        f1(p) le e074914x13minuslog(1+7minus23)

        58478le e074914x13

        744586

        andsumplep1p 6=7(log p)p lt log p1minus (log 7)7 So for q lt

        prodplep0p 6=7 p and p1 ge 29

        the prime immediately preceding p0

        λ(q) le

        1633 log p1 middot745235 middot

        (e074914p

        131

        744586

        )037268

        (log q minus log p1 + log 7

        7

        )+ 002741

        3

        le 84351(log p1)3e224742p131

        (log q minus log p1 + 035152)3

        Thus we obtain just like before that

        $0(33 middot 109) ge 477465 λ(33 middot 109) le 475513 lt 477465

        We also check that $0(q0) ge 916322 is greater than λ(q0) le 429731 for q0 =prodple31p 6=7 p The analysis for q ge

        prodple37p 6=7 p is also just like before since log q ge

        08009p1 minus log 7 we have to show that

        e0224t

        7gt

        84351(log t)3e224742t13

        (08009tminus log t+ 007354)3

        for t ge 37 and that in turn follows from

        0224tminus 224742t13 gt 3 log log tminus 3 log t+ 674849

        which we check for t ge 37 just as we checked (1240)We conclude that $(q) gt λ(q) if q ge 33 middot 109 and 210 - qComputation Now for q lt 33middot109 (and also for 33middot109 le q lt 22middot1010 210|q)

        we need to check that the maximum mqR1 of errqR over all $(q) le R lt λ(q)satisfies (1231) Note that there is a term errqtR in (1231) we bound it using (1232)

        Since logR is increasing on R and Gq(R) depends only on bRc we can tell from(1224) that since we are taking the maximum of errqR it is enough to check integer

        242 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        values of R We check all integers R in [$(q) λ(q)) for all q lt 33 middot 109 (and all33 middot 109 le q lt 22 middot 1010 210|q) by an explicit computation2

        Finally we have the trivial bound

        Gq(Q0sq)

        Gq(Qsq)le 1 (1241)

        which we shall use for Q0 close to Q

        Corollary 1225 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le

        radicx Let Q0 ge 105 δ0 ge 1 be such that

        (20000Q0)2 le x2δ0 set Q =radicx2δ0

        Let S(α) =sumn ane(αn) for α isin RZ Let M as in (121) Then if Q0 le Q06int

        M

        |S(α)|2 dα le logQ0 + c+logQ+ cE

        sumn

        |an|2

        where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

        Let Mδ0Q0 as in (105) Then if (2Q0) le (2Q)06intMδ0Q0

        |S(α)|2 dα le log 2Q0 + c+log 2Q+ cE

        sumn

        |an|2 (1242)

        Here of courseintRZ |S(α)|2 dα =

        sumn |an|2 (Plancherel) If Q0 gt Q06 we will

        use the trivial boundintMδ0r

        |S(α)|2 dα leintRZ|S(α)|2 dα =

        sumn

        |an|2 (1243)

        Proof Immediate from Prop 1211 Prop 1212 and Prop 1224

        Obviously one can also give a statement derived from Prop 1211 the resultingbound is int

        M

        |S(α)|2dα le logQ0 + c+logQ+ cE

        sumn

        |an|2

        where M is as in (121)We also record the large-sieve form of the result

        2This is by far the heaviest computation in the present work though it is still rather minor (about twoweeks of computing on a single core of a fairly new (2010) desktop computer carrying out other tasks as wellthis is next to nothing compared to the computations in [Plab] or even those in [HP13]) For the applicationshere we could have assumed ρ le 815 and that would have reduced computation time drastically thelighter assumption ρ le 06 was made with views to general applicability in the future As elsewhere in thissection numerical computations were carried out by the author in C all floating-point operations used DPlattrsquos interval arithmetic package

        122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 243

        Corollary 1226 Let N ge 1 Let aninfinn=1 an isin C be supported on the integersn le N Let Q0 ge 105 Q ge 20000Q0 Assume that an = 0 for every n for whichthere is a p le Q dividing n

        Let S(α) =sumn ane(αn) for α isin RZ Then if Q0 le Q06sum

        qleQ0

        suma mod q

        (aq)=1

        |S(aq)|2 dα le logQ0 + c+logQ+ cE

        middot (N +Q2)sumn

        |an|2

        where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822

        Proof Proceed as Ramare does in the proof of [Ram09 Thm 52] with Kq = a isinZqZ (a q) = 1 and un = an) in particular apply [Ram09 Thm 21] The proofof [Ram09 Thm 52] shows thatsum

        qleQ0

        suma mod q

        (aq)=1

        |S(aq)|2 dα le maxqleQ0

        Gq(Q0)

        Gq(Q)middotsumqleQ0

        suma mod q

        (aq)=1

        |S(aq)|2 dα

        Now instead of using the easy inequalityGq(Q0)Gq(Q) le G1(Q0)G1(QQ0) useProp 1224

        It would seem desirable to prove a result such as Prop 1224 (or Cor 1225 orCor 1226) without computations and with conditions that are as weak as possibleSince as we said we cannot make c+ equal to cE and since c+ does have to increasewhen the conditions are weakened (as is shown by computations this is not an arti-fact of our method of proof) the right goal might be to show that the maximum ofGq(Q0sq)Gq(Qsq) is reached when s = q = 1

        However this is also untrue without conditions For instance for Q0 = 2 and Qlarge the value of Gq(Q0q)Gq(Qq) at q = 2 is larger than at q = 1 by (1212)

        G2

        (Q0

        2

        )G2

        (Q2

        ) sim 1

        12

        (log Q

        2 + cE + log 22

        )=

        2

        logQ+ cE minus log 22

        gt2

        logQ+ cEsim G(Q0)

        G(Q)

        Thus at the very least a lower bound on Q0 is needed as a condition This also dimsthe hopes somewhat for a combinatorial proof of Gq(Q0q)G(Q) le Gq(Qq)G(Q0)at any rate while such a proof would be welcome it could not be extremely straightfor-ward since there are terms in Gq(Q0q)G(Q) that do not appear in Gq(Qq)G(Q0)

        244 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE

        Chapter 13

        The integral over the minor arcs

        The time has come to bound the part of our triple-product integral (103) that comesfrom the minor arcs m sub RZ We have an `infin estimate (from Prop 1123 based onTheorem 311) and an `2 estimate (from sect122) Now we must put them together

        There are two ways in which we must be careful A trivial bound of the form`33 =

        int|S(α)|3dα le `22 middot `infin would introduce a fatal factor of log x coming from `2

        We avoid this by using the fact that we have `2 estimates over Mδ0Q0for varying Q0

        We must also remember to substract the major-arc contribution from our estimatefor Mδ0Q0 this is why we were careful to give a lower bound in Lem 1031 asopposed to just the upper bound (1028)

        131 Putting together `2 bounds over arcs and `infin bounds

        Let us start with a simple lemma ndash essentially a way to obtain upper bounds by meansof summation by parts

        Lemma 1311 Let f g a a+ 1 b rarr R+0 where a b isin Z+ Assume that for

        all x isin [a b] sumalenlex

        f(n) le F (x) (131)

        where F [a b]rarr R is continuous piecewise differentiable and non-decreasing Then

        bsumn=a

        f(n) middot g(n) le (maxngea

        g(n)) middot F (a) +

        int b

        a

        (maxngeu

        g(n)) middot F prime(u)du

        Proof Let S(n) =sumnm=a f(m) Then by partial summation

        bsumn=a

        f(n) middot g(n) le S(b)g(b) +bminus1sumn=a

        S(n)(g(n)minus g(n+ 1)) (132)

        245

        246 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

        Let h(x) = maxxlenleb g(n) Then h is non-increasing Hence (131) and (132) implythat

        bsumn=a

        f(n)g(n) lebsum

        n=a

        f(n)h(n)

        le S(b)h(b) +

        bminus1sumn=a

        S(n)(h(n)minus h(n+ 1))

        le F (b)h(b) +

        bminus1sumn=a

        F (n)(h(n)minus h(n+ 1))

        In general for αn isin C A(x) =sumalenlex αn and F continuous and piecewise differ-

        entiable on [a x]sumalenlex

        αnF (x) = A(x)F (x)minusint x

        a

        A(u)F prime(u)du (Abel summation) (133)

        Applying this with αn = h(n)minush(n+1) andA(x) =sumalenlex αn = h(a)minush(bxc+

        1) we obtain

        bminus1sumn=a

        F (n)(h(n)minus h(n+ 1))

        = (h(a)minus h(b))F (bminus 1)minusint bminus1

        a

        (h(a)minus h(buc+ 1))F prime(u)du

        = h(a)F (a)minus h(b)F (bminus 1) +

        int bminus1

        a

        h(buc+ 1)F prime(u)du

        = h(a)F (a)minus h(b)F (bminus 1) +

        int bminus1

        a

        h(u)F prime(u)du

        = h(a)F (a)minus h(b)F (b) +

        int b

        a

        h(u)F prime(u)du

        since h(buc+ 1) = h(u) for u isin Z Hence

        bsumn=a

        f(n)g(n) le h(a)F (a) +

        int b

        a

        h(u)F prime(u)du

        We will now see our main application of Lemma 1311 We have to bound anintegral of the form

        intMδ0r

        |S1(α)|2|S2(α)|dα where Mδ0r is a union of arcs defined

        as in (105) Our inputs are (a) a bound on integrals of the formintMδ0r

        |S1(α)|2dα (b)a bound on |S2(α)| for α isin (RZ)Mδ0r The input of type (a) is what we derived insect121 and sect122 the input of type (b) is a minor-arcs bound and as such was the mainsubject of Part I

        131 PUTTING TOGETHER `2 BOUNDS OVER ARCS AND `infin BOUNDS 247

        Proposition 1312 Let S1(α) =sumn ane(αn) an isin C an in L1 Let S2 RZrarr

        C be continuous Define Mδ0r as in (105)Let r0 be a positive integer not greater than r1 Let H [r0 r1] rarr R+ be a

        continuous piecewise differentiable non-decreasing function such that

        1sum|an|2

        intMδ0r+1

        |S1(α)|2dα le H(r) (134)

        for some δ0 le x2r21 and all r isin [r0 r1] Assume moreover that H(r1) = 1 Let

        g [r0 r1]rarr R+ be a non-increasing function such that

        maxαisin(RZ)Mδ0r

        |S2(α)| le g(r) (135)

        for all r isin [r0 r1] and δ0 as aboveThen

        1sumn |an|2

        int(RZ)Mδ0r0

        |S1(α)|2|S2(α)|dα

        le g(r0) middot (H(r0)minus I0) +

        int r1

        r0

        g(r)H prime(r)dr

        (136)

        whereI0 =

        1sumn |an|2

        intMδ0r0

        |S1(α)|2dα (137)

        The condition δ0 le x2r21 is there just to ensure that the arcs in the definition of

        Mδ0r do not overlap for r le r1

        Proof For r0 le r lt r1 let

        f(r) =1sum

        n |an|2

        intMδ0r+1Mδ0r

        |S1(α)|2dα

        Letf(r1) =

        1sumn |an|2

        int(RZ)Mδ0r1

        |S1(α)|2dα

        Then by (135)

        1sumn |an|2

        int(RZ)Mδ0r0

        |S1(α)|2|S2(α)|dα ler1sumr=r0

        f(r)g(r)

        By (134)sumr0lerlex

        f(r) =1sum

        n |an|2

        intMδ0x+1Mδ0r0

        |S1(α)|2dα

        =

        (1sum

        n |an|2

        intMδ0x+1

        |S1(α)|2dα

        )minus I0 le H(x)minus I0

        (138)

        248 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

        for x isin [r0 r1) Moreoversumr0lerler1

        f(r) =1sum

        n |an|2

        int(RZ)Mδ0r0

        |S1(α)|2

        =

        (1sum

        n |an|2

        intRZ|S1(α)|2

        )minus I0 = 1minus I0 = H(r1)minus I0

        We let F (x) = H(x) minus I0 and apply Lemma 1311 with a = r0 b = r1 Weobtain that

        r1sumr=r0

        f(r)g(r) le (maxrger0

        g(r))F (r0) +

        int r1

        r0

        (maxrgeu

        g(r))F prime(u) du

        le g(r0)(H(r0)minus I0) +

        int r1

        r0

        g(u)H prime(u) du

        132 The minor-arc totalWe now apply Prop 1312 Inevitably the main statement involves some integrals thatwill have to be evaluated at the end of the section

        Theorem 1321 Let x ge 1025 middot κ where κ ge 1 Let

        Sη(α x) =sumn

        Λ(n)e(αn)η(nx) (139)

        Let ηlowast(t) = (η2 lowastM ϕ)(κt) where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in `1 Let η+ [0infin)rarr [0infin) be a bounded piecewise differentiablefunction with limtrarrinfin η+(t) = 0 Let Mδ0r be as in (105) with δ0 = 8 Let 105 ler0 lt r1 where r1 = (38)(xκ)415 Let g(r) = gxκϕ(r) where

        gyϕ(r) =(RyKϕ2r log 2r + 05)

        radicz(r) + 25radic

        2r+L2r

        r+ 336K16yminus16 (1310)

        just as in (1119) and K = log(xκ)2 Here RyKφt is as in (1119) and Lt is asin (1113)

        Denote

        Zr0 =

        int(RZ)M8r0

        |Sηlowast(α x)||Sη+(α x)|2dα

        Then

        Zr0 le

        (radic|ϕ|1xκ

        (M + T ) +radicSηlowast(0 x) middot E

        )2

        132 THE MINOR-ARC TOTAL 249

        where

        S =sumpgtradicx

        (log p)2η2+(nx)

        T = Cϕ3

        (1

        2log

        x

        κ

        )middot (S minus (

        radicJ minusradicE)2)

        J =

        intM8r0

        |Sη+(α x)|2 dα

        E =((Cη+0 + Cη+2) log x+ (2Cη+0 + Cη+1)

        )middot x12

        (1311)

        Cη+0 = 07131

        int infin0

        1radict(suprget

        η+(r))2dt

        Cη+1 = 07131

        int infin1

        log tradict

        (suprget

        η+(r))2dt

        Cη+2 = 051942|η+|2infin

        Cϕ3(K) =104488

        |ϕ|1

        int 1K

        0

        |ϕ(w)|dw

        (1312)

        and

        M = g(r0) middot(

        log(r0 + 1) + c+

        logradicx+ cminus

        middot S minus (radicJ minusradicE)2

        )+

        (2

        log x+ 2cminus

        int r1

        r0

        g(r)

        rdr +

        (7

        15+minus214938 + 8

        15 logκlog x+ 2cminus

        )g(r1)

        )middot S

        (1313)where c+ = 20532 and cminus = 06394

        Proof Let y = xκ Let Q = (34)y23 as in Thm 311 (applied with y insteadof x) Let α isin (RZ) M8r where r0 le r le y136 and y is used instead ofx to define M8r (see (105)) There exists an approximation 2α = aq + δy withq le Q |δ|y le 1qQ Thus α = aprimeqprime + δ2y where either aprimeqprime = a2q oraprimeqprime = (a + q)2q holds (In particular if qprime is odd then qprime = q if qprime is even thenqprime = 2q)

        There are three cases

        1 q le r Then either (a) qprime is odd and qprime le r or (b) qprime is even and qprime le 2rSince α is not in M8r then by definition (105) |δ|2y ge δ0r2qy and so|δ| ge δ0rq = 8rq In particular |δ| ge 8

        Thus by Prop 1123

        |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(|δ|8q

        )middot|ϕ|1y le gyϕ(r)middot|ϕ|1y (1314)

        where we use the fact that g(r) is a non-increasing function (Lemma 1124)

        250 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

        2 r lt q le y136 Then by Prop 1123 and Lemma 1124

        |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le gyϕ(

        max

        (|δ|8 1

        )q

        )middot |ϕ|1y

        le gyϕ(r) middot |ϕ|1y(1315)

        3 q gt y136 Again by Prop 1123

        |Sηlowast(α x)| = |Sη2lowastMφ(α y)| le(h( yK

        )+ Cϕ3(K)

        )|ϕ|1y (1316)

        where h(x) is as in (1115) (Of course Cϕ3(K) as in (1312) is equal toCϕ0K|φ|1 where Cϕ0K is as in (1121)) We set K = (log y)2 Sincey = xκ ge 1025 it follows that yK = 2y log y gt 347 middot 1023 gt 216 middot 1020

        Let

        r1 =3

        8y415 g(r) =

        gyϕ(r) if r le r1

        gyϕ(r1) if r gt r1

        By Lemma 1124 for r ge 670 g(r) is a non-increasing function and g(r) ge gyφ(r)Moreover by Lemma 1125 gyφ(r1) ge h(2y log y) where h is as in (1115) and sog(r) ge h(2y log y) for all r ge r0 ge 670 Thus we have shown that

        |Sηlowast(y α)| le(g(r) + Cϕ3

        (log y

        2

        ))middot |ϕ|1y (1317)

        for all α isin (RZ) M8rWe first need to undertake the fairly dull task of getting non-prime or small n out

        of the sum defining Sη+(α x) Write

        S1η+(α x) =sumpgtradicx

        (log p)e(αp)η+(px)

        S2η+(α x) =sum

        n non-primengtradicx

        Λ(n)e(αn)η+(nx) +sumnleradicx

        Λ(n)e(αn)η+(nx)

        By the triangle inequality (with weights |Sη+(α x)|)radicint(RZ)M8r0

        |Sηlowast(α x)||Sη+(α x)|2dα

        le2sumj=1

        radicint(RZ)M8r0

        |Sηlowast(α x)||Sjη+(α x)|2dα

        132 THE MINOR-ARC TOTAL 251

        Clearlyint(RZ)M8r0

        |Sηlowast(α x)||S2η+(α x)|2dα

        le maxαisinRZ

        |Sηlowast(α x)| middotintRZ|S2η+(α x)|2dα

        leinfinsumn=1

        Λ(n)ηlowast(nx) middot

        sumn non-prime

        Λ(n)2η+(nx)2 +sumnleradicx

        Λ(n)2η+(nx)2

        Let η+(z) = suptgez η+(t) Since η+(t) tends to 0 as t rarr infin so does η+ By [RS62Thm 13] partial summation and integration by partssum

        n non-prime

        Λ(n)2η+(nx)2 lesum

        n non-prime

        Λ(n)2η+(nx)2

        le minusint infin

        1

        sumnlet

        n non-prime

        Λ(n)2

        (η+2(tx)

        )primedt

        le minusint infin

        1

        (log t) middot 14262radict(η+

        2(tx))primedt

        le 07131

        int infin1

        log e2tradictmiddot η+

        2

        (t

        x

        )dt

        =

        (07131

        int infin1x

        2 + log txradict

        η+2(t)dt

        )radicx

        while by [RS62 Thm 12]sumnleradicx

        Λ(n)2η+(nx)2 le 1

        2|η+|2infin(log x)

        sumnleradicx

        Λ(n)

        le 051942|η+|2infin middotradicx log x

        This shows thatint(RZ)M8r0

        |Sηlowast(α x)||S2η+(α x)|2dα leinfinsumn=1

        Λ(n)ηlowast(nx) middot E = Sηlowast(0 x) middot E

        where E is as in (1311)It remains to boundint

        (RZ)M8r0

        |Sηlowast(α x)||S1η+(α x)|2dα (1318)

        We wish to apply Prop 1312 Corollary 1225 gives us an input of type (134) wehave just derived a bound (1317) that provides an input of type (135) More precisely

        252 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

        by (1242) (134) holds with

        H(r) =

        log(r+1)+c+

        logradicx+cminus

        if r lt r1

        1 if r ge r1

        where c+ = 20532 gt log 2 + 136 and cminus = 06394 lt log(1radic

        2 middot 8) + log 2 +13325822 (We can apply Corollary 1225 because 2(r1 + 1) = (34)x415 + 2 le(2radicx16)06 for x ge 1025 (or even for x ge 100000)) Since r1 = (38)y415 and

        x ge 1025 middot κ

        limrrarrr+1

        H(r)minus limrrarrrminus1

        H(r) = 1minus log((38)(xκ)415 + 1) + c+

        logradicx+ cminus

        le 1minus(

        415

        12+

        log 38 + c+ minus 4

        15 logκ minus 815cminus

        logradicx+ cminus

        )le 7

        15+minus214938 + 8

        15 logκlog x+ 2cminus

        We also have (135) with (g(r) + Cϕ3

        (log y

        2

        ))middot |ϕ|1y (1319)

        instead of g(r) (by (1317)) Here (1319) is a non-increasing function of r becauseg(r) is as we already checked Hence Prop 1312 gives us that (1318) is at most

        g(r0)middot(H(r0)minus I0) + (1minus I0) middot Cϕ3(

        log y

        2

        )+

        1

        logradicx+ cminus

        int r1

        r0

        g(r)

        r + 1dr +

        (7

        15+minus214938 + 8

        15 logκlog x+ 2cminus

        )g(r1)

        (1320)times |ϕ|1y middot

        sumpgtradicx(log p)2η2

        +(px) where

        I0 =1sum

        pgtradicx(log p)2η2

        +(nx)

        intM8r0

        |S1η+(α x)|2 dα (1321)

        By the triangle inequalityradicintM8r0

        |S1η+(α x)|2 dα =

        radicintM8r0

        |Sη+(α x)minus S2η+(α x)|2 dα

        geradicint

        M8r0

        |Sη+(α x)|2 dαminusradicint

        M8r0

        |S2η+(α x)|2 dα

        geradicint

        M8r0

        |Sη+(α x)|2 dαminusradicint

        RZ|S2η+(α x)|2 dα

        132 THE MINOR-ARC TOTAL 253

        As we already showedintRZ|S2η+(α x)|2 dα =

        sumn non-primeor n le

        radicx

        Λ(n)2η+(nx)2 le E

        ThusI0 middot S ge (

        radicJ minusradicE)2

        and so we are done

        We now should estimate the integralint r1r0

        g(r)r dr in (1313) It is easy to see thatint infin

        r0

        1

        r32dr =

        2

        r120

        int infinr0

        log r

        r2dr =

        log er0

        r0

        int infinr0

        1

        r2dr =

        1

        r0int r1

        r0

        1

        rdr = log

        r1

        r0

        int infinr0

        log r

        r32dr =

        2 log e2r0radicr0

        int infinr0

        log 2r

        r32dr =

        2 log 2e2r0radicr0

        int infinr0

        (log 2r)2

        r32dr =

        2P2(log 2r0)radicr0

        int infinr0

        (log 2r)3

        r32dr =

        2P3(log 2r0)

        r120

        (1322)where

        P2(t) = t2 + 4t+ 8 P3(t) = t3 + 6t2 + 24t+ 48 (1323)

        We also have int infinr0

        dr

        r2 log r= E1(log r0) (1324)

        where E1 is the exponential integral

        E1(z) =

        int infinz

        eminust

        tdt

        We must also estimate the integralsint r1

        r0

        radicz(r)

        r32dr

        int r1

        r0

        z(r)

        r2dr

        int r1

        r0

        z(r) log r

        r2dr

        int r1

        r0

        z(r)

        r32dr (1325)

        Clearly z(r) minus eγ log log r = 250637 log log r is decreasing on r Hence forr ge 105

        z(r) le eγ log log r + cγ

        where cγ = 1025742 Let F (t) = eγ log t+ cγ Then F primeprime(t) = minuseγt2 lt 0 Hence

        d2radicF (t)

        dt2=

        F primeprime(t)

        2radicF (t)

        minus (F prime(t))2

        4(F (t))32lt 0

        254 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

        for all t gt 0 In other wordsradicF (t) is convex-down and so we can bound

        radicF (t)

        from above byradicF (t0) +

        radicFprime(t0) middot (tminus t0) for any t ge t0 gt 0 Hence for r ge r0 ge

        105 radicz(r) le

        radicF (log r) le

        radicF (log r0) +

        dradicF (t)

        dt|t=log r0 middot log

        r

        r0

        =radicF (log r0) +

        eγradicF (log r0)

        middotlog r

        r0

        2 log r0

        Thus by (1322)int infinr0

        radicz(r)

        r32dr le

        radicF (log r0)

        (2minus eγ

        F (log r0)

        )1radicr0

        +eγradic

        F (log r0) log r0

        log e2r0radicr0

        =2radicF (log r0)radicr0

        (1 +

        F (log r0) log r0

        )

        (1326)

        The other integrals in (1325) are easier Just as in (1326) we extend the range ofintegration to [r0infin] Using (1322) and (1324) we obtainint infin

        r0

        z(r)

        r2dr le

        int infinr0

        F (log r)

        r2dr = eγ

        (log log r0

        r0+ E1(log r0)

        )+cγr0int infin

        r0

        z(r) log r

        r2dr le eγ

        ((1 + log r0) log log r0 + 1

        r0+ E1(log r0)

        )+cγ log er0

        r0

        By [OLBC10 (682)]

        1

        r(log r + 1)le E1(log r) le 1

        r log r

        (The second inequality is obvious) Henceint infinr0

        z(r)

        r2dr le eγ(log log r0 + 1 log r0) + cγ

        r0

        int infinr0

        z(r) log r

        r2dr le

        eγ(

        log log r0 + 1log r0

        )+ cγ

        r0middot log er0

        Finally int infinr0

        z(r)

        r32le eγ

        (2 log log r0radic

        r0+ 2E1

        (log r0

        2

        ))+

        2cγradicr0

        le 2radicr0

        (F (log r0) +

        2eγ

        log r0

        )

        (1327)

        It is time to estimate int r1

        r0

        Rz2r log 2rradicz(r)

        r32dr (1328)

        132 THE MINOR-ARC TOTAL 255

        where z = y or z = y((log y)2) (and y = xκ as before) and where Rzt is asdefined in (1113) By Cauchy-Schwarz (1328) is at most

        radicint r1

        r0

        (Rz2r log 2r)2

        r32dr middot

        radicint r1

        r0

        z(r)

        r32dr (1329)

        We have already bounded the second integral Let us look at the first one We can writeRzt = 027125Rzt + 041415 where

        Rzt = log

        (1 +

        log 4t

        2 log 9z13

        2004t

        ) (1330)

        Clearly

        Rzet4 = log

        (1 +

        t2

        log 36z13

        2004 minus t

        )

        Now for f(t) = log(c+ at(bminus t)) and t isin [0 b)

        f prime(t) =ab(

        c+ atbminust

        )(bminus t)2

        f primeprime(t) =minusab((aminus 2c)(bminus 2t)minus 2ct)(

        c+ atbminust

        )2

        (bminus t)4

        In our case a = 12 c = 1 and b = log 36z13 minus log(2004) gt 0 Hence for t lt b

        minusab((aminus 2c)(bminus 2t)minus 2ct) =b

        2

        (2t+

        3

        2(bminus 2t)

        )=b

        2

        (3

        2bminus t

        )gt 0

        and so f primeprime(t) gt 0 In other words t rarr Rzet4 is convex-up for t lt b ie foret4 lt 9z132004 It is easy to check that since we are assuming y ge 1025

        2r1 =3

        16y415 lt

        9

        2004

        (2y

        log y

        )13

        le 9z13

        2004

        We conclude that r rarr Rz2r is convex-up on log 8r for r le r1 and hence so isr rarr Rzr and so in turn is r rarr R2

        zr Thus for r isin [r0 r1]

        R2z2r le R2

        z2r0 middotlog r1r

        log r1r0+R2

        z2r1 middotlog rr0

        log r1r0 (1331)

        256 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

        Therefore by (1322)

        int r1

        r0

        (Rz2r log 2r)2

        r32dr

        leint r1

        r0

        (R2z2r0

        log r1r

        log r1r0+R2

        z2r1

        log rr0

        log r1r0

        )(log 2r)2 dr

        r32

        =2R2

        z2r0

        log r1r0

        ((P2(log 2r0)radicr0

        minus P2(log 2r1)radicr1

        )log 2r1 minus

        P3(log 2r0)radicr0

        +P3(log 2r1)radicr1

        )+

        2R2z2r1

        log r1r0

        (P3(log 2r0)radicr0

        minus P3(log 2r1)radicr1

        minus(P2(log 2r0)radicr0

        minus P2(log 2r1)radicr1

        )log 2r0

        )

        = 2

        (R2z2r0 minus

        log 2r0

        log r1r0

        (R2z2r1 minusR

        2z2r0)

        )middot(P2(log 2r0)radicr0

        minus P2(log 2r1)radicr1

        )+ 2

        R2z2r1 minusR

        2z2r0

        log r1r0

        (P3(log 2r0)radicr0

        minus P3(log 2r1)radicr1

        )= 2R2

        z2r0 middot(P2(log 2r0)radicr0

        minus P2(log 2r1)radicr1

        )+ 2

        R2z2r1 minusR

        2z2r0

        log r1r0

        (Pminus2 (log 2r0)radicr0

        minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

        )

        (1332)where P2(t) and P3(t) are as in (1323) and Pminus2 (t) = P3(t)minustP2(t) = 2t2 +16t+48

        Putting all terms together we conclude that

        int r1

        r0

        g(r)

        rdr le f0(r0 y) + f1(r0) + f2(r0 y) (1333)

        where

        f0(r0 y) =

        ((1minus cϕ)

        radicI0r0r1y + cϕ

        radicI0r0r1 2y

        log y

        )radic2radicr0I1r0

        f1(r0) =

        radicF (log r0)radic

        2r0

        (1 +

        F (log r0) log r0

        )+

        5radic2r0

        +1

        r0

        ((13

        4log er0 + 1107

        )Jr0 + 1366 log er0 + 3755

        )f2(r0 y) = 336

        ((log y)2)16

        y16log

        r1

        r0

        (1334)

        132 THE MINOR-ARC TOTAL 257

        where F (t) = eγ log t+ cγ cγ = 1025742 y = xκ (as usual)

        I0r0r1z = R2z2r0 middot

        (P2(log 2r0)radicr0

        minus P2(log 2r1)radicr1

        )+R2z2r1 minusR

        2z2r0

        log r1r0

        (Pminus2 (log 2r0)radicr0

        minus P3(log 2r1)minus (log 2r0)P2(log 2r1)radicr1

        )Jr = F (log r) +

        log r I1r = F (log r) +

        2eγ

        log r cϕ =

        Cϕ2 log y2|ϕ|1

        log log y2(1335)

        and Cϕ2K is as in (1120)Let us recapitulate briefly The term f2(r0 y) in (1334) comes from the term

        336xminus116 in (1112) The term f1(r0 y) includes all other terms in (1112) exceptfor Rx2r log 2r

        radicz(r)(

        radic2r) The contribution of that last term is (1328) divided

        byradic

        2 That in turn is at most (1329) divided byradic

        2 The first integral in (1329)was bounded in (1332) the second integral was bounded in (1327)

        258 CHAPTER 13 THE INTEGRAL OVER THE MINOR ARCS

        Chapter 14

        Conclusion

        We now need to gather all results using the smoothing functions

        ηlowast = (η2 lowastM ϕ)(κt)

        where ϕ(t) = t2eminust22 η2 = η1 lowastM η1 and η1 = 2 middot I[minus1212] and

        η+ = h200(t)teminust22

        where

        hH(t) =

        int infin0

        h(tyminus1)FH(y)dy

        y

        h(t) =

        t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

        FH(t) =sin(H log y)

        π log y

        We studied ηlowast and η+ in Part II We saw ηlowast in Thm 1321 (which actually works forgeneral ϕ [0infin)rarr [0infin) as its statement says) We will set κ soon

        We fix a value for r namely r = 150000 Our results will have to be valid for anyx ge x+ where x+ is fixed We set x+ = 49 middot 1026 since we want a result valid forN ge 1027 and as was discussed in (111) we will work with x+ slightly smaller thanN2

        141 The `2 norm over the major arcs explicit versionWe apply Lemma 1031 with η = η+ and η as in (113) Let us first work out theerror terms defined in (1027) Recall that δ0 = 8 By Thm 714

        ETη+δ0r2 = max|δ|leδ0r2

        | errηχT (δ x)|

        = 4772 middot 10minus11 +251400radicx+

        le 11405 middot 10minus8(141)

        259

        260 CHAPTER 14 CONCLUSION

        Eη+rδ0 = maxχ mod q

        qlermiddotgcd(q2)

        |δ|legcd(q2)δ0r2q

        radicqlowast| errη+χlowast(δ x)|

        le 13482 middot 10minus14radic

        300000 +1617 middot 10minus10

        radic2

        +1radicx+

        (499900 + 52

        radic300000

        )le 23992 middot 10minus8

        (142)where in the latter case we are using the fact that a stronger bound for q = 1 (namely(141)) allows us to assume q ge 2

        We also need to bound a few norms by the estimates in sectA3 and sectA5 (appliedwith H = 200)

        |η+|1 le 1062319 |η+|2 le 0800129 +2748569

        20072le 0800132

        |η+|infin le 1 + 206440727 middot1 + 4

        π logH

        Hle 1079955

        (143)

        By (1012) and (141)

        |Sη+(0 x)| =∣∣η+(0) middot x+Olowast

        (errη+χT (0 x)

        )middot x∣∣

        le (|η+|1 + ETη+δ0r2)x le 1063x

        This is far from optimal but it will do since all we wish to do with this is to bound thetiny error term Kr2 in (1027)

        Kr2 = (1 +radic

        300000)(log x)2 middot 1079955

        middot (2 middot 106232 + (1 +radic

        300000)(log x)21079955x)

        le 125906(log x)2 le 971 middot 10minus21x

        for x ge x+ By (141) we also have

        519δ0r

        (ET

        η+δ0r2middot

        (|η+|1 +

        ETη+

        δ0r2

        2

        ))le 0075272

        andδ0r(log 2e2r)

        (E2η+rδ0 +Kr2x

        )le 100393 middot 10minus8

        By (A23) and (A26)

        08001287 le |η|2 le 08001288 (144)

        and|η+ minus η|2 le

        274856893

        H72le 242942 middot 10minus6 (145)

        We bound |η(3) |1 using the fact that (as we can tell by taking derivatives) η(2)

        (t)

        increases from 0 at t = 0 to a maximum within [0 12] and then decreases to η(2) (1) =

        142 THE TOTAL MAJOR-ARC CONTRIBUTION 261

        minus7 only to increase to a maximum within [32 2] (equal to the maximum attainedwithin [0 12]) and then decrease to 0 at t = 2

        |η(3) |1 = 2 max

        tisin[012]η

        (2) (t)minus 2η

        (2) (1) + 2 max

        tisin[322]η

        (2) (t)

        = 4 maxtisin[012]

        η(2) (t) + 14 le 4 middot 46255653 + 14 le 325023

        (146)

        where we compute the maximum by the bisection method with 30 iterations (usinginterval arithmetic as always)

        We evaluate explicitly sumqlerq odd

        micro2(q)

        φ(q)= 6798779

        using yet again interval arithmeticLooking at (1029) and (1028) we conclude that

        Lrδ0 le 2 middot 6798779 middot 08001322 le 870531

        Lrδ0 ge 2 middot 6798779 middot 080012872 minus ((log r + 17) middot (3888 middot 10minus6 + 591 middot 10minus12))

        minus(1342 middot 10minus5

        )middot(

        064787 +log r

        4r+

        0425

        r

        )ge 870517

        Lemma 1031 thus gives us thatintM8r0

        ∣∣Sη+(α x)∣∣2 dα = (870524 +Olowast(000007))x+Olowast(0075273)x

        = (87052 +Olowast(00754))x le 87806x

        (147)

        142 The total major-arc contributionFirst of all we must bound from below

        C0 =prodp|N

        (1minus 1

        (pminus 1)2

        )middotprodp-N

        (1 +

        1

        (pminus 1)3

        ) (148)

        The only prime that we know does not divide N is 2 Thus we use the bound

        C0 ge 2prodpgt2

        (1minus 1

        (pminus 1)2

        )ge 13203236 (149)

        The other main constant is Cηηlowast which we defined in (1037) and already startedto estimate in (116)

        Cηηlowast = |η|22int N

        x

        0

        ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N

        x

        0

        ((2minusNx) + ρ)2ηlowast(ρ)dρ

        )(1410)

        262 CHAPTER 14 CONCLUSION

        provided that N ge 2x Recall that ηlowast = (η2 lowastM ϕ)(κt) where ϕ(t) = t2eminust22

        Thereforeint Nx

        0

        ηlowast(ρ)dρ =

        int Nx

        0

        (η2 lowast ϕ)(κρ)dρ =

        int 1

        14

        η2(w)

        int Nx

        0

        ϕ(κρw

        )dρdw

        w

        =|η2|1|ϕ|1

        κminus 1

        κ

        int 1

        14

        η2(w)

        int infinκNxw

        ϕ(ρ)dρdw

        By integration by parts and [AS64 (7113)]int infiny

        ϕ(ρ)dρ = yeminusy22 +

        radic2

        int infinyradic

        2

        eminust2

        dt lt

        (y +

        1

        y

        )eminusy

        22

        Hence int infinκNxw

        ϕ(ρ)dρ leint infin

        2κϕ(ρ)dρ lt

        (2κ +

        1

        )eminus2κ2

        and so since |η2|1 = 1int Nx

        0

        ηlowast(ρ)dρ ge |ϕ|1κminusint 1

        14

        η2(w)dw middot(

        2 +1

        2κ2

        )eminus2κ2

        ge |ϕ|1κminus(

        2 +1

        2κ2

        )eminus2κ2

        (1411)

        Let us now focus on the second integral in (1410) Write Nx = 2 + c1κ Thenthe integral equalsint 2+c1κ

        0

        (minusc1κ + ρ)2ηlowast(ρ)dρ le 1

        κ3

        int infin0

        (uminus c1)2 (η2 lowastM ϕ)(u) du

        =1

        κ3

        int 1

        14

        η2(w)

        int infin0

        (vw minus c1)2ϕ(v)dvdw

        =1

        κ3

        int 1

        14

        η2(w)

        (3

        radicπ

        2w2 minus 2 middot 2c1w + c21

        radicπ

        2

        )dw

        =1

        κ3

        (49

        48

        radicπ

        2minus 9

        4c1 +

        radicπ

        2c21

        )

        It is thus best to choose c1 = (94)radic

        2π = 089762 We must now estimate |ηprime|22 We could do this directly by rigorous numerical

        integration but we might as well do it the hard way (which is actually rather easy) Bythe definition (113) of η

        |ηprime(x+ 1)|2 =(x14 minus 18x12 + 111x10 minus 284x8 + 351x6 minus 210x4 + 49x2

        )eminusx

        2

        (1412)for x isin [minus1 1] and ηprime(x+ 1) = 0 for x 6isin [minus1 1] Now for any even integer k gt 0int 1

        minus1

        xkeminusx2

        dx = 2

        int 1

        0

        xkeminusx2

        dx = γ

        (k + 1

        2 1

        )

        142 THE TOTAL MAJOR-ARC CONTRIBUTION 263

        where γ(a r) =int r

        0eminusttaminus1dt is the incomplete gamma function (We substitute

        t = x2 in the integral) By [AS64 (6516) (6522)] γ(a+ 1 1) = aγ(a 1)minus 1e forall a gt 0 and γ(12 1) =

        radicπ erf(1) where

        erf(z) =2radicπ

        int 1

        0

        eminust2

        dt

        Thus starting from (1412) we see that

        |ηprime|22 = γ

        (15

        2 1

        )minus 18 middot γ

        (13

        2 1

        )+ 111 middot γ

        (11

        2 1

        )minus 284 middot γ

        (9

        2 1

        )+ 351 middot γ

        (7

        2 1

        )minus 210 middot γ

        (5

        2 1

        )+ 49 middot γ

        (3

        2 1

        )=

        9151

        128

        radicπ erf(1)minus 18101

        64e= 27375292

        (1413)We thus obtain

        271|ηprime|22middotint N

        x

        0

        ((2minusNx) + ρ)2ηlowast(ρ)dρ

        le 74188 middot 1

        κ3

        (49

        48

        radicπ

        2minus (94)2

        2radic

        )le 20002

        κ3

        We conclude that

        Cηηlowast ge1

        κ|ϕ|1|η|22 minus |η|22

        (2 +

        1

        2κ2

        )eminus2κ2

        minus 20002

        κ3

        Settingκ = 49

        and using (144) we obtain

        Cηηlowast ge1

        κ(|ϕ|1|η|22 minus 0000834) (1414)

        Here it is useful to note that |ϕ|1 =radic

        π2 and so by (144) |ϕ|1|η|22 = 080237

        We have finally chosen x in terms of N

        x =N

        2 + c1κ

        =N

        2 + 94radic2π

        149

        = 0495461 middotN (1415)

        Thus we see that since we are assuming N ge 1027 we in fact have x ge 495461 middot1026 and so in particular

        x ge 49 middot 1026x

        κge 1025 (1416)

        264 CHAPTER 14 CONCLUSION

        Let us continue with our determination of the major-arcs total We should com-pute the quantities in (1038) We already have bounds for Eη+rδ0 Aη+ (see (147))Lηrδ0 and Kr2 By Corollary 713 we have

        Eηlowastr8 le maxχ mod q

        qlermiddotgcd(q2)

        |δ|legcd(q2)δ0r2q

        radicqlowast| errηlowastχlowast(δ x)|

        le 1

        κ

        (2485 middot 10minus19 +

        1radic1025

        (381500 + 76

        radic300000

        ))le 133805 middot 10minus8

        κ

        (1417)

        where the factor of κ comes from the scaling in ηlowast(t) = (η2 lowastM ϕ)(κt) (which ineffect divides x by κ) It remains only to bound the more harmless terms of type Zη2and LSη

        Clearly Zη2+2 le (1x)sumn Λ(n)(log n)η2

        +(nx) Now by Prop 715

        infinsumn=1

        Λ(n)(log n)η2(nx)

        =

        (0640206 +Olowast

        (2 middot 10minus6 +

        36691radicx

        ))x log xminus 0021095x

        le (0640206 +Olowast(3 middot 10minus6))x log xminus 0021095x

        (1418)

        ThusZη2+2 le 0640209 log x (1419)

        We will proceed a little more crudely for Zη2lowast2

        Zη2lowast2 =1

        x

        sumn

        Λ2(n)η2lowast(nx) le 1

        x

        sumn

        Λ(n)ηlowast(nx) middot (ηlowast(nx) log n)

        le (|ηlowast|1 + | errηlowastχT (0 x)|) middot (|ηlowast(t) middot log+(κt)|infin + |ηlowast|infin log(xκ))(1420)

        where log+(t) = max(0 log t) It is easy to see that

        |ηlowast|infin = |η2 lowastM ϕ|infin le∣∣∣∣η2(t)

        t

        ∣∣∣∣1

        |ϕ|infin le 4(log 2)2 middot 2

        ele 1414 (1421)

        and since log+ is non-decreasing and η2 is supported on a subset of [0 1]

        |ηlowast(t) middot log+(κt)|infin = |(η2 lowastM ϕ) middot log+ |infin le |η2 lowastM (ϕ middot log+)|infin

        le∣∣∣∣η2(t)

        t

        ∣∣∣∣1

        middot |ϕ middot log+ |infin le 1921813 middot 0381157 le 0732513

        where we bound |ϕ middot log+ |infin by the bisection method with 25 iterations We alreadyknow that

        |ηlowast|1 =|η2|1|ϕ|1

        κ=|ϕ|1κ

        =

        radicπ2

        κ (1422)

        142 THE TOTAL MAJOR-ARC CONTRIBUTION 265

        By Cor 713

        | errηlowastχT (0 x)| le 2485 middot 10minus19 +1radic1025

        (381500 + 76) le 120665 middot 10minus7

        We conclude that

        Zη2lowast2 le (radicπ249 + 120665 middot 10minus7)(0732513 + 1414 log(x49)) le 00362 log x

        (1423)We have bounds for |ηlowast|infin and |η+|infin We can also bound

        |ηlowast middot t|infin =|(η2 lowastM ϕ) middot t|infin

        κle |η2|1 middot |ϕ middot t|infin

        κle 332eminus32

        κ

        We quote the estimate

        |η+ middot t|infin = 1064735 + 325312 middot (1 + (4π) log 200)200 le 119073 (1424)

        from (A42)We can now bound LSη(x r) for η = ηlowast η+

        LSη(x r) = log r middotmaxpler

        sumαge1

        η

        (pα

        x

        )

        le (log r) middotmaxpler

        log x

        log p|η|infin +

        sumαge1

        pαgex

        |η middot t|infinpαx

        le (log r) middotmax

        pler

        (log x

        log p|η|infin +

        |η middot t|infin1minus 1p

        )le (log r)(log x)

        log 2|η|infin + 2(log r)|η middot t|infin

        and so

        LSηlowast le(

        1414

        log 2log x+ 2 middot (3e)32

        49

        )log r le 2432 log x+ 057

        LSη+ le(

        107996

        log 2log x+ 2 middot 119073

        )log r le 1857 log x+ 2839

        (1425)

        where we are using the bound on |η+|infin in (143)We can now start to put together all terms in (1036) Let ε0 = |η+ minus η|2|η|2

        Then by (145)ε0|η|2 = |η+ minus η|2 le 242942 middot 10minus6

        Thus

        282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012

        |η(3) |21δ50

        r

        266 CHAPTER 14 CONCLUSION

        is at most

        282643 middot 242942 middot 10minus6 middot (2 middot 080013 + 242942 middot 10minus6)

        +43101 middot 0800132 + 00012 middot 325032

        85

        150000le 29387 middot 10minus5

        by (144) (146) and (1422)Since ηlowast = (η2 lowastM ϕ)(κx) and η2 is supported on [14 1]

        |ηlowast|22 =|η2 lowastM ϕ|22

        κ=

        1

        κ

        int infin0

        (int infin0

        η2(t)ϕ(wt

        ) dtt

        )2

        dw

        le 1

        κ

        int infin0

        (1minus 1

        4

        )int infin0

        η22(t)ϕ2

        (wt

        ) dtt2dw

        =3

        int infin0

        η22(t)

        t

        (int infin0

        ϕ2(wt

        ) dwt

        )dt

        =3

        4κ|η2(t)

        radict|22 middot |ϕ|22 =

        3

        4κmiddot 32

        3(log 2)3 middot 3

        8

        radicπ le 177082

        κ

        where we go from the first to the second line by Cauchy-SchwarzRecalling the bounds on Eηlowastrδ0 and Eη+rδ0 we obtained in (142) and (1417)

        we conclude that the second line of (1036) is at most x2 times

        133805 middot 10minus8

        κmiddot 87806 + 23922 middot 10minus8 middot 16812

        middot (radic

        87806 + 16812 middot 080014)

        radic177082

        κle 17316 middot 10minus6

        κ

        where we are using the boundAη+ le 87806 we obtained in (147) (We are also usingthe bounds on norms in (143) and the value κ = 49)

        By the bounds (1419) (1423) and (1425) we see that the third line of (1036) isat most

        2 middot (0640209 log x) middot (2432 log x+ 057) middot x

        + 4radic

        0640209 log x middot 00362 log x(1857 log x+ 2839)x le 43(log x)2x

        where we use the assumption x ge x+ = 49 middot 1026 (though a much weaker assumptionwould suffice)

        Using the assumption x ge x+ again together with (1422) and the bounds we havejust proven we conclude that for r = 150000 the integral over the major arcsint

        M8r

        Sη+(α x)2Sηlowast(α x)e(minusNα)dα

        143 THE MINOR-ARC TOTAL EXPLICIT VERSION 267

        is

        C0 middot Cη0ηlowastx2 +Olowast

        (29387 middot 10minus5 middot

        radicπ2

        κx2 +

        17316 middot 10minus6

        κx2 + 43(log x)2x

        )

        = C0 middot Cη0ηlowastx2 +Olowast(

        385628 middot 10minus5 middot x2

        κ

        )= C0 middot Cη0ηlowastx2 +Olowast(786996 middot 10minus7x2)

        (1426)where C0 and Cη0ηlowast are as in (1037) Notice that C0Cη0ηlowastx

        2 is the expected asymp-totic for the integral over all of RZ

        Moreover by (149) (1414) and (144) as well as |ϕ|1 =radicπ2

        C0 middot Cη0ηlowast ge 13203236

        (|ϕ|1|η|22

        κminus 0000834

        κ

        )ge 10594003

        κminus 0001102

        κge 1058298

        49

        Hence intM8r

        Sη+(α x)2Sηlowast(α x)e(minusNα)dα ge 1058259

        κx2 (1427)

        where as usual κ = 49 This is our total major-arc bound

        143 The minor-arc total explicit versionWe need to estimate the quantities E S T J M in Theorem 1321 Let us start bybounding the constants in (1312) The constants Cη+j j = 0 1 2 will appear onlyin the minor term E and so crude bounds on them will do

        By (143) and (1424)

        suprget

        η+(r) le min

        (107996

        119073

        t

        )for all t ge 0 Thus

        Cη+0 = 07131

        int infin0

        1radict

        (suprget

        η+(r)

        )2

        dt

        le 07131

        (int 1

        0

        1079962

        radict

        dt+

        int infin1

        1190732

        t52dt

        )le 233744

        Similarly

        Cη+1 = 07131

        int infin1

        log tradict

        (suprget

        η+(r)

        )2

        dt

        le 07131

        int infin1

        1190732 log t

        t52dt le 044937

        268 CHAPTER 14 CONCLUSION

        Immediately from (143)

        Cη+2 = 051942|η+|2infin le 060581

        We get

        E le ((233744 + 060581) log x+ (2 middot 233744 + 044937)) middot x12

        le (294325 log x+ 512426) middot x12 le 84029 middot 10minus12 middot x(1428)

        where E is defined as in (1311) and where we are using the assumption x ge x+ =49 middot 1026 Using (1417) and (1422) we see that

        Sηlowast(0 x) = (|ηlowast|1 +Olowast(ETηlowast0))x =(radic

        π2 +Olowast(133805 middot 10minus8)) xκ

        Hence

        Sηlowast(0 x) middot E le 105315 middot 10minus11 middot x2

        κ (1429)

        We can bound

        S lesumn

        Λ(n)(log n)η2+(nx) le 0640209x log xminus 0021095x (1430)

        by (1418) Let us now estimate T Recall that ϕ(t) = t2eminust22 Sinceint u

        0

        ϕ(t)dt =

        int u

        0

        t2eminust22dt le

        int u

        0

        t2dt =u3

        3

        we can bound

        Cϕ3

        (1

        2log

        x

        κ

        )=

        104488radicπ2

        int 2log xκ

        0

        t2eminust22dt le 02779

        ((log xκ)2)3

        By (147) we already know that J = (87052 +Olowast(00754))x Hence

        (radicJ minusradicE)2 = (

        radic(87052 +Olowast(00754))xminus

        radic84029 middot 10minus12 middot x)2

        ge 86297x(1431)

        and so

        T = Cϕ3

        (1

        2log

        x

        κ

        )middot (S minus (

        radicJ minusradicE)2)

        le 8 middot 02779

        (log xκ)3middot (0640209x log xminus 0021095xminus 86297x)

        le 0177928x log x

        (log xκ)3minus 240405

        8x

        (log xκ)3

        le 142336x

        (log xκ)2minus 1369293

        x

        (log xκ)3

        143 THE MINOR-ARC TOTAL EXPLICIT VERSION 269

        for κ = 49 Since xκ ge 1025 this implies that

        T le 35776 middot 10minus4 middot x (1432)

        It remains to estimate M Let us first look at g(r0) here g = gxκϕ where gyϕ isdefined as in (1119) and φ(t) = t2eminust

        22 as usual Write y = xκ We must estimatethe constant Cϕ2K defined in (1121)

        Cϕ2K = minusint 1

        1K

        ϕ(w) logw dw le minusint 1

        0

        ϕ(w) logw dw

        le minusint 1

        0

        w2eminusw22 logw dw le 0093426

        where again we use VNODE-LP for rigorous numerical integration Since |ϕ|1 =radicπ2 and K = (log y)2 this implies that

        Cϕ2K|ϕ|1logK

        le 007455

        log log y2

        (1433)

        and so

        RyKϕt =007455

        log log y2

        RyKt +

        (1minus 007455

        log log y2

        )Ryt (1434)

        Let t = 2r0 = 300000 we recall that K = (log y)2 Recall from (1416) thaty = xκ ge 1025 thus yK ge 347435 middot 1023 and log((log y)2) ge 335976 Goingback to the definition of Rxt in (1113) we see that

        Ry2r0 le 027125 log

        (1 +

        log(8 middot 150000)

        2 log 9middot(1025)13

        2004middot2middot150000

        )+ 041415 le 058341

        (1435)

        RyK2r0 le 027125 log

        (1 +

        log(8 middot 150000)

        2 log 9middot(347435middot1023)13

        2004middot2middot150000

        )+ 041415 le 060295

        (1436)and so

        RyKϕ2r0 le007455

        335976060295 +

        (1minus 007455

        335976

        )058341 le 058385

        Using

        z(r) = eγ log log r +250637

        log log rle 542506

        we see from (1113) that

        L2r0 = 542506 middot(

        13

        4log 300000 + 782

        )+ 1366 log 300000 + 3755 le 474608

        270 CHAPTER 14 CONCLUSION

        Going back to (1119) we sum up and obtain that

        g(r0) =(058385 middot log 300000 + 05)

        radic542506 + 25radic

        2 middot 150000

        +474608

        150000+ 336

        (log y

        2y

        )16

        le 0041568

        Using again the bound x ge 49 middot 1026 we obtain

        log(150000 + 1) + c+

        logradicx+ cminus

        middot S minus (radicJ minusradicE)2

        le 13971612 log x+ 06394

        middot (0640209x log xminus 0021095x)minus 86297x

        le 178895xminus 117332x12 log x+ 06394

        minus 86297x

        le (178895minus 86297)x le 92598x

        where c+ = 20532 and cminus = 06394 Therefore

        g(r0) middot(

        log(150000 + 1) + c+

        logradicx+ cminus

        middot S minus (radicJ minusradicE)2

        )le 0041568 middot 92598x

        le 038492x(1437)

        This is one of the main terms

        Let r1 = (38)y415 where as usual y = xκ and κ = 49 Then

        Ry2r1 = 027125 log

        1 +log(8 middot 3

        8y415

        )2 log 9y13

        2004middot 34y415

        + 041415

        = 027125 log

        (1 +

        415 log y + log 3

        2(

        13 minus

        415

        )log y + 2 log 9

        2004middot 34

        )+ 041415

        le 027125 log

        (1 +

        415

        2(

        13 minus

        415

        ))+ 041415 le 071215

        (1438)

        143 THE MINOR-ARC TOTAL EXPLICIT VERSION 271

        Similarly for K = (log y)2 (as usual)

        RyK2r1 = 027125 log

        1 +log(8 middot 3

        8y415

        )2 log 9(yK)13

        2004middot 34y415

        + 041415

        = 027125 log

        1 +415 log y + log 3

        215 log y minus 2

        3 log log y + 2 log 9middot213

        2004middot 34

        + 041415

        = 027125 log

        (3 +

        43 log log y minus c

        215 log y minus 2

        3 log log y + 2 log 12middot213

        2004

        )+ 041415

        (1439)where c = 4 log(12 middot 2132004)minus log 3 Let

        f(t) =43 log tminus c

        215 tminus

        23 log t+ 2 log 12middot213

        2004

        The bisection method with 32 iterations shows that

        f(t) le 0019562618 (1440)

        for 180 le t le 30000 since f(t) lt 0 for 0 lt t lt 180 (by (43) log t minus c lt 0) andsince by c gt 203 we have f(t) lt (52)(log t)t as soon as t gt (log t)2 (and so inparticular for t gt 30000) we see that (1440) is valid for all t gt 0 Therefore

        RyK2r1 le 071392 (1441)

        and so by (1434) we conclude that

        RyKϕ2r1 le007455

        335976middot 071392 +

        (1minus 007455

        335976

        )middot 071215 le 071219

        Since r1 = (38)y415 and z(r) is increasing for r ge 27 we know that

        z(r1) le z(y415) = eγ log log y415 +250637

        log log y415

        = eγ log log y +250637

        log log y minus log 154

        minus eγ log15

        4le eγ log log y minus 143644

        (1442)for y ge 1025 Hence (1113) gives us that

        L2r1 le (eγ log log y minus 143644)

        (13

        4log

        3

        4y

        415 + 782

        )+ 1366 log

        3

        4y

        415 + 3755

        le 13

        15eγ log y log log y + 239776 log y + 122628 log log y + 237304

        le (213522 log y + 18118) log log y

        272 CHAPTER 14 CONCLUSION

        Moreover again by (1442)radicz(r1) le

        radiceγ log log y minus 143644

        2radiceγ log log y

        and so by y ge 1025

        (071219 log3

        4y

        415 + 05)

        radicz(r1)

        le (018992 log y + 029512)

        (radiceγ log log y minus 143644

        2radiceγ log log y

        )le 019505

        radiceγ log log y minus 019505 middot 143644 log y

        2radiceγ log log y

        le 026031 log yradic

        log log y minus 300147

        Therefore by (1119)

        gyϕ(r1) le 026031 log yradic

        log log y + 25minus 300147radic34y

        415

        +(213522 log y + 18118) log log y

        38y

        415

        +336((log y)2)16

        y16

        le 030059 log yradic

        log log y

        y215

        +569392 log y log log y

        y415

        minus 057904

        y215

        +483147 log log y

        y415

        +2994(log y)16

        y16

        le 030059 log yradic

        log log y

        y215

        +569392 log y log log y

        y415

        +130151(log y)16

        y16

        le 030915 log yradic

        log log y

        y215

        where we use y ge 1025 and verify that the functions t 7rarr (log t)16t16minus215 t 7rarrradiclog log tt415minus215 and t 7rarr (log log t)t415minus215 are decreasing for t ge y (just by

        taking derivatives)Since κ = 49 one of the terms in (1313) simplifies easily

        7

        15+minus214938 + 8

        15 logκlog x+ 2cminus

        le 7

        15

        By (1430) and y = xκ = x49 we conclude that

        7

        15g(r1)S le 7

        15middot 030915 log y

        radiclog log y

        y215

        middot (0640209 log xminus 0021095)x

        le 014427 log yradic

        log log y

        y215

        (0640209 log y + 24705)x le 030517x

        (1443)

        143 THE MINOR-ARC TOTAL EXPLICIT VERSION 273

        where we are using the fact that y 7rarr (log y)2radic

        log log yy215 is decreasing for y ge1025 (because y 7rarr (log y)52y215 is decreasing for y ge e754 and 1025 gt e754)

        It remains only to bound

        2S

        log x+ 2cminus

        int r1

        r0

        g(r)

        rdr

        in the expression (1313) forM We will use the bound on the integral given in (1333)The easiest term to bound there is f1(r0) defined in (1334) since it depends only onr0 for r0 = 150000

        f1(r0) = 00169073

        It is also not hard to bound f2(r0 x) also defined in (1334)

        f2(r0 y) = 336((log y)2)16

        x16log

        38y

        415

        r0

        le 336(log y)16

        (2y)16

        (4

        15log y + 005699minus log r0

        )

        where we recall again that x = κy = 49y Thus since r0 = 150000 and y ge 1025

        f2(r0 y) le 0001399

        Let us now look at the terms I1r cϕ in (1335) We already saw in (1433) that

        cϕ =Cϕ2|ϕ|1

        logKle 007455

        log log y2

        le 002219

        Since F (t) = eγ log t+ cγ with cγ = 1025742

        I1r0 = F (log r0) +2eγ

        log r0= 573826 (1444)

        It thus remains only to estimate I0r0r1z for z = y and z = yK where K =(log y)2

        We will first give estimates for y large Omitting negative terms from (1335) weeasily get the following general bound crude but useful enough

        I0r0r1z le R2z2r0 middot

        P2(log 2r0)radicr0

        +R2z2r1 minus 0414152

        log r1r0

        Pminus2 (log 2r0)radicr0

        where P2(t) = t2 + 4t+ 8 and Pminus2 (t) = 2t2 + 16t+ 48 By (1438) and (1441)

        Ry2r1 le 071215 RyK2r1 le 071392

        for y ge 1025 Assume now that y ge 10150 Then since r0 = 150000

        Ryr0 le 027125 log

        (1 +

        log 4r0

        2 log 9middot(10150)13

        2004r0

        )+ 041415 le 043086

        274 CHAPTER 14 CONCLUSION

        and similarly RyKr0 le 043113 Since

        0430862 middot P2(log 2r0)radicr0

        le 010426 0431132 middot P2(log 2r0)radicr0

        le 010439

        we obtain that

        (1minus cϕ)radicI0r0r1y + cϕ

        radicI0r0r1 2y

        log y

        le 097781 middotradic

        010426 +049214

        415 log y minus log 400000

        + 002219

        radic010439 +

        049584415 log y minus log 400000

        le 033239

        (1445)

        for y ge 10150For y between 1025 and 10150 we evaluate the left side of (1445) directly using

        the definition (1335) of I0r0r1z instead as well as the bound

        cϕ le007455

        log log y2

        from (1433) (It is clear from the second and third lines of (1332) that I0r0r1z isdecreasing on z for r0 r1 fixed and so the upper bound for cϕ does give the worst case)The bisection method (applied to the interval [25 150] with 30 iterations including 30initial iterations) gives us that

        (1minus cϕ)radicI0r0r1y + cϕ

        radicI0r0r1 2y

        log yle 04153461 (1446)

        for 1025 le y le 10140 By (1445) (1446) is also true for y gt 10150 Hence

        f0(r0 y) le 04153461 middot

        radic2radicr0

        573827 le 0071498

        By (1333) we conclude thatint r1

        r0

        g(r)

        rdr le 0071498 + 0016908 + 0001399 le 0089805

        By (1430)

        2S

        log x+ 2cminusle 2(0640209x log xminus 0021095x)

        log x+ 2cminusle 2 middot 0640209x = 1280418x

        where we recall that cminus = 06294 gt 0 Hence

        2S

        log x+ 2cminus

        int r1

        r0

        g(r)

        rdr le 0114988x (1447)

        144 CONCLUSION PROOF OF MAIN THEOREM 275

        Putting (1437) (1443) and (1447) together we conclude that the quantity Mdefined in (1313) is bounded by

        M le 038492x+ 030517x+ 0114988x le 080508x (1448)

        Gathering the terms from (1429) (1432) and (1448) we see that Theorem 1321states that the minor-arc total

        Zr0 =

        int(RZ)M8r0

        |Sηlowast(α x)||Sη+(α x)|2dα

        is bounded by

        Zr0 le

        (radic|ϕ|1xκ

        (M + T ) +radicSηlowast(0 x) middot E

        )2

        le(radic|ϕ|1(080508 + 35776 middot 10minus4)

        xradicκ

        +radic

        10532 middot 10minus11xradicκ

        )2

        le 100948x2

        κ

        (1449)

        for r0 = 150000 x ge 49 middot 1026 where we use yet again the fact that |ϕ|1 =radicπ2

        This is our total minor-arc bound

        144 Conclusion proof of main theoremAs we have known from the startsum

        n1+n2+n3=N

        Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

        =

        intRZ

        Sη+(α x)2Sηlowast(α x)e(minusNα)dα

        (1450)

        We have just shown that assuming N ge 1027 N oddintRZ

        Sη+(α x)2Sηlowast(α x)e(minusNα)dα

        =

        intM8r0

        Sη+(α x)2Sηlowast(α x)e(minusNα)dα

        +Olowast

        (int(RZ)M8r0

        |Sη+(α x)|2|Sηlowast(α x)|dα

        )

        ge 1058259x2

        κ+Olowast

        (100948

        x2

        κ

        )ge 004877

        x2

        κ

        for r0 = 150000 where x = N(2 + 9(196radic

        2π)) as in (1415) (We are using(1427) and (1449)) Recall that κ = 49 and ηlowast(t) = (η2 lowastM ϕ)(κt) where ϕ(t) =

        t2eminust22

        276 CHAPTER 14 CONCLUSION

        It only remains to show that the contribution of terms with n1 n2 or n3 non-primeto the sum in (1450) is negligible (Let us take out n1 n2 n3 equal to 2 as well sincesome prefer to state the ternary Goldbach conjecture as follows every odd numberge 9is the sum of three odd primes) Clearlysum

        n1+n2+n3=Nn1 n2 or n3 even or non-prime

        Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

        le 3|η+|2infin|ηlowast|infinsum

        n1+n2+n3=Nn1 even or non-prime

        Λ(n1)Λ(n2)Λ(n3)

        le 3|η+|2infin|ηlowast|infinmiddot(logN)sum

        n1 le N non-primeor n1 = 2

        Λ(n1)sumn2leN

        Λ(n2)

        (1451)

        By (143) and (1421) |η+|infin le 1079955 and |ηlowast|infin le 1414 By [RS62 Thms 12and 13] sum

        n1 le N non-primeor n1 = 2

        Λ(n1) lt 14262radicN + log 2 lt 14263

        radicN

        sumn1 le N non-prime

        or n1 = 2

        Λ(n1)sumn2leN

        Λ(n2) = 14263radicN middot 103883N le 148169N32

        Hence the sum on the first line of (1451) is at most

        73306N32 logN

        Thus for N ge 1027 oddsumn1+n2+n3=N

        n1 n2 n3 odd primes

        Λ(n1)Λ(n2)Λ(n3)η+(n1)η+(n2)ηlowast(n3)

        ge 004877x2

        κminus 73306N32 logN

        ge 000024433N2 minus 14412 middot 10minus11 middotN2 ge 00002443N2

        by κ = 49 and (1415) Since 00002443N2 gt 0 this shows that every odd numberN ge 1027 can be written as the sum of three odd primes

        Since the ternary Goldbach conjecture has already been checked for allN le 8875middot1030 [HP13] we conclude that every odd number N gt 7 can be written as the sumof three odd primes and every odd number N gt 5 can be written as the sum of threeprimes The main result is hereby proven the ternary Goldbach conjecture is true

        Part IV

        Appendices

        277

        Appendix A

        Norms of smoothing functions

        Our aim here is to give bounds on the norms of some smoothing functions ndash and inparticular on several norms of a smoothing function η+ [0infin) rarr R based on theGaussian ηhearts(t) = eminust

        22As before we write

        h t 7rarr

        t2(2minus t)3etminus12 if t isin [0 2]0 otherwise

        (A1)

        We recall that we will work with an approximation η+ to the function η [0infin)rarr Rdefined by

        η(t) = h(t)ηhearts(t) =

        t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise

        (A2)

        The approximation η+ is defined by

        η+(t) = hH(t)teminust22 (A3)

        where

        FH(t) =sin(H log y)

        π log y

        hH(t) = (h lowastM FH)(y) =

        int infin0

        h(tyminus1)FH(y)dy

        y

        (A4)

        and H is a positive constant to be set later By (28) MhH = Mh middotMFH Now FH isjust a Dirichlet kernel under a change of variables using this we get that for τ real

        MFH(iτ) =

        1 if |τ | lt H 12 if |τ | = H 0 if |τ | gt H

        (A5)

        279

        280 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

        Thus

        MhH(iτ) =

        Mh(iτ) if |τ | lt H 12Mh(iτ) if |τ | = H 0 if |τ | gt H

        (A6)

        As it turns out h η and Mh (and hence MhH ) are relatively easy to work withwhereas we can already see that hH and η+ have more complicated definitions Partof our work will consist in expressing norms of hH and η+ in terms of norms of h ηand Mh

        A1 The decay of a Mellin transformNow consider any φ [0infin) rarr C that (a) has compact support (or fast decay) (b)satisfies φ(k)(t)tkminus1 = O(1) for trarr 0+ and 0 le k le 3 and (c) is C2 everywhere andquadruply differentiable outside a finite set of points

        By definition

        Mφ(s) =

        int infin0

        φ(x)xsdx

        x

        Thus by integration by parts for lt(s) gt minus1 and s 6= 0

        Mφ(s) =

        int infin0

        φ(x)xsdx

        x= limtrarr0+

        int infint

        φ(x)xsdx

        x= minus lim

        trarr0+

        int infint

        φprime(x)xs

        sdx

        = limtrarr0+

        int infint

        φprimeprime(x)xs+1

        s(s+ 1)dx = lim

        trarr0+minusint infint

        φ(3)(x)xs+2

        s(s+ 1)(s+ 2)dx

        = limtrarr0+

        int infint

        φ(4)(x)xs+3

        s(s+ 1)(s+ 2)(s+ 3)dx

        (A7)where φ(4)(x) is understood in the sense of distributions at the finitely many pointswhere it is not well-defined as a function

        Let s = it φ = h Let Ck = limtrarr0+

        intinfint|h(k)(x)|xkminus1dx for 0 le k le 4 Then

        (A7) gives us that

        Mh(it) le min

        (C0

        C1

        |t|

        C2

        |t||t+ i|

        C3

        |t||t+ i||t+ 2i|

        C4

        |t||t+ i||t+ 2i||t+ 3i|

        )

        (A8)We must estimate the constants Cj 0 le j le 4

        Clearly h(t)tminus1 = O(1) as t rarr 0+ hk(t) = O(1) as t rarr 0+ for all k ge 1h(2) = hprime(2) = hprimeprime(2) = 0 and h(x) hprime(x) and hprimeprime(x) are all continuous Thefunction hprimeprimeprime has a discontinuity at t = 2 As we said we understand h(4) in the senseof distributions at t = 2 for example limεrarr0

        int 2+ε

        2minusε h(4)(t)dt = limεrarr0(h(3)(2 + ε)minus

        h(3)(2minus ε))Symbolic integration easily gives that

        C0 =

        int 2

        0

        t(2minus t)3etminus12dt = 92eminus12 minus 12e32 = 202055184 (A9)

        A1 THE DECAY OF A MELLIN TRANSFORM 281

        We will have to compute Ck 1 le k le 4 with some care due to the absolute valueinvolved in the definition

        The function (x2(2minus x)3exminus12)prime = ((x2(2minus x)3)prime + x2(2minus x)3)exminus12 has thesame zeros as H1(x) = (x2(2minus x)3)prime + x2(2minus x)3 namely minus4 0 1 and 2 The signof H1(x) (and hence of hprime(x)) is + within (0 1) and minus within (1 2) Hence

        C1 =

        int infin0

        |hprime(x)|dx = |h(1)minus h(0)|+ |h(2)minus h(1)| = 2h(1) = 2radice (A10)

        The situation with (x2(2 minus x)3exminus12)primeprime is similar it has zeros at the roots ofH2(x) = 0 where H2(x) = H1(x) + H prime1(x) (and in general Hk+1(x) = Hk(x) +H primek(x)) This time we will prefer to find the roots numerically It is enough to find(candidates for) the roots using any available tool1 and then check rigorously that thesign does change around the purported roots In this way we check thatH2(x) = 0 hastwo roots α21 α22 in the interval (0 2) another root at 2 and two more roots outside[0 2] moreover

        α21 = 048756597185712

        α22 = 148777169309489 (A11)

        where we verify the root using interval arithmetic The sign of H2(x) (and hence ofhprimeprime(x)) is first + then minus then + Write α20 = 0 α23 = 2 By integration by parts

        C2 =

        int infin0

        |hprimeprime(x)|x dx =

        int α21

        0

        hprimeprime(x)x dxminusint α22

        α21

        hprimeprime(x)x dx+

        int 2

        α22

        hprimeprime(x)x dx

        =

        3sumj=1

        (minus1)j+1

        (hprime(x)x|α2j

        α2jminus1minusint α2j

        α2jminus1

        hprime(x) dx

        )

        = 2

        2sumj=1

        (minus1)j+1 (hprime(α2j)α2j minus h(α2j)) = 1079195821037

        (A12)

        To compute C3 we proceed in the same way finding two roots of H3(x) = 0(numerically) within the interval (0 2) viz

        α31 = 104294565694978

        α32 = 180999654602916

        The sign of H3(x) on the interval [0 2] is first minus then + then minus Write α30 = 0α33 = 2 Proceeding as before ndash with the only difference that the integration by parts

        1Routine find root in SAGE was used here

        282 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

        is iterated once now ndash we obtain that

        C3 =

        int infin0

        |hprimeprimeprime(x)|x2dx =

        3sumj=1

        (minus1)jint α3j

        α3jminus1

        hprimeprimeprime(x)x2dx

        =

        3sumj=1

        (minus1)j

        (hprimeprime(x)x2|α3j

        α3jminus1minusint α3j

        α3jminus1

        hprimeprime(x) middot 2x

        )dx

        =

        3sumj=1

        (minus1)j(hprimeprime(x)x2 minus hprime(x) middot 2x+ 2h(x)

        )|α3jα3jminus1

        = 2

        2sumj=1

        (minus1)j(hprimeprime(α3j)α23j minus 2hprime(α3j)α3j + 2h(α3j))

        (A13)

        and so interval arithmetic gives us

        C3 = 751295251672 (A14)

        The treatment of the integral in C4 is very similar at least as first There are tworoots of H4(x) = 0 in the interval (0 2) namely

        α41 = 045839599852663

        α42 = 154626346975533

        The sign ofH4(x) on the interval [0 2] is firstminus + thenminus Using integration by partsas before we obtainint 2minus

        0+

        ∣∣∣h(4)(x)∣∣∣x3dx

        = minusint α41

        0+

        h(4)(x)x3dx+

        int α42

        α41

        h(4)(x)x3dxminusint 2minus

        α41

        h(4)(x)x3dx

        = 2

        2sumj=1

        (minus1)j(h(3)(α4j)α

        34j minus 3h(2)(α4j)α

        24j + 6hprime(α4j)α4j minus 6h(α4j)

        )minus limtrarr2minus

        h(3)(t)t3 = 115269754862

        since limtrarr0+ h(k)(t)tk = 0 for 0 le k le 3 limtrarr2minus h(k)(t) = 0 for 0 le k le 2 and

        limtrarr2minus h(3)(t) = minus24e32 Nowint infin

        2minus|h(4)(x)x3|dx = lim

        εrarr0+|h(3)(2 + ε)minus h(3)(2minus ε)| middot 23 = 23 middot 24e32

        Hence

        C4 =

        int 2minus

        0+

        ∣∣∣h(4)(x)∣∣∣x3dx+ 24e32 middot 23 = 201318185012 (A15)

        A2 THE DIFFERENCE η+ minus η IN `2 NORM 283

        We finish by remarking that can write down Mh explicitly

        Mh = minuseminus12(minus1)minuss(8γ(s+2minus2)+12γ(s+3minus2)+6γ(s+4minus2)+γ(s+5minus2))(A16)

        where γ(s x) is the (lower) incomplete Gamma function

        γ(s x) =

        int x

        0

        eminusttsminus1dt

        We will however find it easier to deal with Mh by means of the bound (A8) in partbecause (A16) amounts to an invitation to numerical instability

        For instance it is easy to use (A8) to give a bound for the `1-norm of Mh(it)Since C4C3 gt C3C2 gt C2C1 gt C1C0

        |Mh(it)|1 = 2

        int infin0

        Mh(it)dt

        le2

        (C0C1

        C0+ C1

        int C2C1

        C1C0

        dt

        t+ C2

        int C3C2

        C2C1

        dt

        t2+ C3

        int C4C3

        C3C2

        dt

        t3+ C4

        int infinC4C3

        dt

        t4

        )

        =2

        (C1 + C1 log

        C2C0

        C21

        + C2

        (C1

        C2minus C2

        C3

        )+C3

        2

        (C2

        2

        C23

        minus C23

        C24

        )+C4

        3middot C

        33

        C34

        )

        and so|Mh(it)|1 le 161939176 (A17)

        This bound is far from tight but it will certainly be usefulSimilarly |(t+ i)Mh(it)|1 is at most two times

        C0

        int C1C0

        0

        |t+ i| dt+ C1

        int C2C1

        C1C0

        ∣∣∣∣1 +i

        t

        ∣∣∣∣ dt+ C2

        int C3C2

        C2C1

        dt

        t+ C3

        int C4C3

        C3C2

        dt

        t2+ C4

        int infinC4C3

        dt

        t3

        =C0

        2

        (radicC4

        1

        C40

        +C2

        1

        C20

        + sinhminus1 C1

        C0

        )+ C1

        (radict2 + 1 + log

        (radict2 + 1minus 1

        t

        ))|C2C1C1C0

        + C2 logC3C1

        C22

        + C3

        (C2

        C3minus C3

        C4

        )+C4

        2

        C23

        C24

        and so|(t+ i)Mh(it)|1 le 278622803 (A18)

        A2 The difference η+ minus η in `2 norm

        We wish to estimate the distance in `2 norm between η and its approximation η+ Thiswill be an easy affair since on the imaginary axis the Mellin transform of η+ is just atruncation of the Mellin transform of η

        284 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

        By (A2) and (A3)

        |η+ minus η|22 =

        int infin0

        ∣∣∣hH(t)teminust22 minus h(t)teminust

        22∣∣∣2 dt

        le(

        maxtge0

        eminust2

        t3)middotint infin

        0

        |hH(t)minus h(t)|2 dtt

        (A19)

        The maximum maxtge0 t3eminust

        2

        is (32)32eminus32 Since the Mellin transform is anisometry (ie (26) holds)int infin

        0

        |hH(t)minus h(t)|2 dtt

        =1

        int infinminusinfin|MhH(it)minusMh(it)|2dt =

        1

        π

        int infinH

        |Mh(it)|2dt

        (A20)By (A8) int infin

        H

        |Mh(it)|2dt leint infinH

        C24

        t8dt le C2

        4

        7H7 (A21)

        Hence int infin0

        |hH(t)minus h(t)|2 dttle C2

        4

        7πH7 (A22)

        Using the bound (A15) for C4 we conclude that

        |η+ minus η|2 leC4radic7π

        (3

        2e

        )34

        middot 1

        H72le 274856893

        H72 (A23)

        It will also be useful to bound∣∣∣∣int infin0

        (η+(t)minus η(t))2 log t dt

        ∣∣∣∣ This is at most (

        maxtge0

        eminust2

        t3| log t|)middotint infin

        0

        |hH(t)minus h(t)|2 dtt

        Now

        maxtge0

        eminust2

        t3| log t| = max

        (maxtisin[01]

        eminust2

        t3(minus log t) maxtisin[15]

        eminust2

        t3 log t

        )= 014882234545

        where we find the maximum by the bisection method with 40 iterations (see 26)Hence by (A22)int infin

        0

        (η+(t)minus η(t))2| log t|dt le 0148822346C2

        4

        le 27427502

        H7le(

        16561251

        H72

        )2

        (A24)

        A3 NORMS INVOLVING η+ 285

        A3 Norms involving η+

        Let us now bound some `1- and `2-norms involving η+ Relatively crude bounds willsuffice in most cases

        First by (A23)

        |η+|2 le |η|2 + |η+ minus η|2 le 0800129 +2748569

        H72

        |η+|2 ge |η|2 minus |η+ minus η|2 ge 0800128minus 2748569

        H72

        (A25)

        where we obtain

        |η|2 =radic

        0640205997 = 08001287 (A26)

        by symbolic integrationLet us now bound |η+ middot log |22 By isometry and (210)

        |η+ middot log |22 =1

        2πi

        int 12 +iinfin

        12minusiinfin

        |M(η+ middot log)(s)|2ds =1

        2πi

        int 12 +iinfin

        12minusiinfin

        |(Mη+)prime(s)|2ds

        Now (Mη+)prime(12 + it) equals 12π times the additive convolution of MhH(it) and(Mηdiams)prime(12 + it) where ηdiams(t) = teminust

        22 Hence by Youngrsquos inequality

        |(Mη+)prime(12 + it)|2 le1

        2π|MhH(it)|1|(Mηdiams)prime(12 + it)|2

        Again by isometry and (210)

        |(Mηdiams)prime(12 + it)|2 =radic

        2π|ηdiams middot log |2

        Hence by (A17)

        |η+ middot log |2 le1

        2π|MhH(it)|1|ηdiams middot log |2 le 25773421 middot |ηdiams middot log |2

        Since by symbolic integration

        |ηdiams middot log |2 leradicradic

        π

        32(8(log 2)2 + 2γ2 + π2 + 8(γ minus 2) log 2minus 8γ)

        le 03220301

        (A27)

        we get that|η+ middot log |2 le 08299818 (A28)

        Let us bound |η+(t)tσ|1 for σ isin (minus2infin) By Cauchy-Schwarz and Plancherel

        |η+(t)tσ|1 =∣∣∣hH(t)t1+σeminust

        22∣∣∣1le∣∣∣tσ+32eminust

        22∣∣∣2|hH(t)

        radict|2

        =∣∣∣tσ+32eminust

        22∣∣∣2

        radicint infin0

        |hH(t)|2 dtt

        =∣∣∣tσ+32eminust

        22∣∣∣2middot

        radic1

        int H

        minusH|Mh(ir)|2dr

        le∣∣∣tσ+32eminust

        22∣∣∣2middot

        radic1

        int infinminusinfin|Mh(ir)|2dr =

        ∣∣∣tσ+32eminust22∣∣∣2middot |h(t)

        radict|2

        (A29)

        286 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

        Since ∣∣∣tσ+32eminust22∣∣∣2

        =

        radicint infin0

        eminust2t2σ+3dt =

        radicΓ(σ + 2)

        2

        |h(t)radict|2 =

        radic31989

        8eminus 585e3

        8le 15023459

        we conclude that|η+(t)tσ|1 le 1062319 middot

        radicΓ(σ + 2) (A30)

        for σ gt minus2

        A4 Norms involving ηprime+By one of the standard transformation rules (see (210)) the Mellin transform of ηprime+equals minus(sminus 1) middotMη+(sminus 1) Since the Mellin transform is an isometry in the senseof (26)

        |ηprime+|22 =1

        2πi

        int 12 +iinfin

        12minusiinfin

        ∣∣M(ηprime+)(s)∣∣2 ds =

        1

        2πi

        int minus 12 +iinfin

        minus 12minusiinfin

        |s middotMη+(s)|2 ds

        Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) = teminust22 Thus by (29) the func-

        tion Mη+(minus12 + it) equals 12π times the (additive) convolution of MhH(it) andMηdiams(minus12 + it) Therefore for s = minus12 + it

        |s| |Mη+(s)| = |s|2π

        int H

        minusHMh(ir)Mηdiams(sminus ir)dr

        le 3

        int H

        minusH|ir minus 1||Mh(ir)| middot |sminus ir||Mηhearts(sminus ir)|dr

        =3

        2π(f lowast g)(t)

        (A31)

        where f(t) = |it minus 1||Mh(it)| and g(t) = | minus 12 + it||Mηdiams(minus12 + it)| (Since|(minus12 + i(tminus r)) + (1 + ir)| = |12 + it| = |s| either | minus 12 + i(tminus r)| ge |s|3 or|1+ir| ge 2|s|3 hence |sminusir||irminus1| = |minus12+i(tminusr)||1+ir| ge |s|3) By Youngrsquosinequality (in a special case that follows from Cauchy-Schwarz) |f lowast g|2 le |f |1|g|2By (A18)

        |f |1 = |(r + i)Mh(ir)|1 le 278622803

        Yet again by Plancherel

        |g|22 =

        int minus 12 +iinfin

        minus 12minusiinfin

        |s|2|Mηdiams(s)|2ds

        =

        int 12 +iinfin

        12minusiinfin

        |(M(ηprimediams))(s)|2ds = 2π|ηprimediams|22 =3π

        32

        4

        A4 NORMS INVOLVING ηprime+ 287

        Hence

        |ηprime+|2 le1radic2πmiddot 3

        2π|f lowast g|2

        le 1radic2π

        3

        2πmiddot 278622803

        radic3π

        32

        4le 10845789

        (A32)

        Let us now bound |ηprime+(t)tσ|1 for σ isin (minus1infin) First of all

        |ηprime+(t)tσ|1 =

        ∣∣∣∣(hH(t)teminust22)primetσ∣∣∣∣1

        le∣∣∣(hprimeH(t)teminust

        22 + hH(t)(1minus t2)eminust22)middot tσ∣∣∣1

        le∣∣∣hprimeH(t)tσ+1eminust

        22∣∣∣1

        + |η+(t)tσminus1|1 + |η+(t)tσ+1|1

        We can bound the last two terms by (A30) Much as in (A29) we note that∣∣∣hprimeH(t)tσ+1eminust22∣∣∣1le∣∣∣tσ+12eminust

        22∣∣∣2|hprimeH(t)

        radict|2

        and then see that

        |hprimeH(t)radict|2 =

        radicint infin0

        |hprimeH(t)|2t dt =

        radic1

        int infinminusinfin|M(hprimeH)(1 + ir)|2dr

        =

        radic1

        int infinminusinfin|(minusir)MhH(ir)|2dr =

        radic1

        int H

        minusH|(minusir)Mh(ir)|2dr

        =

        radic1

        int H

        minusH|M(hprime)(1 + ir)|2dr le

        radic1

        int infinminusinfin|M(hprime)(1 + ir)|2dr = |hprime(t)

        radict|2

        where we use the first rule in (210) twice Since

        ∣∣∣tσ+12eminust22∣∣∣2

        =

        radicΓ(σ + 1)

        2 |hprime(t)

        radict|2 =

        radic103983

        16eminus 1899e3

        16= 26312226

        we conclude that

        |ηprime+(t)tσ|1 le 1062319 middot (radic

        Γ(σ + 1) +radic

        Γ(σ + 3)) +

        radicΓ(σ + 1)

        2middot 26312226

        le 2922875radic

        Γ(σ + 1) + 1062319radic

        Γ(σ + 3)(A33)

        for σ gt minus1

        288 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

        A5 The `infin-norm of η+

        Let us now get a bound for |η+|infin Recall that η+(t) = hH(t)ηdiams(t) where ηdiams(t) =

        teminust22 Clearly

        |η+|infin = |hH(t)ηdiams(t)|infin le |η|infin + |(h(t)minus hH(t))ηdiams(t)|infin

        le |η|infin +

        ∣∣∣∣h(t)minus hH(t)

        t

        ∣∣∣∣infin|ηdiams(t)t|infin

        (A34)

        Taking derivatives we easily see that

        |η|infin = η(1) = 1 |ηdiams(t)t|infin = 2e

        It remains to bound |(h(t)minus hH(t))t|infin By (76)

        hH(t) =

        int infint2

        h(tyminus1)sin(H log y)

        π log y

        dy

        y=

        int infinminusH log 2

        t

        h

        (t

        ewH

        )sinw

        πwdw (A35)

        The sine integral

        Si(x) =

        int x

        0

        sin t

        tdt

        is defined for all x it tends to π2 as xrarr +infin and to minusπ2 as xrarr minusinfin (see [AS64(5225)]) We apply integration by parts to the second integral in (A35) and obtain

        hH(t)minus h(t) = minus 1

        π

        int infinminusH log 2

        t

        (d

        dwh

        (t

        ewH

        ))Si(w)dw minus h(t)

        = minus 1

        π

        int infin0

        (d

        dwh

        (t

        ewH

        ))(Si(w)minus π

        2

        )dw

        minus 1

        π

        int 0

        minusH log 2t

        (d

        dwh

        (t

        ewH

        ))(Si(w) +

        π

        2

        )dw

        Now ∣∣∣∣ ddwh(

        t

        ewH

        )∣∣∣∣ =teminuswH

        H

        ∣∣∣∣hprime( t

        ewH

        )∣∣∣∣ le t|hprime|infinHewH

        Integration by parts easily yields the bounds |Si(x) minus π2| lt 2x for x gt 0 and|Si(x) + π2| lt 2|x| for x lt 0 we also know that 0 le Si(x) le x lt π2 forx isin [0 1] and minusπ2 lt x le Si(x) le 0 for x isin [minus1 0] Hence

        |hH(t)minus h(t)| le 2t|hprime|infinπH

        (int 1

        0

        π

        2eminuswHdw +

        int infin1

        2eminuswH

        wdw

        )= t|hprime|infin middot

        ((1minus eminus1H) +

        4

        π

        E1(1H)

        H

        )

        where E1 is the exponential integral

        E1(z) =

        int infinz

        eminust

        tdt

        A5 THE `infin-NORM OF η+ 289

        By [AS64 (5120)]

        0 lt E1(1H) ltlog(H + 1)

        e1H

        and since log(H+1) = logH+log(1+1H) lt logH+1H lt (logH)(1+1H) lt(logH)e1H for H ge e we see that this gives us that E1(1H) lt logH (again forH ge e as is the case) Hence

        |hH(t)minus h(t)|t

        lt |hprime|infin middot(

        1minus eminus 1H +

        4

        π

        logH

        H

        )lt |hprime|infin middot

        1 + 4π logH

        H (A36)

        and so by (A34)

        |η+|infin le 1 +2

        e

        ∣∣∣∣h(t)minus hH(t)

        t

        ∣∣∣∣infinlt 1 +

        2

        e|hprime|infin middot

        1 + 4π logH

        H

        By (A11) and interval arithmetic we determine that

        |hprime|infin = |hprime(α22)| le 2805820379671 (A37)

        where α22 is a root of hprimeprime(x) = 0 as in (A11) We have proven

        |η+|infin lt 1+2

        emiddot280582038 middot

        1 + 4π logH

        Hlt 1+206440727 middot

        1 + 4π logH

        H (A38)

        We will need three other bounds of this kind namely for η+(t) log t η+(t)t andη+(t)t We start as in (A34)

        |η+ log t|infin le |η log t|infin + |(h(t)minus hH(t))ηdiams(t) log t|infinle |η log t|infin + |(hminus hH(t))t|infin|ηdiams(t)t log t|infin

        |η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)|infin|η+(t)t|infin le |η(t)t|infin + |(hminus hH(t))t|infin|ηdiams(t)t2|infin

        (A39)

        By the bisection method with 30 iterations implemented with interval arithmetic

        |η(t) log t|infin le 0279491 |ηdiams(t)t log t|infin le 03811561

        Hence by (A36) and (A37)

        |η+ log t|infin le 0279491 + 1069456 middot1 + 4

        π logH

        H (A40)

        By the bisection method with 32 iterations

        |η(t)t|infin le 108754396

        (We can also obtain this by solving (η(t)t)prime = 0 symbolically) It is easy to show

        that |ηdiams|infin = 1radice Hence again by (A36) and (A37)

        |η+(t)t|infin le 108754396 + 170181609 middot1 + 4

        π logH

        H (A41)

        290 APPENDIX A NORMS OF SMOOTHING FUNCTIONS

        By the bisection method with 32 iterations

        |η(t)t|infin le 106473476

        Taking derivatives we see that |ηdiams(t)t2|infin = 332eminus32 Hence yet again by (A36)and (A37)

        |η+(t)t|infin le 106473476 + 325312 middot1 + 4

        π logH

        H (A42)

        Appendix B

        Norms of Fourier transforms

        B1 The Fourier transform of ηprimeprime2Our aim here is to give upper bounds on |ηprimeprime2 |infin where η2 is as in (34) We will doconsiderably better than the trivial bound |ηprimeprime|infin le |ηprimeprime|1

        Lemma B11 For every t isin R

        |4e(minust4)minus 4e(minust2) + e(minust)| le 787052 (B1)

        We will describe an extremely simple but rigorous procedure to find the maxi-mum Since |g(t)|2 is C2 (in fact smooth) there are several more efficient and equallyrigourous algorithms ndash for starters the bisection method with error bounded in termsof |(|g|2)primeprime|infin

        Proof Letg(t) = 4e(minust4)minus 4e(minust2) + e(minust) (B2)

        For a le t le b

        g(t) = g(a) +tminus abminus a

        (g(b)minus g(a)) +1

        8(bminus a)2 middotOlowast( max

        visin[ab]|gprimeprime(v)|) (B3)

        (This formula in all likelihood well-known is easy to derive First we can assumewithout loss of generality that a = 0 b = 1 and g(a) = g(b) = 0 Dividing by gby g(t) we see that we can also assume that g(t) is real (and in fact 1) We can alsoassume that g is real-valued in that it will be enough to prove (B3) for the real-valuedfunction ltg as this will give us the bound g(t) = ltg(t) le (18) maxv |(ltg)primeprime(v)| lemaxv |gprimeprime(v)| that we wish for Lastly we can assume (by symmetry) that 0 le t le 12and that g has a local maximum or minimum at t Writing M = maxuisin[01] |gprimeprime(u)|we then have

        g(t) =

        int t

        0

        gprime(v)dv =

        int t

        0

        int v

        t

        gprimeprime(u)dudv = Olowast(int t

        0

        ∣∣∣∣int v

        t

        Mdu

        ∣∣∣∣ dv)= Olowast

        (int t

        0

        (v minus t)Mdv

        )= Olowast

        (1

        2t2M

        )= Olowast

        (1

        8M

        )

        291

        292 APPENDIX B NORMS OF FOURIER TRANSFORMS

        as desired)We obtain immediately from (B3) that

        maxtisin[ab]

        |g(t)| le max(|g(a)| |g(b)|) +1

        8(bminus a)2 middot max

        visin[ab]|gprimeprime(v)| (B4)

        For any v isin R

        |gprimeprime(v)| le(π

        2

        )2

        middot 4 + π2 middot 4 + (2π)2 = 9π2 (B5)

        Clearly g(t) depends only on t mod 4π Hence by (B4) and (B5) to estimate

        maxtisinR|g(t)|

        with an error of at most ε it is enough to subdivide [0 4π] into intervals of lengthleradic

        8ε9π2 each We set ε = 10minus6 and compute

        Lemma B12 Let η2 R+ rarr R be as in (34) Then

        |ηprimeprime2 |infin le 31521 (B6)

        This should be compared with |ηprimeprime2 |1 = 48

        Proof We can write

        ηprimeprime2 (x) = 4(4δ14(x)minus 4δ12(x) + δ1(x)) + f(x) (B7)

        where δx0is the point measure at x0 of mass 1 (Dirac delta function) and

        f(x) =

        0 if x lt 14 or x ge 1minus4xminus2 if 14 le x lt 124xminus2 if 12 le x lt 1

        Thus ηprimeprime2 (t) = 4g(t) + f(t) where g is as in (B2) It is easy to see that |f prime|1 =2 maxx f(x)minus 2 minx f(x) = 160 Therefore∣∣∣f(t)

        ∣∣∣ =∣∣∣f prime(t)(2πit)∣∣∣ le |f prime|1

        2π|t|=

        80

        π|t| (B8)

        Since 31521 minus 4 middot 787052 = 003892 we conclude that (B6) follows from LemmaB11 and (B8) for |t| ge 655 gt 80(π middot 003892)

        It remains to check the range t isin (minus655 655) since 4g(minust)+f(minust) is the complexconjugate of 4g(t) + f(t) it suffices to consider t non-negative We use (B4) (with4g+ f instead of g) and obtain that to estimate maxtisinR |4g+ f(t)| with an error of at

        most ε it is enough to subdivide [0 655) into intervals of lengthleradic

        2ε|(4g + f)primeprime|infineach and check |4g + f(t)| at the endpoints Now for every t isin R∣∣∣∣(f)primeprime (t)∣∣∣∣ =

        ∣∣∣(minus2πi)2x2f(t)∣∣∣ = (2π)2 middotOlowast

        (|x2f |1

        )= 12π2

        B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 293

        By this and (B5) |(4g + f)primeprime|infin le 48π2 Thus intervals of length δ1 give an errorterm of size at most 24π2δ2

        1 We choose δ1 = 0001 and obtain an error term less than0000237 for this stage

        To evaluate f(t) (and hence 4g(t) + f(t)) at a point we integrate using Simpsonrsquosrule on subdivisions of the intervals [14 12] [12 1] into 200 middotmax(1 b

        radic|t|c) sub-

        intervals each1 The largest value of f(t) we find is 3152065 with an error termof at most 45 middot 10minus5

        B2 Bounds involving a logarithmic factor

        Our aim now is to give upper bounds on |ηprimeprime(y)|infin where η(y)(t) = log(yt)η2(t) andy ge 4

        Lemma B21 Let η2 R+ rarr R be as in (34) Let η(y)(t) = log(yt)η2(t) wherey ge 4 Then

        |ηprime(y)|1 lt (log y)|ηprime2|1 (B9)

        Proof Recall that supp(η2) = (14 1) For t isin (14 12)

        ηprime(y)(t) = (4 log(yt) log 4t)prime =4 log 4t

        t+

        4 log yt

        tge 8 log 4t

        tgt 0

        whereas for t isin (12 1)

        ηprime(y)(t) = (minus4 log(yt) log t)prime = minus4 log yt

        tminus 4 log t

        t= minus4 log yt2

        tlt 0

        where we are using the fact that y ge 4 Hence η(y)(t) is increasing on (14 12) anddecreasing on (12 1) it is also continuous at t = 12 Hence |ηprime(y)|1 = 2|η(y)(12)|We are done by

        2|η(y)(12)| = 2 logy

        2middot η2(12) = log

        y

        2middot 8 log 2 lt log y middot 8 log 2 = (log y)|ηprime2|1

        Lemma B22 Let y ge 4 Let g(t) = 4e(minust4) minus 4e(minust2) + e(minust) and k(t) =2e(minust4)minus e(minust2) Then for every t isin R

        |g(t) middot log y minus k(t) middot 4 log 2| le 787052 log y (B10)

        Proof By Lemma B11 |g(t)| le 787052 Since y ge 4 k(t) middot (4 log 2) log y le 6For any complex numbers z1 z2 with |z1| |z2| le ` we can have |z1 minus z2| gt ` only if| arg(z1z2)| gt π3 It is easy to check that for all t isin [minus2 2]∣∣∣∣arg

        (g(t) middot log y

        4 log 2 middot k(t)

        )∣∣∣∣ =

        ∣∣∣∣arg

        (g(t)

        k(t)

        )∣∣∣∣ lt 07 ltπ

        3

        (It is possible to bound maxima rigorously as in (B4)) Hence (B10) holds1As usual the code uses interval arithmetic (sect26)

        294 APPENDIX B NORMS OF FOURIER TRANSFORMS

        Lemma B23 Let η2 R+ rarr R be as in (34) Let η(y)(t) = (log yt)η2(t) wherey ge 4 Then

        |ηprimeprime(y)|infin lt 31521 middot log y (B11)

        Proof Clearly

        ηprimeprime(y)(x) = ηprimeprime2 (x)(log y) +

        ((log x)ηprimeprime2 (x) +

        2

        xηprime2(x)minus 1

        x2η2(x)

        )= ηprimeprime2 (x)(log y) + 4(log x)(4δ14(x)minus 4δ12(x) + δ1(x)) + h(x)

        where

        h(x) =

        0 if x lt 14 or x gt 14x2 (2minus 2 log 2x) if 14 le x lt 124x2 (minus2 + 2 log x) if 12 le x lt 1

        (Here we are using the expression (B7) for ηprimeprime2 (x)) Hence

        ηprimeprime(y)(t) = (4g(t) + f(t))(log y) + (minus16 log 2 middot k(t) + h(t)) (B12)

        where k(t) = 2e(minust4)minus e(minust2) Just as in the proof of Lemma B12

        |f(t)| le |fprime|1

        2π|t|le 80

        π|t| |h(t)| le 160(1 + log 2)

        π|t| (B13)

        Again as before this implies that (B11) holds for

        |t| ge 1

        π middot 003892

        (80 +

        160(1 + log 2)

        (log 4)

        )= 225251

        Note also that it is enough to check (B11) for t ge 0 by symmetry Our remaining taskis to prove (B11) for 0 le t le 225221

        Let I = [03 225221] [325 365] For t isin I we will have

        arg

        (4g(t) + f(t)

        minus16 log 2 middot k(t) + h(t)

        )sub(minusπ

        3

        ) (B14)

        (This is actually true for 0 le t le 03 as well but we will use a different strategy inthat range in order to better control error terms) Consequently by Lemma B12 andlog y ge log 4

        |ηprimeprime(y)(t)| lt max(|4g(t) + f(t)| middot (log y) |16 log 2 middot k(t)minus h(t)|)

        lt max(31521(log y) |48 log 2 + 25|) = 31521 log y

        where we bound h(t) by (B13) and by a numerical computation of the maximum of|h(t)| for 0 le t le 4 as in the proof of Lemma B12

        It remains to check (B14) Here as in the proof of Lemma B22 the allowableerror is relatively large (the expression on the left of (B14) is actually contained in

        B2 BOUNDS INVOLVING A LOGARITHMIC FACTOR 295

        (minus1 1) for t isin I) We decide to evaluate the argument in (B14) at all t isin 0005Z cap I computing f(t) and h(t) by numerical integration (Simpsonrsquos rule) with a subdivisionof [minus14 1] into 5000 intervals Proceeding as in the proof of Lemma B11 we seethat the sampling induces an error of at most

        1

        200052 max

        visinI((4|gprimeprime(v)|+ |(f)primeprime(t)|) le 00001

        848π2 lt 000593 (B15)

        in the evaluation of 4g(t) + f(t) and an error of at most

        1

        200052 max

        visinI((16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|)

        le 00001

        8(16 log 2 middot 6π2 + 24π2 middot (2minus log 2)) lt 00121

        (B16)

        in the evaluation of 16 log 2 middot |kprimeprime(v)|+ |(h)primeprime(t)|Running the numerical evaluation just described for t isin I the estimates for the left

        side of (B14) at the sample points are at most 099134 in absolute value the absolutevalues of the estimates for 4g(t) + f(t) are all at least 27783 and the absolute valuesof the estimates for | minus 16 log 2 middot log k(t) + h(t)| are all at least 21166 Numericalintegration by Simpsonrsquos rule gives errors bounded by 017575 percent Hence theabsolute value of the left side of (B14) is at most

        099134 + arcsin

        (000593

        27783+ 00017575

        )+ arcsin

        (00121

        21166+ 00017575

        )le 100271 lt

        π

        3

        for t isin I Lastly for t isin [0 03] cup [325 365] a numerical computation (samples at 0001Z

        interpolation as in Lemma B12 integrals computed by Simpsonrsquos rule with a subdi-vision into 1000 intervals) gives

        maxtisin[003]cup[325365]

        (|(4g(t) + f(t))|+ | minus 16 log 2 middot k(t) + h(t)|

        log 4

        )lt 2908

        and so maxtisin[003]cup[325365] |ηprimeprime(y)|infin lt 291 log y lt 31521 log y

        An easy integral gives us that the function log middotη2 satisfies

        | log middotη2|1 = 2minus log 4 (B17)

        The following function will appear only in a lower-order term thus an `1 estimate willdo

        Lemma B24 Let η2 R+ rarr R be as in (34) Then

        |(log middotη2)primeprime|1 = 96 log 2 (B18)

        296 APPENDIX B NORMS OF FOURIER TRANSFORMS

        Proof The function log middotη(t) is 0 for t isin [14 1] is increasing and negative for t isin(14 12) and is decreasing and positive for t isin (12 1) Hence

        |(log middotη2)primeprime|infin = 2

        ((log middotη2)prime

        (1

        2

        )minus (log middotη2)prime

        (1

        4

        ))= 2(16 log 2minus (minus32 log 2)) = 96 log 2

        Appendix C

        Sums involving Λ and φ

        C1 Sums over primesHere we treat some sums of the type

        sumn Λ(n)ϕ(n) where ϕ has compact support

        Since the sums are over all integers (not just an arithmetic progression) and there is nophase e(αn) involved the treatment is relatively straightforward

        The following is standard

        Lemma C11 (Explicit formula) Let ϕ [1infin) rarr C be continuous and piecewiseC1 with ϕprimeprime isin `1 let it also be of compact support contained in [1infin) Thensum

        n

        Λ(n)ϕ(n) =

        int infin1

        (1minus 1

        x(x2 minus 1)

        )ϕ(x)dxminus

        sumρ

        (Mϕ)(ρ) (C1)

        where ρ runs over the non-trivial zeros of ζ(s)

        The non-trivial zeros of ζ(s) are of course those in the critical strip 0 lt lt(s) lt 1Remark Lemma C11 appears as exercise 5 in [IK04 sect55] the condition there

        that ϕ be smooth can be relaxed since already the weaker assumption that ϕprimeprime be in L1

        implies that the Mellin transform (Mϕ)(σ + it) decays quadratically on t as t rarr infinthereby guaranteeing that the sum

        sumρ(Mϕ)(ρ) converges absolutely

        Lemma C12 Let x ge 10 Let η2 be as in (117) Assume that all non-trivial zeros ofζ(s) with |=(s)| le T0 lie on the critical line

        Thensumn

        Λ(n)η2

        (nx

        )= x+Olowast

        (0135x12 +

        97

        x2

        )+

        log eT0

        T0

        (94

        2π+

        603

        T0

        )x

        (C2)In particular with T0 = 3061 middot 1010 in the assumption we have for x ge 2000sum

        n

        Λ(n)η2

        (nx

        )= (1 +Olowast(ε))x+Olowast(0135x12)

        where ε = 273 middot 10minus10

        297

        298 APPENDIX C SUMS INVOLVING Λ AND φ

        The assumption that all non-trivial zeros up to T0 = 3061 middot 1010 lie on the criticalline was proven rigorously in [Plaa] higher values of T0 have been reached elsewhere([Wed03] [GD04])

        Proof By Lemma C11sumn

        Λ(n)η2

        (nx

        )=

        int infin1

        η2

        (t

        x

        )dtminus

        int infin1

        η2(tx)

        t(t2 minus 1)dtminus

        sumρ

        (Mϕ)(ρ)

        where ϕ(u) = η2(ux) and ρ runs over all non-trivial zeros of ζ(s) Since η2 is non-negative

        intinfin1η2(tx)dt = x|η2|1 = x whileint infin

        1

        η2(tx)

        t(t2 minus 1)dt = Olowast

        (int 1

        14

        η2(t)

        tx2(t2 minus 1100)dt

        )= Olowast

        (961114

        x2

        )

        By (211)

        sumρ

        (Mϕ)(ρ) =sumρ

        Mη2(ρ) middot xρ =sumρ

        (1minus 2minusρ

        ρ

        )2

        = S1(x)minus 2S1(x2) + S1(x4)

        whereSm(x) =

        sumρ

        ρm+1 (C3)

        Setting aside the contribution of all ρ with |=(ρ)| le T0 and all ρ with |=(ρ)| gt T0 andlt(s) le 12 and using the symmetry provided by the functional equation we obtain

        |Sm(x)| le x12 middotsumρ

        1

        |ρ|m+1+ x middot

        sumρ

        |=(ρ)|gtT0

        |lt(ρ)|gt12

        1

        |ρ|m+1

        le x12 middotsumρ

        1

        |ρ|m+1+x

        2middotsumρ

        |=(ρ)|gtT0

        1

        |ρ|m+1

        We bound the first sum by [Ros41 Lemma 17] and the second sum by [RS03 Lemma2] We obtain

        |Sm(x)| le(

        1

        2mπTm0+

        268

        Tm+10

        )x log

        eT0

        2π+ κmx

        12 (C4)

        where κ1 = 00463 κ2 = 000167 and κ3 = 00000744Hence∣∣∣∣∣sum

        ρ

        (Mη)(ρ) middot xρ∣∣∣∣∣ le

        (1

        2πT0+

        268

        T 20

        )9x

        4log

        eT0

        2π+

        (3

        2+radic

        2

        )κ1x

        12

        C2 SUMS INVOLVING φ 299

        For T0 = 3061 middot 1010 and x ge 2000 we obtainsumn

        Λ(n)η2

        (nx

        )= (1 +Olowast(ε))x+Olowast(0135x12)

        where ε = 273 middot 10minus10

        Corollary C13 Let η2 be as in (117) Assume that all non-trivial zeros of ζ(s) with|=(s)| le T0 T0 = 3061 middot 1010 lie on the critical line Then for all x ge 1sum

        n

        Λ(n)η2

        (nx

        )le min

        ((1 + ε)x+ 02x12 104488x

        ) (C5)

        where ε = 273 middot 10minus10

        Proof Immediate from Lemma C12 for x ge 2000 For x lt 2000 we use computa-tion as follows Since |ηprime2|infin = 16 and

        sumx4lenlex Λ(n) le x for all x ge 0 computingsum

        nlex Λ(n)η2(nx) only for x isin (11000)Z cap [0 2000] results in an inaccuracy of atmost (16 middot 0000509995)x le 000801x This resolves the matter at all points outside(205 207) (for the first estimate) or outside (95 105) and (135 145) (for the secondestimate) In those intervals the prime powers n involved do not change (since whetherx4 lt n le x depends only on n and [x]) and thus we can find the maximum of thesum in (C5) just by taking derivatives

        C2 Sums involving φWe need estimates for several sums involving φ(q) in the denominator

        The easiest are convergent sums such assumq micro

        2(q)(φ(q)q) We can express thisasprodp(1 + 1(p(pminus 1))) This is a convergent product and the main task is to bound

        a tail for r an integer

        logprodpgtr

        (1 +

        1

        p(pminus 1)

        )lesumpgtr

        1

        p(pminus 1)lesumngtr

        1

        n(nminus 1)=

        1

        r (C6)

        A quick computation1 now suffices to give

        2591461 lesumq

        gcd(q 2)micro2(q)

        φ(q)qlt 2591463 (C7)

        and so

        1295730 lesumq odd

        micro2(q)

        φ(q)qlt 1295732 (C8)

        since the expression bounded in (C8) is exactly half of that bounded in (C7)

        1Using D Plattrsquos integer arithmetic package

        300 APPENDIX C SUMS INVOLVING Λ AND φ

        Again using (C6) we get that

        2826419 lesumq

        micro2(q)

        φ(q)2lt 2826421 (C9)

        In what follows we will use values for convergent sums obtained in much the sameway ndash an easy tail bound followed by a computation

        By [Ram95 Lemma 34]sumqler

        micro2(q)

        φ(q)= log r + cE +Olowast(7284rminus13)

        sumqlerq odd

        micro2(q)

        φ(q)=

        1

        2

        (log r + cE +

        log 2

        2

        )+Olowast(4899rminus13)

        (C10)

        wherecE = γ +

        sump

        log p

        p(pminus 1)= 1332582275 +Olowast(10minus93)

        by [RS62 (211)] As we already said in (1215) this supplemented by a computationfor r le 4 middot 107 gives

        log r + 1312 lesumqler

        micro2(q)

        φ(q)le log r + 1354

        for r ge 182 In the same way we get that

        1

        2log r + 083 le

        sumqlerq odd

        micro2(q)

        φ(q)le 1

        2log r + 085 (C11)

        for r ge 195 (The numerical verification here goes up to 138 middot 108 for r gt 318 middot 108use C11)

        Clearly sumqle2rq even

        micro2(q)

        φ(q)=sumqlerq odd

        micro2(q)

        φ(q) (C12)

        We wish to obtain bounds for the sumssumqger

        micro2(q)

        φ(q)2

        sumqgerq odd

        micro2(q)

        φ(q)2

        sumqgerq even

        micro2(q)

        φ(q)2

        where N isin Z+ and r ge 1 To do this it will be helpful to express some of thequantities within these sums as convolutions2 For q squarefree and j ge 1

        micro2(q)qjminus1

        φ(q)j=sumab=q

        fj(b)

        a (C13)

        2The author would like to thank O Ramare for teaching him this technique

        C2 SUMS INVOLVING φ 301

        where fj is the multiplicative function defined by

        fj(p) =pj minus (pminus 1)j

        (pminus 1)jp fj(p

        k) = 0 for k ge 2

        We will also find the following estimate useful

        Lemma C21 Let j ge 2 be an integer andA a positive real Letm ge 1 be an integerThen sum

        ageA(am)=1

        micro2(a)

        ajle ζ(j)ζ(2j)

        Ajminus1middotprodp|m

        (1 +

        1

        pj

        )minus1

        (C14)

        It is useful to note that ζ(2)ζ(4) = 15π2 = 1519817 and ζ(3)ζ(6) =1181564

        Proof The right side of (C14) decreases as A increases while the left side dependsonly on dAe Hence it is enough to prove (C14) when A is an integer

        For A = 1 (C14) is an equality Let

        C =ζ(j)

        ζ(2j)middotprodp|m

        (1 +

        1

        pj

        )minus1

        Let A ge 2 Since sumageA

        (am)=1

        micro2(a)

        aj= C minus

        sumaltA

        (am)=1

        micro2(a)

        aj

        and

        C =suma

        (am)=1

        micro2(a)

        ajlt

        sumaltA

        (am)=1

        micro2(a)

        aj+

        1

        Aj+

        int infinA

        1

        tjdt

        =sumaltA

        (am)=1

        micro2(a)

        aj+

        1

        Aj+

        1

        (j minus 1)Ajminus1

        we obtainsumageA

        (am)=1

        micro2(a)

        aj=

        1

        Ajminus1middot C +

        Ajminus1 minus 1

        Ajminus1middot C minus

        sumaltA

        (am)=1

        micro2(a)

        aj

        ltC

        Ajminus1+Ajminus1 minus 1

        Ajminus1middot(

        1

        Aj+

        1

        (j minus 1)Ajminus1

        )minus 1

        Ajminus1

        sumaltA

        (am)=1

        micro2(a)

        aj

        le C

        Ajminus1+

        1

        Ajminus1

        ((1minus 1

        Ajminus1

        )(1

        A+

        1

        j minus 1

        )minus 1

        )

        302 APPENDIX C SUMS INVOLVING Λ AND φ

        Since (1minus 1A)(1A+ 1) lt 1 and 1A+ 1(j minus 1) le 1 for j ge 3 we obtain that(1minus 1

        Ajminus1

        )(1

        A+

        1

        j minus 1

        )lt 1

        for all integers j ge 2 and so the statement follows

        We now obtain easily the estimates we want by (C13) and Lemma C21 (withj = 2 and m = 1)sumqger

        micro2(q)

        φ(q)2=sumqger

        sumab=q

        f2(b)

        a

        micro2(q)

        qlesumbge1

        f2(b)

        b

        sumagerb

        micro2(a)

        a2

        le ζ(2)ζ(4)

        r

        sumbge1

        f2(b) =15π2

        r

        prodp

        (1 +

        2pminus 1

        (pminus 1)2p

        )le 67345

        r

        (C15)

        Similarly by (C13) and Lemma C21 (with j = 2 and m = 2)sumqgerq odd

        micro2(q)

        φ(q)2=sumbge1

        b odd

        f2(b)

        b

        sumagerba odd

        micro2(a)

        a2le ζ(2)ζ(4)

        1 + 122

        1

        r

        sumb odd

        f2(b)

        =12

        π2

        1

        r

        prodpgt2

        (1 +

        2pminus 1

        (pminus 1)2p

        )le 215502

        r

        (C16)

        sumqgerq even

        micro2(q)

        φ(q)2=sumqger2q odd

        micro2(q)

        φ(q)2le 431004

        r (C17)

        Lastlysumqlerq odd

        micro2(q)q

        φ(q)=sumqlerq odd

        micro2(q)sumd|q

        1

        φ(d)=sumdlerd odd

        1

        φ(d)

        sumqlerd|qq odd

        micro2(q) lesumdlerd odd

        1

        2φ(d)

        ( rd

        + 1)

        le r

        2

        sumd odd

        1

        φ(d)d+

        1

        2

        sumdlerd odd

        1

        φ(d)le 064787r +

        log r

        4+ 0425

        (C18)where we are using (C8) and (C11)

        Since we are on the subject of φ(q) let us also prove a simple lemma that we useat various points in the text to bound qφ(q)

        Lemma C22 For any q ge 1 and any r ge max(3 q)

        q

        φ(q)lt z(r)

        C2 SUMS INVOLVING φ 303

        wherez(r) = eγ log log r +

        250637

        log log r (C19)

        Proof Since z(r) is increasing for r ge 27 the statement follows immediately forq ge 27 by [RS62 Thm 15]

        q

        φ(q)lt z(q) le z(r)

        For q lt 27 it is clear that qφ(q) le 2 middot 3(1 middot 2) = 3 By the arithmeticgeometricmean inequality z(t) ge 2

        radiceγ250637 gt 3 for all t gt e and so the lemma holds for

        q lt 27

        304 APPENDIX C SUMS INVOLVING Λ AND φ

        Appendix D

        Checking small n by checkingzeros of ζ(s)

        In order to show that every odd number n le N is the sum of three primes it is enoughto show for some M le N that

        1 every even integer 4 le m leM can be written as the sum of two primes

        2 the difference between any two consecutive primes le N is at most M minus 4

        (If we want to show that every odd number n le N is the sum of three odd primeswe just replace M minus 4 by M minus 6 in (2)) The best known result of type (1) is thatof Oliveira e Silva Herzog and Pardi ([OeSHP14] M = 4 middot 1018) As for (2) it wasproven in [HP13] for M = 4 middot 1018 and N = 8875694 middot 1030 by a direct computation(valid even if we replace M minus 4 by M minus 6 in the statement of (2))

        Alternatively one can establish results of type (2) by means of numerical verifica-tions of the Riemann hypothesis up to a certain height This is a classical approachfollowed in [RS75] and [Sch76] and later in [RS03] we will use the version of (1)kindly provided by Ramare in [Ramd] We carry out this approach in full here notbecause it is preferrable to [HP13] ndash it is still based on computations and it is slightlymore indirect than [HP13] ndash but simply to show that one can establish what we needby a different route

        A numerical verification of the Riemann hypothesis up to a certain height consistssimply in checking that all (non-trivial) zeroes z of the Riemann zeta function up to aheight H (meaning =(z) le H) lie on the critical line lt(z) = 12

        The height up to which the Riemann hypothesis has actually been fully verified isnot a matter on which there is unanimity The strongest claim in the literature is in[GD04] which states that the first 1013 zeroes of the Riemann zeta function lie on thecritical line lt(z) = 12 This corresponds to checking the Riemann hypothesis up toheight H = 244599 middot 1012 It is unclear whether this computation was or could beeasily made rigorous as pointed out in [SD10 p 2398] it has not been replicated yet

        Before [GD04] the strongest results were those of the ZetaGrid distributed com-puting project led by S Wedeniwski [Wed03] the method followed in it was more

        305

        306 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

        traditional and should allow rigorous verification involving interval arithmetic Unfor-tunately the results were never formally published The statement that the ZetaGridproject verified the first 9 middot 1011 zeroes (corresponding to H = 2419 middot 1011) is oftenquoted (eg [Bom10 p 29]) this is the point to which the project had got by thetime of Gourdon and Demichelrsquos announcement Wedeniwski asserts in private com-munication that the project verified the first 1012 zeroes and that the computation wasdouble-checked (by the same method)

        The strongest claim prior to ZetaGrid was that of van de Lune (H = 3293 middot 109first 1010 zeroes unpublished) Recently Platt [Plaa] checked the first 11 middot 1011 ze-roes (H = 3061 middot 1010) rigorously following a method essentially based on thatin [Boo06a] Note that [Plaa] uses interval arithmetic which is highly desirable forfloating-point computations

        Proposition D03 Every odd integer 5 le n le n0 is the sum of three primes where

        n0 =

        590698 middot 1029 if [GD04] is used (H = 244 middot 1012)615697 middot 1028 if ZetaGrid results are used (H = 2419 middot 1011)123163 middot 1027 if [Plaa] is used ( H = 3061 middot 1010)

        Proof For n le 4 middot 1018 + 3 this is immediate from [OeSHP14] Let 4 middot 1018 + 3 ltn le n0 We need to show that there is a prime p in [n minus 4 minus (n minus 4)∆ n minus 4]where ∆ is large enough for (nminus 4)∆ le 4 middot 1018 minus 4 to hold We will then have that4 le n minus p le 4 + (n minus 4)∆ le 4 middot 1018 Since n minus p is even [OeSHP14] will thenimply that nminus p is the sum of two primes pprime pprimeprime and so

        n = p+ pprime + pprimeprime

        Since nminus 4 gt 1011 the interval [nminus 4minus (nminus 4)∆ nminus 4] with ∆ = 28314000must contain a prime [RS03] This gives the solution for (nminus4) le 11325 middot1026 sincethen (nminus 4) le 4 middot 1018 minus 4 Note 11325 middot 1026 gt e59

        From here onwards we use the tables in [Ramd] to find acceptable values of ∆Since nminus 4 ge e59 we can choose

        ∆ =

        52211882224 if [GD04] is used (case (a))13861486834 if ZetaGrid is used (case (b))307779681 if [Plaa] is used (case (c))

        This gives us (n minus 4)∆ le 4 middot 1018 minus 4 for n minus 4 lt er0 where r0 = 67 in case (a)r0 = 66 in case (b) and r0 = 62 in case (c)

        If nminus 4 ge er0 we can choose (again by [Ramd])

        ∆ =

        146869130682 in case (a)15392435100 in case (b)307908668 in case (c)

        This is enough for nminus4 lt e68 in case (a) and without further conditions for (b) or (c)

        307

        Finally if nminus 4 ge e68 and we are in case (a) [Ramd] assures us that the choice

        ∆ = 147674531294

        is valid we verify as well that (n0 minus 4)∆ le 4 middot 1018 minus 4

        In other words the rigorous results in [Plaa] are enough to show the result for allodd n le 1027 Of course [HP13] is also more than enough and gives stronger resultsthan Prop D03

        308 APPENDIX D CHECKING SMALL N BY CHECKING ZEROS OF ζ(S)

        Bibliography

        [AS64] M Abramowitz and I A Stegun Handbook of mathematical func-tions with formulas graphs and mathematical tables volume 55 ofNational Bureau of Standards Applied Mathematics Series For sale bythe Superintendent of Documents US Government Printing OfficeWashington DC 1964

        [BBO10] J Bertrand P Bertrand and J-P Ovarlez Mellin transform In A DPoularikas editor Transforms and applications handbook CRC PressBoca Raton FL 2010

        [Bom74] E Bombieri Le grand crible dans la theorie analytique des nombresSociete Mathematique de France Paris 1974 Avec une sommaire enanglais Asterisque No 18

        [Bom10] E Bombieri The classical theory of zeta and L-functions Milan JMath 78(1)11ndash59 2010

        [Bom76] E Bombieri On twin almost primes Acta Arith 28(2)177ndash193197576

        [Boo06a] A R Booker Artinrsquos conjecture Turingrsquos method and the Riemannhypothesis Experiment Math 15(4)385ndash407 2006

        [Boo06b] A R Booker Turing and the Riemann hypothesis Notices AmerMath Soc 53(10)1208ndash1211 2006

        [Bor56] K G Borodzkin On the problem of I M Vinogradovrsquos constant (inRussian) In Proc Third All-Union Math Conf volume 1 page 3Izdat Akad Nauk SSSR Moscow 1956

        [Bou99] J Bourgain On triples in arithmetic progression Geom Funct Anal9(5)968ndash984 1999

        [BR02] G Bastien and M Rogalski Convexite complete monotonie etinegalites sur les fonctions zeta et gamma sur les fonctions desoperateurs de Baskakov et sur des fonctions arithmetiques CanadJ Math 54(5)916ndash944 2002

        309

        310 BIBLIOGRAPHY

        [But11] Y Buttkewitz Exponential sums over primes and the prime twin prob-lem Acta Math Hungar 131(1-2)46ndash58 2011

        [Che73] J R Chen On the representation of a larger even integer as the sum ofa prime and the product of at most two primes Sci Sinica 16157ndash1761973

        [Che85] J R Chen On the estimation of some trigonometrical sums and theirapplication Sci Sinica Ser A 28(5)449ndash458 1985

        [Chu37] NG Chudakov On the Goldbach problem C R (Dokl) Acad SciURSS n Ser 17335ndash338 1937

        [Chu38] NG Chudakov On the density of the set of even numbers which arenot representable as the sum of two odd primes Izv Akad Nauk SSSRSer Mat 2 pages 25ndash40 1938

        [Chu47] N G Chudakov Introduction to the theory of Dirichlet L-functionsOGIZ Moscow-Leningrad 1947 In Russian

        [CW89] J R Chen and T Z Wang On the Goldbach problem Acta MathSinica 32(5)702ndash718 1989

        [CW96] J R Chen and T Z Wang The Goldbach problem for odd numbersActa Math Sinica (Chin Ser) 39(2)169ndash174 1996

        [Dab96] H Daboussi Effective estimates of exponential sums over primesIn Analytic number theory Vol 1 (Allerton Park IL 1995) volume138 of Progr Math pages 231ndash244 Birkhauser Boston Boston MA1996

        [Dav67] H Davenport Multiplicative number theory Markham PublishingCo Chicago Ill 1967 Lectures given at the University of MichiganWinter Term

        [dB81] N G de Bruijn Asymptotic methods in analysis Dover PublicationsInc New York third edition 1981

        [Des08] R Descartes Œuvres de Descartes publiees par Charles Adam etPaul Tannery sous les auspices du Ministere de lrsquoInstruction publiquePhysico-mathematica Compendium musicae Regulae ad directionemingenii Recherche de la verite Supplement a la correspondance XParis Leopold Cerf IV u 691 S 4 1908

        [Des77] J-M Deshouillers Sur la constante de Snirelprimeman In SeminaireDelange-Pisot-Poitou 17e annee (197576) Theorie des nombresFac 2 Exp No G16 page 6 Secretariat Math Paris 1977

        [DEtRZ97] J-M Deshouillers G Effinger H te Riele and D Zinoviev A com-plete Vinogradov 3-primes theorem under the Riemann hypothesisElectron Res Announc Amer Math Soc 399ndash104 1997

        BIBLIOGRAPHY 311

        [Dic66] L E Dickson History of the theory of numbers Vol I Divisibilityand primality Chelsea Publishing Co New York 1966

        [DLDDD+10] C Daramy-Loirat F De Dinechin D Defour M Gallet N Gast andCh Lauter Crlibm March 2010 version 10beta4

        [DR01] H Daboussi and J Rivat Explicit upper bounds for exponential sumsover primes Math Comp 70(233)431ndash447 (electronic) 2001

        [Dre93] F Dress Fonction sommatoire de la fonction de Mobius I Majorationsexperimentales Experiment Math 2(2)89ndash98 1993

        [DS70] H G Diamond and J Steinig An elementary proof of the prime num-ber theorem with a remainder term Invent Math 11199ndash258 1970

        [Eff99] G Effinger Some numerical implications of the Hardy and Littlewoodanalysis of the 3-primes problem Ramanujan J 3(3)239ndash280 1999

        [EM95] M El Marraki Fonction sommatoire de la fonction de Mobius III Ma-jorations asymptotiques effectives fortes J Theor Nombres Bordeaux7(2)407ndash433 1995

        [EM96] M El Marraki Majorations de la fonction sommatoire de la fonctionmicro(n)n Univ Bordeaux 1 preprint (96-8) 1996

        [Est37] T Estermann On Goldbachrsquos Problem Proof that Almost all EvenPositive Integers are Sums of Two Primes Proc London Math SocS2-44(4)307ndash314 1937

        [FI98] J Friedlander and H Iwaniec Asymptotic sieve for primes Ann ofMath (2) 148(3)1041ndash1065 1998

        [FI10] J Friedlander and H Iwaniec Opera de cribro volume 57 of AmericanMathematical Society Colloquium Publications American Mathemat-ical Society Providence RI 2010

        [For02] K Ford Vinogradovrsquos integral and bounds for the Riemann zeta func-tion Proc London Math Soc (3) 85(3)565ndash633 2002

        [GD04] X Gourdon and P Demichel The first 1013 zeros of the Rie-mann zeta function and zeros computation at very large heighthttpnumberscomputationfreefrConstantsMiscellaneouszetazeros1e13-1e24pdf 2004

        [GR94] I S Gradshteyn and I M Ryzhik Table of integrals series and prod-ucts Academic Press Inc Boston MA fifth edition 1994 Transla-tion edited and with a preface by Alan Jeffrey

        [GR96] A Granville and O Ramare Explicit bounds on exponential sumsand the scarcity of squarefree binomial coefficients Mathematika43(1)73ndash107 1996

        312 BIBLIOGRAPHY

        [Har66] G H Hardy Collected papers of G H Hardy (Including Joint pa-pers with J E Littlewood and others) Vol I Edited by a committeeappointed by the London Mathematical Society Clarendon Press Ox-ford 1966

        [HB79] D R Heath-Brown The fourth power moment of the Riemann zetafunction Proc London Math Soc (3) 38(3)385ndash422 1979

        [HB85] D R Heath-Brown The ternary Goldbach problem Rev MatIberoamericana 1(1)45ndash59 1985

        [HB11] H Hong and Ch W Brown QEPCAD B ndash Quantifier elimination bypartial cylindrical algebraic decomposition May 2011 version 162

        [Hela] H A Helfgott Major arcs for Goldbachrsquos problem Preprint Availableat arXiv12035712

        [Helb] H A Helfgott Minor arcs for Goldbachrsquos problem Preprint Availableas arXiv12055252

        [Helc] H A Helfgott The Ternary Goldbach Conjecture is true PreprintAvailable as arXiv13127748

        [Hel13a] H Helfgott La conjetura debil de Goldbach Gac R Soc Mat Esp16(4) 2013

        [Hel13b] H A Helfgott The ternary Goldbach conjecture 2013 Avail-able at httpvaluevarwordpresscom20130702the-ternary-goldbach-conjecture

        [Hel14a] H A Helfgott La conjecture de Goldbach ternaire Gaz Math(140)5ndash18 2014 Translated by Margaret Bilu revised by the author

        [Hel14b] H A Helfgott The ternary Goldbach problem To appear in Proceed-ings of the International Congress of Mathematicians (Seoul Korea2014) 2014

        [HL22] G H Hardy and J E Littlewood Some problems of lsquoPartitio numero-rumrsquo III On the expression of a number as a sum of primes ActaMath 44(1)1ndash70 1922

        [HP13] H A Helfgott and David J Platt Numerical verification of the ternaryGoldbach conjecture up to 8875 middot 1030 Exp Math 22(4)406ndash4092013

        [HR00] G H Hardy and S Ramanujan Asymptotic formulaelig in combinatoryanalysis [Proc London Math Soc (2) 17 (1918) 75ndash115] In Collectedpapers of Srinivasa Ramanujan pages 276ndash309 AMS Chelsea PublProvidence RI 2000

        BIBLIOGRAPHY 313

        [Hux72] M N Huxley Irregularity in sifted sequences J Number Theory4437ndash454 1972

        [IK04] H Iwaniec and E Kowalski Analytic number theory volume 53 ofAmerican Mathematical Society Colloquium Publications AmericanMathematical Society Providence RI 2004

        [Kad] H Kadiri An explicit zero-free region for the Dirichlet L-functionsPreprint Available as arXiv0510570

        [Kad05] H Kadiri Une region explicite sans zeros pour la fonction ζ de Rie-mann Acta Arith 117(4)303ndash339 2005

        [Kar93] A A Karatsuba Basic analytic number theory Springer-VerlagBerlin 1993 Translated from the second (1983) Russian edition andwith a preface by Melvyn B Nathanson

        [Knu99] O Knuppel PROFILBIAS February 1999 version 2

        [Kor58] N M Korobov Estimates of trigonometric sums and their applicationsUspehi Mat Nauk 13(4 (82))185ndash192 1958

        [Lam08] B Lambov Interval arithmetic using SSE-2 In Reliable Implemen-tation of Real Number Algorithms Theory and Practice Interna-tional Seminar Dagstuhl Castle Germany January 8-13 2006 volume5045 of Lecture Notes in Computer Science pages 102ndash113 SpringerBerlin 2008

        [Leh66] R Sherman Lehman On the difference π(x) minus li(x) Acta Arith11397ndash410 1966

        [LW02] M-Ch Liu and T Wang On the Vinogradov bound in the three primesGoldbach conjecture Acta Arith 105(2)133ndash175 2002

        [Mar41] K K Mardzhanishvili On the proof of the Goldbach-Vinogradov the-orem (in Russian) C R (Doklady) Acad Sci URSS (NS) 30(8)681ndash684 1941

        [McC84a] K S McCurley Explicit estimates for the error term in the prime num-ber theorem for arithmetic progressions Math Comp 42(165)265ndash285 1984

        [McC84b] K S McCurley Explicit zero-free regions for Dirichlet L-functionsJ Number Theory 19(1)7ndash32 1984

        [Mon68] H L Montgomery A note on the large sieve J London Math Soc4393ndash98 1968

        [Mon71] H L Montgomery Topics in multiplicative number theory LectureNotes in Mathematics Vol 227 Springer-Verlag Berlin 1971

        314 BIBLIOGRAPHY

        [MV73] H L Montgomery and R C Vaughan The large sieve Mathematika20119ndash134 1973

        [MV74] H L Montgomery and R C Vaughan Hilbertrsquos inequality J LondonMath Soc (2) 873ndash82 1974

        [MV07] H L Montgomery and R C Vaughan Multiplicative number the-ory I Classical theory volume 97 of Cambridge Studies in AdvancedMathematics Cambridge University Press Cambridge 2007

        [Ned06] N S Nedialkov VNODE-LP a validated solver for initial value prob-lems in ordinary differential equations July 2006 version 03

        [OeSHP14] T Oliveira e Silva S Herzog and S Pardi Empirical verification ofthe even Goldbach conjecture and computation of prime gaps up to4 middot 1018 Math Comp 832033ndash2060 2014

        [OLBC10] F W J Olver D W Lozier R F Boisvert and Ch W Clark edi-tors NIST handbook of mathematical functions US Department ofCommerce National Institute of Standards and Technology Washing-ton DC 2010 With 1 CD-ROM (Windows Macintosh and UNIX)

        [Olv58] F W J Olver Uniform asymptotic expansions of solutions of lin-ear second-order differential equations for large values of a parameterPhilos Trans Roy Soc London Ser A 250479ndash517 1958

        [Olv59] F W J Olver Uniform asymptotic expansions for Weber paraboliccylinder functions of large orders J Res Nat Bur Standards Sect B63B131ndash169 1959

        [Olv61] F W J Olver Two inequalities for parabolic cylinder functions ProcCambridge Philos Soc 57811ndash822 1961

        [Olv65] F W J Olver On the asymptotic solution of second-order differentialequations having an irregular singularity of rank one with an applica-tion to Whittaker functions J Soc Indust Appl Math Ser B NumerAnal 2225ndash243 1965

        [Olv74] F W J Olver Asymptotics and special functions Academic Press[A subsidiary of Harcourt Brace Jovanovich Publishers] New York-London 1974 Computer Science and Applied Mathematics

        [Plaa] D Platt Computing π(x) analytically To appear in Math CompAvailable as arXiv12035712

        [Plab] D Platt Numerical computations concerning GRH Preprint Availableat arXiv13053087

        [Pla11] D Platt Computing degree 1 L-functions rigorously PhD thesis Bris-tol University 2011

        BIBLIOGRAPHY 315

        [Rama] O Ramare Etat des lieux Preprint Available as httpmathuniv-lille1fr˜ramareMathsExplicitJNTBpdf

        [Ramb] O Ramare Explicit estimates on several summatory functions involv-ing the Moebius function To appear in Math Comp

        [Ramc] O Ramare A sharp bilinear form decomposition for primes and Moe-bius function Preprint To appear in Acta Math Sinica

        [Ramd] O Ramare Short effective intervals containing primes Preprint

        [Ram95] O Ramare On Snirelprimemanrsquos constant Ann Scuola Norm Sup PisaCl Sci (4) 22(4)645ndash706 1995

        [Ram09] O Ramare Arithmetical aspects of the large sieve inequality volume 1of Harish-Chandra Research Institute Lecture Notes Hindustan BookAgency New Delhi 2009 With the collaboration of D S Ramana

        [Ram10] O Ramare On Bombierirsquos asymptotic sieve J Number Theory130(5)1155ndash1189 2010

        [Ram13] O Ramare From explicit estimates for primes to explicit estimates forthe Mobius function Acta Arith 157(4)365ndash379 2013

        [Ram14] O Ramare Explicit estimates on the summatory functions of theMobius function with coprimality restrictions Acta Arith 165(1)1ndash10 2014

        [Ros41] B Rosser Explicit bounds for some functions of prime numbers AmerJ Math 63211ndash232 1941

        [RR96] O Ramare and R Rumely Primes in arithmetic progressions MathComp 65(213)397ndash425 1996

        [RS62] J B Rosser and L Schoenfeld Approximate formulas for some func-tions of prime numbers Illinois J Math 664ndash94 1962

        [RS75] J B Rosser and L Schoenfeld Sharper bounds for the Chebyshevfunctions θ(x) and ψ(x) Math Comp 29243ndash269 1975 Collectionof articles dedicated to Derrick Henry Lehmer on the occasion of hisseventieth birthday

        [RS03] O Ramare and Y Saouter Short effective intervals containing primesJ Number Theory 98(1)10ndash33 2003

        [RV83] H Riesel and R C Vaughan On sums of primes Ark Mat 21(1)46ndash74 1983

        [Sao98] Y Saouter Checking the odd Goldbach conjecture up to 1020 MathComp 67(222)863ndash866 1998

        316 BIBLIOGRAPHY

        [Sch33] L Schnirelmann Uber additive Eigenschaften von Zahlen Math Ann107(1)649ndash690 1933

        [Sch76] L Schoenfeld Sharper bounds for the Chebyshev functions θ(x) andψ(x) II Math Comp 30(134)337ndash360 1976

        [SD10] Y Saouter and P Demichel A sharp region where π(x) minus li(x) ispositive Math Comp 79(272)2395ndash2405 2010

        [Sel91] A Selberg Lectures on sieves In Collected papers vol II pages66ndash247 Springer Berlin 1991

        [Sha14] X Shao A density version of the Vinogradov three primes theoremDuke Math J 163(3)489ndash512 2014

        [Shu92] F H Shu The Cosmos In Encyclopaedia Britannica Macropaediavolume 16 pages 762ndash795 Encyclopaedia Britannica Inc 15 edition1992

        [Tao14] T Tao Every odd number greater than 1 is the sum of at most fiveprimes Math Comp 83(286)997ndash1038 2014

        [Tem10] N M Temme Parabolic cylinder functions In NIST Handbook ofmathematical functions pages 303ndash319 US Dept Commerce Wash-ington DC 2010

        [Tru] T S Trudgian An improved upper bound for the error in thezero-counting formulae for Dirichlet L-functions and Dedekind zeta-functions Preprint

        [Tuc11] W Tucker Validated numerics A short introduction to rigorous com-putations Princeton University Press Princeton NJ 2011

        [Tur53] A M Turing Some calculations of the Riemann zeta-function ProcLondon Math Soc (3) 399ndash117 1953

        [TV03] N M Temme and R Vidunas Parabolic cylinder functions exam-ples of error bounds for asymptotic expansions Anal Appl (Singap)1(3)265ndash288 2003

        [van37] J G van der Corput Sur lrsquohypothese de Goldbach pour presque tousles nombres pairs Acta Arith 2266ndash290 1937

        [Vau77a] R C Vaughan On the estimation of Schnirelmanrsquos constant J ReineAngew Math 29093ndash108 1977

        [Vau77b] R-C Vaughan Sommes trigonometriques sur les nombres premiersC R Acad Sci Paris Ser A-B 285(16)A981ndashA983 1977

        [Vau80] R C Vaughan Recent work in additive prime number theory In Pro-ceedings of the International Congress of Mathematicians (Helsinki1978) pages 389ndash394 Acad Sci Fennica Helsinki 1980

        BIBLIOGRAPHY 317

        [Vau97] R C Vaughan The Hardy-Littlewood method volume 125 of Cam-bridge Tracts in Mathematics Cambridge University Press Cam-bridge second edition 1997

        [Vin37] I M Vinogradov A new method in analytic number theory (Russian)Tr Mat Inst Steklova 105ndash122 1937

        [Vin47] IM Vinogradov The method of trigonometrical sums in the theory ofnumbers (Russian) Tr Mat Inst Steklova 233ndash109 1947

        [Vin54] I M Vinogradov The method of trigonometrical sums in the theoryof numbers Interscience Publishers London and New York 1954Translated revised and annotated by K F Roth and Anne Davenport

        [Vin58] I M Vinogradov A new estimate of the function ζ(1 + it) Izv AkadNauk SSSR Ser Mat 22161ndash164 1958

        [Vin04] I M Vinogradov The method of trigonometrical sums in the theory ofnumbers Dover Publications Inc Mineola NY 2004 Translated fromthe Russian revised and annotated by K F Roth and Anne DavenportReprint of the 1954 translation

        [Wed03] S Wedeniwski ZetaGrid - Computational verification of the Riemannhypothesis Conference in Number Theory in honour of Professor HC Williams Banff Alberta Canada May 2003

        [Wei84] A Weil Number theory An approach through history From Hammu-rapi to Legendre Birkhauser Boston Inc Boston MA 1984

        [Whi03] E T Whittaker On the functions associated with the parabolic cylinderin harmonic analysis Proc London Math Soc 35417ndash427 1903

        [Wig20] S Wigert Sur la theorie de la fonction ζ(s) de Riemann Ark Mat141ndash17 1920

        [Won01] R Wong Asymptotic approximations of integrals volume 34 of Clas-sics in Applied Mathematics Society for Industrial and Applied Math-ematics (SIAM) Philadelphia PA 2001 Corrected reprint of the 1989original

        [Zin97] D Zinoviev On Vinogradovrsquos constant in Goldbachrsquos ternary problemJ Number Theory 65(2)334ndash358 1997

        • Preface
        • Acknowledgements
        • 1 Introduction
          • 11 History and new developments
          • 12 The circle method Fourier analysis on Z
          • 13 The major arcs M
            • 131 What do we really know about L-functions and their zeros
            • 132 Estimates of f0362f() for in the major arcs
              • 14 The minor arcs m
                • 141 Qualitative goals and main ideas
                • 142 Combinatorial identities
                • 143 Type I sums
                • 144 Type II or bilinear sums
                  • 15 Integrals over the major and minor arcs
                  • 16 Some remarks on computations
                    • 2 Notation and preliminaries
                      • 21 General notation
                      • 22 Dirichlet characters and L functions
                      • 23 Fourier transforms and exponential sums
                      • 24 Mellin transforms
                      • 25 Bounds on sums of and
                      • 26 Interval arithmetic and the bisection method
                        • I Minor arcs
                          • 3 Introduction
                            • 31 Results
                            • 32 Comparison to earlier work
                            • 33 Basic setup
                              • 331 Vaughans identity
                              • 332 An alternative route
                                  • 4 Type I sums
                                    • 41 Trigonometric sums
                                    • 42 Type I estimates
                                      • 421 Type I variations
                                          • 5 Type II sums
                                            • 51 The sum S1 cancellation
                                              • 511 Reduction to a sum with
                                              • 512 Explicit bounds for a sum with
                                              • 513 Estimating the triple sum
                                                • 52 The sum S2 the large sieve primes and tails
                                                  • 6 Minor-arc totals
                                                    • 61 The smoothing function
                                                    • 62 Contributions of different types
                                                      • 621 Type I terms SI1
                                                      • 622 Type I terms SI2
                                                      • 623 Type II terms
                                                        • 63 Adjusting parameters Calculations
                                                          • 631 First choice of parameters qy
                                                          • 632 Second choice of parameters
                                                            • 64 Conclusion
                                                                • II Major arcs
                                                                  • 7 Major arcs overview and results
                                                                    • 71 Results
                                                                    • 72 Main ideas
                                                                      • 8 The Mellin transform of the twisted Gaussian
                                                                        • 81 How to choose a smoothing function
                                                                        • 82 The twisted Gaussian overview and setup
                                                                          • 821 Relation to the existing literature
                                                                          • 822 General approach
                                                                            • 83 The saddle point
                                                                              • 831 The coordinates of the saddle point
                                                                              • 832 The direction of steepest descent
                                                                                • 84 The integral over the contour
                                                                                  • 841 A simple contour
                                                                                  • 842 Another simple contour
                                                                                    • 85 Conclusions
                                                                                      • 9 Explicit formulas
                                                                                        • 91 A general explicit formula
                                                                                        • 92 Sums and decay for the Gaussian
                                                                                        • 93 The case of (t)
                                                                                        • 94 The case of +(t)
                                                                                        • 95 A sum for +(t)2
                                                                                        • 96 A verification of zeros and its consequences
                                                                                            • III The integral over the circle
                                                                                              • 10 The integral over the major arcs
                                                                                                • 101 Decomposition of S by characters
                                                                                                • 102 The integral over the major arcs the main term
                                                                                                • 103 The 2 norm over the major arcs
                                                                                                • 104 The integral over the major arcs conclusion
                                                                                                  • 11 Optimizing and adapting smoothing functions
                                                                                                    • 111 The symmetric smoothing function
                                                                                                      • 1111 The product (t) (-t)
                                                                                                        • 112 The smoothing function adapting minor-arc bounds
                                                                                                          • 12 The 2 norm and the large sieve
                                                                                                            • 121 Variations on the large sieve for primes
                                                                                                            • 122 Bounding the quotient in the large sieve for primes
                                                                                                              • 13 The integral over the minor arcs
                                                                                                                • 131 Putting together 2 bounds over arcs and bounds
                                                                                                                • 132 The minor-arc total
                                                                                                                  • 14 Conclusion
                                                                                                                    • 141 The 2 norm over the major arcs explicit version
                                                                                                                    • 142 The total major-arc contribution
                                                                                                                    • 143 The minor-arc total explicit version
                                                                                                                    • 144 Conclusion proof of main theorem
                                                                                                                        • IV Appendices
                                                                                                                          • A Norms of smoothing functions
                                                                                                                            • A1 The decay of a Mellin transform
                                                                                                                            • A2 The difference +- in 2 norm
                                                                                                                            • A3 Norms involving +
                                                                                                                            • A4 Norms involving +
                                                                                                                            • A5 The -norm of +
                                                                                                                              • B Norms of Fourier transforms
                                                                                                                                • B1 The Fourier transform of 2
                                                                                                                                • B2 Bounds involving a logarithmic factor
                                                                                                                                  • C Sums involving and
                                                                                                                                    • C1 Sums over primes
                                                                                                                                    • C2 Sums involving
                                                                                                                                      • D Checking small n by checking zeros of (s)

          top related