Transcript
The ternary Goldbach problem
Harald Andres Helfgott
arX
iv1
501
0543
8v2
[m
ath
NT
] 2
7 Ja
n 20
15
ii
Contents
Preface vii
Acknowledgements ix
1 Introduction 111 History and new developments 212 The circle method Fourier analysis on Z 613 The major arcs M 9
131 What do we really know about L-functions and their zeros 9132 Estimates of f(α) for α in the major arcs 10
14 The minor arcs m 14141 Qualitative goals and main ideas 14142 Combinatorial identities 16143 Type I sums 18144 Type II or bilinear sums 21
15 Integrals over the major and minor arcs 2416 Some remarks on computations 28
2 Notation and preliminaries 3121 General notation 3122 Dirichlet characters and L functions 3223 Fourier transforms and exponential sums 3224 Mellin transforms 3425 Bounds on sums of micro and Λ 3526 Interval arithmetic and the bisection method 38
I Minor arcs 41
3 Introduction 4331 Results 4432 Comparison to earlier work 4533 Basic setup 45
331 Vaughanrsquos identity 45
iii
iv CONTENTS
332 An alternative route 47
4 Type I sums 5141 Trigonometric sums 5142 Type I estimates 56
421 Type I variations 63
5 Type II sums 7751 The sum S1 cancellation 80
511 Reduction to a sum with micro 80512 Explicit bounds for a sum with micro 84513 Estimating the triple sum 89
52 The sum S2 the large sieve primes and tails 93
6 Minor-arc totals 10161 The smoothing function 10162 Contributions of different types 102
621 Type I terms SI1 102622 Type I terms SI2 103623 Type II terms 107
63 Adjusting parameters Calculations 117631 First choice of parameters q le y 119632 Second choice of parameters 125
64 Conclusion 133
II Major arcs 135
7 Major arcs overview and results 13771 Results 13872 Main ideas 140
8 The Mellin transform of the twisted Gaussian 14381 How to choose a smoothing function 14582 The twisted Gaussian overview and setup 146
821 Relation to the existing literature 146822 General approach 147
83 The saddle point 149831 The coordinates of the saddle point 149832 The direction of steepest descent 150
84 The integral over the contour 152841 A simple contour 152842 Another simple contour 157
85 Conclusions 159
CONTENTS v
9 Explicit formulas 16391 A general explicit formula 16492 Sums and decay for the Gaussian 17593 The case of ηlowast(t) 17894 The case of η+(t) 18495 A sum for η+(t)2 18896 A verification of zeros and its consequences 193
III The integral over the circle 199
10 The integral over the major arcs 201101 Decomposition of Sη by characters 202102 The integral over the major arcs the main term 204103 The `2 norm over the major arcs 207104 The integral over the major arcs conclusion 212
11 Optimizing and adapting smoothing functions 217111 The symmetric smoothing function η 218
1111 The product η(t)η(ρminus t) 218112 The smoothing function ηlowast adapting minor-arc bounds 219
12 The `2 norm and the large sieve 227121 Variations on the large sieve for primes 227122 Bounding the quotient in the large sieve for primes 232
13 The integral over the minor arcs 245131 Putting together `2 bounds over arcs and `infin bounds 245132 The minor-arc total 248
14 Conclusion 259141 The `2 norm over the major arcs explicit version 259142 The total major-arc contribution 261143 The minor-arc total explicit version 267144 Conclusion proof of main theorem 275
IV Appendices 277
A Norms of smoothing functions 279A1 The decay of a Mellin transform 280A2 The difference η+ minus η in `2 norm 283A3 Norms involving η+ 285A4 Norms involving ηprime+ 286A5 The `infin-norm of η+ 288
vi CONTENTS
B Norms of Fourier transforms 291B1 The Fourier transform of ηprimeprime2 291B2 Bounds involving a logarithmic factor 293
C Sums involving Λ and φ 297C1 Sums over primes 297C2 Sums involving φ 299
D Checking small n by checking zeros of ζ(s) 305
Preface
ἐγγὺς δrsquo ἦν τέλεος ὃ δὲ τὀ τρίτον ἧκε χ[αμᾶζε
σὺν τῶι δrsquo ἐξέφυγεν θάνατον καὶ κῆ[ρα μέλαιναν
Hesiod () Ehoiai fr 7621ndash2 Merkelbach and West
The ternary Goldbach conjecture (or three-prime conjecture) states that every oddnumber n greater than 5 can be written as the sum of three primes The purpose of thisbook is to give the first full proof of this conjecture
The proof builds on the great advances made in the early 20th century by Hardy andLittlewood (1922) and Vinogradov (1937) Progress since then has been more gradualIn some ways it was necessary to clear the board and start work using only the mainexisting ideas towards the problem together with techniques developed elsewhere
Part of the aim has been to keep the exposition as accessible as possible withan emphasis on qualitative improvements and new technical ideas that should be ofuse elsewhere The main strategy was to give an analytic approach that is efficientrelatively clean and as it must be for this problem explicit the focus does not lie inoptimizing explicit constants or in performing calculations necessary as these tasksare
Organization In the introduction after a summary of the history of the problemwe will go over a detailed outline of the proof The rest of the book is divided in threeparts structured so that they can be read independently the first two parts do not referto each other and the third part uses only the main results (clearly marked) of the firsttwo parts
As is the case in most proofs involving the circle method the problem is reduced toshowing that a certain integral over the ldquocirclerdquo RZ is non-zero The circle is dividedinto major arcs and minor arcs In Part I ndash in some ways the technical heart of the proofndash we will see how to give upper bounds on the integrand when α is in the minor arcsPart II will provide rather precise estimates for the integrand when the variable α is inthe major arcs Lastly Part III shows how to use these inputs as well as possible toestimate the integral
Each part and each chapter starts with a general discussion of the strategy andthe main ideas involved Some of the more technical bounds and computations arerelegated to the appendices
vii
viii PREFACE
Dependencies between the chapters
1 2
3 7 10
4 8 11
5 9 12
6 13
14
Introduction Notation andpreliminaries
Minor arcsintroduction
Type I sums
Type II sums
Minor-arctotals
Major arcsoverview
Mellin transform oftwisted Gaussian
Explicit formulas
The integral overthe major arcs
Smoothing func-tions and their use
The `2 norm andthe large sieve
The integral overthe minor arcs
Conclusion
Acknowledgements
The author is very thankful to D Platt who working in close coordination with himprovided GRH verifications in the necessary ranges and also helped him with the usageof interval arithmetic He is also deeply grateful to O Ramare who in reply to hisrequests prepared and sent for publication several auxiliary results and who otherwiseprovided much-needed feedback
The author is also much indebted to A Booker B Green R Heath-Brown HKadiri D Platt T Tao and M Watkins for many discussions on Goldbachrsquos prob-lem and related issues Several historical questions became clearer due to the helpof J Brandes K Gong R Heath-Brown Z Silagadze R Vaughan and T WooleyAdditional references were graciously provided by R Bryant S Huntsman and IRezvyakova Thanks are also due to B Bukh A Granville and P Sarnak for theirvaluable advice
The introduction is largely based on the authorrsquos article for the Proceedings of the2014 ICM [Hel14b] That article in turn is based in part on the informal note [Hel13b]which was published in Spanish translation ([Hel13a] translated by M A Morales andthe author and revised with the help of J Cilleruelo and M Helfgott) and in a Frenchversion ([Hel14a] translated by M Bilu and revised by the author) The proof firstappeared as a series of preprints [Helb] [Hela] [Helc]
Travel and other expenses were funded in part by the Adams Prize and the PhilipLeverhulme Prize The authorrsquos work on the problem started at the Universite deMontreal (CRM) in 2006 he is grateful to both the Universite de Montreal and theEcole Normale Superieure for providing pleasant working environments During thelast stages of the work travel was partly covered by ANR Project Caesar No ANR-12-BS01-0011
The present work would most likely not have been possible without free and pub-licly available software SAGE PARI Maxima gnuplot VNODE-LP PROFIL BIASand of course LATEX Emacs the gcc compiler and GNULinux in general Some ex-ploratory work was done in SAGE and Mathematica Rigorous calculations used eitherD Plattrsquos interval-arithmetic package (based in part on Crlibm) or the PROFILBIASinterval arithmetic package underlying VNODE-LP
The calculations contained in this paper used a nearly trivial amount of resourcesthey were all carried out on the authorrsquos desktop computers at home and work How-ever D Plattrsquos computations [Plab] used a significant amount of resources kindly do-nated to D Platt and the author by several institutions This crucial help was providedby MesoPSL (affiliated with the Observatoire de Paris and Paris Sciences et Lettres)
ix
x ACKNOWLEDGEMENTS
Universite de Paris VIVII (UPMC - DSI - Pole Calcul) University of Warwick (thanksto Bill Hart) University of Bristol France Grilles (French National Grid InfrastructureDIRAC national instance) Universite de Lyon 1 and Universite de Bordeaux 1 BothD Platt and the author would like to thank the donating organizations their technicalstaff and all those who helped to make these resources available to them
Chapter 1
Introduction
The question we will discuss or one similar to it seems to have been first posed byDescartes in a manuscript published only centuries after his death [Des08 p 298]Descartes states ldquoSed amp omnis numerus par fit ex uno vel duobus vel tribus primisrdquo(ldquoBut also every even number is made out of one two or three prime numbersrdquo1) Thisstatement comes in the middle of a discussion of sums of polygonal numbers such asthe squares
Statements on sums of primes and sums of values of polynomials (polygonal num-bers powers nk etc) have since shown themselves to be much more than mere cu-riosities ndash and not just because they are often very difficult to prove Whereas the studyof sums of powers can rely on their algebraic structure the study of sums of primesleads to the realization that from several perspectives the set of primes behaves muchlike the set of integers or like a random set of integers (It also leads to the realizationthat this is very hard to prove)
If instead of the primes we had a random set of odd integers S whose density ndashan intuitive concept that can be made precise ndash equaled that of the primes then wewould expect to be able to write every odd number as a sum of three elements of Sand every even number as the sum of two elements of S We would have to check byhand whether this is true for small odd and even numbers but it is relatively easy toshow that after a long enough check it would be very unlikely that there would be anyexceptions left among the infinitely many cases left to check
The question then is in what sense we need the primes to be like a random set ofintegers in other words we need to know what we can prove about the regularities ofthe distribution of the primes This is one of the main questions of analytic numbertheory progress on it has been very slow and difficult
Fourier analysis expresses information on the distribution of a sequence in termsof frequencies In the case of the primes what may be called the main frequencies ndashthose in the major arcs ndash correspond to the same kind of large-scale distribution thatis encoded by L-functions the family of functions to which the Riemann zeta function
1Thanks are due to J Brandes and R Vaughan for a discussion on a possible ambiguity in the Latinwording Descartesrsquo statement is mentioned (with a translation much like the one given here) in DicksonrsquosHistory [Dic66 Ch XVIII]
1
2 CHAPTER 1 INTRODUCTION
belongs On some of the crucial questions on L-functions the limits of our knowledgehave barely budged in the last century There is something relatively new now namelyrigorous numerical data of non-negligible scope still such data is by definition finiteand as a consequence its range of applicability is very narrow Thus the real questionin the major-arc regime is how to use well the limited information we do have on thelarge-scale distribution of the primes As we will see this requires delicate work onexplicit asymptotic analysis and smoothing functions
Outside the main frequencies ndash that is in what are called the minor arcs ndash estimatesbased on L-functions no longer apply and what is remarkable is that one can sayanything meaningful on the distribution of the primes Vinogradov was the first to giveunconditional non-trivial bounds showing that there are no great irregularities in theminor arcs this is what makes them ldquominorrdquo Here the task is to give sharper boundsthan Vinogradov It is in this regime that we can genuinely say that we learn a littlemore about the distribution of the primes based on what is essentially an elementaryand highly optimized analytic-combinatorial analysis of exponential sums ie Fouriercoefficients given by series (supported on the primes in our case)
The circle method reduces an additive problem ndash that is a problems on sums suchas sums of primes powers etc ndash to the estimation of an integral on the space offrequencies (the ldquocirclerdquo RZ) In the case of the primes as we have just discussed wehave precise estimates on the integrand on part of the circle (the major arcs) and upperbounds on the rest of the circle (the minor arcs) Putting them together efficiently togive an estimate on the integral is a delicate matter we leave it for the last part as itis really what is particular to our problem as opposed to being of immediate generalrelevance to the study of the primes As we shall see estimating the integral well doesinvolve using ndash and improving ndash general estimates on the variance of irregularities inthe distribution of the primes as given by the large sieve
In fact one of the main general lessons of the proof is that there is a very closerelationship between the circle method and the large sieve we will use the large sievenot just as a tool ndash which we shall incidentally sharpen in certain contexts ndash but as asource for ideas on how to apply the circle method more effectively
This has been an attempt at a first look from above Let us now undertake a moreleisurely and detailed overview of the problem and its solution
11 History and new developments
The history of the conjecture starts properly with Euler and his close friend ChristianGoldbach both of whom lived and worked in Russia at the time of their correspon-dence ndash about a century after Descartesrsquo isolated statement Goldbach a man of manyinterests is usually classed as a serious amateur he seems to have awakened Eulerrsquospassion for number theory which would lead to the beginning of the modern era ofthe subject [Wei84 Ch 3 sectIV] In a letter dated June 7 1742 Goldbach made aconjectural statement on prime numbers and Euler rapidly reduced it to the followingconjecture which he said Goldbach had already posed to him every positive integercan be written as the sum of at most three prime numbers
11 HISTORY AND NEW DEVELOPMENTS 3
We would now say ldquoevery integer greater than 1rdquo since we no long consider 1 tobe a prime number Moreover the conjecture is nowadays split into two
bull the weak or ternary Goldbach conjecture states that every odd integer greaterthan 5 can be written as the sum of three primes
bull the strong or binary Goldbach conjecture states that every even integer greaterthan 2 can be written as the sum of two primes
As their names indicate the strong conjecture implies the weak one (easily subtract 3from your odd number n then express nminus 3 as the sum of two primes)
The strong conjecture remains out of reach A short while ago ndash the first completeversion appeared on May 13 2013 ndash the author proved the weak Goldbach conjecture
Theorem 111 Every odd integer greater than 5 can be written as the sum of threeprimes
In 1937 I M Vinogradov proved [Vin37] that the conjecture is true for all oddnumbers n larger than some constant C (Hardy and Littlewood had proved the samestatement under the assumption of the Generalized Riemann Hypothesis which weshall have the chance to discuss later)
It is clear that a computation can verify the conjecture only for n le c c a constantcomputations have to be finite What can make a result coming from analytic numbertheory be valid only for n ge C
An analytic proof generally speaking gives us more than just existence In thiskind of problem it gives us more than the possibility of doing something (here writingan integer n as the sum of three primes) It gives us a rigorous estimate for the numberof ways in which this something is possible that is it shows us that this number ofways equals
main term + error term (11)
where the main term is a precise quantity f(n) and the error term is something whoseabsolute value is at most another precise quantity g(n) If f(n) gt g(n) then (11) isnon-zero ie we will have shown the existence of a way to write our number as thesum of three primes
(Since what we truly care about is existence we are free to weigh different waysof writing n as the sum of three primes however we wish ndash that is we can decide thatsome primes ldquocountrdquo twice or thrice as much as others and that some do not count atall)
Typically after much work we succeed in obtaining (11) with f(n) and g(n) suchthat f(n) gt g(n) asymptotically that is for n large enough To give a highly simplifiedexample if say f(n) = n2 and g(n) = 100n32 then f(n) gt g(n) for n gt C whereC = 104 and so the number of ways (11) is positive for n gt C
We want a moderate value of C that is a C small enough that all cases n le C canbe checked computationally To ensure this we must make the error term bound g(n)as small as possible This is our main task A secondary (and sometimes neglected)possibility is to rig the weights so as to make the main term f(n) larger in comparisonto g(n) this can generally be done only up to a certain point but is nonetheless veryhelpful
4 CHAPTER 1 INTRODUCTION
As we said the first unconditional proof that odd numbers n ge C can be writtenas the sum of three primes is due to Vinogradov Analytic bounds fall into severalcategories or stages quite often successive versions of the same theorem will gothrough successive stages
1 An ineffective result shows that a statement is true for some constant C but givesno way to determine what the constant C might be Vinogradovrsquos first proof ofhis theorem (in [Vin37]) is like this it shows that there exists a constant C suchthat every odd number n gt C is the sum of three primes yet give us no hope offinding out what the constant C might be2 Many proofs of Vinogradovrsquos resultin textbooks are also of this type
2 An effective but not explicit result shows that a statement is true for someunspecified constant C in a way that makes it clear that a constant C couldin principle be determined following and reworking the proof with great careVinogradovrsquos later proof ([Vin47] translated in [Vin54]) is of this nature AsChudakov [Chu47 sectIV2] pointed out the improvement on [Vin37] given byMardzhanishvili [Mar41] already had the effect of making the result effective3
3 An explicit result gives a value of C According to [Chu47 p 201] the firstexplicit version of Vinogradovrsquos result was given by Borozdkin in his unpub-lished doctoral dissertation written under the direction of Vinogradov (1939)C = exp(exp(exp(4196))) Such a result is by definition also effectiveBorodzkin later [Bor56] gave the value C = ee
16038
though he does not seem tohave published the proof The best ndash that is smallest ndash value of C known beforethe present work was that of Liu and Wang [LW02] C = 2 middot 101346
4 What we may call an efficient proof gives a reasonable value for C ndash in our casea value small enough that checking all cases up to C is feasible
How far were we from an efficient proof That is what sort of computation couldever be feasible The situation was paradoxical the conjecture was known above anexplicit C but C = 2 middot101346 is so large that it could not be said that the problem couldbe attacked by any foreseeable computational means within our physical universe (Atruly brute-force verification up to C takes at least C steps a cleverer verification takeswell over
radicC steps The number of picoseconds since the beginning of the universe is
less than 1030 whereas the number of protons in the observable universe is currentlyestimated at sim 1080 [Shu92] this limits the number of steps that can be taken inany currently imaginable computer even if it were to do parallel processing on anastronomical scale) Thus the only way forward was a series of drastic improvementsin the mathematical rather than computational side
I gave a proof with C = 1029 in May 2013 Since D Platt and I had verifiedthe conjecture for all odd numbers up to n le 88 middot 1030 by computer [HP13] thisestablished the conjecture for all odd numbers n
2Here as is often the case in ineffective results in analytic number theory the underlying issue is that ofSiegel zeros which are believed not to exist but have not been shown not to the strongest bounds on (ieagainst) such zeros are ineffective and so are all of the many results using such estimates
3The proof in [Mar41] combined the bounds in [Vin37] with a more careful accounting of the effect ofthe single possible Siegel zero within range
11 HISTORY AND NEW DEVELOPMENTS 5
(In December 2013 I reduced C to 1027 The verification of the ternary Gold-bach conjecture up to n le 1027 can be done on a home computer over a weekendas of the time of writing (2014) It must be said that this uses the verification of thebinary Goldbach conjecture for n le 4 middot 1018 [OeSHP14] which itself required com-putational resources far outside the home-computing range Checking the conjectureup to n le 1027 was not even the main computational task that needed to be accom-plished to establish the Main Theorem ndash that task was the finite verification of zeros ofL-functions in [Plab] a general-purpose computation that should be useful elsewhere)
What was the strategy of the proof The basic framework is the one pioneered byHardy and Littlewood for a variety of problems ndash namely the circle method which aswe shall see is an application of Fourier analysis over Z (There are other later routesto Vinogradovrsquos result see [HB85] [FI98] and especially the recent work [Sha14]which avoids using anything about zeros of L-functions inside the critical strip) Vino-gradovrsquos proof like much of the later work on the subject was based on a detailedanalysis of exponential sums ie Fourier transforms over Z So is the proof that wewill sketch
At the same time the distance between 2 middot 101346 and 1027 is such that we cannothope to get to 1027 (or any other reasonable constant) by fine-tuning previous workRather we must work from scratch using the basic outline in Vinogradovrsquos originalproof and other initially unrelated developments in analysis and number theory (no-tably the large sieve) Merely improving constants will not do rather we must doqualitatively better than previous work (by non-constant factors) if we are to have anychance to succeed It is on these qualitative improvements that we will focus
It is only fair to review some of the progress made between Vinogradovrsquos time andours Here we will focus on results later we will discuss some of the progress madein the techniques of proof See [Dic66 Ch XVIII] for the early history of the problem(before Hardy and Littlewood) see R Vaughanrsquos ICM lecture notes on the ternaryGoldbach problem [Vau80] for some further details on the history up to 1978
In 1933 Schnirelmann proved [Sch33] that every integer n gt 1 can be written asthe sum of at most K primes for some unspecified constant K (This pioneering workis now considered to be part of the early history of additive combinatorics) In 1969Klimov gave an explicit value for K (namely K = 6 middot 109) he later improved theconstant to K = 115 (with G Z Piltay and T A Sheptickaja) and K = 55 Laterthere were results by Vaughan [Vau77a] (K = 27) Deshouillers [Des77] (K = 26)and Riesel-Vaughan [RV83] (K = 19)
Ramare showed in 1995 that every even number n gt 1 can be written as the sum ofat most 6 primes [Ram95] In 2012 Tao proved [Tao14] that every odd number n gt 1is the sum of at most 5 primes
There have been other avenues of attack towards the strong conjecture Using ideasclose to those of Vinogradovrsquos Chudakov [Chu37] [Chu38] Estermann [Est37] andvan der Corput [van37] proved (independently from each other) that almost every evennumber (meaning all elements of a subset of density 1 in the even numbers) can bewritten as the sum of two primes In 1973 J-R Chen showed [Che73] that every even
6 CHAPTER 1 INTRODUCTION
number n larger than a constant C can be written as the sum of a prime number andthe product of at most two primes (n = p1 + p2 or n = p1 + p2p3) IncidentallyJ-R Chen himself together with T-Z Wang was responsible for the best bounds onC (for ternary Goldbach) before Lui and Wang C = exp(exp(11503)) lt 4 middot 1043000
[CW89] and C = exp(exp(9715)) lt 6 middot 107193 [CW96]Matters are different if one assumes the Generalized Riemann Hypothesis (GRH)
A careful analysis [Eff99] of Hardy and Littlewoodrsquos work [HL22] gives that everyodd number n ge 124 middot 1050 is the sum of three primes if GRH is true4 Accordingto [Eff99] the same statement with n ge 1032 was proven in the unpublished doctoraldissertation of B Lucke a student of E Landaursquos in 1926 Zinoviev [Zin97] improvedthis to n ge 1020 A computer check ([DEtRZ97] see also [Sao98]) showed that theconjecture is true for n lt 1020 thus completing the proof of the ternary Goldbachconjecture under the assumption of GRH What was open until now was of course theproblem of giving an unconditional proof
12 The circle method Fourier analysis on Z
It is common for a first course on Fourier analysis to focus on functions over the re-als satisfying f(x) = f(x + 1) or what is the same functions f RZ rarr CSuch a function (unless it is fairly pathological) has a Fourier series converging to itthis is just the same as saying that f has a Fourier transform f Z rarr C definedby f(n) =
intRZ f(α)e(minusαn)dα and satisfying f(α) =
sumnisinZ f(n)e(αn) (Fourier
inversion theorem) where e(t) = e2πitIn number theory we are especially interested in functions f Zrarr C Then things
are exactly the other way around provided that f decays reasonably fast as n rarr plusmninfin(or becomes 0 for n large enough) f has a Fourier transform f RZ rarr C definedby f(α) =
sumn f(n)e(minusαn) and satisfying f(n) =
intRZ f(α)e(αn)dα (Highbrow
talk we already knew that Z is the Fourier dual of RZ and so of course RZ isthe Fourier dual of Z) ldquoExponential sumsrdquo (or ldquotrigonometrical sumsrdquo as in the titleof [Vin54]) are sums of the form
sumn f(α)e(minusαn) of course the ldquocirclerdquo in ldquocircle
methodrdquo is just a name for RZ (To see an actual circle in the complex plane look atthe image of RZ under the map α 7rarr e(α))
The study of the Fourier transform f is relevant to additive problems in numbertheory ie questions on the number of ways of writing n as a sum of k integers ofa particular form Why One answer could be that f gives us information about theldquorandomnessrdquo of f if f were the characteristic function of a random set then f(α)would be very small outside a sharp peak at α = 0
We can also give a more concrete and immediate answer Recall that in generalthe Fourier transform of a convolution equals the product of the transforms over Z
4In fact Hardy Littlewood and Effinger use an assumption somewhat weaker than GRH they assumethat Dirichlet L-functions have no zeroes satisfying lt(s) ge θ where θ lt 34 is arbitrary (We will reviewDirichlet L-functions in a minute)
12 THE CIRCLE METHOD FOURIER ANALYSIS ON Z 7
this means that for the additive convolution
(f lowast g)(n) =sum
m1m2isinZm1+m2=n
f(m1)g(m2)
the Fourier transform satisfies the simple rule
f lowast g(α) = f(α) middot g(α)
We can see right away from this that (f lowast g)(n) can be non-zero only if n can bewritten as n = m1 + m2 for some m1 m2 such that f(m1) and g(m2) are non-zeroSimilarly (f lowastglowasth)(n) can be non-zero only if n can be written as n = m1 +m2 +m3
for some m1 m2 m3 such that f(m1) f2(m2) and f3(m3) are all non-zero Thissuggests that to study the ternary Goldbach problem we define f1 f2 f3 Zrarr C sothat they take non-zero values only at the primes
Hardy and Littlewood defined f1(n) = f2(n) = f3(n) = 0 for n non-prime (andalso for n le 0) and f1(n) = f2(n) = f3(n) = (log n)eminusnx for n prime (where x isa parameter to be fixed later) Here the factor eminusnx is there to provide ldquofast decayrdquoso that everything converges as we will see later Hardy and Littlewoodrsquos choice ofeminusnx (rather than some other function of fast decay) comes across in hindsight asbeing very clever though not quite best-possible (Their ldquochoicerdquo was to some extentnot a choice but an artifact of their version of the circle method which was framedin terms of power series not in terms of exponential sums with arbitrary smoothingfunctions) The term log n is there for technical reasons ndash in essence it makes senseto put it there because a random integer around n has a chance of about 1(log n) ofbeing prime
We can see that (f1 lowast f2 lowast f3)(n) 6= 0 if and only if n can be written as the sumof three primes Our task is then to show that (f1 lowast f2 lowast f3)(n) (ie (f lowast f lowast f)(n))is non-zero for every n larger than a constant C sim 1027 Since the transform of aconvolution equals a product of transforms
(f1lowastf2lowastf3)(n) =
intRZ
f1 lowast f2 lowast f3(α)e(αn)dα =
intRZ
(f1f2f3)(α)e(αn)dα (12)
Our task is thus to show that the integralintRZ(f1f2f3)(α)e(αn)dα is non-zero
As it happens f(α) is particularly large when α is close to a rational with smalldenominator Moreover for such α it turns out we can actually give rather preciseestimates for f(α) Define M (called the set of major arcs) to be a union of narrowarcs around the rationals with small denominator
M =⋃qler
⋃a mod q
(aq)=1
(a
qminus 1
qQa
q+
1
)
where Q is a constant times xr and r will be set later (This is a slight simplificationthe major-arc set we will actually use in the course of the proof will be a little different
8 CHAPTER 1 INTRODUCTION
due to a distinction between odd and even q) We can writeintRZ
(f1f2f3)(α)e(αn)dα =
intM
(f1f2f3)(α)e(αn)dα+
intm
(f1f2f3)(α)e(αn)dα
(13)where m is the complement (RZ) M (called minor arcs)
Now we simply do not know how to give precise estimates for f(α) when α is inm However as Vinogradov realized one can give reasonable upper bounds on |f(α)|for α isin m This suggests the following strategy show thatint
m
|f1(α)||f2(α)||f3(α)|dα ltintM
f1(α)f2(α)f3(α)e(αn)dα (14)
By (12) and (13) this will imply immediately that (f1 lowast f2 lowast f3)(n) gt 0 and so wewill be done
The name of circle method is given to the study of additive problems by means ofFourier analysis over Z and in particular to the use of a subdivision of the circle RZinto major and minor arcs to estimate the integral of a Fourier transform There wasa ldquocirclerdquo already in Hardy and Ramanujanrsquos work [HR00] but the subdivision intomajor and minor arcs is due to Hardy and Littlewood who also applied their methodto a wide variety of additive problems (Hence ldquothe Hardy-Littlewood methodrdquo as analternative name for the circle method) For instance before working on the ternaryGoldbach conjecture they studied the question of whether every n gt C can be writtenas the sum of kth powers (Waringrsquos problem) In fact they used a subdivision intomajor and minor arcs to study Waringrsquos problem and not for the ternary Goldbachproblem they had no minor-arc bounds for ternary Goldbach and their use of GRHhad the effect of making every α isin RZ yield to a major-arc treatment
Vinogradov worked with finite exponential sums ie fi compactly supportedFrom todayrsquos perspective it is clear that there are applications (such as ours) in whichit can be more important for fi to be smooth than compactly supported still Vino-gradovrsquos simplifications were an incentive to further developments In the case of theternary Goldbachrsquos problem his key contribution consisted in the fact that he couldgive bounds on f(α) for α in the minor arcs without using GRH
An important note in the case of the binary Goldbach conjecture the method failsat (14) and not before if our understanding of the actual value of fi(α) is at all correctit is simply not true in general thatint
m
|f1(α)||f2(α)|dα ltintM
f1(α)f2(α)e(αn)dα
Let us see why this is not surprising Set f1 = f2 = f3 = f for simplicity so thatwe have the integral of the square (f(α))2 for the binary problem and the integral ofthe cube (f(α))3 for the ternary problem Squaring like cubing amplifies the peaksof f(α) which are at the rationals of small denominator and their immediate neighbor-hoods (the major arcs) however cubing amplifies the peaks much more than squaringThis is why even though the arcs making up M are very narrow
intM
(f(α))3e(αn)dα
13 THE MAJOR ARCS M 9
is larger thanintm|f(α)|3dα that explains the name major arcs ndash they are not large but
they give the major part of the contribution In contrast squaring amplifies the peaksless and this is why the absolute value of
intMf(α)2e(αn)dα is in general smaller thanint
m|f(α)|2dα As nobody knows how to prove a precise estimate (and in particular
lower bounds) on f(α) for α isin m the binary Goldbach conjecture is still very muchout of reach
To prove the ternary Goldbach conjecture it is enough to estimate both sides of(14) for carefully chosen f1 f2 f3 and compare them This is our task from now on
13 The major arcs M
131 What do we really know about L-functions and their zerosBefore we start let us give a very brief review of basic analytic number theory (in thesense of say [Dav67]) A Dirichlet character χ Z rarr C of modulus q is a characterof (ZqZ)lowast lifted to Z (In other words χ(n) = χ(n+ q) for all n χ(ab) = χ(a)χ(b)for all a b and χ(n) = 0 for (n q) 6= 1) A Dirichlet L-series is defined by
L(s χ) =
infinsumn=1
χ(n)nminuss
for lt(s) gt 1 and by analytic continuation for lt(s) le 1 (The Riemann zeta functionζ(s) is the L-function for the trivial character ie the character χ such that χ(n) = 1for all n) Taking logarithms and then derivatives we see that
minus Lprime(s χ)
L(s χ)=
infinsumn=1
χ(n)Λ(n)nminuss (15)
for lt(s) gt 1 where Λ is the von Mangoldt function (Λ(n) = log p if n is some primepower pα α ge 1 and Λ(n) = 0 otherwise)
Dirichlet introduced his characters and L-series so as to study primes in arithmeticprogressions In general and after some work (15) allows us to restate many sumsover the primes (such as our Fourier transforms f(α)) as sums over the zeros ofL(s χ)A non-trivial zero of L(s χ) is a zero of L(s χ) such that 0 lt lt(s) lt 1 (The otherzeros are called trivial because we know where they are namely at negative integersand in some cases also on the line lt(s) = 0 In order to eliminate all zeros onlt(s) = 0 outside s = 0 it suffices to assume that χ is primitive a primitive charactermodulo q is one that is not induced by (ie not the restriction of) any character modulod|q d lt q)
The Generalized Riemann Hypothesis for Dirichlet L-functions is the statementthat for every Dirichlet character χ every non-trivial zero of L(s χ) satisfies lt(s) =12 Of course the Generalized Riemann Hypothesis (GRH) ndash and the Riemann Hy-pothesis which is the special case of χ trivial ndash remains unproven Thus if we want toprove unconditional statements we need to make do with partial results towards GRHTwo kinds of such results have been proven
10 CHAPTER 1 INTRODUCTION
bull Zero-free regions Ever since the late nineteenth century (Hadamard de laVallee-Poussin) we have known that there are hourglass-shaped regions (moreprecisely of the shape c
log t le σ le 1minus clog t where c is a constant and where we
write s = σ + it) outside which non-trivial zeros cannot lie Explicit values forc are known [McC84b] [Kad05] [Kad] There is also the Vinogradov-Korobovregion [Kor58] [Vin58] which is broader asymptotically but narrower in mostof the practical range (see [For02] however)
bull Finite verifications of GRH It is possible to (ask a computer to) prove smallfinite fragments of GRH in the sense of verifying that all non-trivial zeros ofa given finite set of L-functions with imaginary part less than some constant Hlie on the critical line lt(s) = 12 Such verifications go back to Riemannwho checked the first few zeros of ζ(s) Large-scale rigorous computer-basedverifications are now a possibility
Most work in the literature follows the first alternative though [Tao14] did use afinite verification of RH (ie GRH for the trivial character) Unfortunately zero-freeregions seem too narrow to be useful for the ternary Goldbach problem Thus we areleft with the second alternative
In coordination with the present work Platt [Plab] verified that all zeros s of L-functions for characters χ with modulus q le 300000 satisfying =(s) le Hq lie on theline lt(s) = 12 where
bull Hq = 108q for q odd and
bull Hq = max(108q 200 + 75 middot 107q) for q even
This was a medium-large computation taking a few hundreds of thousands of core-hours on a parallel computer It used interval arithmetic for the sake of rigor we willlater discuss what this means
The choice to use a finite verification of GRH rather than zero-free regions hadconsequences on the manner in which the major and minor arcs had to be chosen Aswe shall see such a verification can be used to give very precise bounds on the majorarcs but also forces us to define them so that they are narrow and their number isconstant To be precise the major arcs were defined around rationals aq with q le rr = 300000 moreover as will become clear the fact that Hq is finite will force theirwidth to be bounded by c0rqx where c0 is a constant (say c0 = 8)
132 Estimates of f(α) for α in the major arcs
Recall that we want to estimate sums of the type f(α) =sumf(n)e(minusαn) where
f(n) is something like (log n)η(nx) for n equal to a prime and 0 otherwise hereη Rrarr C is some function of fast decay such as Hardy and Littlewoodrsquos choice
η(t) =
eminust for t ge 0
0 for t lt 0
13 THE MAJOR ARCS M 11
Let us modify this just a little ndash we will actually estimate
Sη(α x) =sum
Λ(n)e(αn)η(nx) (16)
where Λ is the von Mangoldt function (as in (15)) The use of α rather thanminusα is justa bow to tradition as is the use of the letter S (for ldquosumrdquo) however the use of Λ(n)rather than just plain log p does actually simplify matters
The function η here is sometimes called a smoothing function or simply a smooth-ing It will indeed be helpful for it to be smooth on (0infin) but in principle it neednot even be continuous (Vinogradovrsquos work implicitly uses in effect the ldquobrutal trun-cationrdquo 1[01](t) defined to be 1 when t isin [0 1] and 0 otherwise that would be fine forthe minor arcs but as it will become clear it is a bad idea as far as the major arcs areconcerned)
Assume α is on a major arc meaning that we can write α = aq+δx for some aq(q small) and some δ (with |δ| small) We can write Sη(α x) as a linear combination
Sη(α x) =sumχ
cχSηχ
(δ
x x
)+ tiny error term (17)
where
Sηχ
(δ
x x
)=sum
Λ(n)χ(n)e(δnx)η(nx) (18)
In (17) χ runs over primitive Dirichlet characters of moduli d|q and cχ is small(|cχ| le
radicdφ(q))
Why are we expressing the sums Sη(α x) in terms of the sums Sηχ(δx x) whichlook more complicated The argument has become δx whereas before it was αHere δ is relatively small ndash smaller than the constant c0r in our setup In other wordse(δnx) will go around the circle a bounded number of times as n goes from 1 up to aconstant times x (by which time η(nx) has become small because η is of fast decay)This makes the sums much easier to estimate
To estimate the sums Sηχ we will use L-functions together with one of the mostcommon tools of analytic number theory the Mellin transform This transform is es-sentially a Laplace transform with a change of variables and a Laplace transform inturn is a Fourier transform taken on a vertical line in the complex plane For f of fastenough decay the Mellin transform F = Mf of f is given by
F (s) =
int infin0
f(t)tsdt
t
we can express f in terms of F by the Mellin inversion formula
f(t) =1
2πi
int σ+iinfin
σminusiinfinF (s)tminussds
for any σ within an interval We can thus express e(δt)η(t) in terms of its Mellintransform Fδ and then use (15) to express Sηχ in terms of Fδ and Lprime(s χ)L(s χ)
12 CHAPTER 1 INTRODUCTION
shifting the integral in the Mellin inversion formula to the left we obtain what is knownin analytic number theory as an explicit formula
Sηχ(δx x) = [η(minusδ)x]minussumρ
Fδ(ρ)xρ + tiny error term
Here the term between brackets appears only for χ trivial In the sum ρ goes over allnon-trivial zeros ofL(s χ) and Fδ is the Mellin transform of e(δt)η(t) (The tiny errorterm comes from a sum over the trivial zeros of L(s χ)) We will obtain the estimatewe desire if we manage to show that the sum over ρ is small
The point is this if we verify GRH for L(s χ) up to imaginary part H ie ifwe check that all zeroes ρ of L(s χ) with |=(ρ)| le H satisfy lt(ρ) = 12 we have|xρ| =
radicx In other words xρ is very small (compared to x) However for any
ρ whose imaginary part has absolute value greater than H we know next to nothingabout its real part other than 0 le lt(ρ) le 1 (Zero-free regions are notoriously weakfor =(ρ) large we will not use them) Hence our only chance is to make sure thatFδ(ρ) is very small when |=(ρ)| ge H
This has to be true for both δ very small (including the case δ = 0) and for δ not sosmall (|δ| up to c0rq which can be large because r is a large constant) How can wechoose η so that Fδ(ρ) is very small in both cases for τ = =(ρ) large
The method of stationary phase is useful as an exploratory tool here In brief itsuggests (and can sometimes prove) that the main contribution to the integral
Fδ(t) =
int infin0
e(δt)η(t)tsdt
t(19)
can be found where the phase of the integrand has derivative 0 This happens whent = minusτ2πδ (for sgn(τ) 6= sgn(δ)) the contribution is then a moderate factor timesη(minusτ2πδ) In other words if sgn(τ) 6= sgn(δ) and δ is not too small (|δ| ge 8 say)Fδ(σ + iτ) behaves like η(minusτ2πδ) if δ is small (|δ| lt 8) then Fδ behaves like F0which is the Mellin transform Mη of η Here is our goal then the decay of η(t) as|t| rarr infin should be as fast as possible and the decay of the transform Mη(σ + iτ)should also be as fast as possible
This is a classical dilemma often called the uncertainty principle because it is themathematical fact underlying the physical principle of the same name you cannot havea function η that decreases extremely rapidly and whose Fourier transform (or in thiscase its Mellin transform) also decays extremely rapidly
What does ldquoextremely rapidlyrdquo mean here It means (as Hardy himself proved)ldquofaster than any exponential eminusCtrdquo Thus Hardy and Littlewoodrsquos choice η(t) = eminust
seems essentially optimal at first sightHowever it is not optimal We can choose η so that Mη decreases exponentially
(with a constant C somewhat worse than for η(t) = eminust) but η decreases faster thanexponentially This is a particularly appealing possibility because it is t|δ| and not somuch t that risks being fairly small (To be explicit say we check GRH for charactersof modulus q up to Hq sim 50 middot c0rq ge 50|δ| Then we only know that |τ2πδ| amp8 So for η(t) = eminust η(minusτ2πδ) may be as large as eminus8 which is not negligibleIndeed since this term will be multiplied later by other terms eminus8 is simply not small
13 THE MAJOR ARCS M 13
enough On the other hand we can assume that Hq ge 200 (say) and so Mη(s) simeminus(π2)|τ | is completely negligible and will remain negligible even if we replace π2by a somewhat smaller constant)
We shall take η(t) = eminust22 (that is the Gaussian) This is not the only possible
choice but it is in some sense natural It is easy to show that the Mellin transform Fδfor η(t) = eminust
22 is a multiple of what is called a parabolic cylinder function U(a z)with imaginary values for z There are plenty of estimates on parabolic cylinder func-tions in the literature ndash but mostly for a and z real in part because that is one of thecases occuring most often in applications There are some asymptotic expansions andestimates for U(a z) a z general due to Olver [Olv58] [Olv59] [Olv61] [Olv65]but unfortunately they come without fully explicit error terms for a and z within ourrange of interest (The same holds for [TV03])
In the end I derived bounds for Fδ using the saddle-point method (The methodof stationary phase which we used to choose η seems to lead to error terms that aretoo large) The saddle-point method consists in brief in changing the contour of anintegral to be bounded (in this case (19)) so as to minimize the maximum of theintegrand (To use a metaphor in [dB81] find the lowest mountain pass)
Here we strive to get clean bounds rather than the best possible constants Considerthe case k = 0 of Corollary 802 with k = 0 it states the following For s = σ + iτwith σ isin [0 1] and |τ | ge max(100 4π2|δ|) we obtain that the Mellin transform Fδ ofη(t)e(δt) with η(t) = eminust
22 satisfies
|Fδ(s+ k)|+ |Fδ((1minus s) + k)| le
3001eminus01065( 2|τ|
|`| )2
if 4|τ |`2 lt 323286eminus01598|τ | if 4|τ |`2 ge 32
(110)
Similar bounds hold for σ in other ranges thus giving us estimates on the Mellintransform Fδ for η(t) = tkeminust
22 and σ in the critical range [0 1] (We could do a littlebetter if we knew the value of σ but in our applications we do not once we leavethe range in which GRH has been checked We will give a bound (Theorem 801) thatdoes take σ into account and also reflects and takes advantage of the fact that thereis a transitional region around |τ | sim (32)(πδ)2 in practice however we will useCor 802)
A momentrsquos thought shows that we can also use (110) to deal with the Mellintransform of η(t)e(δt) for any function of the form η(t) = eminust
22g(t) (or more gener-ally η(t) = tkeminust
22g(t)) where g(t) is any band-limited function By a band-limitedfunction we could mean a function whose Fourier transform is compactly supportedwhile that is a plausible choice it turns out to be better to work with functions that areband-limited with respect to the Mellin transform ndash in the sense of being of the form
g(t) =
int R
minusRh(r)tminusirdr
where h Rrarr C is supported on a compact interval [minusRR] withR not too large (sayR = 200) What happens is that the Mellin transform of the product eminust
22g(t)e(δt)
is a convolution of the Mellin transform Fδ(s) of eminust22e(δt) (estimated in (110)) and
14 CHAPTER 1 INTRODUCTION
that of g(t) (supported in [minusRR]) the effect of the convolution is just to delay decayof Fδ(s) by at most a shift by y 7rarr y minusR
We wish to estimate Sηχ(δx) for several functions η This motivates us to derivean explicit formula (sect) general enough to work with all the weights η(t) we will workwith while being also completely explicit and free of any integrals that may be tediousto evaluate
Once that is done and once we consider the input provided by Plattrsquos finite verifi-cation of GRH up to Hq we obtain simple bounds for different weights
For η(t) = eminust22 x ge 108 χ a primitive character of modulus q le r = 300000
and any δ isin R with |δ| le 4rq we obtain
Sηχ
(δ
x x
)= Iq=1 middot η(minusδ)x+ E middot x (111)
where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and
|E| le 4306 middot 10minus22 +1radicx
(650400radicq
+ 112
) (112)
Here η stands for the Fourier transform from R to R normalized as follows η(t) =intinfinminusinfin e(minusxt)η(x)dx Thus η(minusδ) is just
radic2πeminus2π2δ2 (self-duality of the Gaussian)
This is one of the main results of Part II see sect71 Similar bounds are also proventhere for η(t) = t2eminust
22 as well as for a weight of type η(t) = teminust22g(t) where
g(t) is a band-limited function and also for a weight η defined by a multiplicativeconvolution The conditions on q (namely q le r = 300000) and δ are what weexpected from the outset
Thus concludes our treatment of the major arcs This is arguably the easiest part ofthe proof it was actually what I left for the end as I was fairly confident it would workout Minor-arc estimates are more delicate let us now examine them
14 The minor arcs m
141 Qualitative goals and main ideas
What kind of bounds do we need What is there in the literatureWe wish to obtain upper bounds on |Sη(α x)| for some weight η and any α isin RZ
not very close to a rational with small denominator Every α is close to some rationalaq what we are looking for is a bound on |Sη(α x)| that decreases rapidly when qincreases
Moreover we want our bound to decrease rapidly when δ increases where α =aq + δx In fact the main terms in our bound will be decreasing functions ofmax(1 |δ|8) middot q (Let us write δ0 = max(2 |δ|4) from now on) This will allowour bound to be good enough outside narrow major arcs which will get narrower andnarrower as q increases ndash that is precisely the kind of major arcs we were presupposingin our major-arc bounds
14 THE MINOR ARCS M 15
It would be possible to work with narrow major arcs that become narrower as qincreases simply by allowing q to be very large (close to x) and assigning each angleto the fraction closest to it This is in fact the common procedure However thismakes matters more difficult in that we would have to minimize at the same time thefactors in front of terms xq x
radicq etc and those in front of terms q
radicqx and so
on (These terms are being compared to the trivial bound x) Instead we choose tostrive for a direct dependence on δ throughout this will allow us to cap q at a muchlower level thus making terms such as q and
radicqx negligible (This choice has been
taken elsewhere in applications of the circle method but strangely seems absent fromprevious work on the ternary Goldbach conjecture)
How good must our bounds be Since the major-arc bounds are valid only forq le r = 300000 and |δ| le 4rq we cannot afford even a single factor of log x (orany other function tending to infin as x rarr infin) in front of terms such as x
radicq|δ0| a
factor like that would make the term larger than the trivial bound x if q|δ0| is equal toa constant (r say) and x is very large Apparently there was no such ldquolog-free boundrdquowith explicit constants in the literature even though such bounds were considered tobe in principle feasible and even though previous work ([Che85] [Dab96] [DR01][Tao14]) had gradually decreased the number of factors of log x (In limited ranges forq there were log-free bounds without explicit constants see [Dab96] [Ram10] Theestimate in [Vin54 Thm 2a 2b] was almost log-free but not quite There were alsobounds [Kar93] [But11] that used L-functions and thus were not really useful in atruly minor-arc regime)
It also seemed clear that a main bound proportional to (log q)2xradicq (as in [Tao14])
was too large At the same time it was not really necessary to reach a bound of thebest possible form that could be found through Vinogradovrsquos basic approach namely
|Sη(α x)| le Cxradicq
φ(q) (113)
Such a bound had been proven by Ramare [Ram10] for q in a limited range and Cnon-explicit later in [Ramc] ndash which postdates the first version of [Helb] ndash Ramarebroadened the range to q le x148 and gave an explicit value forC namelyC = 13000Such a bound is a notable achievement but unfortunately it is not useful for ourpurposes Rather we will aim at a bound whose main term is bounded by a constantaround 1 times x(log δ0q)
radicδ0φ(q) this is slightly worse asymptotically than (113)
but it is much better in the delicate range of δ0q sim 300000 and in fact for a muchwider range as well
We see that we have several tasks One of them is the removal of logarithms wecannot afford a single factor of log x and in practice we can afford at most one factorof log q Removing logarithms will be possible in part because of the use of previouslyexisting efficient techniques (the large sieve for sequences with prime support) but alsobecause we will be able to find cancellation at several places in sums coming from acombinatorial identity (namely Vaughanrsquos identity) The task of finding cancellationis particularly delicate because we cannot afford large constants or for that matter
16 CHAPTER 1 INTRODUCTION
statements valid only for large x (Bounding a sum such assumn micro(n) efficiently where
micro is the Mobius function
micro(n) =
(minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p
is harder than estimating a sum such assumn Λ(n) equally efficiently even though we
are used to thinking of the two problems as equivalent)We have said that our bounds will improve as |δ| increases This dependence on
δ will be secured in different ways at different places Sometimes δ will appear asan argument as in η(minusδ) for η piecewise continuous with ηprime isin L1 we know that|η(t)| rarr 0 as |t| rarr infin Sometimes we will obtain a dependence on δ by using severaldifferent rational approximations to the same α isin R Lastly we will obtain a gooddependence on δ in bilinear sums by supplying a scattered input to a large sieve
If there is a main moral to the argument it lies in the close relation between thecircle method and the large sieve The circle method rests on the estimation of anintegral involving a Fourier transform f RZ rarr C as we will later see this leadsnaturally to estimating the `2-norm of f on subsets (namely unions of arcs) of the circleRZ The large sieve can be seen as an approximate discrete version of Plancherelrsquosidentity which states that |f |2 = |f |2
Both in this section and in sect15 we shall use the large sieve in part so as to usethe fact that some of the functions we work with have prime support ie are non-zeroonly on prime numbers There are ways to use prime support to improve the outputof the large sieve In sect15 these techniques will be refined and then translated to thecontext of the circle method where f has (essentially) prime support and |f |2 must beintegrated over unions of arcs (This allows us to remove a logarithm) The main pointis that the large sieve is not being used as a black box rather we can adapt ideas from(say) the large-sieve context and apply them to the circle method
Lastly there are the benefits of a continuous η Hardy and Littlewood alreadyused a continuous η this was abandoned by Vinogradov presumably for the sake ofsimplicity The idea that smooth weights η can be superior to sharp truncations isnow commonplace As we shall see using a continuous η is helpful in the minor-arcsregime but not as crucial there as for the major arcs We will not use a smooth η wewill prove our estimates for any continuous η that is piecewise C1 and then towardsthe end we will choose to use the same weight η = η2 as in [Tao14] in part because ithas compact support and in part for the sake of comparison The moral here is not quitethe common dictum ldquoalways smoothrdquo but rather that different kinds of smoothing canbe appropriate for different tasks in the end we will show how to coordinate differentsmoothing functions η
There are other ideas involved for instance some of Vinogradovrsquos lemmas areimproved Let us now go into some of the details
142 Combinatorial identitiesGenerally since Vinogradov a treatment of the minor arcs starts with a combinatorialidentity expressing Λ(n) (or the characteristic function of the primes) as a sum of two
14 THE MINOR ARCS M 17
or more convolutions (In this section by a convolution flowastg we will mean the Dirichletconvolution (f lowast g)(n) =
sumd|n f(d)g(nd) ie the multiplicative convolution on the
semigroup of positive integers)In some sense the archetypical identity is
Λ = micro lowast log
but it will not usually do the contribution of micro(d) log(nd) with d close to n is toodifficult to estimate precisely There are alternatives for example there is the identity
Λ(n) log n = micro lowast log2minusΛ lowast Λ (114)
which underlies an estimate of Selbergrsquos that in turn is the basis for the Erdos-Selbergproof of the prime number theorem see eg [MV07 sect82] More generally onecan decompose Λ(n)(log n)k as micro lowast logk+1 minus a linear combination of convolu-tions this kind of decomposition ndash really just a direct consequence of the develop-ment of (ζ prime(s)ζ(s))(k) ndash will be familiar to some from the exposition of Bombierirsquoswork [Bom76] in [FI10 sect3] (for instance) Another useful identity was that used byDaboussi [Dab96] witness its application in [DR01] which gives explicit estimates onexponential sums over primes
The proof of Vinogradovrsquos three-prime result was simplified substantially [Vau77b]by the introduction of Vaughanrsquos identity
Λ(n) = microleU lowast logminusΛleV lowast microleU lowast 1 + 1 lowast microgtU lowast ΛgtV + ΛleV (115)
where we are using the notation
fleW =
f(n) if n leW 0 if n gt W
fgtW =
0 if n leW f(n) if n gt W
Of the resulting sums (sumn(microleU lowast log)(n)e(αn)η(nx) etc) the first three are said
to be of type I type I (again) and type II the last sumsumnleV Λ(n) is negligible
One of the advantages of Vaughanrsquos identity is its flexibility we can set U and Vto whatever values we wish Its main disadvantage is that it is not ldquolog-freerdquo in that itseems to impose the loss of two factors of log x if we sum each side of (115) from 1to x we obtain
sumnlex Λ(n) sim x on the left side whereas if we bound the sum on the
right side without the use of cancellation we obtain a bound of x(log x)2 Of coursewe will obtain some cancellation from the phase e(αn) still even if this gives us afactor of say 1
radicq we will get a bound of x(log x)2
radicq which is worse than the
trivial bound x for q bounded and x large Since we want a bound that is useful for allq larger than the constant r and all x larger than a constant this will not do
As was pointed out in [Tao14] it is possible to get a factor of (log q)2 instead of afactor of (log x)2 in the type II sums by setting U and V appropriately Unfortunatelya factor of (log q)2 is still too large in practice and there is also the issue of factors oflog x in type I sums
Vinogradov had already managed to get an essentially log-free result (by a ratherdifficult procedure) in [Vin54 Ch IX] The result in [Dab96] is log-free Unfortu-nately the explicit result in [DR01] ndash the study of which encouraged me at the begin-ning of the project ndash is not For a while I worked with the case k = 2 of the expansion
18 CHAPTER 1 INTRODUCTION
of (ζ prime(s)ζ(s))(k) which gives
Λ middot log2 = micro lowast log3minus3 middot (Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (116)
This identity is essentially log-free while a trivial bound on the sum of the right sidefor n from 1 to N does seem to have two extra factors of log they are present only inthe term micro lowast log3 which is not the hardest one to estimate Ramare obtained a log-freebound in [Ram10] using an identity introduced by Diamond and Steinig in the courseof their own work on elementary proofs of the prime number theorem [DS70] thatidentity gives a decomposition for Λ middot logk that can also be derived from the expansionof (ζ prime(s)ζ(s))(k) by a clever grouping of terms
In the end I decided to use Vaughanrsquos identity motivated in part by [Tao14] andin part by the lack of free parameters in (116) as can be seen in (115) Vaughanrsquosidentity has two parameters U V that we can set to whatever values we think best Theform of the identity allowed me to reuse much of my work up to that point but it alsoposed a challenge since Vaughanrsquos identity is by no means log-free one has obtaincancellation in Vaughanrsquos identity at every possible step beyond the cancellation givenby the phase e(αn) (The presence of a phase in fact makes the task of getting can-cellation from the identity more complicated) The removal of logarithms will be oneof our main tasks in what follows It is clear that the presence of the Mobius functionmicro should give in principle some cancellation we will show how to use it to obtain asmuch cancellation as we need ndash with good constants and not just asymptotically
143 Type I sumsThere are two type I sums namelysum
mleU
micro(m)sumn
(log n)e(αmn)η(mnx
)(117)
and sumvleV
Λ(v)sumuleU
micro(u)sumn
e(αvun)η(vunx
) (118)
In either case α = aq + δx where q is larger than a constant r and |δx| le 1qQ0
for some Q0 gt max(qradicx) For the purposes of this exposition we will set it as our
task to estimate the slightly simpler sumsummleD
micro(m)sumn
e(αmn)η(mnx
) (119)
where D can be U or UV or something else less than xWhy can we consider this simpler sum without omitting anything essential It is
clear that (117) is of the same kind as (119) The inner double sum in (118) is just(119) with αv instead of α this enables us to estimate (118) by means of (119) for qsmall ie the more delicate case If q is not small then the approximation αv sim avqmay not be accurate enough In that case we collapse the two outer sums in (118) intoa sum
sumn(ΛleV lowast microleU )(n) and treat all of (118) much as we will treat (119) since
14 THE MINOR ARCS M 19
q is not small we can afford to bound (ΛleV lowast microleU )(n) trivially (by log n) in the lesssensitive terms
Let us first outline Vinogradovrsquos procedure for bounding type I sums Just by sum-ming a geometric series we get∣∣∣∣∣∣
sumnleN
e(αn)
∣∣∣∣∣∣ le min
(N
c
α
) (120)
where c is a constant and α is the distance from α to the nearest integer Vinogradovsplits the outer sum in (119) into sums of length q When m runs on an interval oflength q the angle amq runs through all fractions of the form bq due to the errorδx αm could be close to 0 for two values of n but otherwise αm takes valuesbounded below by 1q (twice) 2q (twice) 3q (twice) etc Thus∣∣∣∣∣∣
sumyltmley+q
micro(m)sumnleN
e(αmn)
∣∣∣∣∣∣ lesum
yltmley+q
∣∣∣∣∣∣sumnleN
e(αmn)
∣∣∣∣∣∣ le 2N
m+ 2cq log eq
(121)for any y ge 0
There are several ways to improve this One is simply to estimate the inner summore precisely this was already done in [DR01] One can also define a smoothingfunction η as in (119) it is easy to get∣∣∣∣∣∣
sumnleN
e(αn)η(nx
)∣∣∣∣∣∣ le min
(x|η|1 +
|ηprime|12|ηprime|1
2| sin(πα)||ηprimeprime|infin
4x(sinπα)2
)
Except for the third term this is as in [Tao14] We could also choose carefully whichbound to use for each m surprisingly this gives an improvement ndash in fact an impor-tant one for m large However even with these improvements we still have a termproportional to Nm as in (121) and this contributes about (x log x)q to the sum(119) thus giving us an estimate that is not log-free
What we have to do naturally is to take out the terms with q|m for m small (If mis large then those may not be the terms for which mα is close to 0 we will later seewhat to do) For y + q le Q2 |αminus aq| le 1qQ we get thatsum
yltmley+q
q-m
min
(A
B
| sinπαn|
C
| sinπαn|2
)(122)
is at most
min
(20
3π2Cq2 2A+
4q
π
radicAC
2Bq
πmax
(2 log
Ce3q
Bπ
)) (123)
This is satisfactory We are left with all the terms m le M = min(DQ2) with q|mndash and also with all the terms Q2 lt m le D For m le M divisible by q we can
20 CHAPTER 1 INTRODUCTION
estimate (as opposed to just bound from above) the inner sum in (119) by the Poissonsummation formula and then sum over m but without taking absolute values writingm = aq we get a main term
xmicro(q)
qmiddot η(minusδ) middot
sumaleMq
(aq)=1
micro(a)
a (124)
where (a q) stands for the greatest common divisor of a and qIt is clear that we have to get cancellation over micro here There is an elegant elemen-
tary argument [GR96] showing that the absolute value of the sum in (124) is at most1 We need to gain one more log however Ramare [Ramb] helpfully furnished thefollowing bound ∣∣∣∣∣∣∣∣
sumalex
(aq)=1
micro(a)
a
∣∣∣∣∣∣∣∣ le4
5
q
φ(q)
1
log xq(125)
for q le x (Cf [EM95] [EM96]) This is neither trivial nor elementary5 We are so tospeak allowed to use non-elementary means (that is methods based on L-functions)because the only L-function we need to use here is the Riemann zeta function
What shall we do for m gt Q2 We can always give a bound
sumyltmley+q
min
(A
C
| sinπαn|2
)le 3A+
4q
π
radicAC (126)
for y arbitrary since AC will be of constant size (4qπ)radicAC is pleasant enough but
the contribution of 3A sim 3|η|1xy is nasty (it adds a multiple of (x log x)q to thetotal) and seems unavoidable the values of m for which αm is close to 0 no longercorrespond to the congruence class m equiv 0 mod q and thus cannot be taken out
The solution is to switch approximations (The idea of using different approxima-tions to the same α is neither new nor recent in the general context of the circle methodsee [Vau97 sect28 Ex 2] What may be new is its use to clear a hurdle in type I sums)What does this mean If α were exactly or almost exactly aq then there would beno other very good approximations in a reasonable range However note that we candefine Q = bx|δq|c for α = aq + δx and still have |αminus aq| le 1qQ If δ is verysmall Q will be larger than 2D and there will be no terms with Q2 lt m le D toworry about
5The current state of knowledge may seem surprising after all we expect nearly square-root cancella-tion ndash for instance |
sumnlex micro(n)n| le
radic2x holds for all real 0 lt x le 1012 see also the stronger
bound [Dre93]) The classical zero-free region of the Riemann zeta function ought to give a factor ofexp(minus
radic(log x)c) which looks much better than 1 log x What happens is that (a) such a factor is
not actually much better than 1 log x for x sim 1030 say (b) estimating sums involving the Mobius func-tion by means of an explicit formula is harder than estimating sums involving Λ(n) the residues of 1ζ(s)at the non-trivial zeros of s come into play As a result getting non-trivial explicit results on sums of micro(n)is harder than one would naively expect from the quality of classical effective (but non-explicit) results See[Rama] for a survey of explicit bounds
14 THE MINOR ARCS M 21
What happens if δ is not very small We know that for any Qprime there is an approx-imation aprimeqprime to α with |αminus aprimeqprime| le 1qprimeQprime and qprime le Qprime However for Qprime gt Q weknow that aprimeqprime cannot equal aq by the definition of Q the approximation aq is notgood enough ie |α minus aq| le 1qQprime does not hold Since aq 6= aprimeqprime we see that|aq minus aprimeqprime| ge 1qqprime and this implies that qprime ge (ε(1 + ε))Q
Thus for m gt Q2 the solution is to apply (126) with aprimeqprime instead of aq Thecontribution of A fades into insignificance for the first sum over a range y lt m ley + qprime y ge Q2 it contributes at most x(Q2) and all the other contributions of Asum up to at most a constant times (x log x)qprime
Proceeding in this way we obtain a total bound for (119) whose main terms areproportional to
1
φ(q)
x
log xq
min
(1
1
δ2
)
2
π
radic|ηprimeprime|infin middotD and q log max
(D
q q
) (127)
with good explicit constants The first term ndash usually the largest one ndash is precisely whatwe needed it is proportional to (1φ(q))x log x for q small and decreases rapidly as|δ| increases
144 Type II or bilinear sums
We must now bound
S =summ
(1 lowast microgtU )(m)sumngtV
Λ(n)e(αmn)η(mnx)
At this point it is convenient to assume that η is the Mellin convolution of two functionsThe multiplicative or Mellin convolution on R+ is defined by
(η0 lowastM η1)(t) =
int infin0
η0(r)η1
(t
r
)dr
r
Tao [Tao14] takes η = η2 = η1 lowastM η1 where η1 is a brutal truncation viz thefunction taking the value 2 on [12 1] and 0 elsewhere We take the same η2 in partfor comparison purposes and in part because this will allow us to use off-the-shelfestimates on the large sieve (Brutal truncations are rarely optimal in principle but asthey are very common results for them have been carefully optimized in the literature)Clearly
S =
int XU
V
summ
sumdgtUd|m
micro(d)
η1
(m
xW
)middotsumngeV
Λ(n)e(αmn)η1
( nW
) dWW
(128)
22 CHAPTER 1 INTRODUCTION
By Cauchy-Schwarz the integrand is at mostradicS1(UW )S2(VW ) where
S1(UW ) =sum
x2W ltmle x
W
∣∣∣∣∣∣∣∣sumdgtUd|m
micro(d)
∣∣∣∣∣∣∣∣2
S2(VW ) =sum
x2W lemle
xW
∣∣∣∣∣∣∣sum
max(VW2 )lenleW
Λ(n)e(αmn)
∣∣∣∣∣∣∣2
(129)
We must bound S1(UW ) by a constant times xW We are able to do this ndash witha good constant (A careless bound would have given a multiple of (xU) log3(xU)which is much too large) First we reduce S1(W ) to an expression involving an inte-gral of sum
r1lex
sumr2lex
(r1r2)=1
micro(r1)micro(r2)
σ(r1)σ(r2) (130)
We can bound (130) by the use of bounds onsumnlet micro(n)n combined with the es-
timation of infinite products by means of approximations to ζ(s) for s rarr 1+ Aftersome additional manipulations we obtain a bound for S1(UW ) whose main term isat most (3π2)(xW ) for each W and closer to 022482xW on average over W
(This is as good a point as any to say that throughout we can use a trick in [Tao14]that allows us to work with odd values of integer variables throughout instead of lettingm or n range over all integers Here for instance if m and n are restricted to be oddwe obtain a bound of (2π2)(xW ) for individual W and 015107xW on averageoverW This is so even though we are losing some cancellation in micro by the restriction)
Let us now bound S2(VW ) This is traditionally done by Linnikrsquos dispersionmethod However it should be clear that the thing to do nowadays is to use a largesieve and more specifically a large sieve for primes that kind of large sieve is nothingother than a tool for estimating expressions such as S2(VW ) (Incidentally eventhough we are trying to save every factor of log we can we choose not to use smallsieves at all either here or elsewhere) In order to take advantage of prime support weuse Montgomeryrsquos inequality ([Mon68] [Hux72] see the expositions in [Mon71 pp27ndash29] and [IK04 sect74]) combined with Montgomery and Vaughanrsquos large sieve withweights [MV73 (16)] following the general procedure in [MV73 (16)] We obtain abound of the form
logW
log W2q
(x
4φ(q)+qW
φ(q)
)W
2(131)
on S2(VW ) where of course we can also choose not to gain a factor of logW2q ifq is close to or greater than W
It remains to see how to gain a factor of |δ| in the major arcs and more specificallyin S2(VW ) To explain this let us step back and take a look at what the large sieve is
14 THE MINOR ARCS M 23
Given a civilized function f Zrarr C Plancherelrsquos identity tells us thatintRZ
∣∣∣f (α)∣∣∣2 dα =
sumn
|f(n)|2
The large sieve can be seen as an approximate or statistical version of this for aldquosamplerdquo of points α1 α2 αk satisfying |αi minus αj | ge β for i 6= j it tells us thatsum
1lejlek
∣∣∣f (αi)∣∣∣2 le (X + βminus1)
sumn
|f(n)|2 (132)
assuming that f is supported on an interval of length X Now consider α1 = α α2 = 2α α3 = 3α If α = aq then the angles
α1 αq are well-separated ie they satisfy |αi minus αj | ge 1q and so we can apply(132) with β = 1q However αq+1 = α1 Thus if we have an outer sum oflength L gt q ndash in (129) we have an outer sum of length L = x2W ndash we needto split it into dLqe blocks of length q and so the total bound given by (132) isdLqe(X + q)
sumn |f(n)|2 Indeed this is what gives us (131) which is fine but we
want to do better for |δ| larger than a constantSuppose then that α = aq + δx where |δ| gt 8 say Then the angles α1
and αq+1 are not identical |α1 minus αq+1| le q|δ|x We also see that αq+1 is at adistance at least q|δ|x from α2 α3 αq provided that q|δ|x lt 1q We can goon with αq+2 αq+3 and stop only once there is overlap ie only once we reachαm such that m|δ|x ge 1q We then give all the angles α1 αm ndash which areseparated by at least q|δ|x from each other ndash to the large sieve at the same time Wedo this dLme le dL(x|δ|q)e times and obtain a total bound of dL(x|δ|q)e(X +x|δ|q)
sumn |f(n)|2 which for L = x2W X = W2 gives us about(
x
4Q
W
2+x
4
)logW
provided thatL ge x|δ|q and as usual |αminusaq| le 1qQ This is very small comparedto the trivial bound xW8
What happens if L lt x|δq| Then there is never any overlap we consider allangles αi and give them all together to the large sieve The total bound is (W 24 +xW2|δ|q) logW If L = x2W is smaller than say x3|δq| then we see clearlythat there are non-intersecting swarms of angles αi around the rationals aq We canthus save a factor of log (or rather (φ(q)q) log(W|δq|)) by applying Montgomeryrsquosinequality which operates by strewing displacements of given angles (or here swarmsaround angles) around the circle to the extent possible while keeping everything well-separated In this way we obtain a bound of the form
logW
log W|δ|q
(x
|δ|φ(q)+
q
φ(q)
W
2
)W
2
Compare this to (131) we have gained a factor of |δ|4 and so we use this estimatewhen |δ| gt 4 (We will actually use the criterion |δ| gt 8 but since we will be working
24 CHAPTER 1 INTRODUCTION
with approximations of the form 2α = aq + δx the value of δ in our actual workis twice of what it is in this introduction This is a consequence of working with sumsover the odd integers as in [Tao14])
We have succeeded in eliminating all factors of log we came across The onlyfactor of log that remains is log xUV coming from the integral
int xUV
dWW Thuswe want UV to be close to x but we cannot let it be too close since we also have aterm proportional to D = UV in (127) and we need to keep it substantially smallerthan x We set U and V so that UV is x
radicqmax(4 |δ|) or thereabouts
In the end after some work we obtain our main minor-arcs bound (Theorem 311)It states the following Let x ge x0 x0 = 216 middot 1020 Tecall that Sη(α x) =sumn Λ(n)e(αn)η(nx) and η2 = η1lowastM η1 = 4 middot1[121]lowast1[121] Let 2α = aq+δx
q le Q gcd(a q) = 1 |δx| le 1qQ where Q = (34)x23 If q le x136 then
|Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)
middot x+25xradicδ0q
+2x
δ0qmiddot Lxδ0qq + 336x56
(133)where
δ0 = max(2 |δ|4) Rxt = 027125 log
(1 +
log 4t
2 log 9x13
2004t
)+ 041415
Lxtq =q
φ(q)
(13
4log t+ 782
)+ 1366 log t+ 3755
(134)The factor Rxt is small in practice for typical ldquodifficultrdquo values of x and δ0x it is
less than 1 The crucial things to notice in (133) are that there is no factor of log x andthat in the main term there is only one factor of log δ0q The fact that δ0 helps us asit grows is precisely what enables us to take major arcs that get narrower and narroweras q grows
15 Integrals over the major and minor arcsSo far we have sketched (sect13) how to estimate Sη(α x) for α in the major arcs andη based on the Gaussian eminust
22 and also (sect14) how to bound |Sη(α x)| for α in theminor arcs and η = η2 where η2 = 4 middot 1[121] lowastM 1[121] We now must show how touse such information to estimate integrals such as the ones in (14)
We will use two smoothing functions η+ ηlowast in the notation of (13) we set f1 =f2 = Λ(n)η+(nx) f3 = Λ(n)ηlowast(nx) and so we must give a lower bound forint
M
(Sη+(α x))2Sηlowast(α x)e(minusαn)dα (135)
and an upper bound for intm
∣∣Sη+(α x)∣∣2 Sηlowast(α x)e(minusαn)dα (136)
15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 25
so that we can verify (14)The traditional approach to (136) is to boundintm
(Sη+(α x))2Sηlowast(α x)e(minusαn)dα leintm
∣∣Sη+(α x)∣∣2 dα middotmax
αisinmηlowast(α)
lesumn
Λ(n)2η2+
(nx
)middotmaxαisinm
Sηlowast(α x)(137)
Since the sum over n is of the order of x log x this is not log-free and so cannot begood enough we will later see how to do better Still this gets the main shape rightour bound on (136) will be proportional to |η+|22|ηlowast|1 Moreover we see that ηlowast hasto be such that we know how to bound |Sηlowast(α x)| for α isin m while our choice of η+
is more or less free at least as far as the minor arcs are concernedWhat about the major arcs In order to do anything on them we will have to be
able to estimate both η+(α) and ηlowast(α) for α isin M If that is the case then as weshall see we will be able to obtain that the main term of (135) is an infinite product(independent of the smoothing functions) times x2 timesint infin
minusinfin(η+(minusα))2ηlowast(minusα)e(minusαnx)dα
=
int infin0
int infin0
η+(t1)η+(t2)ηlowast
(nxminus (t1 + t2)
)dt1dt2
(138)
In other words we want to maximize (or nearly maximize) the expression on the rightof (138) divided by |η+|22|ηlowast|1
One way to do this is to let ηlowast be concentrated on a small interval [0 ε) Then theright side of (138) is approximately
|ηlowast|1 middotint infin
0
η+(t)η+
(nxminus t)dt (139)
To maximize (139) we should make sure that η+(t) sim η+(nxminus t) We set x sim n2and see that we should define η+ so that it is supported on [0 2] and symmetric aroundt = 1 or nearly so this will maximize the ratio of (139) to |η+|22|ηlowast|1
We should do this while making sure that we will know how to estimate Sη+(α x)for α isin M We know how to estimate Sη(α x) very precisely for functions of theform η(t) = g(t)eminust
22 η(t) = g(t)teminust22 etc where g(t) is band-limited We will
work with a function η+ of that form chosen so as to be very close (in `2 norm) to afunction η that is in fact supported on [0 2] and symmetric around t = 1
We choose
η(t) =
t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 if t 6isin [0 2]
This function is obviously symmetric (η(t) = η(2 minus t)) and vanishes to high orderat t = 0 besides being supported on [0 2]
We set η+(t) = hR(t)teminust22 where hR(t) is an approximation to the function
h(t) =
t2(2minus t)3etminus
12 if t isin [0 2]
0 if t 6isin [0 2]
26 CHAPTER 1 INTRODUCTION
We just let hR(t) be the inverse Mellin transform of the truncation ofMh to an interval[minusiR iR] (Explicitly
hR(t) =
int infin0
h(tyminus1)FR(y)dy
y
where FR(t) = sin(R log y)(π log y) that is FR is the Dirichlet kernel with a changeof variables)
Since the Mellin transform of teminust22 is regular at s = 0 the Mellin transform
Mη+ will be holomorphic in a neighborhood of s 0 le lt(s) le 1 even thoughthe truncation of Mh to [minusiR iR] is brutal Set R = 200 say By the fast decay ofMh(it) and the fact that the Mellin transform M is an isometry |(hR(t)minush(t))t|2 isvery small and hence so is |η+ minus η|2 as we desired
But what about the requirement that we be able to estimate Sηlowast(α x) for bothα isin m and α isinM
Generally speaking if we know how to estimate Sη1(α x) for some α isin RZ andwe also know how to estimate Sη2(α x) for all other α isin RZ where η1 and η2 aretwo smoothing functions then we know how to estimate Sη3(α x) for all α isin RZwhere η3 = η1 lowastM η2 or more generally ηlowast(t) = (η1 lowastM η2)(κt) κ gt 0 a constantThis is an easy exercise on exchanging the order of integration and summation
Sηlowast(α x) =sumn
Λ(n)e(αn)(η1 lowastM η2)(κn
x
)=
int infin0
sumn
Λ(n)e(αn)η1(κr)η2
( nrx
) drr
=
int infin0
η1(κr)Sη2(rx)dr
r
(140)and similarly with η1 and η2 switched Of course this trick is valid for all exponentialsums any function f(n) would do in place of Λ(n) The only caveat is that η1 (andη2) should be small very near 0 since for r small we may not be able to estimateSη2(rx) (or Sη1(rx)) with any precision This is not a problem one of our functionswill be t2eminust
22 which vanishes to second order at 0 and the other one will be η2 =4 middot 1[121] lowastM 1[121] which has support bounded away from 0 We will set κ large(say κ = 49) so that the support of ηlowast is indeed concentrated on a small interval [0 ε)as we wanted
Now that we have chosen our smoothing weights η+ and ηlowast we have to estimate themajor-arc integral (135) and the minor-arc integral (136) What follows can actuallybe done for general η+ and ηlowast we could have left our particular choice of η+ and ηlowastfor the end
Estimating the major-arc integral (135) may sound like an easy task since we haverather precise estimates for Sη(α x) (η = η+ ηlowast) when α is on the major arcs wecould just replace Sη(α x) in (135) by the approximation given by (17) and (111) Itis however more efficient to express (135) as the sum of the contribution of the trivialcharacter (a sum of integrals of (η(minusδ)x)3 where η(minusδ)x comes from (111)) plus a
15 INTEGRALS OVER THE MAJOR AND MINOR ARCS 27
term of the form
(maximum ofradicq middot E(q) for q le r) middot
intM
∣∣Sη+(α x)∣∣2 dα
where E(q) = E is as in (112) plus two other terms of essentially the same form Asusual the major arcs M are the arcs around rationals aq with q le r We will soondiscuss how to bound the integral of
∣∣Sη+(α x)∣∣2 over arcs around rationals aq with
q le s s arbitrary Here however it is best to estimate the integral over M using theestimate on Sη+(α x) from (17) and (111) we obtain a great deal of cancellationwith the effect that for χ non-trivial the error term in (112) appears only when it getssquared and thus becomes negligible
The contribution of the trivial character has an easy approximation thanks to thefast decay of η We obtain that the major-arc integral (135) equals a main termC0Cηηlowastx
2 where
C0 =prodp|n
(1minus 1
(pminus 1)2
)middotprodp-n
(1 +
1
(pminus 1)3
)
Cηηlowast =
int infin0
int infin0
η(t1)η(t2)ηlowast
(nxminus (t1 + t2)
)dt1dt2
plus several small error terms We have already chosen η ηlowast and x so as to (nearly)maximize Cηηlowast
It is time to bound the minor-arc integral (136) As we said in sect15 we must dobetter than the usual bound (137) Since our minor-arc bound (32) on |Sη(α x)|α sim aq decreases as q increases it makes sense to use partial summation togetherwith bounds onint
ms
|Sη+(α x)|2 =
intMs
|Sη+(α x)|2dαminusintM
|Sη+(α x)|2dα
where ms denotes the arcs around aq r lt q le s and Ms denotes the arcs around allaq q le s We already know how to estimate the integral on M How do we boundthe integral on Ms
In order to do better than the trivial boundintMsleintRZ we will need to use the
fact that the series (16) defining Sη+(α x) is essentially supported on prime numbersBounding the integral on Ms is closely related to the problem of bounding
sumqles
suma mod q
(aq)=1
∣∣∣∣∣∣sumnlex
ane(aq)
∣∣∣∣∣∣2
(141)
efficiently for s considerably smaller thanradicx and an supported on the primes
radicx lt
p le x This is a classical problem in the study of the large sieve The usual bound on(141) (by for instance Montgomeryrsquos inequality) has a gain of a factor of
2eγ(log s)(log xs2)
28 CHAPTER 1 INTRODUCTION
relative to the bound of (x + s2)sumn |an|2 that one would get from the large sieve
without using prime support Heath-Brown proceeded similarly to boundintMs
|Sη+(α x)|2dα 2eγ log s
log xs2
intRZ|Sη+(α x)|2dα (142)
This already gives us the gain of C(log s) log x that we absolutely need butthe constant C is suboptimal the factor in the right side of (142) should really be(log s) log x ie C should be 1 We cannot reasonably hope to obtain a factor betterthan 2(log s) log x in the minor arcs due to what is known as the parity problem insieve theory As it turns out Ramare [Ram09] had given general bounds on the largesieve that were clearly conducive to better bounds on (141) though they involved aratio that was not easy to bound in general
I used several careful estimations (including [Ram95 Lem 34]) to reduce theproblem of bounding this ratio to a finite number of cases which I then checked bya rigorous computation This approach gave a bound on (141) with a factor of sizeclose to 2(log s) log x (This solves the large-sieve problem for s le x03 it wouldstill be worthwhile to give a computation-free proof for all s le x12minusε ε gt 0) It wasthen easy to give an analogous bound for the integral over Ms namelyint
Ms
|Sη+(α x)|2dα 2 log s
log x
intRZ|Sη+(α x)|2dα
where can easily be made precise by replacing log s by log s + 136 and log x bylog x + c where c is a small constant Without this improvement the main theoremwould still have been proved but the required computation time would have been mul-tiplied by a factor of considerably more than e3γ = 56499
What remained then was just to compare the estimates on (135) and (136) andcheck that (136) is smaller for n ge 1027 This final step was just bookkeeping Aswe already discussed a check for n lt 1027 is easy Thus ends the proof of the maintheorem
16 Some remarks on computationsThere were two main computational tasks verifying the ternary conjecture for all n leC and checking the Generalized Riemann Hypothesis for modulus q le r up to acertain height
The first task was not very demanding Platt and I verified in [HP13] that everyodd integer 5 lt n le 88 middot 1030 can be written as the sum of three primes (In theend only a check for 5 lt n le 1027 was needed) We proceeded as follows In amajor computational effort Oliveira e Silva Herzog and Pardi [OeSHP14]) had alreadychecked that the binary Goldbach conjecture is true up to 4 middot 1018 ndash that is every evennumber up to 4 middot 1018 is the sum of two primes Given that all we had to do wasto construct a ldquoprime ladderrdquo that is a list of primes from 3 up to 88 middot 1030 suchthat the difference between any two consecutive primes in the list is at least 4 and atmost 4 middot 1018 (This is a known strategy see [Sao98]) Then for any odd integer
16 SOME REMARKS ON COMPUTATIONS 29
5 lt n le 88 middot 1030 there is a prime p in the list such that 4 le n minus p le 4 middot 1018 + 2(Choose the largest p lt n in the ladder or if n minus that prime is 2 choose the primeimmediately under that) By [OeSHP14] (and the fact that 4 middot 1018 + 2 equals p + qwhere p = 2000000000000001301 and q = 1999999999999998701 are both prime)we can write nminus p = p1 + p2 for some primes p1 p2 and so n = p+ p1 + p2
Building a prime ladder involves only integer arithmetic that is computer manip-ulation of integers rather than of real numbers Integers are something that computerscan handle rapidly and reliably We look for primes for our ladder only among a spe-cial set of integers whose primality can be tested deterministically quite quickly (Prothnumbers k middot 2m + 1 k lt 2m) Thus we can build a prime ladder by a rigorousdeterministic algorithm that can be (and was) parallelized trivially
The second computation is more demanding It consists in verifying that for everyL-function L(s χ) with χ of conductor q le r = 300000 (for q even) or q le r2(for q odd) all zeroes of L(s χ) such that |=(s)| le Hq = 108q (for q odd) and|=(s)| le Hq = max(108q 200 + 75 middot 107q (for q even) lie on the critical lineAs a matter of fact Platt went up to conductor q le 200000 (or twice that for q even)[Plab] he had already gone up to conductor 100000 in his PhD thesis [Pla11] Theverification took in total about 400000 core-hours (ie the total number of processorcores used times the number of hours they ran equals 400000 nowadays a top-of-the-line processor typically has eight cores) In the end since I used only q le 150000 (ortwice that for q even) the number of hours actually needed was closer to 160000 sinceI could have made do with q le 120000 (at the cost of increasing C to 1029 or 1030) itis likely in retrospect that only about 80000 core-hours were needed
Checking zeros of L-functions computationally goes back to Riemann (who didit by hand for the special case of the Riemann zeta function) It is also one of thethings that were tried on digital computers in their early days (by Turing [Tur53] forinstance see the exposition in [Boo06b]) One of the main issues to be careful aboutarises whenever one manipulates real numbers via a computer generally speaking acomputer cannot store an irrational number moreover while a computer can handlerationals it is really most comfortable handling just those rationals whose denomina-tors are powers of two Thus one cannot really say ldquocomputer give me the sine ofthat numberrdquo and expect a precise result What one should do if one really wants toprove something (as is the case here) is to say ldquocomputer I am giving you an intervalI = [a2k b2k] give me an interval I prime = [c2` d2`] preferably very short suchthat sin(I) sub I primerdquo This is called interval arithmetic it is arguably the easiest way to dofloating-point computations rigorously
Processors do not do this natively and if interval arithmetic is implemented purelyon software computations can be slowed down by a factor of about 100 Fortunatelythere are ways of running interval-arithmetic computations partly on hardware partlyon software
Incidentally there are some basic functions (such as sin) that should always be doneon software not just if one wants to use interval arithmetic but even if one just wantsreasonably precise results the implementation of transcendental functions in some ofthe most popular processors does not always round correctly and errors can accumulatequickly Fortunately this problem is already well-known and there is software thattakes care of this (Platt and I used the crlibm library [DLDDD+10])
30 CHAPTER 1 INTRODUCTION
Lastly there were several relatively minor computations strewn here and there inthe proof There is some numerical integration done rigorously once or twice thiswas done using a standard package based on interval arithmetic [Ned06] but most ofthe time I wrote my own routines in C (using Plattrsquos interval arithmetic package) forthe sake of speed Another kind of computation (employed much more in [Hela] thanin the somewhat more polished version of the proof given here) was a rigorous versionof a ldquoproof by graphrdquo (ldquothe maximum of a function f is clearly less than 4 because Ican see it on the screenrdquo) There is a standard way to do this (see eg [Tuc11 sect52])essentially the bisection method combines naturally with interval arithmetic as weshall describe in sect26 Yet another computation (and not a very small one) was thatinvolved in verifying a large-sieve inequality in an intermediate range (as we discussedin sect15)
It may be interesting to note that one of the inequalities used to estimate (130) wasproven with the help of automatic quantifier elimination [HB11] Proving this inequal-ity was a very minor task both computationally and mathematically in all likelihoodit is feasible to give a human-generated proof Still it is nice to know from first-hand experience that computers can nowadays (pretend to) do something other thanjust perform numerical computations ndash and that this is already applicable in currentmathematical practice
Chapter 2
Notation and preliminaries
21 General notationGiven positive integers m n we say m|ninfin if every prime dividing m also divides nWe say a positive integer n is square-full if for every prime p dividing n the squarep2 also divides n (In particular 1 is square-full) We say n is square-free if p2 - nfor every prime p For p prime n a non-zero integer we define vp(n) to be the largestnon-negative integer α such that pα|n
When we writesumn we mean
suminfinn=1 unless the contrary is stated As always
Λ(n) denotes the von Mangoldt function
Λ(n) =
log p if n = pα for some prime p and some integer α ge 10 otherwise
and micro denotes the Mobius function
micro(n) =
(minus1)k if n = p1p2 pk all pi distinct0 if p2|n for some prime p
We let τ(n) be the number of divisors of an integer n ω(n) the number of primedivisors of n and σ(n) the sum of the divisors of n
We write (a b) for the greatest common divisor of a and b If there is any riskof confusion with the pair (a b) we write gcd(a b) Denote by (a binfin) the divisorprodp|b p
vp(a) of a (Thus a(a binfin) is coprime to b and is in fact the maximal divisorof a with this property)
As is customary we write e(x) for e2πix We denote the Lr norm of a function fby |f |r We write Olowast(R) to mean a quantity at most R in absolute value Given a setS we write 1S for its characteristic function
1S(x) =
1 if x isin S0 otherwise
Write log+ x for max(log x 0)
31
32 CHAPTER 2 NOTATION AND PRELIMINARIES
22 Dirichlet characters and L functions
Let us go over some basic terms A Dirichlet character χ Z rarr C of modulus q is acharacter χ of (ZqZ)lowast lifted to Z with the convention that χ(n) = 0 when (n q) 6= 1(In other words χ is completely multiplicative and periodic modulo q and vanisheson integers not coprime to q) Again by convention there is a Dirichlet character ofmodulus q = 1 namely the trivial character χT Z rarr C defined by χT (n) = 1 forevery n isin Z
If χ is a character modulo q and χprime is a character modulo qprime|q such that χ(n) =χprime(n) for all n coprime to q we say that χprime induces χ A character is primitive if it isnot induced by any character of smaller modulus Given a character χ we write χlowast forthe (uniquely defined) primitive character inducing χ If a character χmod q is inducedby the trivial character χT we say that χ is principal and write χ0 for χ (provided themodulus q is clear from the context) In other words χ0(n) = 1 when (n q) = 1 andχ0(n) = 0 when (n q) = 0
A Dirichlet L-function L(s χ) (χ a Dirichlet character) is defined as the analyticcontinuation of
sumn χ(n)nminuss to the entire complex plane there is a pole at s = 1 if χ
is principalA non-trivial zero of L(s χ) is any s isin C such that L(s χ) = 0 and 0 lt lt(s) lt 1
(In particular a zero at s = 0 is called ldquotrivialrdquo even though its contribution can bea little tricky to work out The same would go for the other zeros with lt(s) = 0occuring for χ non-primitive though we will avoid this issue by working mainly withχ primitive) The zeros that occur at (some) negative integers are called trivial zeros
The critical line is the line lt(s) = 12 in the complex plane Thus the generalizedRiemann hypothesis for Dirichlet L-functions reads for every Dirichlet character χall non-trivial zeros of L(s χ) lie on the critical line Verifiable finite versions ofthe generalized Riemann hypothesis generally read for every Dirichlet character χ ofmodulus q le Q all non-trivial zeros of L(s χ) with |=(s)| le f(q) lie on the criticalline (where f Zrarr R+ is some given function)
23 Fourier transforms and exponential sums
The Fourier transform on R is normalized here as follows
f(t) =
int infinminusinfin
e(minusxt)f(x)dx
The trivial bound is |f |infin le |f |1 If f is compactly supported (or of fast enoughdecay as t 7rarr plusmninfin) and piecewise continuous f(t) = f prime(t)(2πit) by integration byparts Iterating we obtain that if f is of fast decay and differentiable k times outsidefinitely many points then
f(t) = Olowast
(|f (k)|infin(2πt)k
)= Olowast
(|f (k)|1(2πt)k
) (21)
23 FOURIER TRANSFORMS AND EXPONENTIAL SUMS 33
Thus for instance if f is compactly supported continuous and piecewise C1 then fdecays at least quadratically
It could happen that |f (k)|1 = infin in which case (21) is trivial (but not false) Inpractice we require f (k) isin L1 In a typical situation f is differentiable k times exceptat x1 x2 xk where it is differentiable only (k minus 2) times the contribution of xi(say) to |f (k)|1 is then | limxrarrx+
if (kminus1)(x)minus limxrarrxminusi
f (kminus1)(x)|The following bound is standard (see eg [Tao14 Lemma 31]) for α isin RZ and
f Rrarr C compactly supported and piecewise continuous∣∣∣∣∣sumnisinZ
f(n)e(αn)
∣∣∣∣∣ le min
(|f |1 +
1
2|f prime|1
12 |fprime|1
| sin(πα)|
) (22)
(The first bound follows fromsumnisinZ |f(n)| le |f |1 + (12)|f prime|1 which in turn is
a quick consequence of the fundamental theorem of calculus the second bound isproven by summation by parts) The alternative bound (14)|f primeprime|1| sin(πα)|2 givenin [Tao14 Lemma 31] (for f continuous and piecewise C1) can usually be improvedby the following estimate
Lemma 231 Let f Rrarr C be compactly supported continuous and piecewise C1Then ∣∣∣∣∣sum
nisinZf(n)e(αn)
∣∣∣∣∣ le 14 |f primeprime|infin
(sinπα)2(23)
for every α isin R
As usual the assumption of compact support could easily be relaxed to an assump-tion of fast decay
Proof By the Poisson summation formulainfinsum
n=minusinfinf(n)e(αn) =
infinsumn=minusinfin
f(nminus α)
Since f(t) = f prime(t)(2πit)
infinsumn=minusinfin
f(nminus α) =
infinsumn=minusinfin
f prime(nminus α)
2πi(nminus α)=
infinsumn=minusinfin
f primeprime(nminus α)
(2πi(nminus α))2
By Eulerrsquos formula π cot sπ = 1s+suminfinn=1(1(n+ s)minus 1(nminus s))
infinsumn=minusinfin
1
(n+ s)2= minus(π cot sπ)prime =
π2
(sin sπ)2 (24)
Hence∣∣∣∣∣infinsum
n=minusinfinf(nminus α)
∣∣∣∣∣ le |f primeprime|infininfinsum
n=minusinfin
1
(2π(nminus α))2= |f primeprime|infin middot
1
(2π)2middot π2
(sinαπ)2
34 CHAPTER 2 NOTATION AND PRELIMINARIES
The trivial bound |f primeprime|infin le |f primeprime|1 applied to (23) recovers the bound in [Tao14Lemma 31] In order to do better we will give a tighter bound for |f primeprime|infin in AppendixB when f is equal to one of our main smoothing functions (f = η2)
Integrals of multiples of f primeprime (in particular |f primeprime|1 and f primeprime) can still be made senseof when f primeprime is undefined at a finite number of points provided f is understood as adistribution (and f prime has finite total variation) This is the case in particular for f = η2
When we need to estimatesumn f(n) precisely we will use the Poisson summation
formula sumn
f(n) =sumn
f(n)
We will not have to worry about convergence here since we will apply the Poissonsummation formula only to compactly supported functions f whose Fourier transformsdecay at least quadratically
24 Mellin transformsThe Mellin transform of a function φ (0infin)rarr C is
Mφ(s) =
int infin0
φ(x)xsminus1dx (25)
If φ(x)xσminus1 is in `1 with respect to dt (ieintinfin
0|φ(x)|xσminus1dx ltinfin) then the Mellin
transform is defined on the line σ+ iR Moreover if φ(x)xσminus1 is in `1 for σ = σ1 andfor σ = σ2 where σ2 gt σ1 then it is easy to see that it is also in `1 for all σ isin (σ1 σ2)and that moreover the Mellin transform is holomorphic on s σ1 lt lt(s) lt σ2 Wethen say that s σ1 lt lt(s) lt σ2 is a strip of holomorphy for the Mellin transform
The Mellin transform becomes a Fourier transform (of η(eminus2πv)eminus2πvσ) by meansof the change of variables x = eminus2πv We thus obtain for example that the Mellintransform is an isometry in the sense thatint infin
0
|f(x)|2x2σ dx
x=
1
2π
int infinminusinfin|Mf(σ + it)|2dt (26)
Recall that in the case of the Fourier transform for |f |2 = |f |2 to hold it is enoughthat f be in `1 cap `2 This gives us that for (26) to hold it is enough that f(x)xσminus1 bein `1 and f(x)xσminus12 be in `2 (again with respect to dt in both cases)
We write f lowastM g for the multiplicative or Mellin convolution of f and g
(f lowastM g)(x) =
int infin0
f(w)g( xw
) dww (27)
In generalM(f lowastM g) = Mf middotMg (28)
25 BOUNDS ON SUMS OF micro AND Λ 35
and
M(f middot g)(s) =1
2πi
int σ+iinfin
σminusiinfinMf(z)Mg(sminus z)dz [GR94 sect1732] (29)
provided that z and sminus z are within the strips on which Mf and Mg (respectively) arewell-defined
We also have several useful transformation rules just as for the Fourier transformFor example
M(f prime(t))(s) = minus(sminus 1) middotMf(sminus 1)
M(tf prime(t))(s) = minuss middotMf(s)
M((log t)f(t))(s) = (Mf)prime(s)
(210)
(as in eg [BBO10 Table 111])Let
η2 = (2 middot 1[121]) lowastM (2 middot 1[121])
Since (see eg [BBO10 Table 113] or [GR94 sect1643])
(MI[ab])(s) =bs minus as
s
we see that
Mη2(s) =
(1minus 2minuss
s
)2
Mη4(s) =
(1minus 2minuss
s
)4
(211)
Let fz = eminuszt where lt(z) gt 0 Then
(Mf)(s) =
int infin0
eminuszttsminus1dt =1
zs
int infin0
eminustdt
=1
zs
int zinfin
0
eminusuusminus1du =1
zs
int infin0
eminusttsminus1dt =Γ(s)
zs
where the next-to-last step holds by contour integration and the last step holds by thedefinition of the Gamma function Γ(s)
25 Bounds on sums of micro and Λ
We will need some simple explicit bounds on sums involving the von Mangoldt func-tion Λ and the Moebius function micro In non-explicit work such sums are usuallybounded using the prime number theorem or rather using the properties of the zetafunction ζ(s) underlying the prime number theorem Here however we need robustfully explicit bounds valid over just about any range
For the most part we will just be quoting the literature supplemented with somecomputations when needed The proofs in the literature are sometimes based on prop-erties of ζ(s) and sometimes on more elementary facts
36 CHAPTER 2 NOTATION AND PRELIMINARIES
First let us see some bounds involving Λ The following bound can be easilyderived from [RS62 (323)] supplemented by a quick calculation of the contributionof powers of primes p lt 32 sum
nlex
Λ(n)
nle log x (212)
We can derive a bound in the other direction from [RS62 (321)] (for x gt 1000adding the contribution of all prime powers le 1000) and a numerical verification forx le 1000 sum
nlex
Λ(n)
nge log xminus log
3radic2 (213)
We also use the following older bounds
1 By the second table in [RR96 p 423] supplemented by a computation for2 middot 106 le V le 4 middot 106 sum
nley
Λ(n) le 10004y (214)
for y ge 2 middot 106
2 sumnley
Λ(n) lt 103883y (215)
for every y gt 0 [RS62 Thm 12]
For all y gt 663 sumnley
Λ(n)n lt 103884y2
2 (216)
where we use (215) and partial summation for y gt 200000 and a computation for663 lt y le 200000 Using instead the second table in [RR96 p 423] together withcomputations for small y lt 107 and partial summation we get that
sumnley
Λ(n)n lt 10008y2
2(217)
for y gt 16 middot 106Similarly sum
nley
Λ(n)radicn
lt 2 middot 10004radicy (218)
for all y ge 1It is also true that sum
y2ltpley
(log p)2 le 1
2y(log y) (219)
25 BOUNDS ON SUMS OF micro AND Λ 37
for y ge 117 this holds for y ge 2 middot 758699 by [RS75 Cor 2] (applied to x = yx = y2 and x = 2y3) and for 117 le y lt 2 middot 758699 by direct computation
Now let us see some estimates on sums involving micro The situation here is lesssatisfactory than for sums involving Λ The main reason is that the complex-analyticapproach to estimating
sumnleN micro(n) would involve 1ζ(s) rather than ζ prime(s)ζ(s) and
thus strong explicit bounds on the residues of 1ζ(s) would be needed Thus explicitestimates on sums involving micro are harder to obtain than estimates on sums involving ΛThis is so even though analytic number theorists are generally used (from the habit ofnon-explicit work) to see the estimation of one kind of sum or the other as essentiallythe same task
Fortunately in the case of sums of the typesumnlex micro(n)n for x arbitrary (a type of
sum that will be rather important for us) all we need is a saving of (log n) or (log n)2
on the trivial bound This is provided by the following
1 (Granville-Ramare [GR96] Lemma 102)∣∣∣∣∣∣sum
nlexgcd(nq)=1
micro(n)
n
∣∣∣∣∣∣ le 1 (220)
for all x q ge 1
2 (Ramare [Ram13] cf El Marraki [EM95] [EM96])∣∣∣∣∣∣sumnlex
micro(n)
n
∣∣∣∣∣∣ le 003
log x(221)
for x ge 11815
3 (Ramare [Ramb]) sumnlexgcd(nq)=1
micro(n)
n= Olowast
(1
log xqmiddot 4
5
q
φ(q)
)(222)
for all x and all q le xsumnlexgcd(nq)=1
micro(n)
nlog
x
n= Olowast
(100303
q
φ(q)
)(223)
for all x and all q
Improvements on these bounds would lead to improvements on type I estimates butnot in what are the worst terms overall at this point
A computation carried out by the author has proven the following inequality for allreal x le 1012 ∣∣∣∣∣∣
sumnlex
micro(n)
n
∣∣∣∣∣∣ leradic
2
x(224)
38 CHAPTER 2 NOTATION AND PRELIMINARIES
The computation was conducted rigorously by means of interval arithmetic For thesake of verification we record that
542625 middot 10minus8 lesum
nle1012
micro(n)
nle 542898 middot 10minus8
Computations also show that the stronger bound∣∣∣∣∣∣sumnlex
micro(n)
n
∣∣∣∣∣∣ le 1
2radicx
holds for all 3 le x le 7727068587 but not for x = 7727068588minus εEarlier numerical work carried out by Olivier Ramare [Ram14] had shown that
(224) holds for all x le 1010
26 Interval arithmetic and the bisection methodInterval arithmetic has at its basic data type intervals of the form I = [a2` b2`]where a b ` isin Z and a le b Say we have a real number x and we want to know sin(x)In general we cannot represent x in a computer in part because it may have no finitedescription The best we can do is to construct an interval of the form I = [a2` b2`]in which x is contained
What we ask of a routine in an interval-arithmetic package is to construct an intervalI prime = [aprime2`
prime bprime2`
prime] in which sin(I) is contained (In practice this is done partly in
software by means of polynomial approximations to sin with precise error terms andpartly in hardware by means of an efficient usage of rounding conventions) This givesus in effect a value for sin(x) (namely (aprime+ bprime)2`
prime+1) and a bound on the error term(namely (bprime minus aprime)2`prime+1)
There are several implementations of interval arithmetic available We will almostalways use D Plattrsquos implementation [Pla11] of double-precision interval arithmeticbased on Lambovrsquos [Lam08] ideas (At one point we will use the PROFILBIAS inter-val arithmetic package [Knu99] since it underlies the VNODE-LP [Ned06] packagewhich we use to bound an integral)
The bisection method is a particularly simple method for finding maxima and min-ima of functions as well as roots It combines rather nicely with interval arithmeticwhich makes the method rigorous We follow an implementation based on [Tuc11sect52] Let us go over the basic ideas
Let us use the bisection method to find the minima (say) of a function f on acompact interval I0 (If the interval is non-compact we generally apply the bisectionmethod to a compact sub-interval and use other tools eg power-series expansionsin the complement) The method proceeds by splitting an interval into two repeatedlydiscarding the halfs where the minimum cannot be found More precisely if we im-plement it by interval arithmetic it proceeds as follows First in an optional initialstep we subdivide (if necessary) the interval I0 into smaller intervals Ik to which thealgorithm will actually be applied For each k interval arithmetic gives us a lower
26 INTERVAL ARITHMETIC AND THE BISECTION METHOD 39
bound rminusk and an upper bound r+k on f(x) x isin Ik here rminusk and r+
k are both ofthe form a2` a ` isin Z Let m0 be the minimum of r+
k over all k We can discardall the intervals Ik for which rminusk gt m0 Then we apply the main procedure startingwith i = 1 split each surviving interval into two equal halves recompute the lower andupper bound on each half definemi as before to be the minimum of all upper boundsand discard again the intervals on which the lower bound is larger than mi increase iby 1 We repeat the main procedure as often as needed In the end we obtain that theminimum is no smaller than the minimum of the lower bounds (call them (r(i))minusk ) onall surviving intervals I(i)
k Of course we also obtain that the minimum (or minima ifthere is more than one) must lie in one of the surviving intervals
It is easy to see how the same method can be applied (with a trivial modification)to find maxima or (with very slight changes) to find the roots of a real-valued functionon a compact interval
40 CHAPTER 2 NOTATION AND PRELIMINARIES
Part I
Minor arcs
41
Chapter 3
Introduction
The circle method expresses the number of solutions to a given problem in terms ofexponential sums Let η R+ rarr C be a smooth function Λ the von Mangoldt function(defined as in (15)) and e(t) = e2πit The estimation of exponential sums of the type
Sη(α x) =sumn
Λ(n)e(αn)η(nx) (31)
where α isin RZ already lies at the basis of Hardy and Littlewoodrsquos approach to theternary Goldbach problem by means of the circle method [HL22] The division of thecircle RZ into ldquomajor arcsrdquo and ldquominor arcsrdquo goes back to Hardy and Littlewoodrsquosdevelopment of the circle method for other problems As they themselves noted as-suming GRH means that for the ternary Goldbach problem all of the circle can bein effect subdivided into major arcs ndash that is under GRH (31) can be estimated withmajor-arc techniques for α arbitrary They needed to make such an assumption pre-cisely because they did not yet know how to estimate Sη(α x) on the minor arcs
Minor-arc techniques for Goldbachrsquos problem were first developed by Vinogradov[Vin37] These techniques make it possible to work without GRH The main obstacleto a full proof of the ternary Goldbach conjecture since then has been that in spite ofgradual improvements minor-arc bounds have simply not been strong enough
As in all work to date our aim will be to give useful upper bounds on (31) forα in the minor bounds rather than the precise estimates that are typical of the major-arc case We will have to give upper bounds that are qualitatively stronger than thoseknown before (In Part III we will also show how to use them more efficiently)
Our main challenge will be to give a good upper bound whenever q is larger than aconstant r Here ldquosufficiently goodrdquo means ldquosmaller than the trivial bound divided bya large constant and getting even smaller quickly as q growsrdquo Our bound must also begood for α = aq + δx where q lt r but δ is large (Such an α may be said to lie onthe tail (δ large) of a major arc (q small))
Of course all expressions must be explicit and all constants in the leading terms ofthe bound must be small Still the main requirement is a qualitative one For instancewe know in advance that a single factor of log x would be the end of us That is we
43
44 CHAPTER 3 INTRODUCTION
know that if there is a single term of the form say (x log x)q and the trivial boundis about x we are lost (x log x)q is greater than x for x large and q constant
The quality of the results here is due to several new ideas of general applicabilityIn particular sect51 introduces a way to obtain cancellation from Vaughanrsquos identityVaughanrsquos identity is a two-log gambit in that it introduces two convolutions (each ofthem at a cost of log) and offers a great deal of flexibility in compensation One of theideas presented here is that at least one of two logs can be successfully recovered afterhaving been given away in the first stage of the proof This reduces the cost of the useof this basic identity in this and presumably many other problems
There are several other improvements that make a qualitative difference see thediscussions at the beginning of sect4 and sect5 Considering smoothed sums ndash now a com-mon idea ndash also helps (Smooth sums here go back to Hardy-Littlewood [HL22] ndash bothin the general context of the circle method and in the context of Goldbachrsquos ternaryproblem In recent work on the problem they reappear in [Tao14])
31 ResultsThe main bound we are about to see is essentially proportional to ((log q)
radicφ(q)) middot x
The term δ0 serves to improve the bound when we are on the tail of an arc
Theorem 311 Let x ge x0 x0 = 216 middot 1020 Let Sη(α x) be as in (31) with ηdefined in (34) Let 2α = aq + δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23 If q le x136 then
|Sη(α x)| le Rxδ0q log δ0q + 05radicδ0φ(q)
middot x+25xradicδ0q
+2x
δ0qmiddot Lxδ0qq + 336x56
(32)where
δ0 = max(2 |δ|4) Rxt = 027125 log
(1 +
log 4t
2 log 9x13
2004t
)+ 041415
Lxtq =q
φ(q)
(13
4log t+ 782
)+ 1366 log t+ 3755
(33)If q gt x136 then
|Sη(α x)| le 0276x56(log x)32 + 1234x23 log x
The factor Rxt is small in practice for instance for x = 1025 and δ0q = 5 middot 105
(typical ldquodifficultrdquo values) Rxδ0q equals 059648 The classical choice1 for η in (31) is η(t) = 1 for t le 1 η(t) = 0 for t gt 1 which
of course is not smooth or even continuous We use
η(t) = η2(t) = 4 max(log 2minus | log 2t| 0) (34)
1Or more precisely the choice made by Vinogradov and followed by most of the literature since himHardy and Littlewood [HL22] worked with η(t) = eminust
32 COMPARISON TO EARLIER WORK 45
as in Tao [Tao14] in part for purposes of comparison (This is the multiplicative con-volution of the characteristic function of an interval with itself) Nearly all work shouldbe applicable to any other sufficiently smooth function η of fast decay It is importantthat η decay at least quadratically
We are not forced to use the same smoothing function as in Part II and we do notAs was explained in the introduction the simple technique (140) allows us to workwith one smoothing function on the major arcs and with another one on the minor arcs
32 Comparison to earlier workTable 31 compares the bounds for the ratio |Sη(aq x)|x given by this paper and by[Tao14][Thm 13] for x = 1027 and different values of q We are comparing worstcases φ(q) as small as possible (q divisible by 2 middot 3 middot 5 middot middot middot ) in the result here and qdivisible by 4 (implying 4α sim a(q4)) in Taorsquos result The main term in the result inthis paper improves slowly with increasing x the results in [Tao14] worsen slowly withincreasing x The qualitative gain with respect to the main term in [Tao14 (110)] is inthe order of log(q)
radicφ(q)q Notice also that the bounds in [Tao14] are not log-free in
[Tao14 (110)] there is a term proportional to x(log x)2q This becomes larger thanthe trivial bound x for x very large
The results in [DR01] are unfortunately worse than the trivial bound in the rangecovered by Table 31 Ramarersquos results ([Ram10 Thm 3] [Ramc Thm 6]) are notapplicable within the range since neither of the conditions log q le (150)(log x)13q le x148 is satisfied Ramarersquos bound in [Ramc Thm 6] is∣∣∣∣∣∣
sumxltnle2x
Λ(n)e(anq)
∣∣∣∣∣∣ le 13000
radicq
φ(q)x (35)
for 20 le q le x148 We should underline that while both the constant 13000 and thecondition q le x148 keep (35) from being immediately useful in the present context(35) is asymptotically better than the results here as q rarr infin (Indeed qualitativelyspeaking the form of (35) is the best one can expect from results derived by the familyof methods stemming from Vinogradovrsquos work) There is also unpublished work byRamare (ca 1993) with different constants for q (log x log log x)4
33 Basic setupIn the minor-arc regime the first step in estimating an exponential sum on the primesgenerally consists in the application of an identity expressing the von Mangoldt func-tion Λ(n) in terms of a sum of convolutions of other functions
331 Vaughanrsquos identityWe recall Vaughanrsquos identity [Vau77b]
Λ = microleU lowast log +microleU lowast ΛleV lowast 1 + microgtU lowast ΛgtV lowast 1 + ΛleV (36)
46 CHAPTER 3 INTRODUCTION
q0|Sη(aqx)|
x HH |Sη(aqx)|x Tao
105 004661 03447515 middot 105 003883 02883625 middot 105 003098 0231945 middot 105 002297 01741675 middot 105 001934 014775106 001756 013159107 000690 005251
Table 31 Worst-case upper bounds on xminus1|Sη(a2q x)| for q ge q0 |δ| le 8 x =1027 The trivial bound is 1
where 1 is the constant function 1 and where we write
flez(n) =
f(n) if n le z0 if n gt z
fgtz(n) =
0 if n le zf(n) if n gt z
Here f lowast g denotes the Dirichlet convolution (f lowast g)(n) =sumd|n f(d)g(nd) We can
set the values of U and V however we wishVaughanrsquos identity is essentially a consequence of the Mobius inversion formula
(1 lowast micro)(n) =
1 if n = 10 otherwise
(37)
Indeed by (37)
ΛgtV (n) =sumdm|n
micro(d)ΛgtV (m)
=sumdm|n
microleU (d)ΛgtV (m) +sumdm|n
microgtU (d)ΛgtV (m)
Applying to this the trivial equality ΛgtV = Λ minus ΛleV as well as the simple fact that1 lowast Λ = log we obtain that
ΛgtV (n) =sumd|n
microleU (d) log(nd)minussumdm|n
microleU (d)ΛleV (m) +sumdm|n
microgtU (d)ΛgtV (m)
By ΛV = ΛgtV + ΛgeV we conclude that Vaughanrsquos identity (36) holdsApplying Vaughanrsquos identity we easily get that for any function η R rarr R any
completely multiplicative function f Z+ rarr C and any x gt 0 U V ge 0sumn
Λ(n)f(n)e(αn)η(nx) = SI1 minus SI2 + SII + S0infin (38)
33 BASIC SETUP 47
where
SI1 =summleU
micro(m)f(m)sumn
(log n)e(αmn)f(n)η(mnx)
SI2 =sumdleV
Λ(d)f(d)summleU
micro(m)f(m)sumn
e(αdmn)f(n)η(dmnx)
SII =summgtU
f(m)
sumdgtUd|m
micro(d)
sumngtV
Λ(n)e(αmn)f(n)η(mnx)
S0infin =sumnleV
Λ(n)e(αn)f(n)η(nx)
(39)
We will use the function
f(n) =
1 if gcd(n v) = 10 otherwise
(310)
where v is a small positive square-free integer (Our final choice will be v = 2) Then
Sη(x α) = SI1 minus SI2 + SII + S0infin + S0w (311)
where Sη(x α) is as in (31) and
S0v =sumn|v
Λ(n)e(αn)η(nx)
The sums SI1 SI2 are called ldquoof type Irdquo the sum SII is called ldquoof type IIrdquo (orbilinear) (The not-all-too colorful nomenclature goes back to Vinogradov) The sumS0infin is in general negligible for our later choice of V and η it will be in fact 0 Thesum S0v will be negligible as well
As we already discussed in the introduction Vaughanrsquos identity is highly flexible(in that we can choose U and V at will) but somewhat inefficient in practice (in that atrivial estimate for the right side of (311) is actually larger than a trivial estimate forthe left side of (311)) Some of our work will consist in regaining part of what is givenup when we apply Vaughanrsquos identity
332 An alternative route
There is an alternative route ndash namely to use a less sacrificial though also more in-flexible identity While this was not in the end the route that was followed let usnevertheless discuss it in some detail in part so that we can understand to what extentit was in retrospect viable and in part so as to see how much of the work we willundertake is really more or less independent of the particular identity we choose
48 CHAPTER 3 INTRODUCTION
Since ζ prime(s)ζ(s) =sumn Λ(n)nminuss and(
ζ prime(s)
ζ(s)
)(2)
=
(ζ primeprime(s)
ζ(s)minus (ζ prime(s))
2
ζ(s)2
)prime
=ζ(3)(s)
ζ(s)minus 3ζ primeprime(s)ζ prime(s)
ζ(s)2+ 2
(ζ prime(s)
ζ(s)
)3
=ζ(3)(s)
ζ(s)minus 3
(ζ prime(s)
ζ(s)
)primemiddot ζprime(s)
ζ(s)minus(ζ prime(s)
ζ(s)
)3
(312)
we can see comparing coefficients that
Λ middot log2 = micro lowast log3minus3(Λ middot log) lowast Λminus Λ lowast Λ lowast Λ (313)
as was stated by Bombieri in [Bom76]Here the term microlowast log3 is of the same kind as the term microleU lowast log we have to estimate
if we use Vaughanrsquos identity though the fact that there is no truncation at U means thatone of the error terms will get larger ndash it will be proportional to x in fact if we sumfrom 1 to x The trivial upper bound on the sum of Λ middot log2 from 1 to x is x(log x)2thus an error term of size x is barely acceptable
In general when we have a double or triple sum we are not very good at gettingbetter than trivial bounds in ranges in which all but one of the variables are very smallThis is the source of the large error term that appears in the sum involving micro lowast log3
because we are no longer truncating as for microleU lowast log It will also be the source of otherlarge error terms including one that would be too large ndash namely the one coming fromthe term (Λ middot log) lowast Λ when the variable of Λ middot log is large and that of Λ is small (Thetrivial bound on that range is x log x)
We avoid this problem by substituting the identity Λ middot log = micro lowast log2minusΛ lowastΛ inside(313)
Λ middot log2 = micro lowast log3minus3(micro lowast log2) lowast Λ + 2Λ lowast Λ lowast Λ (314)
(We could also have got this directly from the next-to-last line in (312)) When thevariable of Λ in (micro lowast log2) lowast Λ is small the variable of micro lowast log2 is large and we canestimate the resulting term using the same techniques as for micro lowast log3
It is easy to see that we can in fact mix (313) and (314)
Λ middot log2 = micro lowast log3minus3((Λ middot log) lowast ΛgtV + (micro lowast log2) lowast ΛleV
)+ (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ)
(315)
for V arbitrary Note here that there is some cancellation in the last term writing
F3V (n) = (minusΛgtV lowast Λ lowast Λ + 2ΛleV lowast Λ lowast Λ) (n) (316)
we can check easily that for n = p1p2p3 square-free with V 3 lt n we have
F3V (n) =
minus6 log p1 log p2 log p3 if all pi gt V 0 if p1 lt p2 le V lt p36 log p1 log p2 log p3 if p1 le V lt p2 lt p312 log p1 log p2 log p3 if all pi le V
33 BASIC SETUP 49
In contrast for n square-free minusΛ lowast Λ lowast Λ(n) is minus6 if n is of the form p1p2p3 and 0otherwise
We may find it useful to take aside two large terms that may need to be boundedtrivially namely micro lowast log3
leu and (Λ middot log)leu lowastΛgtV where u will be a small parameter(We can let for instance u = 3) We conclude that
Λ middot log2 = FI1u(n)minus 3FI2Vu(n)minus 3FIIVu(n) + F3V (n) + F0Vu(n) (317)
whereFI1u = micro lowast log3
gtu
FI2Vu = (micro lowast log2) lowast ΛleV
FIIVu(n) = (Λ middot log)gtu lowast ΛgtV
F0Vu(n) = micro lowast log3leuminus3(Λ middot log)leu lowast ΛgtV
and F3V is as in (316)In the bulk of the present work ndash in particular in all steps that are part of the proof
of Theorem 311 or the Main Theorem ndash we will use Vaughanrsquos identity rather than(317) This choice was made while the proof was still underway it was due mainlyto back-of-the-envelope estimates that showed that the error terms could be too largeif (314) was used Of course this might have been the case with Vaughanrsquos identityas well but the fact that the parameters U V there have a large effect on the outcomemeant that one could hope to improve on insufficient estimates in part by adjusting Uand V without losing all previous work (This is what was meant by the ldquoflexibilityrdquoof Vaughanrsquos identity)
The question remains can one prove ternary Goldbach using (317) rather thanVaughanrsquos identity This seems likely If so which proof would be more complicatedThis is not clear
There are large parts of the work that are the essentially the same in both cases
bull estimates for sums involving microleU lowast logk (ldquotype Irdquo)
bull estimates for sums involving Λgtu lowast ΛgtV and the like (ldquotype IIrdquo)
Trilinear sums ie sums involving ΛlowastΛlowastΛ can be estimated much like bilinear sumsie sums involving Λ lowast Λ
There are also challenges that appear only for Vaughanrsquos identity and others thatappear only for (317) An example of a challenge that is successfully faced in the mainproof but does not appear if (317) is used consists in bounding sums of type
sumUltmlexW
sumdgtUd|m
micro(d)
2
(In sect51 we will be able to bound sums of this type by a constant times xW ) Like-wise large tail terms that have to be estimated trivially seem unavoidable in (317)(The choice of a parameter u gt 1 as above is meant to alleviate the problem)
50 CHAPTER 3 INTRODUCTION
In the end losing a factor of about log xUV seems inevitable when one usesVaughanrsquos identity but not when one uses (317) Another reason why a full treatmentbased on (317) would also be worthwhile is that it is a somewhat less familiar andarguably under-used identity and deserves more exploration With these commentswe close the discussion of (317) we will henceforth use Vaughanrsquos identity
Chapter 4
Type I sums
Here we must bound sums of the basic typesummleD
micro(m)sumn
e(αmn)η(mnx
)and variations thereof There are three main improvements in comparison to standardtreatments
1 The terms with m divisible by q get taken out and treated separately by analyticmeans This all but eliminates what would otherwise be the main term
2 The other terms get handled by improved estimates on trigonometric sums Forlarge m the improvements have a substantial total effect ndash more than a constantfactor is gained
3 The ldquoerrorrdquo term δx = α minus aq is used to our advantage This happens boththrough the Poisson summation formula and through the use of two alternativeapproximations to the same number α
The fact that a continuous weight η is used (ldquosmoothingrdquo) is a difference with respectto the classical literature ([Vin37] and what followed) but not with respect to morerecent work (including [Tao14]) using smooth or continuous weights is an idea thathas become commonplace in analytic number theory even though it is not consistentlyapplied The improvements due to smoothing in type I are both relatively minor andessentially independent of the improvements due to (1) and (3) The use of a contin-uous weight combines nicely with (2) but the ideas given here would give qualitativeimprovements in the treatment of trigonometric sums even in the absence of smoothing
41 Trigonometric sumsThe following lemmas on trigonometric sums improve on the best Vinogradov-typelemmas in the literature (By this we mean results of the type of Lemma 8a and
51
52 CHAPTER 4 TYPE I SUMS
Lemma 8b in [Vin04 Ch I] See in particular the work of Daboussi and Rivat [DR01Lemma 1]) The main idea is to switch between different types of approximation withinthe sum rather than just choosing between bounding all terms either trivially (by A)or non-trivially (by C| sin(παn)|2) There will also1 be improvements in our appli-cations stemming from the fact that Lemmas 411 and Lemma 412 take quadratic(| sin(παn)|2) rather than linear (| sin(παn)|) inputs (These improved inputs comefrom the use of smoothing elsewhere)
Lemma 411 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Then for anyAC ge 0
sumyltnley+q
min
(A
C
| sin(παn)|2
)le min
(2A+
6q2
π2C 3A+
4q
π
radicAC
) (41)
Proof We start by letting m0 = byc + b(q + 1)2c j = n minusm0 so that j ranges inthe interval (minusq2 q2] We write
αn =aj + c
q+ δ1(j) + δ2 mod 1
where |δ1(j)| and |δ2| are both le 12q we can assume δ2 ge 0 The variable r =aj + c mod q occupies each residue class mod p exactly once
One option is to bound the terms corresponding to r = 0minus1 by A each and allthe other terms by C| sin(παn)|2 (This can be seen as the simple case it will takeus about a page just because we should estimate all sums and all terms here with greatcare ndash as in [DR01] only more so)
The terms corresponding to r = minusk and r = k minus 1 (2 le k le q2) contribute atmost
1
sin2 πq (k minus 1
2 minus qδ2)+
1
sin2 πq (k minus 3
2 + qδ2)le 1
sin2 πq
(k minus 1
2
) +1
sin2 πq
(k minus 3
2
) since x 7rarr 1
(sin x)2 is convex-up on (0infin) Hence the terms with r 6= 0 1 contribute atmost
1(sin π
2q
)2 + 2sum
2lerle q2
1(sin π
q (r minus 12))2 le
1(sin π
2q
)2 + 2
int q2
1
1(sin π
q x)2
where we use again the convexity of x 7rarr 1(sinx)2 (We can assume q gt 2 asotherwise we have no terms other than r = 0 1) Nowint q2
1
1(sin π
q x)2 dx =
q
π
int π2
πq
1
(sinu)2du =
q
πcot
π
q
1This is a change with respect to the first version of the preprint [Helb] The version of Lemma 411there has however the advantage of being immediately comparable to results in the literature
41 TRIGONOMETRIC SUMS 53
Hence sumyltnley+q
min
(A
C
(sinπαn)2
)le 2A+
C(sin π
2q
)2 + C middot 2q
πcot
π
q
Now by [AS64 (4368)] and [AS64 (4370)] for t isin (minusπ π)
t
sin t= 1 +
sumkge0
a2k+1t2k+2 = 1 +
t2
6+
t cot t = 1minussumkge0
b2k+1t2k+2 = 1minus t2
3minus t4
45minus
(42)
where a2k+1 ge 0 b2k+1 ge 0 Thus for t isin [0 t0] t0 lt π(t
sin t
)2
= 1 +t2
3+ c0(t)t4 le 1 +
t2
3+ c0(t0)t4 (43)
where
c0(t) =1
t4
((t
sin t
)2
minus(
1 +t2
3
))
which is an increasing function because a2k+1 ge 0 For t0 = π4 c0(t0) le 0074807Hence
t2
sin2 t+ t cot 2t le
(1 +
t2
3+ c0
(π4
)t4)
+
(1
2minus 2t2
3minus 8t4
45
)=
3
2minus t2
3+
(c0
(π4
)minus 8
45
)t4 le 3
2minus t2
3le 3
2
for t isin [0 π4]Therefore the left side of (41) is at most
2A+ C middot(
2q
π
)2
middot 3
2= 2A+
6
π2Cq2
The following is an alternative approach it yields the other estimate in (41) Webound the terms corresponding to r = 0 r = minus1 r = 1 by A each We let r = plusmnrprimefor rprime ranging from 2 to q2 We obtain that the sum is at most
3A+sum
2lerprimeleq2
min
A C(sin π
q
(rprime minus 1
2 minus qδ2))2
+
sum2lerprimeleq2
min
A C(sin π
q
(rprime minus 1
2 + qδ2))2
(44)
54 CHAPTER 4 TYPE I SUMS
We bound a term min(AC sin((πq)(rprime minus 12 plusmn qδ2))2) by A if and only ifC sin((πq)(rprimeminus 1plusmn qδ2))2 ge A (In other words we are choosing which of the twobounds A C| sin(παn)|2 on a case-by-case basis ie for each n instead of makinga single choice for all n in one go This is hardly anything deep but it does result ina marked improvement with respect to the literature and would give an improvementeven if we were given a bound B| sin(παn)| instead of a bound C| sin(παn)|2 asinput) The number of such terms is
le max(0 b(qπ) arcsin(radicCA)∓ qδ2c)
and thus at most (2qπ) arcsin(radicCA) in total (Recall that qδ2 le 12) Each
other term gets bounded by the integral of C sin2(παq) from rprime minus 1 plusmn qδ2 (ge(qπ) arcsin(
radicCA)) to rprime plusmn qδ2 by convexity Thus (44) is at most
3A+2q
πA arcsin
radicC
A+ 2
int q2
qπ arcsin
radicCA
C
sin2 πtq
dt
le 3A+2q
πA arcsin
radicC
A+
2q
πC
radicA
Cminus 1
We can easily show (taking derivatives) that arcsinx + x(1 minus x2) le 2x for 0 lex le 1 Setting x = CA we see that this implies that
3A+2q
πA arcsin
radicC
A+
2q
πC
radicA
Cminus 1 le 3A+
4q
π
radicAC
(If CA gt 1 then 3A + (4qπ)radicAC is greater than Aq which is an obvious upper
bound for the left side of (41))
Now we will see that if we take out terms with n divisible by q and n is not toolarge then we can give a bound that does not involve a constant term A at all (We arereferring to the bound (203π2)Cq2 below of course 2A + (4qπ)
radicAC does have
a constant term 2A ndash it is just smaller than the constant term 3A in the correspondingbound in (41))
Lemma 412 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any AC ge 0sum
y1ltnley2q-n
min
(A
C
| sin(παn)|2
)le min
(20
3π2Cq2 2A+
4q
π
radicAC
) (45)
Proof Clearly αn equals anq + (nQ)βq since y2 le Q2 this means that |αnminusanq| le 12q for n le y2 moreover again for n le y2 the sign of αnminus anq remainsconstant Hence the left side of (45) is at most
q2sumr=1
min
(A
C
(sin πq (r minus 12))2
)+
q2sumr=1
min
(A
C
(sin πq r)
2
)
41 TRIGONOMETRIC SUMS 55
Proceeding as in the proof of Lemma 411 we obtain a bound of at most
C
(1
(sin π2q )2
+1
(sin πq )2
+q
πcot
π
q+q
πcot
3π
2q
)
for q ge 2 (If q = 1 then the left-side of (45) is trivially zero) Now by (42)
t2
(sin t)2+t
2cot 2t le
(1 +
t2
3+ c0
(π4
)t4)
+1
4
(1minus 4t2
3minus 16t4
45
)le 5
4+
(c0
(π4
)minus 4
45
)t4 le 5
4
for t isin [0 π4] and
t2
(sin t)2+ t cot
3t
2le(
1 +t2
3+ c0
(π2
)t4)
+2
3
(1minus 3t2
4minus 81t4
24 middot 45
)le 5
3+
(minus1
6+
(c0
(π2
)minus 27
360
)(π2
)2)t2 le 5
3
for t isin [0 π2] Hence(1
(sin π2q )2
+1
(sin πq )2
+q
πcot
π
q+q
πcot
3π
2q
)le(
2q
π
)2
middot 54
+( qπ
)2
middot 53le 20
3π2q2
Alternatively we can follow the second approach in the proof of Lemma 411 andobtain an upper bound of 2A+ (4qπ)
radicAC
The following bound will be useful when the constant A in an application ofLemma 412 would be too large (This tends to happen for n small)
Lemma 413 Let α = aq + βqQ (a q) = 1 |β| le 1 q le Q Let y2 gt y1 ge 0 Ify2 minus y1 le q and y2 le Q2 then for any BC ge 0
sumy1ltnley2
q-n
min
(B
| sin(παn)|
C
| sin(παn)|2
)le 2B
q
πmax
(2 log
Ce3q
Bπ
) (46)
The upper bound le (2Bqπ) log(2e2qπ) is also valid
Proof As in the proof of Lemma 412 we can bound the left side of (46) by
2
q2sumr=1
min
(B
sin πq
(r minus 1
2
) C
sin2 πq
(r minus 1
2
))
56 CHAPTER 4 TYPE I SUMS
Assume B sin(πq) le C le B By the convexity of 1 sin(t) and 1 sin(t)2 fort isin (0 π2]
q2sumr=1
min
(B
sin πq
(r minus 1
2
) C
sin2 πq
(r minus 1
2
))
le B
sin π2q
+
int qπ arcsin C
B
1
B
sin πq tdt+
int q2
qπ arcsin C
B
1
sin2 πq tdt
le B
sin π2q
+q
π
(B
(log tan
(1
2arcsin
C
B
)minus log tan
π
2q
)+ C cot arcsin
C
B
)le B
sin π2q
+q
π
(B
(log cot
π
2qminus log
C
B minusradicB2 minus C2
)+radicB2 minus C2
)
Now for all t isin (0 π2)
2
sin t+
1
tlog cot t lt
1
tlog
(e2
t
)
we can verify this by comparing series Thus
B
sin π2q
+q
πB log cot
π
2qle B q
πlog
2e2q
π
for q ge 2 (If q = 1 the sum on the left of (46) is empty and so the bound we aretrying to prove is trivial) We also have
t log(tminusradict2 minus 1) +
radict2 minus 1 lt minust log 2t+ t (47)
for t ge 1 (as this is equivalent to log(2t2(1minusradic
1minus tminus2)) lt 1minusradic
1minus tminus2 which wecheck easily after changing variables to δ = 1minus
radic1minus tminus2) Hence
B
sin π2q
+q
π
(B
(log cot
π
2qminus log
C
B minusradicB2 minus C2
)+radicB2 minus C2
)le B q
πlog
2e2q
π+q
π
(B minusB log
2B
C
)le B q
πlog
Ce3q
Bπ
for q ge 2Given any C we can apply the above with C = B instead as for any t gt 0
min(Bt Ct2) le Bt le min(BtBt2) (We refrain from applying (47) so as toavoid worsening a constant) If C lt B sinπq (or even if C lt (πq)B) we relax theinput to C = B sinπq and go through the above
42 Type I estimatesLet us give our first main type I estimate2 One of the main innovations is the mannerin which the ldquomain termrdquo (m divisible by q) is separated we are able to keep error
2The current version of Lemma 421 is an improvement over that included in the first version of thepreprint [Helb]
42 TYPE I ESTIMATES 57
terms small thanks to the particular way in which we switch between two differentapproximations
(These are not necessarily successive approximations in the sense of continuedfractions we do not want to assume that the approximation aq we are given arisesfrom a continued fraction and at any rate we need more control on the denominator qprime
of the new approximation aprimeqprime than continued fractions would furnish)The following lemma is a theme so to speak to which several variations will be
given Later in practice we will always use one of the variations rather than theoriginal lemma itself This is so just because even though (48) is the basic type ofsum we treat in type I the sums that we will have to estimate in practice will alwayspresent some minor additional complication Proving the lemma we are about to givein full will give us a chance to see all the main ideas at work leaving complications forlater
Lemma 421 Let α = aq+ δx (a q) = 1 |δx| le 1qQ0 q le Q0 Q0 ge 16 Letη be continuous piecewise C2 and compactly supported with |η|1 = 1 and ηprimeprime isin L1Let c0 ge |ηprimeprime|infin
Let 1 le D le x Then if |δ| le 12c2 where c2 = (3π5radicc0)(1 +
radic133) the
absolute value of summleD
micro(m)sumn
e(αmn)η(mnx
)(48)
is at most
x
qmin
(1
c0(2πδ)2
) ∣∣∣∣∣∣∣∣∣∣summleMq
(mq)=1
micro(m)
m
∣∣∣∣∣∣∣∣∣∣+Olowast
(c0
(1
4minus 1
π2
)(D2
2xq+D
2x
))(49)
plus
2radicc0c1π
D + 3c1x
qlog+ D
c2xq+
radicc0c1π
q log+ D
q2
+|ηprime|1π
q middotmax
(2 log
c0e3q2
4π|ηprime|1x
)+
(2radic
3c0c1π
+3c1c2
+55c0c212π2
)q
(410)
where c1 = 1 + |ηprime|1(2xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02
In general if |δ| ge 12c2 the absolute value of (48) is at most (49) plus
2radicc0c1π
(D + (1 + ε) min
(lfloorx
|δ|q
rfloor+ 1 2D
)($ε +
1
2log+ 2D
x|δ|q
))
+ 3c1
(2 +
(1 + ε)
εlog+ 2D
x|δ|q
)x
Q0+
35c0c26π2
q
(411)
for ε isin (0 1] arbitrary where $ε =radic
3 + 2ε+ ((1 +radic
133)4minus 1)(2(1 + ε))
58 CHAPTER 4 TYPE I SUMS
In (49) min(1 c0(2πδ)2) always equals 1 when |δ| le 12c2 (since (35)(1 +radic
133) gt 1)
Proof Let Q = bx|δq|c Then α = aq + Olowast(1qQ) and q le Q (If δ = 0 welet Q = infin and ignore the rest of the paragraph since then we will never need Qprime orthe alternative approximation aprimeqprime) Let Qprime = d(1 + ε)Qe ge Q + 1 Then α is notaq + Olowast(1qQprime) and so there must be a different approximation aprimeqprime (aprime qprime) = 1qprime le Qprime such that α = aprimeqprime + Olowast(1qprimeQprime) (since such an approximation alwaysexists) Obviously |aq minus aprimeqprime| ge 1qqprime yet at the same time |aq minus aprimeqprime| le1qQ+ 1qprimeQprime le 1qQ+ 1((1 + ε)qprimeQ) Hence qprimeQ+ q((1 + ε)Q) ge 1 and soqprime ge Qminusq(1+ε) ge (ε(1+ε))Q (Note also that (ε(1+ε))Q ge (2|δq|x)middotbxδqc gt1 and so qprime ge 2)
Lemma 412 will enable us to treat separately the contribution from terms withm divisible by q and m not divisible by q provided that m le Q2 Let M =min(Q2 D) We start by considering all terms with m le M divisible by q Thene(αmn) equals e((δmx)n) By Poisson summation
sumn
e(αmn)η(mnx) =sumn
f(n)
where f(u) = e((δmx)u)η((mx)u) Now
f(n) =
inte(minusun)f(u)du =
x
m
inte((δ minus xn
m
)u)η(u)du =
x
mη( xmnminus δ
)
By assumption m le M le Q2 le x2|δq| and so |xm| ge 2|δq| ge 2δ Thus by(21) (with k = 2)
sumn
f(n) =x
m
η(minusδ) +sumn 6=0
η(nxmminus δ)
=x
m
η(minusδ) +Olowast
sumn6=0
1(2π(nxm minus δ
))2 middot ∣∣∣ηprimeprime∣∣∣
infin
=
x
mη(minusδ) +
m
x
c0(2π)2
Olowast
max|r|le 1
2
sumn 6=0
1
(nminus r)2
(412)
Since x 7rarr 1x2 is convex on R+
max|r|le 1
2
sumn 6=0
1
(nminus r)2=sumn 6=0
1(nminus 1
2
)2 = π2 minus 4
42 TYPE I ESTIMATES 59
Therefore the sum of all terms with m leM and q|m issummleMq|m
x
mη(minusδ) +
summleMq|m
m
x
c0(2π)2
(π2 minus 4)
=xmicro(q)
qmiddot η(minusδ) middot
summleMq
(mq)=1
micro(m)
m
+Olowast(micro(q)2c0
(1
4minus 1
π2
)(D2
2xq+D
2x
))
We will bound |η(minusδ)| by (21)As we have just seen estimating the contribution of the terms with m divisible by
q and not too large (m le M ) involves isolating a main term estimating it carefully(with cancellation) and then bounding the remaining error terms
We will now bound the contribution of all other m ndash that is m not divisible by qand m larger than M Cancellation will now be used only within the inner sum thatis we will bound each inner sum
Tm(α) =sumn
e(αmn)η(mnx
)
and then we will carefully consider how to bound sums of |Tm(α)| over m efficientlyBy (22) and Lemma 231
|Tm(α)| le min
(x
m+
1
2|ηprime|1
12 |ηprime|1
| sin(πmα)|m
x
c04
1
(sinπmα)2
) (413)
For any y2 gt y1 gt 0 with y2 minus y1 le q and y2 le Q2 (413) gives us thatsumy1ltmley2
q-m
|Tm(α)| lesum
y1ltmley2q-m
min
(A
C
(sinπmα)2
)(414)
for A = (xy1)(1 + |ηprime|1(2(xy1))) and C = (c04)(y2x) We must now estimatethe sum sum
mleMq-m
|Tm(α)|+sum
Q2 ltmleD
|Tm(α)| (415)
To bound the terms with m le M we can use Lemma 412 The question is thenwhich one is smaller the first or the second bound given by Lemma 412 A briefcalculation gives that the second bound is smaller (and hence preferable) exactly whenradicCA gt (3π10q)(1 +
radic133) Since
radicCA sim (
radicc02)mx this means that
it is sensible to prefer the second bound in Lemma 412 when m gt c2xq wherec2 = (3π5
radicc0)(1 +
radic133)
It thus makes sense to ask does Q2 le c2xq (so that m le M implies m lec2xq) This question divides our work into two basic cases
60 CHAPTER 4 TYPE I SUMS
Case (a) δ large |δ| ge 12c2 where c2 = (3π5radicc0)(1 +
radic133) Then
Q2 le c2xq this will induce us to bound the first sum in (415) by the first bound inLemma 412
Recall that M = min(Q2 D) and so M le c2xq By (414) and Lemma 412
sum1lemleMq-m
|Tm(α)| leinfinsumj=0
sumjqltmlemin((j+1)qM)
q-m
min
(x
jq + 1+|ηprime|1
2
c04
(j+1)qx
(sinπmα)2
)
le 20
3π2
c0q3
4x
sum0lejleMq
(j + 1) le 20
3π2
c0q3
4xmiddot(
1
2
M2
q2+
3
2
c2x
q2+ 1
)
le 5c0c26π2
M +5c0q
3π2
(3
2c2 +
q2
x
)le 5c0c2
6π2M +
35c0c26π2
q
(416)where to bound the smaller terms we are using the inequality Q2 le c2xq andwhere we are also using the observation that since |δx| le 1qQ0 the assumption|δ| ge 12c2 implies that q le 2c2xQ0 moreover since q le Q0 this gives us thatq2 le 2c2x In the main term we are bounding qM2x from above by M middot qQ2x leM2δ le c2M
If D le (Q + 1)2 then M ge bDc and so (416) is all we need the second sumin (415) is empty Assume from now on that D gt (Q+ 1)2 The first sum in (415)is then bounded by (416) (with M = Q2) To bound the second sum in (415) wewill use the approximation aprimeqprime instead of aq The motivation is the following ifwe used the approximation aq even for m gt Q2 the contribution of the terms withq|m would be too large When we use aprimeqprime the contribution of the terms with qprime|m(or m equiv plusmn1 mod qprime) is very small only a fraction 1qprime (tiny since qprime is large) of allterms are like that and their individual contribution is always small precisely becausem gt Q2
By (414) (without the restriction q - m on either side) and Lemma 411
sumQ2ltmleD
|Tm(α)| leinfinsumj=0
sumjqprime+Q
2 ltmlemin((j+1)qprime+Q2D)
|Tm(α)|
le
lfloorDminus(Q+1)2
qprime
rfloorsumj=0
(3c1
x
jqprime + Q+12
+4qprime
π
radicc1c0
4
x
jqprime + (Q+ 1)2
(j + 1)qprime +Q2
x
)
le
lfloorDminus(Q+1)2
qprime
rfloorsumj=0
(3c1
x
jqprime + Q+12
+4qprime
π
radicc1c0
4
(1 +
qprime
jqprime + (Q+ 1)2
))
where we recall that c1 = 1 + |ηprime|1(2xD) Since qprime ge (ε(1 + ε))QlfloorDminus(Q+1)2
qprime
rfloorsumj=0
x
jqprime + Q+12
le x
Q2+x
qprime
int D
Q+12
1
tdt le 2x
Q+
(1 + ε)x
εQlog+ D
Q+12
(417)
42 TYPE I ESTIMATES 61
Recall now that qprime le (1 + ε)Q+ 1 le (1 + ε)(Q+ 1) Therefore
qprimebDminus(Q+1)2
qprime csumj=0
radic1 +
qprime
jqprime + (Q+ 1)2le qprime
radic1 +
(1 + ε)Q+ 1
(Q+ 1)2+
int D
Q+12
radic1 +
qprime
tdt
le qprimeradic
3 + 2ε+
(D minus Q+ 1
2
)+qprime
2log+ D
Q+12
(418)We conclude that
sumQ2ltmleD |Tm(α)| is at most
2radicc0c1π
(D +
((1 + ε)
radic3 + 2εminus 1
2
)(Q+ 1) +
(1 + ε)Q+ 1
2log+ D
Q+12
)
+ 3c1
(2 +
(1 + ε)
εlog+ D
Q+12
)x
Q
(419)We sum this to (416) (with M = Q2) and obtain that (415) is at most
2radicc0c1π
(D + (1 + ε)(Q+ 1)
($ε +
1
2log+ D
Q+12
))
+ 3c1
(2 +
(1 + ε)
εlog
DQ+1
2
)x
Q+
35c0c26π2
q
(420)
where we are bounding
5c0c26π2
=5c06π2
3π
5radicc0
(1 +
radic13
3
)=
radicc0
2π
(1 +
radic13
3
)le
2radicc0c1π
middot 14
(1 +
radic13
3
)(421)
and defining
$ε =radic
3 + 2ε+
(1
4
(1 +
radic13
3
)minus 1
)1
2(1 + ε) (422)
(Note that $ε ltradic
3 for ε lt 01741) A quick check against (416) shows that (420)is valid also when D le Q2 even when Q + 1 is replaced by min(Q + 1 2D) Webound Q from above by x|δ|q and log+D((Q + 1)2) by log+ 2D(x|δ|q + 1)and obtain the result
Case (b) |δ| small |δ| le 12c2 or D le Q02 Then min(c2xqD) le Q2 Westart by bounding the first q2 terms in (415) by (413) and Lemma 413sum
mleq2
|Tm(α)| lesum
mleq2
min
( 12 |ηprime|1
| sin(πmα)|
c0q8x
| sin(πmα)|2
)
le |ηprime|1π
qmax
(2 log
c0e3q2
4π|ηprime|1x
)
(423)
62 CHAPTER 4 TYPE I SUMS
If q2 lt 2c2x we estimate the terms with q2 lt m le c2xq by Lemma 412which is applicable because min(c2xqD) lt Q2
sumq2ltmleDprime
q-m
|Tm(α)| leinfinsumj=1
sum(jminus 1
2 )qltmle(j+ 12 )q
mlemin( c2xq D)q-m
min
(x(
j minus 12
)q
+|ηprime1|2c04
(j+12)qx
(sinπmα)2
)
le 20
3π2
c0q3
4x
sum1lejleDprimeq + 1
2
(j +
1
2
)le 20
3π2
c0q3
4x
(c2x
2q2
Dprime
q+
3
2
(c2x
q2
)+
5
8
)
le 5c06π2
(c2D
prime + 3c2q +5
4
q3
x
)le 5c0c2
6π2
(Dprime +
11
2q
)
(424)where we write Dprime = min(c2xqD) If c2xq ge D we stop here Assume thatc2xq lt D Let R = max(c2xq q2) The terms we have already estimated areprecisely those with m le R We bound the terms R lt m le D by the second boundin Lemma 411sum
RltmleD
|Tm(α)| leinfinsumj=0
summgtjq+R
mlemin((j+1)q+RD)
min
(c1x
jq +Rc04
(j+1)q+Rx
(sinπmα)2
)
leb 1q (DminusR)csumj=0
3c1x
jq +R+
4q
π
radicc1c0
4
(1 +
q
jq +R
) (425)
(Note there is no need to use two successive approximations aq aprimeqprime as in case (a)We are also including all terms with m divisible by q as we may since |Tm(α)| isnon-negative) Now much as before
b 1q (DminusR)csumj=0
x
jq +Rle x
R+x
q
int D
R
1
tdt le min
(q
c2
2x
q
)+x
qlog+ D
c2xq (426)
andb 1q (DminusR)csumj=0
radic1 +
q
jq +Rleradic
1 +q
R+
1
q
int D
R
radic1 +
q
tdt
leradic
3 +D minusRq
+1
2log+ D
q2
(427)
We sum with (423) and (424) and we obtain that (415) is at most
2radicc0c1π
(radic3q +D +
q
2log+ D
q2
)+
(3c1 log+ D
c2xq
)x
q
+ 3c1 min
(q
c2
2x
q
)+
55c0c212π2
q +|ηprime|1π
q middotmax
(2 log
c0e3q2
4π|ηprime|1x
)
(428)
42 TYPE I ESTIMATES 63
where we are using the fact that 5c0c26π2 lt 2
radicc0c1π to make sure that the term
(5c0c26π2)Dprime from (424) is more than compensated by the termminus2
radicc0c1Rπ com-
ing from minusRq in (427) (by the definition of Dprime and R we have R ge D) We canalso use 5c0c26π
2 lt 2radicc0c1π to bound the term (5c0c26π
2)Dprime from (424) by theterm 2
radicc0c1Dπ in (428) in case c2xq ge D (Again by definition Dprime le D) Thus
(428) is valid both when c2xq lt D and when c2xq ge D
421 Type I variationsWe will need a version of Lemma 421 with m and n restricted to the odd numbers(We will barely be using the restriction of m whereas the restriction on n is both (a)slightly harder to deal with (b) something that can be turned to our advantage)
Lemma 422 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge 16 Let η be continuous piecewise C2 and compactly supported with|η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin
Let 1 le D le x Then if |δ| le 12c2 where c2 = 6π5radicc0 the absolute value ofsum
mleDm odd
micro(m)sumn odd
e(αmn)η(mnx
)(429)
is at most
x
2qmin
(1
c0(πδ)2
) ∣∣∣∣∣∣∣∣∣∣summleMq
(m2q)=1
micro(m)
m
∣∣∣∣∣∣∣∣∣∣+Olowast
(c0q
x
(1
8minus 1
2π2
)(D
q+ 1
)2)
(430)
plus
2radicc0c1π
D +3c12
x
qlog+ D
c2xq+
radicc0c1π
q log+ D
q2
+2|ηprime|1π
q middotmax
(1 log
c0e3q2
4π|ηprime|1x
)+
(2radic
3c0c1π
+3c12c2
+55c0c2
6π2
)q
(431)
where c1 = 1 + |ηprime|1(xD) and M isin [min(Q02 D) D] The same bound holds if|δ| ge 12c2 but D le Q02
In general if |δ| ge 12c2 the absolute value of (48) is at most (430) plus
2radicc0c1π
(D + (1 + ε) min
(lfloorx
|δ|q
rfloor+ 1 2D
)(radic3 + 2ε+
1
2log+ 2D
x|δ|q
))
+3
2c1
(2 +
(1 + ε)
εlog+ 2D
x|δ|q
)x
Q0+
35c0c23π2
q
(432)for ε isin (0 1] arbitrary
64 CHAPTER 4 TYPE I SUMS
If q is even the sum (430) can be replaced by 0
Proof The proof is almost exactly that of Lemma 421 we go over the differencesThe parameters Q Qprime aprime qprime and M are defined just as before (with 2α wherever wehad α)
Let us first consider m le M odd and divisible by q (Of course this case arisesonly if q is odd) For n = 2r + 1
e(αmn) = e(αm(2r + 1)) = e(2αrm)e(αm)
= e
(δ
xrm
)e
((a
2q+
δ
2x+κ
2
)m
)= e
(δ(2r + 1)
2xm
)e
(a+ κq
2
m
q
)= κprimee
(δ(2r + 1)
2xm
)
where κ isin 0 1 and κprime = e((a + κq)2) isin minus1 1 are independent of m and nHence by Poisson summationsum
n odd
e(αmn)η(mnx) = κprimesumn odd
e((δm2x)n)η(mnx)
=κprime
2
(sumn
f(n)minussumn
f(n+ 12)
)
(433)
where f(u) = e((δm2x)u)η((mx)u) Now
f(t) =x
mη
(x
mtminus δ
2
)
Just as before |xm| ge 2|δq| ge 2δ Thus
1
2
∣∣∣∣∣sumn
f(n)minussumn
f(n+ 12)
∣∣∣∣∣ le x
m
1
2
∣∣∣∣η(minusδ2)∣∣∣∣+
1
2
sumn 6=0
∣∣∣∣η( xm n
2minus δ
2
)∣∣∣∣
=x
m
1
2
∣∣∣∣η(minusδ2)∣∣∣∣+
1
2middotOlowast
sumn 6=0
1(π(nxm minus δ
))2 middot ∣∣∣ηprimeprime∣∣∣
infin
=
x
2m
∣∣∣∣η(minusδ2)∣∣∣∣+
m
x
c02π2
(π2 minus 4)x
(434)The contribution of the second term in the last line of (434) issum
mleMm oddq|m
m
x
c02π2
(π2 minus 4) =q
x
c02π2
(π2 minus 4) middotsum
mleMq
m odd
m
=qc0x
(1
8minus 1
2π2
)(M
q+ 1
)2
42 TYPE I ESTIMATES 65
Hence the absolute value of the sum of all terms with m le M and q|m is given by(430)
We define Tm(α) by
Tm(α) =sumn odd
e(αmn)η(mnx
) (435)
Changing variables by n = 2r + 1 we see that
|Tm(α)| =
∣∣∣∣∣sumr
e(2α middotmr)η(m(2r + 1)x)
∣∣∣∣∣ Hence instead of (413) we get that
|Tm(α)| le min
(x
2m+
1
2|ηprime|1
12 |ηprime|1
| sin(2πmα)|m
x
c02
1
(sin 2πmα)2
) (436)
We obtain (414) but with Tm instead of Tm A = (x2y1)(1 + |ηprime|1(xy1)) andC = (c02)(y2x) and so c1 = 1 + |ηprime|1(xD)
The rest of the proof of Lemma 421 carries almost over word-by-word (For thesake of simplicity we do not really try to take advantage of the odd support of mhere) Since C has doubled it would seem to make sense to reset the value of c2 to bec2 = (3π5
radic2c0)(1 +
radic133) this would cause complications related to the fact that
5c0c23π2 would become larger than 2
radicc0π and so we set c2 to the slightly smaller
value c2 = 6π5radicc0 instead This implies
5c0c23π2
=2radicc0π
(437)
The bound from (416) gets multiplied by 2 (but the value of c2 has changed) thesecond line in (419) gets halved (421) gets replaced by (437) the second term inthe maximum in the second line of (423) gets doubled the bound from (424) getsdoubled and the bound from (426) gets halved
We will also need a version of Lemma 421 (or rather Lemma 422 we will decideto work with the restriction that n and m be odd) with a factor of (log n) within theinner sum This is the sum SI1 in (39)
Lemma 423 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(16 2
radicx) Let η be continuous piecewise C2 and compactly
supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Assume that for any ρ ge ρ0ρ0 a constant the function η(ρ)(t) = log(ρt)η(t) satisfies
|η(ρ)|1 le log(ρ)|η|1 |ηprime(ρ)|1 le log(ρ)|ηprime|1 |ηprimeprime(ρ)|infin le c0 log(ρ) (438)
Letradic
3 le D le min(xρ0 xe) Then if |δ| le 12c2 where c2 = 6π5radicc0 the
absolute value of summleDm odd
micro(m)sumn
n odd
(log n)e(αmn)η(mnx
)(439)
66 CHAPTER 4 TYPE I SUMS
is at most
x
qmin
(1c0δ
2
(2π)2
) ∣∣∣∣∣∣∣∣∣∣summleMq
(mq)=1
micro(m)
mlog
x
mq
∣∣∣∣∣∣∣∣∣∣+x
q|log middotη(minusδ)|
∣∣∣∣∣∣∣∣∣∣summleMq
(mq)=1
micro(m)
m
∣∣∣∣∣∣∣∣∣∣+Olowast
(c0
(1
2minus 2
π2
)(D2
4qxlog
e12x
D+
1
e
)) (440)
plus
2radicc0c1π
D logex
D+
3c12
x
qlog+ D
c2xqlog
q
c2
+
(2|ηprime|1π
max
(1 log
c0e3q2
4π|ηprime|1x
)log x+
2radicc0c1π
(radic3 +
1
2log+ D
q2
)log
q
c2
)q
+3c12
radic2x
c2log
2x
c2+
20c0c322
3π2
radic2x log
2radicex
c2(441)
for c1 = 1 + |ηprime|1(xD) The same bound holds if |δ| ge 12c2 but D le Q02In general if |δ| ge 12c2 the absolute value of (439) is at most
2radicc0c1π
D logex
D+
2radicc0c1π
(1 + ε)
(x
|δ|q+ 1
)(radic3 + 2ε middot log+ 2
radice|δ|q +
1
2log+ 2D
x|δ|q
log+ 2|δ|q
)
+
(3c14
(2radic5
+1 + ε
2εlog x
)+
40
3
radic2c0c
322
)radicx log x
(442)for ε isin (0 1]
Proof DefineQQprimeM aprime and qprime as in the proof of Lemma 421 The same method ofproof works as for Lemma 421 we go over the differences When applying Poissonsummation or (22) use η(xm)(t) = (log xtm)η(t) instead of η(t) Then use thebounds in (438) with ρ = xm in particular
|ηprimeprime(xm)|infin le c0 logx
m
For f(u) = e((δm2x)u)(log u)η((mx)u)
f(t) =x
mη(xm)
(x
mtminus δ
2
)
42 TYPE I ESTIMATES 67
and so
1
2
sumn
∣∣∣f(n2)∣∣∣ le x
m
1
2
∣∣∣∣η(xm)
(minusδ
2
)∣∣∣∣+1
2
sumn 6=0
∣∣∣∣η( xm n
2minus δ
2
)∣∣∣∣
=1
2
x
m
(log middotη
(minusδ
2
)+ log
( xm
)η
(minusδ
2
))+m
x
(log
x
m
) c02π2
(π2 minus 4)
The part of the main term involving log(xm) becomes
xη(minusδ)2
summleMm oddq|m
micro(m)
mlog( xm
)=xmicro(q)
qη(minusδ) middot
summleMq
(m2q)=1
micro(m)
mlog
(x
mq
)
for q odd (We can see that this like the rest of the main term vanishes for m even)In the term in front of π2 minus 4 we find the sum
summleMm oddq|m
m
xlog( xm
)le M
xlog
x
M+q
2
int Mq
0
t logxq
tdt
=M
xlog
x
M+M2
4qxlog
e12x
M
where we use the fact that t 7rarr t log(xt) is increasing for t le xe By the same fact(and by M le D) (M2q) log(e12xM) le (D2q) log(e12xD) It is also easy tosee that (Mx) log(xM) le 1e (since M le D le x)
The basic estimate for the rest of the proof (replacing (413)) is
Tm(α) =sumn odd
e(αmn)(log n)η(mnx
)=sumn odd
e(αmn)η(xm)
(mnx
)
= Olowast
min
x
2m|η(xm)|1 +
|ηprime(xm)|12
12 |ηprime(xm)|1
| sin(2πmα)|m
x
12 |ηprimeprime(xm)|infin
(sin 2πmα)2
= Olowast
(log
x
mmiddotmin
(x
2m+|ηprime|1
2
12 |ηprime|1
| sin(2πmα)|m
x
c02
1
(sin 2πmα)2
))
We wish to bound summleMq-mm odd
|Tm(α)|+sum
Q2 ltmleD
|Tm(α)| (443)
Just as in the proofs of Lemmas 421 and 422 we give two bounds one valid for|δ| large (|δ| ge 12c2) and the other for δ small (|δ| le 12c2) Again as in the proofof Lemma 422 we ignore the condition that m is odd in (415)
68 CHAPTER 4 TYPE I SUMS
Consider the case of |δ| large first Instead of (416) we havesum1lemleMq-m
|Tm(α)| le 40
3π2
c0q3
2x
sum0lejleMq
(j + 1) logx
jq + 1 (444)
Since sum0lejleMq
(j + 1) logx
jq + 1
le log x+M
qlog
x
M+
sum1lejleMq
logx
jq+
sum1lejleMq minus1
j logx
jq
le log x+M
qlog
x
M+
int Mq
0
logx
tqdt+
int Mq
1
t logx
tqdt
le log x+
(2M
q+M2
2q2
)log
e12x
M
this means thatsum1lemleMq-m
|Tm(α)| le 40
3π2
c0q3
4x
(log x+
(2M
q+M2
2q2
)log
e12x
M
)
le 5c0c23π2
M log
radicex
M+
40
3
radic2c0c
322
radicx log x
(445)
where we are using the bounds M le Q2 le c2xq and q2 le 2c2x (just as in (416))Instead of (417) we havelfloor
Dminus(Q+1)2
qprime
rfloorsumj=0
(log
x
jqprime + Q+12
)x
jqprime + Q+12
le x
Q2log
2x
Q+x
qprime
int D
Q+12
logx
t
dt
t
le 2x
Qlog
2x
Q+x
qprimelog
2x
Qlog+ 2D
Q
recall that the coefficient in front of this sum will be halved by the condition that n isodd Instead of (418) we obtain
qprimebDminus(Q+1)2
qprime csumj=0
radic1 +
qprime
jqprime + (Q+ 1)2
(log
x
jqprime + Q+12
)
le qprimeradic
3 + 2ε middot log2x
Q+ 1+
int D
Q+12
(1 +
qprime
2t
)(log
x
t
)dt
le qprimeradic
3 + 2ε middot log2x
Q+ 1+D log
ex
D
minus Q+ 1
2log
2ex
Q+ 1+qprime
2log
2x
Q+ 1log
2D
Q+ 1
42 TYPE I ESTIMATES 69
(The boundint ba
log(xt)dtt le log(xa) log(ba) will be more practical than the exactexpression for the integral) Hence
sumQ2ltmleD |Tm(α)| is at most
2radicc0c1π
D logex
D
+2radicc0c1π
((1 + ε)
radic3 + 2ε+
(1 + ε)
2log
2D
Q+ 1
)(Q+ 1) log
2x
Q+ 1
minus2radicc0c1π
middot Q+ 1
2log
2ex
Q+ 1+
3c12
(2radic5
+1 + ε
εlog+ D
Q2
)radicx log
radicx
Summing this to (445) (with M = Q2) and using (421) and (422) as before weobtain that (443) is at most
2radicc0c1π
D logex
D
+2radicc0c1π
(1 + ε)(Q+ 1)
(radic3 + 2ε log+ 2
radicex
Q+ 1+
1
2log+ 2D
Q+ 1log+ 2x
Q+ 1
)+
3c12
(2radic5
+1 + ε
εlog+ D
Q2
)radicx log
radicx+
40
3
radic2c0c
322
radicx log x
Now we go over the case of |δ| small (or D le Q02) Instead of (423) we havesummleq2
|Tm(α)| le 2|ηprime|1π
qmax
(1 log
c0e3q2
4π|ηprime|1x
)log x (446)
Suppose q2 lt 2c2x (Otherwise the sum we are about to estimate is empty) Insteadof (424) we havesumq2ltmleDprime
q-m
|Tm(α)| le 40
3π2
c0q3
6x
sum1lejleDprimeq + 1
2
(j +
1
2
)log
x(j minus 1
2
)q
le 10c0q3
3π2x
(log
2x
q+
1
q
int Dprime
0
logx
tdt+
1
q
int Dprime
0
t logx
tdt+
Dprime
qlog
x
Dprime
)
=10c0q
3
3π2x
(log
2x
q+
(2Dprime
q+
(Dprime)2
2q2
)log
radicex
Dprime
)le 5c0c2
3π2
(4radic
2c2x log2x
q+ 4radic
2c2x log
radicex
Dprime+Dprime log
radicex
Dprime
)le 5c0c2
3π2
(Dprime log
radicex
Dprime+ 4radic
2c2x log2radicex
c2
)(447)
where Dprime = min(c2xqD) (We are using the bounds q3x le (2c2)32 Dprimeq2x lec2q lt c
322
radic2x and Dprimeqx le c2) Instead of (425) we have
sumRltmleD
|Tm(α)| lebDminusRq csumj=0
(3c12 x
jq +R+
4q
π
radicc1c0
4
(1 +
q
jq +R
))log
x
jq +R
70 CHAPTER 4 TYPE I SUMS
where R = max(c2xq q2) We can simply reuse (426) multiplying it by log xRthe only difference is that now we take care to bound min(qc2 2xq) by the geometricmean
radic(qc2)(2xq) =
radic2xc2 We replace (427) by
b 1q (DminusR)csumj=0
radic1 +
q
jq +Rlog
x
jq +Rleradic
1 +q
Rlog
x
R+
1
q
int D
R
radic1 +
q
tlog
x
tdt
leradic
3 logq
c2+
(D
qlog
ex
Dminus R
qlog
ex
R
)+
1
2log
q
c2log+ D
R
(448)We sum with (446) and (447) and obtain (441) as an upper bound for (443) (Just asin the proof of Lemma 421 the term (5c0c2(3π
2))Dprime log(radicexDprime) is smaller than
the term (2radicc1c0π)R log exR in (448) and thus gets absorbed by it when D gt R
If D le R then again as in Lemma 421 the sumsumRltmleD |Tm(α)| is empty and
we bound (5c0c2(3π2))Dprime log(
radicexDprime) by the term (2
radicc1c0π)D log exD which
would not appear otherwise)
Now comes the time to focus on our second type I sum namelysumvleVv odd
Λ(v)sumuleUu odd
micro(u)sumn
n odd
e(αvun)η(vunx)
which corresponds to the term SI2 in (39) The innermost two sums on their ownare a sum of type I we have already seen Accordingly for q small we will be able tobound them using Lemma 422 If q is large then that approach does not quite worksince then the approximation avq to vα is not always good enough (As we shall latersee we need q le Qv for the approximation to be sufficiently close for our purposes)
Fortunately when q is large we can also afford to lose a factor of log since thegains from q will be large Here is the estimate we will use for q large
Lemma 424 Let α isin RZ with 2α = aq + δx (a q) = 1 |δx| le 1qQ0q le Q0 Q0 ge max(2e 2
radicx) Let η be continuous piecewise C2 and compactly
supported with |η|1 = 1 and ηprimeprime isin L1 Let c0 ge |ηprimeprime|infin Let c2 = 6π5radicc0 Assume
that x ge e2c22Let U V ge 1 satisfy UV +(1918)Q0 le x56 Then if |δ| le 12c2 the absolute
value of ∣∣∣∣∣∣∣∣sumvleVv odd
Λ(v)sumuleUu odd
micro(u)sumn
n odd
e(αvun)η(vunx)
∣∣∣∣∣∣∣∣ (449)
is at most
x
2qmin
(1
c0(πδ)2
)log V q
+Olowast(
1
4minus 1
π2
)middot c0(D2 log V
2qx+
3c42
UV 2
x+
(U + 1)2V
2xlog q
) (450)
42 TYPE I ESTIMATES 71
plus
2radicc0c1π
(D log
Dradice
+ q
(radic3 log
c2x
q+
logD
2log+ D
q2
))+
3c12
x
qlogD log+ D
c2xq+
2|ηprime|1π
qmax
(1 log
c0e3q2
4π|ηprime|1x
)log
q
2
+3c1
2radic
2c2
radicx log
c2x
2+
25c04π2
(2c2)32radicx log x
(451)
whereD = UV and c1 = 1+ |ηprime|1(2xD) and c4 = 103884 The same bound holdsif |δ| ge 12c2 but D le Q02
In general if |δ| ge 12c2 the absolute value of (449) is at most (450) plus
2radicc0c1π
D logD
e
+2radicc0c1π
(1 + ε)
(x
|δ|q+ 1
)((radic
3 + 2εminus 1) log
x|δ|q + 1radic
2+
1
2logD log+ e2D
x|δ|q
)
+
(3c12
(1
2+
3(1 + ε)
16εlog x
)+
20c03π2
(2c2)32
)radicx log x
(452)for ε isin (0 1]
Proof We proceed essentially as in Lemma 421 and Lemma 422 Let Q qprime and Qprime
be as in the proof of Lemma 422 that is with 2α where Lemma 421 uses αLet M = min(UVQ2) We first consider the terms with uv le M u and v odd
uv divisible by q If q is even there are no such terms Assume q is odd Then by(433) and (434) the absolute value of the contribution of these terms is at most
sumaleMa oddq|a
sumv|a
aUlevleV
Λ(v)micro(av)
(xη(minusδ2)
2a+O
(a
x
|ηprimeprime|infin2π2
middot (π2 minus 4)
)) (453)
Now
sumaleMa oddq|a
sumv|a
aUlevleV
Λ(v)micro(av)
a
=sumvleVv odd
(vq)=1
Λ(v)
v
sumulemin(UMV )
u oddq|u
micro(u)
u+sumpαleVp oddp|q
Λ(pα)
pα
sumulemin(UMV )
u oddq
(qpα)|u
micro(u)
u
72 CHAPTER 4 TYPE I SUMS
which equals
micro(q)
q
sumvleVv odd
(vq)=1
Λ(v)
v
sumulemin(UqMV q)
(u2q)=1
micro(u)
u
+micro(
q(qpα)
)q
sumpαleVp oddp|q
Λ(pα)
pα(q pα)
sumulemin( U
q(qpα)MV
q(qpα) )u odd
(u q(qpα) )=1
micro(u)
u
=1
qmiddotOlowast
sumvleV
(v2q)=1
Λ(v)
v+sumpαleVp oddp|q
log p
pα(q pα)
where we are using (220) to bound the sums on u by 1 We notice that
sumpαleVp oddp|q
log p
pα(q pα)lesump oddp|q
(log p)
vp(q) +sum
αgtvp(q)
pαleV
1
pαminusvp(q)
le log q +
sump oddp|q
(log p)sumβgt0
pβle V
pvp(q)
log p
pβle log q +
sumvleVv odd
(vq)=1
Λ(v)
v
and so
sumaleMa oddq|a
sumv|a
aUlevleV
Λ(v)micro(av)
a=
1
qmiddotOlowast
log q +sumvleV
(v2)=1
Λ(v)
v
=
1
qmiddotOlowast(log q + log V )
by (212) The absolute value of the sum of the terms with η(minusδ2) in (453) is thus atmost
x
q
η(minusδ2)
2(log q + log V ) le x
2qmin
(1
c0(πδ)2
)log V q
where we are bounding η(minusδ2) by (21) (with k = 2)
42 TYPE I ESTIMATES 73
The other terms in (453) contribute at most
(π2 minus 4)|ηprimeprime|infin2π2
1
x
sumuleU
sumvleV
uv odduvleM q|uvu sq-free
Λ(v)uv (454)
For any RsumuleRu oddq|u le R24q + 3R4 Using the estimates (212) (215)
and (216) we obtain that the double sum in (454) is at mostsumvleV
(v2q)=1
Λ(v)vsum
ulemin(UMv)
u oddq|u
u+sumpαleVp oddp|q
(log p)pαsumuleUu oddq
(qpα)|u
u
lesumvleV
(v2q)=1
Λ(v)v middot(
(Mv)2
4q+
3M
4v
)+sumpαleVp oddp|q
(log p)pα middot (U + 1)2
4
le M2 log V
4q+
3c44MV +
(U + 1)2
4V log q
(455)
where c4 = 103884From this point onwards we use the easy bound∣∣∣∣∣∣∣∣∣
sumv|a
aUlevleV
Λ(v)micro(av)
∣∣∣∣∣∣∣∣∣ le log a
What we must bound now issummleUVm odd
q - m orm gt M
(logm)sumn odd
e(αmn)η(mnx) (456)
The inner sum is the same as the sum Tm(α) in (435) we will be using the bound(436) Much as before we will be able to ignore the condition that m is odd
Let D = UV What remains to do is similar to what we did in the proof of Lemma421 (or Lemma 422)
Case (a) δ large |δ| ge 12c2 Instead of (416) we have
sum1lemleMq-m
(logm)|Tm(α)| le 40
3π2
c0q3
4x
sum0lejleMq
(j + 1) log(j + 1)q
74 CHAPTER 4 TYPE I SUMS
and since M le min(c2xqD) q leradic
2c2x (just as in the proof of Lemma 421) andsum0lejleMq
(j + 1) log(j + 1)q
le M
qlogM +
(M
q+ 1
)log(M + 1) +
1
q2
int M
0
t log t dt
le(
2M
q+ 1
)log x+
M2
2q2log
Mradice
we conclude thatsum1lemleMq-m
|Tm(α)| le 5c0c23π2
M logMradice
+20c03π2
(2c2)32radicx log x
(457)
Instead of (417) we have
bDminus(Q+1)2
qprime csumj=0
x
jqprime + Q+12
log
(jqprime +
Q+ 1
2
)le x
Q+12
logQ+ 1
2+x
qprime
int D
Q+12
log t
tdt
le 2x
Qlog
Q
2+
(1 + ε)x
2εQ
((logD)2 minus
(log
Q
2
)2)
Instead of (418) we estimate
qprime
lfloorDminusQ+1
2qprime
rfloorsumj=0
(log
(Q+ 1
2+ jqprime
))radic1 +
qprime
jqprime + Q+12
le qprime(
logD + (radic
3 + 2εminus 1) logQ+ 1
2
)+
int D
Q+12
log t dt+
int D
Q+12
qprime log t
2tdt
le qprime(
logD +(radic
3 + 2εminus 1)
logQ+ 1
2
)+
(D log
D
eminus Q+ 1
2log
Q+ 1
2e
)+qprime
2logD log+ D
Q+12
We conclude that when D ge Q2 the sumsumQ2ltmleD(logm)|Tm(α)| is at most
2radicc0c1π
(D log
D
e+ (Q+ 1)
((1 + ε)(
radic3 + 2εminus 1) log
Q+ 1
2minus 1
2log
Q+ 1
2e
))+
radicc0c1π
(Q+ 1)(1 + ε) logD log+ e2DQ+1
2
+3c12
(2x
Qlog
Q
2+
(1 + ε)x
2εQ
((logD)2 minus
(log
Q
2
)2))
42 TYPE I ESTIMATES 75
We must now add this to (457) Since
(1 + ε)(radic
3 + 2εminus 1) logradic
2minus 1
2log 2e+
1 +radic
133
2log 2radice gt 0
and Q ge 2radicx we conclude that (456) is at most
2radicc0c1π
D logD
e
+2radicc0c1π
(1 + ε)(Q+ 1)
((radic
3 + 2εminus 1) logQ+ 1radic
2+
1
2logD log+ e2D
Q+12
)
+
(3c12
(1
2+
3(1 + ε)
16εlog x
)+
20c03π2
(2c2)32
)radicx log x
(458)Case (b) δ small |δ| le 12c2 or D le Q02 The analogue of (423) is a bound of
le 2|ηprime|1π
qmax
(1 log
c0e3q2
4π|ηprime|1x
)log
q
2
for the terms with m le q2 If q2 lt 2c2x then much as in (424) we havesumq2ltmleDprime
q-m
|Tm(α)|(logm) le 10
π2
c0q3
3x
sum1lejleDprimeq + 1
2
(j +
1
2
)log(j + 12)q
le 10
π2
c0q
3x
int Dprime+ 32 q
q
x log x dx
(459)
Sinceint Dprime+ 32 q
q
x log x dx =1
2
(Dprime +
3
2q
)2
logDprime + 3
2qradiceminus 1
2q2 log
qradice
=
(1
2Dprime2 +
3
2Dprimeq
)(log
Dprimeradice
+3
2
q
Dprime
)+
9
8q2 log
Dprime + 32qradiceminus 1
2q2 log
qradice
=1
2Dprime2 log
Dprimeradice
+3
2Dprimeq logDprime +
9
8q2
(2
9+
3
2+ log
(Dprime +
19
18q
))
where Dprime = min(c2xqD) and since the assumption (UV + (1918)Q0) le x56implies that (29 + 32 + log(Dprime + (1918)q)) le x we conclude thatsum
q2ltmleDprime
q-m
|Tm(α)|(logm)
le 5c0c23π2
Dprime logDprimeradice
+10c03π2
(3
4(2c2)32
radicx log x+
9
8(2c2)32
radicx log x
)le 5c0c2
3π2Dprime log
Dprimeradice
+25c04π2
(2c2)32radicx log x
(460)
76 CHAPTER 4 TYPE I SUMS
Let R = max(c2xq q2) We bound the terms R lt m le D as in (425) with afactor of log(jq +R) inside the sum The analogues of (426) and (427) are
b 1q (DminusR)csumj=0
x
jq +Rlog(jq +R) le x
RlogR+
x
q
int D
R
log t
tdt
leradic
2x
c2log
radicc2x
2+x
qlogD log+ D
R
(461)
where we use the assumption that x ge e2c2 and
b 1q (DminusR)csumj=0
log(jq +R)
radic1 +
q
jq +Rleradic
3 logR
+1
q
(D log
D
eminusR log
R
e
)+
1
2logD log
D
R
(462)
(or 0 if D lt R) We sum with (460) and the terms with m le q2 and obtain forDprime = c2xq = R
2radicc0c1π
(D log
Dradice
+ q
(radic3 log
c2x
q+
logD
2log+ D
q2
))+
3c12
x
qlogD log+ D
c2xq+
2|ηprime|1π
qmax
(1 log
c0e3q2
4π|ηprime|1x
)log
q
2
+3c1
2radic
2c2
radicx log
c2x
2+
25c04π2
(2c2)32radicx log x
which it is easy to check is also valid even if Dprime = D (in which case (461) and (462)do not appear) or R = q2 (in which case (460) does not appear)
Chapter 5
Type II sums
We must now consider the sum
SII =summgtU
(mv)=1
sumdgtUd|m
micro(d)
sumngtV
(nv)=1
Λ(n)e(αmn)η(mnx) (51)
Here the main improvements over classical treatments of type II sums are as fol-lows
1 obtaining cancellation in the term sumdgtUd|m
micro(d)
leading to a gain of a factor of log
2 using a large sieve for primes getting rid of a further log
3 exploiting via a non-conventional application of the principle of the large sieve(Lemma 521) the fact that α is in the tail of an interval (when that is the case)
It should be clear that these techniques are of general applicability (It is also clear that(2) is not new though strangely enough it seems not to have been applied to Gold-bachrsquos problem Perhaps this oversight is due to the fact that proofs of Vinogradovrsquosresult given in textbooks often follow Linnikrsquos dispersion method rather than the largesieve Our treatment of the large sieve for primes will follow the lines set by Mont-gomery and Montgomery-Vaughan [MV73 (16)] The fact that the large sieve forprimes can be combined with the new technique (3) is of course a novelty)
While (1) is particularly useful for the treatment of a term that generally arises inapplications of Vaughanrsquos identity all of the points above address issues that can arisein more general situations in number theory
77
78 CHAPTER 5 TYPE II SUMS
It is technically helpful to express η as the (multiplicative) convolution of two func-tions of compact support ndash preferrably the same function
η(x) = η1 lowastM η1 =
int infin0
η1(t)η1(xt)dt
t (52)
For the smoothing function η(t) = η2(t) = 4 max(log 2 minus | log 2t| 0) equation (52)holds with η1 = 2 middot 1[121] where 1[121] is the characteristic function of the interval[12 1] We will work with η = η2 yet most of our work will be valid for any η of theform η = η1 lowast η1
By (52) the sum (51) equals
4
int infin0
summgtU
(mv)=1
sumdgtUd|m
micro(d)
sumngtV
(nv)=1
Λ(n)e(αmn)η1(t)η1
(mnx
t
)dt
t
= 4
int xU
V
summax( x
2W U)ltmle xW
(mv)=1
sumdgtUd|m
micro(d)
summax(VW2 )ltnleW
(nv)=1
Λ(n)e(αmn)dW
W
(53)by the substitution t = (mx)W (We can assume V le W le xU because otherwiseone of the sums in (54) is empty) As we can see the sums within the integral are nowunsmoothed This will not be truly harmful and to some extent it will be convenientin that ready-to-use large-sieve estimates in the literature have been optimized morecarefully for unsmoothed sums than for smooth sums The fact that the sums start atx2W and W2 rather than at 1 will also be slightly helpful
(This is presumably why the weight η2 was introduced in [Tao14] which also usesthe large sieve As we will later see the weight η2 ndash or anything like it ndash will simplynot do on the major arcs which are much more sensitive to the choice of weights Onthe minor arcs however η2 is convenient and this is why we use it here For type Isums ndash as should be clear from our work so far which was stated for general weightsndash any function whose second derivative exists almost everywhere and lies in `1 woulddo just as well The option of having no smoothing whatsoever ndash as in Vinogradovrsquoswork or as in most textbook accounts ndash would not be quite as good for type I sumsand would lead to a routine but inconvenient splitting of sums into short intervals inplace of (53))
We now do what is generally the first thing in type II treatments we use Cauchy-Schwarz A minor note however that may help avoid confusion the treatments fa-miliar to some readers (eg the dispersion method not followed here) start with thespecial case of Cauchy-Schwarz that is most common in number theory∣∣∣∣∣∣
sumnleN
an
∣∣∣∣∣∣2
le NsumnleN
|an|2
79
whereas here we apply the general rule
summ
ambm leradicsum
m
|am|2radicsum
m
|bm|2
to the integrand in (53) At any rate we will have reduced the estimation of a sumto the estimation of two simpler sums
summ |am|2
summ |bm|2 but each of these two
simpler sums will be of a kind that we will lead to a loss of a factor of log x (or(log x)3) if not estimated carefully Since we cannot afford to lose a single factor oflog x we will have to deploy and develop techniques to eliminate these factors of log xThe procedure followed will be quite different for the two sums a variety of techniqueswill be needed
We separate n prime and n non-prime in the integrand of (53) and as we weresaying we apply Cauchy-Schwarz We obtain that the expression within the integral in(53) is at most
radicS1(UW ) middot S2(U VW ) +
radicS1(UW ) middot S3(W ) where
S1(UW ) =sum
max( x2W U)ltmle x
W
(mv)=1
sumdgtUd|m
micro(d)
2
S2(U VW ) =sum
max( x2W U)ltmle x
W
(mv)=1
∣∣∣∣∣∣∣∣∣∣sum
max(VW2 )ltpleW(pv)=1
(log p)e(αmp)
∣∣∣∣∣∣∣∣∣∣
2
(54)
and
S3(W ) =sum
x2W ltmle x
W
(mv)=1
∣∣∣∣∣∣∣∣sumnleW
n non-prime
Λ(n)
∣∣∣∣∣∣∣∣2
=sum
x2W ltmle x
W
(mv)=1
(142620W 12
)2
le 10171x+ 20341W
(55)
(by [RS62 Thm 13]) We will assume V le w thus the condition (p v) = 1 will befulfilled automatically and can be removed
The contribution of S3(W ) will be negligible We must bound S1(UW ) andS2(U VW ) from above
80 CHAPTER 5 TYPE II SUMS
51 The sum S1 cancellationWe shall bound
S1(UW ) =sum
max(Ux2W )ltmlexW(mv)=1
sumdgtUd|m
micro(d)
2
(56)
There will be a surprising amount of cancellation the expression within the sumwill be bounded by a constant on average ndash a constant less than 1 and usually less than12 in fact In other words the inner sum in (56) is exactly 0 most of the time
Recall that we need explicit constants throughout and that this essentially con-strains us to elementary means (We will at one point use Dirichlet series and ζ(s) fors real and greater than 1)
511 Reduction to a sum with microIt is tempting to start by applying Mobius inversion to change d gt U to d le U in(56) but this just makes matters worse We could also try changing variables so thatmd (which is smaller than xUW ) becomes the variable instead of d but this leadsto complications for m non-square-free Instead we write
summax(Ux2W )ltmlexW
(mv)=1
sumdgtUd|m
micro(d)
2
=sum
x2W ltmle x
W
(mv)=1
sumd1d2|m
micro(d1 gt U)micro(d2 gt U)
=sum
r1ltxWU
sumr2ltxWU
(r1r2)=1
(r1r2v)=1
suml
(lr1r2)=1
r1lr2lgtU
(`v)=1
micro(r1l)micro(r2l)sum
x2W ltmle x
W
r1r2l|m(mv)=1
1
(57)where d1 = r1l d2 = r2l l = (d1 d2) (The inequality r1 lt xWU comes fromr1r2l|m m le xW r2l gt U r2 lt xWU is proven in the same way) Now (57)equals sum
slt xWU
(sv)=1
sumr1lt
xWUs
sumr2lt
xWUs
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)sum
max(
Umin(r1r2)
xW
2r1r2s
)ltlle xW
r1r2s
(lr1r2)=1(micro(l))2=1
(`v)=1
1 (58)
where we have set s = m(r1r2l) We begin by simplifying the innermost triple sumThis we do in the following Lemma it is not a trivial task and carrying it out efficientlyactually takes an idea
51 THE SUM S1 CANCELLATION 81
Lemma 511 Let z y gt 0 Thensumr1lty
sumr2lty
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)sum
min(
zymin(r1r2)
z2r1r2
)ltlle z
r1r2
(lr1r2)=1(micro(l))2=1
(`v)=1
1 (59)
equals
6z
π2
v
σ(v)
sumr1lty
sumr2lty
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)
σ(r1)σ(r2)
(1minusmax
(1
2r1
yr2
y
))
+Olowast
508 ζ
(3
2
)2
yradicz middotprodp|v
(1 +
1radicp
)(1minus 1
p32
)2
(510)
If v = 2 the error term in (510) can be replaced by
Olowast
(127ζ
(3
2
)2
yradicz middot(
1 +1radic2
)(1minus 1
232
)2) (511)
Proof By Mobius inversion (59) equalssumr1lty
sumr2lty
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)sum
lle zr1r2
lgtmin(
zymin(r1r2)
z2r1r2
)(`v)=1
sumd1|r1d2|r2d1d2|l
micro(d1)micro(d2)
sumd3|vd3|l
micro(d3)summ2|l
(mr1r2v)=1
micro(m)
(512)
We can change the order of summation of ri and di by defining si = ridi and we canalso use the obvious fact that the number of integers in an interval (a b] divisible by dis (bminus a)d+Olowast(1) Thus (512) equalssum
d1d2lty
(d1d2)=1
(d1d2v)=1
micro(d1)micro(d2)sum
s1ltyd1s2ltyd2
(d1s1d2s2)=1
(s1s2v)=1
micro(d1s1)micro(d2s2)
sumd3|v
micro(d3)sum
mleradic
z
d21s1d22s2d3
(md1s1d2s2v)=1
micro(m)
d1d2d3m2
z
s1d1s2d2
(1minusmax
(1
2s1d1
ys2d2
y
))
(513)
82 CHAPTER 5 TYPE II SUMS
plus
Olowast
sum
d1d2lty
(d1d2v)=1
sums1ltyd1s2ltyd2
(s1s2v)=1
sumd3|v
summle
radicz
d21s1d22s2d3
m sq-free
1
(514)
If we complete the innermost sum in (513) by removing the condition
m leradicz(d2
1sd22s2)
we obtain (reintroducing the variables ri = disi)
z middotsum
r1r2lty
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)
r1r2
(1minusmax
(1
2r1
yr2
y
))
sumd1|r1d2|r2
sumd3|v
summ
(mr1r2v)=1
micro(d1)micro(d2)micro(m)micro(d3)
d1d2d3m2
(515)
times z Now (515) equalssumr1r2lty
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)z
r1r2
(1minusmax
(1
2r1
yr2
y
)) prodp|r1r2
or v
(1minus 1
p
) prodp-r1r2p-v
(1minus 1
p2
)
=6z
π2
v
σ(v)
sumr1r2lty
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)
σ(r1)σ(r2)
(1minusmax
(1
2r1
yr2
y
))
ie the main term in (510) It remains to estimate the terms used to complete thesum their total is by definition given exactly by (513) with the inequality m leradicz(d2
1sd22s2d3) changed to m gt
radicz(d2
1sd22s2d3) This is a total of size at most
1
2
sumd1d2lty
(d1d2v)=1
sums1ltyd1s2ltyd2
(s1s2v)=1
sumd3|v
summgt
radicz
d21s1d22s2d3
m sq-free
1
d1d2d3m2
z
s1d1s2d2 (516)
Adding this to (514) we obtain as our total error termsumd1d2lty
(d1d2v)=1
sums1ltyd1s2ltyd2
(s1s2v)=1
sumd3|v
f
(radicz
d21s1d2
2s2d3
) (517)
51 THE SUM S1 CANCELLATION 83
where
f(x) =summlexm sq-free
1 +1
2
summgtxm sq-free
x2
m2
It is easy to see that f(x)x has a local maximum exactly when x is a square-free(positive) integer We can hence check that
f(x) le 1
2
(2 + 2
(ζ(2)
ζ(4)minus 125
))x = 126981 x
for all x ge 0 by checking all integers smaller than a constant using m m sq-free subm 4 - m and 15 middot (34) lt 126981 to bound f from below for x larger than aconstant Therefore (517) is at most
127sum
d1d2lty
(d1d2v)=1
sums1ltyd1s2ltyd2
(s1s2v)=1
sumd3|v
radicz
d21s1d2
2s2d3
= 127radiczprodp|v
(1 +
1radicp
)middot
sumdlty
(dv)=1
sumsltyd
(sv)=1
1
dradics
2
We can bound the double sum simply by
sumdlty
(dv)=1
sumsltyd
1radicsdle 2
sumdlty
radicyd
dle 2radicy middot ζ
(3
2
)prodp|v
(1minus 1
p32
)
Alternatively if v = 2 we bound
sumsltyd
(sv)=1
1radics
=sumsltyd
s odd
1radicsle 1 +
1
2
int yd
1
1radicsds =
radicyd
and thus
sumdlty
(dv)=1
sumsltyd
(sv)=1
1radicsdle
sumdlty
(d2)=1
radicyd
dle radicy
(1minus 1
232
)ζ
(3
2
)
Applying Lemma 511 with y = Ss and z = xWs where S = xWU we
84 CHAPTER 5 TYPE II SUMS
obtain that (58) equals
6x
π2W
v
σ(v)
sumsltS
(sv)=1
1
s
sumr1ltSs
sumr2ltSs
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)
σ(r1)σ(r2)
(1minusmax
(1
2r1
Ssr2
Ss
))
+Olowast
504ζ
(3
2
)3
S
radicx
W
prodp|v
(1 +
1radicp
)(1minus 1
p32
)3
(518)with 504 replaced by 127 if v = 2 The main term in (518) can be written as
6x
π2W
v
σ(v)
sumsleS
(sv)=1
1
s
int 1
12
sumr1leuSs
sumr2leuSs
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)
σ(r1)σ(r2)du (519)
As we can see the use of an integral eliminates the unpleasant factor(1minusmax
(1
2r1
Ssr2
Ss
))
From now on we will focus on the cases v = 1 and v = 2 for simplicity (Highervalues of v do not seem to be really profitable in the last analysis)
512 Explicit bounds for a sum with microWe must estimate the expression within parentheses in (519) It is not too hard toshow that it tends to 0 the first part of the proof of Lemma 512 will reduce this to thefact that
sumn micro(n)n = 0 Obtaining good bounds is a more delicate matter For our
purposes we will need the expression to converge to 0 at least as fast as 1(log)2 witha good constant in front For this task the bound (221) on
sumnlex micro(n)n is enough
Lemma 512 Let
gv(x) =sumr1lex
sumr2lex
(r1r2)=1
(r1r2v)=1
micro(r1)micro(r2)
σ(r1)σ(r2)
where v = 1 or v = 2 Then
|g1(x)| le
1x if 33 le x le 1061x (111536 + 55768 log x) if 106 le x lt 101000044325(log x)2 + 01079radic
xif x ge 1010
|g2(x)| le
21x if 33 le x le 1061x (163434 + 817168 log x) if 106 le x lt 10100038128(log x)2 + 02046radic
x if x ge 1010
51 THE SUM S1 CANCELLATION 85
Tbe proof involves what may be called a version of Rankinrsquos trick using Dirichletseries and the behavior of ζ(s) near s = 1
Proof We prove the statements for x le 106 by a direct computation using intervalarithmetic (In fact in that range one gets 20895071x instead of 21x) Assumefrom now on that x gt 106
Clearly
g(x) =sumr1lex
sumr2lex
(r1r2v)=1
sumd|(r1r2)
micro(d)
micro(r1)micro(r2)
σ(r1)σ(r2)
=sumdlex
(dv)=1
micro(d)sumr1lex
sumr2lex
d|(r1r2)
(r1r2v)=1
micro(r1)micro(r2)
σ(r1)σ(r2)
=sumdlex
(dv)=1
micro(d)
(σ(d))2
sumu1lexd
(u1dv)=1
sumu2lexd
(u2dv)=1
micro(u1)micro(u2)
σ(u1)σ(u2)
=sumdlex
(dv)=1
micro(d)
(σ(d))2
sumrlexd
(rdv)=1
micro(r)
σ(r)
2
(520)
Moreover sumrlexd
(rdv)=1
micro(r)
σ(r)=
sumrlexd
(rdv)=1
micro(r)
r
sumdprime|r
prodp|dprime
(p
p+ 1minus 1
)
=sum
dprimelexdmicro(dprime)2=1
(dprimedv)=1
prodp|dprime
minus1
p+ 1
sumrlexd
(rdv)=1
dprime|r
micro(r)
r
=sum
dprimelexdmicro(dprime)2=1
(dprimedv)=1
1
dprimeσ(dprime)
sumrlexddprime
(rddprimev)=1
micro(r)
r
and sumrlexddprime
(rddprimev)=1
micro(r)
r=
sumdprimeprimelexddprimedprimeprime|(ddprimev)infin
1
dprimeprime
sumrlexddprimedprimeprime
micro(r)
r
86 CHAPTER 5 TYPE II SUMS
Hence
|g(x)| lesumdlex
(dv)=1
(micro(d))2
(σ(d))2
sum
dprimelexdmicro(dprime)2=1
(dprimedv)=1
1
dprimeσ(dprime)
sumdprimeprimelexddprimedprimeprime|(ddprimev)infin
1
dprimeprimef(xddprimedprimeprime)
2
(521)
where f(t) =∣∣∣sumrlet micro(r)r
∣∣∣We intend to bound the function f(t) by a linear combination of terms of the form
tminusδ δ isin [0 12) Thus it makes sense now to estimate Fv(s1 s2 x) defined to be thequantity
sumd
(dv)=1
(micro(d))2
(σ(d))2
sumdprime1
(dprime1dv)=1
micro(dprime1)2
dprime1σ(dprime1)
sumdprimeprime1 |(ddprime1v)infin
1
dprimeprime1middot (ddprime1dprimeprime1)1minuss1
sum
dprime2(dprime2dv)=1
micro(dprime2)2
dprime2σ(dprime2)
sumdprimeprime2 |(ddprime2v)infin
1
dprimeprime2middot (ddprime2dprimeprime2)1minuss2
for s1 s2 isin [12 1] This is equal to
sumd
(dv)=1
micro(d)2
ds1+s2
prodp|d
1
(1 + pminus1)2
(1minus pminuss1)prodp|v
1(1minuspminuss1 )(1minuspminuss2 )
(1minus pminuss2)
middot
sumdprime
(dprimedv)=1
micro(dprime)2
(dprime)s1+1
prodpprime|dprime
1
(1 + pprimeminus1) (1minus pprimeminuss1)
middot
sumdprime
(dprimedv)=1
micro(dprime)2
(dprime)s2+1
prodpprime|dprime
1
(1 + pprimeminus1) (1minus pprimeminuss2)
which in turn can easily be seen to equalprodp-v
(1 +
pminuss1pminuss2
(1minus pminuss1 + pminus1)(1minus pminuss2 + pminus1)
)prodp|v
1
(1minus pminuss1)(1minus pminuss2)
middotprodp-v
(1 +
pminus1pminuss1
(1 + pminus1)(1minus pminuss1)
)middotprodp-v
(1 +
pminus1pminuss2
(1 + pminus1)(1minus pminuss2)
) (522)
51 THE SUM S1 CANCELLATION 87
Now for any 0 lt x le y le x12 lt 1
(1+xminusy)(1minusxy)(1minusxy2)minus(1+x)(1minusy)(1minusx3) = (xminusy)(y2minusx)(xyminusxminus1)x le 0
and so
1 +xy
(1 + x)(1minus y)=
(1 + xminus y)(1minus xy)(1minus xy2)
(1 + x)(1minus y)(1minus xy)(1minus xy2)le (1minus x3)
(1minus xy)(1minus xy2)
(523)For any x le y1 y2 lt 1 with y2
1 le x y22 le x
1 +y1y2
(1minus y1 + x)(1minus y2 + x)le (1minus x3)2(1minus x4)
(1minus y1y2)(1minus y1y22)(1minus y2
1y2) (524)
This can be checked as follows multiplying by the denominators and changing vari-ables to x s = y1 + y2 and r = y1y2 we obtain an inequality where the left sidequadratic on s with positive leading coefficient must be less than or equal to the rightside which is linear on s The left side minus the right side can be maximal for givenx r only when s is maximal or minimal This happens when y1 = y2 or when eitheryi =
radicx or yi = x for at least one of i = 1 2 In each of these cases we have re-
duced (524) to an inequality in two variables that can be proven automatically1 by aquantifier-elimination program the author has used QEPCAD [HB11] to do this
Hence Fv(s1 s2 x) is at most
prodp-v
(1minus pminus3)2(1minus pminus4)
(1minus pminuss1minuss2)(1minus pminus2s1minuss2)(1minus pminuss1minus2s2)middotprodp|v
1
(1minus pminuss1)(1minus pminuss2)
middotprodp-v
1minus pminus3
(1 + pminuss1minus1)(1 + pminus2s1minus1)
prodp-v
1minus pminus3
(1 + pminuss2minus1)(1 + pminus2s2minus1)
= Cvs1s2 middotζ(s1 + 1)ζ(s2 + 1)ζ(2s1 + 1)ζ(2s2 + 1)
ζ(3)4ζ(4)(ζ(s1 + s2)ζ(2s1 + s2)ζ(s1 + 2s2))minus1
(525)where Cvs1s2 equals 1 if v = 1 and
(1minus 2minuss1minus2s2)(1 + 2minuss1minus1)(1 + 2minus2s1minus1)(1 + 2minuss2minus1)(1 + 2minus2s2minus1)
(1minus 2minuss1+s2)minus1(1minus 2minus2s1minuss2)minus1(1minus 2minuss1)(1minus 2minuss2)(1minus 2minus3)4(1minus 2minus4)
if v = 2For 1 le t le x (221) and (224) imply
f(t) le
radic
2t if x le 1010radic2t + 003
log x
(xt
) log log 1010
log xminuslog 1010 if x gt 1010(526)
1In practice the case yi =radicx leads to a polynomial of high degree and quantifier elimination increases
sharply in complexity as the degree increases a stronger inequality of lower degree (with (1minus 3x3) insteadof (1minus x3)2(1minus x4)) was given to QEPCAD to prove in this case
88 CHAPTER 5 TYPE II SUMS
where we are using the fact that log x is convex-down Note that again by convexity
log log xminus log log 1010
log xminus log 1010lt (log t)prime|t=log 1010 =
1
log 1010= 00434294
Obviouslyradic
2t in (526) can be replaced by (2t)12minusε for any ε ge 0By (521) and (526)
|gv(x)| le(
2
x
)1minus2ε
Fv(12 + ε 12 + ε x)
for x le 1010 We set ε = 1 log x and obtain from (525) that
Fv(12 + ε 12 + ε x) le Cv 12 +ε 12 +ε
ζ(1 + 2ε)ζ(32)4ζ(2)2
ζ(3)4ζ(4)
le 55768 middot Cv 12 +ε 12 +ε middot(
1 +log x
2
)
(527)
where we use the easy bound ζ(s) lt 1 + 1(sminus 1) obtained bysumns lt 1 +
int infin1
tsdt
(For sharper bounds see [BR02]) Now
C2 12 +ε 12 +ε le(1minus 2minus32minusε)2(1 + 2minus32)2(1 + 2minus2)2(1minus 2minus1minus2ε)
(1minus 2minus12)2(1minus 2minus3)4(1minus 2minus4)
le 14652983
whereas C1 12 +ε 12 +ε = 1 (We are assuming x ge 106 and so ε le 1(log 106)) Hence
|gv(x)| le
1x (111536 + 55768 log x) if v = 11x (163434 + 817168 log x) if v = 2
for 106 le x lt 1010For general x we must use the second bound in (526) Define c = 1(log 1010)
We see that if x gt 1010
|gv(x)| le 0032
(log x)2F1(1minus c 1minus c) middot Cv1minusc1minusc
+ 2 middotradic
2radicx
003
log xF (1minus c 12) middot Cv1minusc12
+1
x(111536 + 55768 log x) middot Cv 12 +ε 12 +ε
For v = 1 this gives
|g1(x)| le 00044325
(log x)2+
21626radicx log x
+1
x(111536 + 55768 log x)
le 00044325
(log x)2+
01079radicx
51 THE SUM S1 CANCELLATION 89
for v = 2 we obtain
|g2(x)| le 0038128
(log x)2+
25607radicx log x
+1
x(163434 + 817168 log x)
le 0038128
(log x)2+
02046radicx
513 Estimating the triple sumWe will now be able to bound the triple sum in (519) vizsum
sleS(sv)=1
1
s
int 1
12
gv(uSs)du (528)
where gv is as in Lemma 512As we will soon see Lemma 512 that (528) is bounded by a constant (essentially
because the integralint 12
01t(log t)2 converges) We must give as good a constant as
we can since it will affect the largest term in the final resultClearly gv(R) = gv(bRc) The contribution of each gv(m) 1 le m le S to (528)
is exactly gv(m) timessumS
m+1ltsleSm
1
s
(sv)=1
int 1
msS
1du+sum
S2mltsle
Sm+1
1
s
(sv)=1
int (m+1)sS
msS
1du
+sum
S2(m+1)
ltsle S2m
1
s
(sv)=1
int (m+1)sS
12
du =sum
Sm+1ltsle
Sm
(sv)=1
(1
sminus m
S
)
+sum
S2mltsle
Sm+1
(sv)=1
1
S+
sumS
2(m+1)ltsle S
2m
(sv)=1
(m+ 1
Sminus 1
2s
)
(529)
Write f(t) = 1S for S2m lt t le S(m+1) f(t) = 0 for t gt Sm or t lt S2(m+1) f(t) = 1tminusmS for S(m+ 1) lt t le Sm and f(t) = (m+ 1)S minus 12t forS2(m + 1) lt t le S2m then (529) equals
sumn(nv)=1 f(n) By Euler-Maclaurin
(second order)sumn
f(n) =
int infinminusinfin
f(x)minus 1
2B2(x)f primeprime(x)dx =
int infinminusinfin
f(x) +Olowast(
1
12|f primeprime(x)|
)dx
=
int infinminusinfin
f(x)dx+1
6middotOlowast
(∣∣∣∣f prime( 3
2m
)∣∣∣∣+
∣∣∣∣f prime( s
m+ 1
)∣∣∣∣)=
1
2log
(1 +
1
m
)+
1
6middotOlowast
((2m
s
)2
+
(m+ 1
s
)2)
(530)
90 CHAPTER 5 TYPE II SUMS
Similarly
sumn odd
f(n) =
int infinminusinfin
f(2x+ 1)minus 1
2B2(x)d
2f(2x+ 1)
dx2dx
=1
2
int infinminusinfin
f(x)dxminus 2
int infinminusinfin
1
2B2
(xminus 1
2
)f primeprime(x)dx
=1
2
int infinminusinfin
f(x)dx+1
6
int infinminusinfin
Olowast (|f primeprime(x)|) dx
=1
4log
(1 +
1
m
)+
1
3middotOlowast
((2m
s
)2
+
(m+ 1
s
)2)
We use these expressions form le C0 where C0 ge 33 is a constant to be computedlater they will give us the main term For m gt C0 we use the bounds on |g(m)| thatLemma 512 gives us
(Starting now and for the rest of the paper we will focus on the cases v = 1v = 2 when giving explicit computational estimates All of our procedures wouldallow higher values of v as well but as will become clear much later the gains fromhigher values of v are offset by losses and complications elsewhere)
Let us estimate (528) Let
cv0 =
16 if v = 113 if v = 2
cv1 =
1 if v = 125 if v = 2
cv2 =
55768 if v = 1817168 if v = 2
cv3 =
111536 if v = 1163434 if v = 2
cv4 =
00044325 if v = 10038128 if v = 2
cv5 =
01079 if v = 102046 if v = 2
Then (528) equals
summleC0
gv(m) middot(φ(v)
2vlog
(1 +
1
m
)+Olowast
(cv0
5m2 + 2m+ 1
S2
))
+sum
S106lesltSC0
1
s
int 1
12
Olowast(cv1uSs
)du
+sum
S1010lesltS106
1
s
int 1
12
Olowast(cv2 log(uSs) + cv3
uSs
)du
+sum
sltS1010
1
s
int 1
12
Olowast
(cv4
(log uSs)2+
cv5radicuSs
)du
51 THE SUM S1 CANCELLATION 91
which issummleC0
gv(m) middot φ(v)
2vlog
(1 +
1
m
)+summleC0
|g(m)| middotOlowast(cv0
5m2 + 2m+ 1
S2
)
+Olowast
(cv1
log 2
C0+
log 2
106
(cv3 + cv2(1 + log 106)
)+
2minusradic
2
10102cv5
)
+Olowast
sumsltS1010
cv42
s(logS2s)2
for S ge (C0 + 1) Note that
sumsltS1010
1s(logS2s)2 =
int 21010
01
t(log t)2 dtNow
cv42
int 21010
0
1
t(log t)2dt =
cv42
log(10102)=
000009923 if v = 1
0000853636 if v = 2
and
log 2
106
(cv3 + cv2(1 + log 106)
)+
2minusradic
2
105cv5 =
00006506 if v = 1
0009525 if v = 2
For C0 = 10000
φ(v)
v
1
2
summleC0
gv(m) middot log
(1 +
1
m
)=
0362482 if v = 10360576 if v = 2
cv0summleC0
|gv(m)|(5m2 + 2m+ 1) le
62040665 if v = 1159113401 if v = 2
and
cv1 middot (log 2)C0 =
000006931 if v = 1000017328 if v = 2
Thus for S ge 100000sumsleS
(sv)=1
1
s
int 1
12
gv(uSs)du le
036393 if v = 1037273 if v = 2
(531)
For S lt 100000 we proceed as above but using the exact expression (529) insteadof (530) Note (529) is of the form fsm1(S) + fsm2(S)S where both fsm1(S)and fsm2(S) depend only on bSc (and on s andm) Summing overm le S we obtaina bound of the form sum
sleS(sv)=1
1
s
int 1
12
gv(uSs)du le Gv(S)
92 CHAPTER 5 TYPE II SUMS
withGv(S) = Kv1(|S|) +Kv2(|S|)S
where Kv1(n) and Kv2(n) can be computed explicitly for each integer n (For exam-ple Gv(S) = 1minus 1S for 1 le S lt 2 and Gv(S) = 0 for S lt 1)
It is easy to check numerically that this implies that (531) holds not just for S ge100000 but also for 40 le S lt 100000 (if v = 1) or 16 le S lt 100000 (if v =
2) Using the fact that Gv(S) is non-negative we can compareint T
1Gv(S)dSS with
log(T+1N) for each T isin [2 40]cap 1NZ (N a large integer) to show again numerically
that int T
1
Gv(S)dS
Sle
03698 log T if v = 1037273 log T if v = 2
(532)
(We use N = 100000 for v = 1 already N = 10 gives us the answer above forv = 2 Indeed computations suggest the better bound 0358 instead of 037273 weare committed to using 037273 because of (531))
Multiplying by 6vπ2σ(v) we conclude that
S1(UW ) =x
WmiddotH1
( x
WU
)+Olowast
(508ζ(32)3 x32
W 32U
)(533)
if v = 1
S1(UW ) =x
WmiddotH2
( x
WU
)+Olowast
(127ζ(32)3 x32
W 32U
)(534)
if v = 2 where
H1(S) =
6π2G1(S) if 1 le S lt 40022125 if S ge 40
H2(s) =
4π2G2(S) if 1 le S lt 16015107 if S ge 16
(535)Hence (by (532)) int T
1
Hv(S)dS
Sle
022482 log T if v = 1015107 log T if v = 2
(536)
moreover
H1(S) le 3
π2 H2(S) le 2
π2(537)
for all S
Note There is another way to obtain cancellation on micro applicable when (xW ) gtUq (as is unfortunately never the case in our main application) For this alternativeto be taken one must either apply Cauchy-Schwarz on n rather than m (resulting inexponential sums over m) or lump together all m near each other and in the same
52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 93
congruence class modulo q before applying Cauchy-Schwarz on m (one can indeed dothis if δ is small) We could then writesum
msimWmequivr mod q
sumd|mdgtU
micro(d) = minussummsimW
mequivr mod q
sumd|mdleU
micro(d) = minussumdleU
micro(d)(Wqd+O(1))
and obtain cancellation on d If Uq ge (xW ) however the error term dominates
52 The sum S2 the large sieve primes and tailsWe must now bound
S2(U primeW primeW ) =sum
U primeltmle xW
(mv)=1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αmp)
∣∣∣∣∣∣2
(538)
for U prime = max(U x2W ) W prime = max(VW2) (The condition (p v) = 1 will befulfilled automatically by the assumption V gt v)
From a modern perspective this is clearly a case for a large sieve It is also clear thatwe ought to try to apply a large sieve for sequences of prime support What is subtlerhere is how to do things well for very large q (ie xq small) This is in some sense adual problem to that of q small but it poses additional complications for example it isnot obvious how to take advantage of prime support for very large q
As in type I we avoid this entire issue by forbidding q large and then taking advan-tage of the error term δx in the approximation α = a
q + δx This is one of the main
innovations here Note this alternative method will allow us to take advantage of primesupport
A key situation to study is that of frequencies αi clustering around given rationalsaq while nevertheless keeping at a certain small distance from each other
Lemma 521 Let q ge 1 Let α1 α2 αk isin RZ be of the form αi = aiq + υi0 le ai lt q where the elements υi isin R all lie in an interval of length υ gt 0 and whereai = aj implies |υi minus υj | gt ν gt 0 Assume ν + υ le 1q Then for any WW prime ge 1W prime geW2
ksumi=1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αip)
∣∣∣∣∣∣2
le min
(1
2q
φ(q)
1
log ((q(ν + υ))minus1)
)middot(W minusW prime + νminus1
) sumW primeltpleW
(log p)2
(539)
Proof For any distinct i j the angles αi αj are separated by at least ν (if ai = aj) orat least 1qminus|υiminusυj | ge 1qminusυ ge ν (if ai 6= aj) Hence we can apply the large sieve(in the optimal N + δminus1 minus 1 form due to Selberg [Sel91] and Montgomery-Vaughan[MV74]) and obtain the bound in (539) with 1 instead of min(1 ) immediately
94 CHAPTER 5 TYPE II SUMS
We can also apply Montgomeryrsquos inequality ([Mon68] [Hux72] see the exposi-tions in [Mon71 pp 27ndash29] and [IK04 sect74]) This gives us that the left side of (539)is at most
sumrleR
(rq)=1
(micro(r))2
φ(r)
minus1 sum
rleR(rq)=1
sumaprime mod r(aprimer)=1
ksumi=1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e((αi + aprimer)p)
∣∣∣∣∣∣2
(540)
If we add all possible fractions of the form aprimer r le R (r q) = 1 to the fractionsaiq we obtain fractions that are separated by at least 1qR2 If ν + υ ge 1qR2 thenthe resulting angles αi + aprimer are still separated by at least ν Thus we can apply thelarge sieve to (540) setting R = 1
radic(ν + υ)q we see that we gain a factor of
sumrleR
(rq)=1
(micro(r))2
φ(r)ge φ(q)
q
sumrleR
(micro(r))2
φ(r)ge φ(q)
q
sumdleR
1
dge φ(q)
2qlog((q(ν + υ))minus1
)
(541)since
sumdleR 1d ge log(R) for all R ge 1 (integer or not)
Let us first give a bound on sums of the type of S2(U VW ) using prime sup-port but not the error terms (or Lemma 521) This is something that can be donevery well using tools available in the literature (Not all of these tools seem to beknown as widely as they should be) Bounds (542) and (544) are completely standardlarge-sieve bounds To obtain the gain of a factor of log in (543) we use a lemmaof Montgomeryrsquos for whose modern proof (containing an improvement by Huxley)we refer to the standard source [IK04 Lemma 715] The purpose of Montgomeryrsquoslemma is precisely to gain a factor of log in applications of the large sieve to sequencessupported on the primes To use the lemma efficiently we apply Montgomery andVaughanrsquos large sieve with weights [MV73 (16)] rather than more common forms ofthe large sieve (The idea ndash used in [MV73] to prove an improved version of the Brun-Titchmarsh inequality ndash is that Farey fractions (rationals with bounded denominator)are not equidistributed this fact can be exploited if a large sieve with weights is used)
Lemma 522 Let W ge 1 W prime geW2 Let α = aq +Olowast(1qQ) q le Q Then
sumA0ltmleA1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αmp)
∣∣∣∣∣∣2
lelceil
A1 minusA0
min(q dQ2e)
rceilmiddot (W minusW prime + 2q)
sumW primeltpleW
(log p)2
(542)
52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 95
If q lt W2 and Q ge 35W the following bound also holds
sumA0ltmleA1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αmp)
∣∣∣∣∣∣2
lelceilA1 minusA0
q
rceilmiddot q
φ(q)
W
log(W2q)middot
sumW primeltpleW
(log p)2
(543)
If A1 minusA0 le q and q le ρQ ρ isin [0 1] the following bound also holds
sumA0ltmleA1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αmp)
∣∣∣∣∣∣2
le (W minusW prime + q(1minus ρ))sum
W primeltpleW
(log p)2
(544)
Proof Let k = min(q dQ2e) ge dq2e We split (A0 A1] into d(A1minusA0)ke blocksof at most k consecutive integers m0 + 1m0 + 2 For m mprime in such a block αmand αmprime are separated by a distance of at least
|(aq)(mminusmprime)| minusOlowast(kqQ) = 1q minusOlowast(12q) ge 12q
By the large sieve
qsuma=1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(α(m0 + a)p)
∣∣∣∣∣∣2
le ((W minusW prime)+2q)sum
W primeltpleW
(log p)2 (545)
We obtain (542) by summing over all d(A1 minusA0)ke blocksIf A1 minus A0 le |q| and q le ρQ ρ isin [0 1] we obtain (544) simply by applying
the large sieve without splitting the interval A0 lt m le A1Let us now prove (543) We will use Montgomeryrsquos inequality followed by Mont-
gomery and Vaughanrsquos large sieve with weights An angle aq + aprime1r1 is separatedfrom other angles aprimeq + aprime2r2 (r1 r2 le R (ai ri) = 1) by at least 1qr1R ratherthan just 1qR2 We will choose R so that qR2 lt Q this implies 1Q lt 1qR2 le1qr1R
By a lemma of Montgomeryrsquos [IK04 Lemma 715] applied (for each 1 le a le q)to S(α) =
sumn ane(αn) with an = log(n)e(α(m0 + a)n) if n is prime and an = 0
otherwise
1
φ(r)
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(α(m0 + a)p)
∣∣∣∣∣∣2
lesum
aprime mod r(aprimer)=1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e
((α (m0 + a) +
aprime
r
)p
)∣∣∣∣∣∣2
(546)
96 CHAPTER 5 TYPE II SUMS
for each square-free r leW prime We multiply both sides of (546) by(W
2+
3
2
(1
qrRminus 1
Q
)minus1)minus1
and sum over all a = 0 1 q minus 1 and all square-free r le R coprime to q we willlater make sure that R leW prime We obtain that
sumrleR
(rq)=1
(W
2+
3
2
(1
qrRminus 1
Q
)minus1)minus1
micro(r)2
φ(r)
middotqsuma=1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(α(m0 + a)p)
∣∣∣∣∣∣2
(547)
is at mostsumrleR
(rq)=1
r sq-free
(W
2+
3
2
(1
qrRminus 1
Q
)minus1)minus1
qsuma=1
sumaprime mod r(aprimer)=1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e
((α (m0 + a) +
aprime
r
)p
)∣∣∣∣∣∣2
(548)
We now apply the large sieve with weights [MV73 (16)] recalling that each angleα(m0 +a)+aprimer is separated from the others by at least 1qrRminus1Q we obtain that(548) is at most
sumW primeltpleW (log p)2 It remains to estimate the sum in the first line of
(547) (We are following here a procedure analogous to that used in [MV73] to provethe Brun-Titchmarsh theorem)
Assume first that q leW135 Set
R =
(σW
q
)12
(549)
where σ = 12e2middot025068 = 030285 It is clear that qR2 lt Q q lt W prime and R ge 2Moreover for r le R
1
Qle 1
35Wle σ
35
1
σW=
σ
35
1
qR2le σ35
qrR
Hence
W
2+
3
2
(1
qrRminus 1
Q
)minus1
le W
2+
3
2
qrR
1minus σ35=W
2+
3r
2(1minus σ
35
)Rmiddot 2σW
2
=W
2
(1 +
3σ
1minus σ35rW
R
)ltW
2
(1 +
rW
R
)
52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 97
and so
sumrleR
(rq)=1
(W
2+
3
2
(1
qrRminus 1
Q
)minus1)minus1
micro(r)2
φ(r)
ge 2
W
sumrleR
(rq)=1
(1 + rRminus1)minus1micro(r)2
φ(r)ge 2
W
φ(q)
q
sumrleR
(1 + rRminus1)minus1micro(r)2
φ(r)
For R ge 2 sumrleR
(1 + rRminus1)minus1micro(r)2
φ(r)gt logR+ 025068
this is true for R ge 100 by [MV73 Lemma 8] and easily verifiable numerically for2 le R lt 100 (It suffices to verify this for R integer with r lt R instead of r le R asthat is the worst case)
Now
logR =1
2
(log
W
2q+ log 2σ
)=
1
2log
W
2qminus 025068
Hence sumrleR
(1 + rRminus1)minus1micro(r)2
φ(r)gt
1
2log
W
2q
and the statement followsNow consider the case q gt W135 If q is even then in this range inequality
(542) is always better than (543) and so we are done Assume then that W135 ltq le W2 and q is odd We set R = 2 clearly qR2 lt W le Q and q lt W2 le W primeand so this choice of R is valid It remains to check that
1
W2 + 3
2
(12q minus
1Q
)minus1 +1
W2 + 3
2
(14q minus
1Q
)minus1 ge1
Wlog
W
2q
This follows because
112 + 3
2
(t2 minus
135
)minus1 +1
12 + 3
2
(t4 minus
135
)minus1 ge logt
2
for all 2 le t le 135
We need a version of Lemma 522 with m restricted to the odd numbers since weplan to set the parameter v equal to 2
98 CHAPTER 5 TYPE II SUMS
Lemma 523 Let W ge 1 W prime geW2 Let 2α = aq +Olowast(1qQ) q le Q Then
sumA0ltmleA1
m odd
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αmp)
∣∣∣∣∣∣2
lelceilA1 minusA0
min(2qQ)
rceilmiddot (W minusW prime + 2q)
sumW primeltpleW
(log p)2
(550)
If q lt W2 and Q ge 35W the following bound also holds
sumA0ltmleA1
m odd
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αmp)
∣∣∣∣∣∣2
lelceilA1 minusA0
2q
rceilmiddot q
φ(q)
W
log(W2q)middot
sumW primeltpleW
(log p)2
(551)
If A1 minusA0 le 2q and q le ρQ ρ isin [0 1] the following bound also holds
sumA0ltmleA1
∣∣∣∣∣∣sum
W primeltpleW
(log p)e(αmp)
∣∣∣∣∣∣2
le (W minusW prime + q(1minus ρ))sum
W primeltpleW
(log p)2
(552)
Proof We follow the proof of Lemma 522 noting the differences Let
k = min(q dQ2e) ge dq2e
just as before We split (A0 A1] into d(A1 minusA0)ke blocks of at most 2k consecutiveintegers any such block contains at most k odd numbers For odd m mprime in such ablock αm and αmprime are separated by a distance of
|α(mminusmprime)| =∣∣∣∣2α
mminusmprime
2
∣∣∣∣ = |(aq)k| minusOlowast(kqQ) ge 12q
We obtain (550) and (552) just as we obtained (542) and (544) before To obtain(551) proceed again as before noting that the angles we are working with can belabelled as α(m0 + 2a) 0 le a lt q
The idea now (for large δ) is that if δ is not negligible then as m increases andαm loops around the circle RZ αm roughly repeats itself every q steps ndash but with aslight displacement This displacement gives rise to a configuration to which Lemma521 is applicable The effect is that we can apply the large sieve once instead of manytimes thus leading to a gain of a large factor (essentially the number of times the largesieve would have been used) This is how we obtain the factor of |δ| in the denominatorof the main term x|δ|q in (556) and (557)
52 THE SUM S2 THE LARGE SIEVE PRIMES AND TAILS 99
Proposition 524 Let x ge W ge 1 W prime ge W2 U prime ge x2W Let Q ge 35W Let2α = aq + δx (a q) = 1 |δx| le 1qQ q le Q Let S2(U primeW primeW ) be as in(538) with v = 2
For q le ρQ where ρ isin [0 1]
S2(U primeW primeW ) le(
max(1 2ρ)
(x
8q+
x
2W
)+W
2+ 2q
)middot
sumW primeltpleW
(log p)2
(553)If q lt W2
S2(U primeW primeW ) le(
x
4φ(q)
1
log(W2q)+
q
φ(q)
W
log(W2q)
)middot
sumW primeltpleW
(log p)2
(554)If W gt x4q the following bound also holds
S2(U primeW primeW ) le(W
2+
q
1minus x4Wq
) sumW primeltpleW
(log p)2 (555)
If δ 6= 0 and x4W + q le x|δ|q
S2(U primeW primeW ) le min
12qφ(q)
log(
x|δq|(q + x
4W
)minus1)
middot(
x
|δq|+W
2
) sumW primeltpleW
(log p)2
(556)
Lastly if δ 6= 0 and q le ρQ where ρ isin [0 1)
S2(U primeW primeW ) le(
x
|δq|+W
2+
x
8(1minus ρ)Q+
x
4(1minus ρ)W
) sumW primeltpleW
(log p)2
(557)
The trivial bound would be in the order of
S2(U primeW primeW ) = (x2 log x)sum
W primeltpleW
(log p)2
In practice (555) gets applied when W ge xq
Proof Let us first prove statements (554) and (553) which do not involve δ Assumefirst q leW2 Then by (551) with A0 = U prime A1 = xW
S2(U primeW primeW ) le(xW minus U prime
2q+ 1
)q
φ(q)
W
log(W2q)
sumW primeltpleW
(log p)2
Clearly (xW minus U prime)W le (x2W ) middotW = x2 Thus (554) holds
100 CHAPTER 5 TYPE II SUMS
Assume now that q le ρQ Apply (550) with A0 = U prime A1 = xW Then
S2(U primeW primeW ) le(
xW minus U prime
q middotmin(2 ρminus1)+ 1
)(W minusW prime + 2q)
sumW primeltpleW
(log p)2
Now (xW minus U prime
q middotmin(2 ρminus1)+ 1
)middot (W minusW prime + 2q)
le( xWminus U prime
) W minusW prime
qmin(2 ρminus1)+ max(1 2ρ)
( xWminus U prime
)+W2 + 2q
le x4
qmin(2 ρminus1)+ max(1 2ρ)
x
2W+W2 + 2q
This implies (553)If W gt x4q apply (544) with = x4Wq ρ = 1 This yields (555)Assume now that δ 6= 0 and x4W + q le x|δq| Let Qprime = x|δq| For any m1
m2 with x2W lt m1m2 le xW we have |m1 minusm2| le x2W le 2(Qprime minus q) andso ∣∣∣∣m1 minusm2
2middot δx+ qδx
∣∣∣∣ le Qprime|δ|x =1
q (558)
The conditions of Lemma 521 are thus fulfilled with υ = (x4W ) middot |δ|x and ν =|δq|x We obtain that S2(U primeW primeW ) is at most
min
(1
2q
φ(q)
1
log ((q(ν + υ))minus1)
)(W minusW prime + νminus1
) sumW primeltpleW
(log p)2
Here W minusW prime + νminus1 = W minusW prime + x|qδ| leW2 + x|qδ| and
(q(ν + υ))minus1 =
(q|δ|x
)minus1 (q +
x
4W
)minus1
Lastly assume δ 6= 0 and q le ρQ We let Qprime = x|δq| ge Q again and we splitthe range U prime lt m le xW into intervals of length 2(Qprime minus q) so that (558) still holdswithin each interval We apply Lemma 521 with υ = (Qprimeminus q) middot |δ|x and ν = |δq|xWe obtain that S2(U primeW primeW ) is at most(
1 +xW minus U2(Qprime minus q)
)(W minusW prime + νminus1
) sumW primeltpleW
(log p)2
Here W minusW prime + νminus1 leW2 + xq|δ| as before Moreover(W
2+
x
q|δ|
)(1 +
xW minus U2(Qprime minus q)
)le(W
2+Qprime
)(1 +
x2W
2(1minus ρ)Qprime
)le W
2+Qprime +
x
8(1minus ρ)Qprime+
x
4W (1minus ρ)
le x
|δq|+W
2+
x
8(1minus ρ)Q+
x
4(1minus ρ)W
Hence (557) holds
Chapter 6
Minor-arc totals
It is now time to make all of our estimates fully explicit choose our parameters putour type I and type II estimates together and give our final minor-arc estimates
Let x gt 0 be given Starting in section 631 we will assume that x ge x0 =216 middot1020 We will choose our main parameters U and V gradually as the need ariseswe assume from the start that 2 middot 106 le V lt x4 and UV le x
We are also given an angle α isin RZ We choose an approximation 2α = aq +δx (a q) = 1 q le Q |δx| le 1qQ The parameter Q will be chosen later weassume from the start that Q ge max(16 2
radicx) and Q ge max(2U xU)
(Actually U and V will be chosen in different ways depending on the size of qActually evenQ will depend on the size of q this may seem circular but what actuallyhappens is the following we will first set a value for Q depending only on x and ifthe corresponding value of q le Q is larger than a certain parameter y depending on xthen we reset U V and Q and obtain a new value of q)
Let SI1 SI2 SII S0 be as in (39) with the smoothing function η = η2 as in(34) (We bounded the type I sums SI1 SI2 for a general smoothing function η it isonly here that we are specifying η)
The term S0 is 0 because V lt x4 and η2 is supported on [minus14 1] We set v = 2
61 The smoothing functionFor the smoothing function η2 in (34)
|η2|1 = 1 |ηprime2|1 = 8 log 2 |ηprimeprime2 |1 = 48 (61)
as per [Tao14 (59)ndash(513)] Similarly for η2ρ(t) = log(ρt)η2(t) where ρ ge 4
|η2ρ|1 lt log(ρ)|η2|1 = log(ρ)
|ηprime2ρ|1 = 2η2ρ(12) = 2 log(ρ2)η2(12) lt (8 log 2) log ρ
|ηprimeprime2ρ|1 = 4 log(ρ4) + |2 log ρminus 4 log(ρ4)|+ |4 log 2minus 4 log ρ|+ | log ρminus 4 log 2|+ | log ρ| lt 48 log ρ
(62)
101
102 CHAPTER 6 MINOR-ARC TOTALS
In the first inequality we are using the fact that log(ρt) is always positive (and less thanlog(ρ)) when t is in the support of η2
Write log+ x for max(log x 0)
62 Contributions of different types
621 Type I terms SI1The term SI1 can be handled directly by Lemma 423 with ρ0 = 4 and D = U (Condition (438) is valid thanks to (62)) Since U le Q2 the contribution of SI1gets bounded by (440) and (441) the absolute value of SI1 is at most
x
qmin
(1c0δ
2
(2π)2
) ∣∣∣∣∣∣∣∣∣∣summleUq
(mq)=1
micro(m)
mlog
x
mq
∣∣∣∣∣∣∣∣∣∣+x
q|log middotη(minusδ)|
∣∣∣∣∣∣∣∣∣∣summleUq
(mq)=1
micro(m)
m
∣∣∣∣∣∣∣∣∣∣+
2radicc0c1π
(U log
ex
U+radic
3q logq
c2+q
2log
q
c2log+ 2U
q
)+
3c1x
2qlog
q
c2log+ U
c2xq
+3c12
radic2x
c2log
2x
c2+
(c02minus 2c0π2
)(U2
4qxlog
e12x
U+
1
e
)+
2|ηprime|1π
qmax
(1 log
c0e3q2
4π|ηprime|1x
)log x+
20c0c322
3π2
radic2x log
2radicex
c2
(63)where c0 = 31521 (by Lemma B23) c1 = 10000028 gt 1 + (8 log 2)V ge 1 +(8 log 2)(xU) and c2 = 6π5
radicc0 = 067147 By (21) (with k = 2) (B17) and
Lemma B24
|log middotη(minusδ)| le min
(2minus log 4
24 log 2
π2δ2
)
By (220) (222) and (223) the first line of (63) is at most
x
qmin
(1cprime0δ2
)(min
(4
5
qφ(q)
log+ Uq2
1
)log
x
U+ 100303
q
φ(q)
)
+x
qmin
(2minus log 4
cprimeprime0δ2
)min
(4
5
qφ(q)
log+ Uq2
1
)
where cprime0 = 0798437 gt c0(2π)2 cprimeprime0 = 1685532 Clearly cprimeprime0c0 gt 1 gt 2minus log 4Taking derivatives we see that t 7rarr (t2) log(tc2) log+ 2Ut takes its maxi-
mum (for t isin [1 2U ]) when log(tc2) log+ 2Ut = log tc2 minus log+ 2Ut sincetrarr log tc2 minus log+ 2Ut is increasing on [1 2U ] we conclude that
q
2log
q
c2log+ 2U
qle U log
2U
c2
62 CONTRIBUTIONS OF DIFFERENT TYPES 103
Similarly t 7rarr t log(xt) log+(Ut) takes its maximum at a point t isin [0 U for whichlog(xt) log+(Ut) = log(xt) + log+(Ut) and so
x
qlog
q
c2log+ U
c2xq
le U
c2(log x+ logU)
We conclude that
|SI1| lex
qmin
(1cprime0δ2
)(min
(4qφ(q)
5 log+ Uq2
1
)(log
x
U+ c3I
)+ c4I
q
φ(q)
)
+
(c7I log
q
c2+ c8I log xmax
(1 log
c11Iq2
x
))q + c10I
U2
4qxlog
e12x
U
+
(c5I log
2U
c2+ c6I log xU
)U + c9I
radicx log
2radicex
c2+c10I
e
(64)where c2 and cprime0 are as above c3I = 211104 gt cprimeprime0c
prime0 c4I = 100303 c5I =
357422 gt 2radicc0c1π c6I = 223389 gt 3c12c2 c7I = 619072 gt 2
radic3c0c1π
c8I = 353017 gt 2(8 log 2)π
c9I = 191568 gt3radic
2c12radicc2
+20radic
2c0c322
3π2
c10I = 937301 gt c0(12minus 2π2) and c11I = 90857 gt c0e3(4π middot 8 log 2)
622 Type I terms SI2The case q le QV If q le QV then for v le V
2vα =va
q+Olowast
(v
)=va
q+Olowast
(1
q2
)
and so vaq is a valid approximation to 2vα (Here we are using v to label an integervariable bounded above by v le V we no longer need v to label the quantity in (310)since that has been set equal to the constant 2) Moreover for Qv = Qv we see that2vα = (vaq) +Olowast(1qQv) If α = aq + δx then vα = vaq + δ(xv) Now
SI2 =sumvleVv odd
Λ(v)summleUm odd
micro(m)sumn
n odd
e((vα) middotmn)η(mn(xv)) (65)
We can thus estimate SI2 by applying Lemma 422 to each inner double sum in (65)We obtain that if |δ| le 12c2 where c2 = 6π5
radicc0 and c0 = 31521 then |SI2| is
at most
sumvleV
Λ(v)
xv2qvmin
(1
c0(πδ)2
) ∣∣∣∣∣∣∣∣∣sum
mleMvq
(m2q)=1
micro(m)
m
∣∣∣∣∣∣∣∣∣+c10Iq
4xv
(U
qv+ 1
)2
(66)
104 CHAPTER 6 MINOR-ARC TOTALS
plus
sumvleV
Λ(v)
(2radicc0c+
πU +
3c+2
x
vqvlog+ U
c2xvqv
+
radicc0c+
πqv log+ U
qv2
)
+sumvleV
Λ(v)
(c8I max
(log
c11Iq2v
xv 1
)qv +
(2radic
3c0c+π
+3c+2c2
+55c0c2
6π2
)qv
)
(67)where qv = q(q v) Mv isin [min(Q2v U) U ] and c+ = 1 + (8 log 2)(xUV ) if|δ| ge 12c2 then |SI2| is at most (66) plus
sumvleV
Λ(v)
radicc0c1π2
U +3c12
2 +(1 + ε)
εlog+ 2U
xv|δ|qv
x
Q+
35c0c23π2
qv
+sumvleV
Λ(v)
radicc0c1π2
(1 + ε) min
(lfloorxv
|δ|qv
rfloor+ 1 2U
)radic3 + 2ε+
log+ 2U
b xv|δ|qv c+1
2
(68)
Write SV =sumvleV Λ(v)(vqv) By (212)
SV lesumvleV
Λ(v)
vq+
sumvleV
(vq)gt1
Λ(v)
v
((q v)
qminus 1
q
)
le log V
q+
1
q
sump|q
(log p)
vp(q) +sumαge1
pα+vp(q)leV
1
pαminussumαge1
pαleV
1
pα
le log V
q+
1
q
sump|q
(log p)vp(q) =log V q
q
(69)
This helps us to estimate (66) We could also use this to estimate the second term inthe first line of (67) but for that purpose it will actually be wiser to use the simplerbound sum
vleV
Λ(v)x
vqvlog+ U
c2xvqv
lesumvleV
Λ(v)Uc2ele 10004
ec2UV (610)
(by (214) and the fact that t log+At takes its maximum at t = Ae)We bound the sum over m in (66) by (220) and (222)∣∣∣∣∣∣∣∣∣
summleMvq
(m2q)=1
micro(m)
m
∣∣∣∣∣∣∣∣∣ le min
(4
5
qφ(q)
log+ Mv
2q2 1
)
62 CONTRIBUTIONS OF DIFFERENT TYPES 105
To bound the terms involving (Uqv + 1)2 we usesumvleV
Λ(v)v le 05004V 2 (by (217))
sumvleV
Λ(v)v(v q)j lesumvleV
Λ(v)v + VsumvleV
(vq)6=1
Λ(v)(v q)j
sumvleV
(vq) 6=1
Λ(v)(v q) lesump|q
(log p)sum
1leαlelogp V
pvp(q) lesump|q
(log p)log V
log ppvp(q)
le (log V )sump|q
pvp(q) le q log V
and sumvleV
(vq)6=1
Λ(v)(v q)2 lesump|q
(log p)sum
1leαlelogp V
pvp(q)+α
lesump|q
(log p) middot 2pvp(q) middot plogp V le 2qV log q
Using (214) and (69) as well we conclude that (66) is at most
x
2qmin
(1
c0(πδ)2
)min
(4
5
qφ(q)
log+ min(Q2VU)2q2
1
)log V q
+c10I
4x
(05004V 2q
(U
q+ 1
)2
+ 2UV q log V + 2U2V log V
)
AssumeQ le 2UVe Using (214) (610) (218) and the inequality vq le V q le Q(which implies q2 le Ue) we see that (67) is at most
10004
((2radicc0c+
π+
3c+2ec2
)UV +
radicc0c+
πQ log
U
q2
)+
(c5I2 max
(log
c11Iq2V
x 2
)+ c6I2
)Q
where c5I2 = 353312 gt 10004 middot c8I and
c6I2 = 10004
(2radic
3c0c+π
+3c+2c2
+55c0c2
6π2
) (611)
The expressions in (68) get estimated similarly The first line of (68) is at most
10004
(2radicc0c+
πUV +
3c+2
(2 +
1 + ε
εlog+ 2UV |δ|q
x
)xV
Q+
35c0c23π2
qV
)
106 CHAPTER 6 MINOR-ARC TOTALS
by (214) Since q le QV we can obviously bound qV by Q As for the second lineof (68) ndash
sumvleV
Λ(v) min
(lfloorxv
|δ|qv
rfloor+ 1 2U
)middot 1
2log+ 2Ulfloor
xv|δ|qv
rfloor+ 1
lesumvleV
Λ(v) maxtgt0
t log+ U
tlesumvleV
Λ(v)U
e=
10004
eUV
but
sumvleV
Λ(v) min
(lfloorxv
|δ|qv
rfloor+ 1 2U
)le
sumvle x
2U|δ|q
Λ(v) middot 2U
+sum
x2U|δ|qltvleV
(vq)=1
Λ(v)x|δ|vq
+sumvleV
Λ(v) +sumvleV
(vq)6=1
Λ(v)x|δ|v
(1
qvminus 1
q
)
le 103883x
|δ|q+
x
|δ|qmax
(log V minus log
x
2U |δ|q+ log
3radic2 0
)+ 10004V +
x
|δ|1
q
sump|q
(log p)vp(q)
le x
|δ|q
(103883 + log q + log+ 6UV |δ|qradic
2x
)+ 10004V
by (212) (213) (214) and (215) we are proceeding much as in (69)
Let us collect our bounds If |δ| le 12c2 then assuming Q le 2UVe we con-clude that |SI2| is at most
x
2φ(q)min
(1
c0(πδ)2
)min
(45
log+ Q4V q2
1
)log V q
+ c8I2x
q
(UV
x
)2 (1 +
q
U
)2
+c10I
2
(UV
xq log V +
U2V
xlog V
) (612)
plus
(c4I2 +c9I2)UV +(c10I2 logU
q+c5I2 max
(log
c11Iq2V
x 2
)+c12I2)middotQ (613)
62 CONTRIBUTIONS OF DIFFERENT TYPES 107
where
c4I2 = 357565(1 + ε0) gt 10004 middot 2radicc0c+πc5I2 = 353312 gt 10004 middot c8I
c8I2 = 117257 gtc10I
4middot 05004
c9I2 = 082214(1 + 2ε0) gt 3c+ middot 100042ec2
c10I2 = 178783radic
1 + 2ε0 gt 10004radicc0c+π
c12I2 = 293333 + 11902ε0
gt 10004
(3
2c2c+ +
2radic
3c0π
radicc+ +
55c0c26π2
)+ 178783(1 + ε0) log 2
= c6I2 + c10I2 log 2
and c10I = 937301 as before Here ε0 = (4 log 2)(xUV ) and c6I2 is as in (611)If |δ| ge 12c2 then |SI2| is at most (612) plus
(c4I2 + (1 + ε)c13I2)UV + cε
(c14I2
(log q + log+ 6UV |δ|qradic
2x
)+ c15I2
)x
|δ|q
+ c16I2
(2 +
1 + ε
εlog+ 2UV |δ|q
x
)x
QV+ c17I2Q+ cε middot c4I2V
(614)where
c13I2 = 131541(1 + ε0) gt2radicc0c+
πmiddot 10004
e
c14I2 = 357422radic
1 + 2ε0 gt2radicc0c+
π
c15I2 = 371301radic
1 + 2ε0 gt2radicc0c+
πmiddot 103883
c16I2 = 15006(1 + 2ε0) gt 10004 middot 3c+2
c17I2 = 250295 gt 10004 middot 35c0c23π2
and cε = (1 + ε)radic
3 + 2ε We recall that c2 = 6π5radicc0 = 067147 We will
choose ε isin (0 1) later we also leave the task of bounding ε0 for laterThe case q gt QV We use Lemma 424 in this case
623 Type II termsAs we showed in (51)ndash(55) SII (given in (51)) is at most
4
int xU
V
radicS1(UW ) middot S2(U VW )
dW
W+4
int xU
V
radicS1(UW ) middot S3(W )
dW
W (615)
where S1 S2 and S3 are as in (54) and (55) We bounded S1 in (533) and (534) S2
in Prop 524 and S3 in (55)
108 CHAPTER 6 MINOR-ARC TOTALS
Let us try to give some structure to the bookkeeping we must now inevitably doThe second integral in (615) will be negligible (because S3 is) let us focus on the firstintegral
Thanks to our work in sect51 the term S1(UW ) is bounded by a (small) constanttimes xW (This represents a gain of several factors of log with respect to the trivialbound) We bounded S2(U VW ) using the large sieve we expected and got a boundthat is better than trivial by a factor of size roughly radicq log x ndash the exact factor inthe bound depends on the value of W In particular it is only in the central part of therange for W that we will really be able to save a factor of radicq log x as opposed tojust radicq We will have to be slightly clever in order to get a good total bound in theend
We first recall our estimate for S1 In the whole range [V xU ] for W we knowfrom (533) (534) and (537) that S1(UW ) is at most
2
π2
x
W+ κ0ζ(32)3 x
W
radicxWU
U (616)
whereκ0 = 127
(We recall we are working with v = 2)We have better estimates for the constant in front in some parts of the range in
what is usually the main part (534) and (536) give us a constant of 015107 insteadof 2π2 Note that 127ζ(32)3 = 226417 We should choose U V so that thefirst term in (616) dominates For the while being assume only
U ge 5 middot 105 x
V U (617)
then (616) givesS1(UW ) le κ1
x
W (618)
whereκ1 =
2
π2+
226418radic1062
le 02347
This will suffice for our cruder estimatesThe second integral in (615) is now easy to bound By (55)
S3(W ) le 10171x+ 20341W le 10172x
since W le xU le x5 middot 105 Hence
4
int xU
V
radicS1(UW ) middot S3(W )
dW
Wle 4
int xU
V
radicκ1
x
Wmiddot 10172x
dW
W
le κ9xradicV
62 CONTRIBUTIONS OF DIFFERENT TYPES 109
whereκ9 = 8 middot
radic10172 middot κ1 le 39086
Let us now examine S2 which was bounded in Prop 524 We set the parametersW prime U prime as follows in accordance with (54)
W prime = max(VW2) U prime = max(U x2W )
Since W prime geW2 and W ge V gt 117 we can always boundsumW primeltpleW
(log p)2 le 1
2W (logW ) (619)
by (219)Bounding S2 for δ arbitrary We set
W0 = min(max(2θq V ) xU)
where θ ge e is a parameter that will be set laterFor V leW lt W0 we use the bound (553)
S2(U primeW primeW ) le(
max(1 2ρ)
(x
8q+
x
2W
)+W
2+ 2q
)middot 1
2W (logW )
le max
(1
2 ρ
)(W
8q+
1
2
)x logW +
W 2 logW
4+ qW logW
where ρ = qQIf W0 gt V the contribution of the terms with V leW lt W0 to (615) is (by 618)
bounded by
4
int W0
V
radicκ1
x
W
(ρ0
4
(W
4q+ 1
)x logW +
W 2 logW
4+ qW logW
)dW
W
le κ2
2
radicρ0x
int W0
V
radiclogW
W 32dW +
κ2
2
radicx
int W0
V
radiclogW
W 12dW
+ κ2
radicρ0x2
16q+ qx
int W0
V
radiclogW
WdW
le(κ2radicρ0
xradicV
+ κ2
radicxW0
)radiclogW0
+2κ2
3
radicρ0x2
16q+ qx
((logW0)32 minus (log V )32
)
(620)
where ρ0 = max(1 2ρ) and
κ2 = 4radicκ1 le 193768
(We are using the easy boundradica+ b+ c le
radica+radicb+radicc)
110 CHAPTER 6 MINOR-ARC TOTALS
We now examine the terms with W ge W0 If 2θq gt xU then W0 = xU thecontribution of the case is nil and the computations below can be ignored Thus wecan assume that 2θq le xU
We use (554)
S2(U primeW primeW ) le(
x
4φ(q)
1
log(W2q)+
q
φ(q)
W
log(W2q)
)middot 1
2W logW
Byradica+ b le
radica+radicb we can take out the qφ(q) middotW log(W2q) term and estimate
its contribution on its own it is at most
4
int xU
W0
radicκ1
x
Wmiddot q
φ(q)middot 1
2W 2
logW
logW2q
dW
W
=κ2radic
2
radicq
φ(q)
int xU
W0
radicx logW
W logW2qdW
le κ2radic2
radicqx
φ(q)
int xU
W0
1radicW
(1 +
radiclog 2q
logW2q
)dW
(621)
Nowint xU
W0
1radicW
radiclog 2q
logW2qdW =
radic2q log 2q
int x2Uq
max(θV2q)
1radict log t
dt
We bound this last integral somewhat crudely for T ge e
int T
e
1radict log t
dt le 23
radicT
log T (622)
(This is shown as follows since
1radicT log T
lt
(23
radicT
log T
)prime
if and only if T gt T0 where T0 = e(1minus223)minus1
= 213594 it is enough to check(numerically) that (622) holds for T = T0) Since θ ge e this gives us that
int xU
W0
1radicW
(1 +
radiclog 2q
logW2q
)dW
le 2
radicx
U+ 23
radic2q log 2q middot
radicx2Uq
log x2Uq
62 CONTRIBUTIONS OF DIFFERENT TYPES 111
and so (621) is at most
radic2κ2
radicq
φ(q)
(1 + 115
radiclog 2q
log x2Uq
)xradicU
We are left with what will usually be the main term viz
4
int xU
W0
radicS1(UW ) middot
(x
8φ(q)
logW
logW2q
)WdW
W (623)
which by (534) is at most xradicφ(q) times the integral of
1
W
radicradicradicradic(2H2
( x
WU
)+κ4
2
radicxWU
U
)logW
logW2q
for W going from W0 to xU where H2 is as in (535) and
κ4 = 4κ0ζ(32)3 le 905671
By the arithmeticgeometric mean inequality the integrand is at most 1W times
β + βminus1 middot 2H2(xWU)
2+βminus1
2
κ4
2
radicxWU
U+β
2
log 2q
logW2q(624)
for any β gt 0 We will choose β laterThe first summand in (624) gives what we can think of as the main or worst term
in the whole paper let us compute it first The integral isint xU
W0
β + βminus1 middot 2H2(xWU)
2
dW
W=
int xUW0
1
β + βminus1 middot 2H2(s)
2
ds
s
le(β
2+κ6
4β
)log
x
UW0
(625)
by (536) whereκ6 = 060428
Thus the main term is simply(β
2+κ6
4β
)xradicφ(q)
logx
UW0 (626)
The integral of the second summand is at most
βminus1 middot κ4
4
radicx
U
int xU
V
dW
W 32le βminus1 middot κ4
2
radicxUV
U
112 CHAPTER 6 MINOR-ARC TOTALS
By (617) this is at most
βminus1
radic2middot 10minus3 middot κ4 le βminus1κ72
where
κ7 =
radic2κ4
1000le 01281
Thus the contribution of the second summand is at most
βminus1κ7
2middot xradic
φ(q)
The integral of the third summand in (624) is
β
2
int xU
W0
log 2q
logW2q
dW
W (627)
If V lt 2θq le xU this is
β
2
int xU
2θq
log 2q
logW2q
dW
W=β
2log 2q middot
int x2Uq
θ
1
log t
dt
t
=β
2log 2q middot
(log log
x
2Uqminus log log θ
)
If 2θq gt xU the integral is over an empty range and its contribution is hence 0If 2θq le V (627) is
β
2
int xU
V
log 2q
logW2q
dW
W=β log 2q
2
int x2Uq
V2q
1
log t
dt
t
=β log 2q
2middot (log log
x
2Uqminus log log V2q)
=β log 2q
2middot log
(1 +
log xUV
log V2q
)
(628)
(Let us stop for a moment and ask ourselves when this will be smaller than whatwe can see as the main term namely the term (β2) log xUW0 in (625) Clearlylog(1 + (log xUV )(log V2q)) le (log xUV )(log V2q) and that is smaller than(log xUV ) log 2q when V2q gt 2q Of course it does not actually matter if (628)is smaller than the term from (625) or not since we are looking for upper bounds herenot for asymptotics)
The total bound for (623) is thus
xradicφ(q)
middot(β middot(
1
2log
x
UW0+
Φ
2
)+ βminus1
(1
4κ6 log
x
UW0+κ7
2
)) (629)
62 CONTRIBUTIONS OF DIFFERENT TYPES 113
where
Φ =
log 2q(
log log x2Uq minus log log θ
)if V2θ lt q lt x(2θU)
log 2q log(
1 + log xUVlog V2q
)if q le V2θ
(630)
Choosing β optimally we obtain that (623) is at most
xradic2φ(q)
radic(log
x
UW0+ Φ
)(κ6 log
x
UW0+ 2κ7
) (631)
where Φ is as in (630)Bounding S2 for |δ| ge 8 Let us see how much a non-zero δ can help us It makes
sense to apply (556) only when |δ| ge 8 otherwise (554) is almost certainly betterNow by definition |δ|x le 1qQ and so |δ| ge 8 can happen only when q le x8Q
With this in mind let us apply (556) assuming |δ| gt 8 Note first that
x
|δq|
(q +
x
4W
)minus1
ge 1|δq|qx + 1
4W
ge 4|δq|1
2Q + 1W
ge 4W
|δ|qmiddot 1
1 + W2Q
ge 4W
|δ|qmiddot 1
1 + xU2Q
This is at least 2 min(2QW )|δq| Thus we are allowed to apply (556) when |δq| le2 min(2QW ) Since Q ge xU we know that min(2QW ) = W for all W le xU and so it is enough to assume that |δq| le 2W We will soon be making a strongerassumption
Recalling also (619) we see that (556) gives us
S2(U primeW primeW ) le min
12qφ(q)
log
(4W|δ|q middot
1
1+xU2Q
)( x
|δq|+W
2
)middot 1
2W (logW )
(632)Similarly to before we define W0 = max(V θ|δq|) where θ ge 3e28 will be set
later (Here θ ge 3e28 is an assumption we do not yet need but we will be using itsoon to simplify matters slightly) For W geW0 we certainly have |δq| le 2W Hencethe part of the first term of (615) coming from the range W0 leW lt xU is
4
int xU
W0
radicS1(UW ) middot S2(U VW )
dW
W
le 4
radicq
φ(q)
int xU
W0
radicradicradicradicradicS1(UW ) middot logW
log
(4W|δ|q middot
1
1+xU2Q
) (Wx
|δq|+W 2
2
)dW
W
(633)
114 CHAPTER 6 MINOR-ARC TOTALS
By (534) the contribution of the term Wx|δq| to (633) is at most
4xradic|δ|φ(q)
int xU
W0
radicradicradicradicradicradic(H2
( x
WU
)+κ4
4
radicxWU
U
)logW
log
(4W|δ|q middot
1
1+xU2Q
) dWW
Note that 1 + (xU)2Q le 32 Proceeding as in (623)ndash(631) we obtain that this isat most
2xradic|δ|φ(q)
radic(log
x
UW0+ Φ
)(κ6 log
x
UW0+ 2κ7
)
where
Φ =
log (1+ε1)|δq|4 log
(1 + log xUV
log 4V|δ|(1+ε1)q
)if |δq| le Vθ
log 3|δq|8
(log log 8x
3U |δq| minus log log 8θ3
)if Vθ lt |δq| le xθU
(634)
where ε1 = x2UQ This is what we think of as the main termBy (618) the contribution of the term W 22 to (633) is at most
4
radicq
φ(q)
int xU
W0
radicκ1
2xdWradicWmiddot maxW0leWle x
U
radiclogW
log 8W3|δq|
(635)
Since trarr (log t)(log tc) is decreasing for t gt c (635) is at most
4radic
2κ1
radicq
φ(q)
(xradicUminusradicxW0
)radiclogW0
log 8W0
3|δq| (636)
If W0 gt V we also have to consider the range V leW lt W0 By Prop 524 and(619) the part of (615) coming from this is
4
int θ|δq|
V
radicS1(UW ) middot (logW )
(Wx
2|δq|+W 2
4+
Wx
16(1minus ρ)Q+
x
8(1minus ρ)
)dW
W
The contribution of W 24 is at most
4
int W0
V
radicκ1
x
WlogW middot W
2
4
dW
Wle 4radicκ1 middot
radicxW0 middot
radiclogW
the sum of this and (636) is at most
4radicκ1
(radic2q
φ(q)
(xradicUminusradicxW0
)radiclogW0
log 8θ3
+radicxW0
radiclogW0
)
le κ2 middotradic
q
φ(q)
xradicU
radiclogW0
62 CONTRIBUTIONS OF DIFFERENT TYPES 115
where we use the facts that W0 = θ|δq| (by W0 gt V ) and θ ge 3e28 and where werecall that κ2 = 4
radicκ1
The terms Wx2|δ|q and Wx(16(1minus ρ)Q) contribute at most
4radicκ1
int θ|δq|
V
radicx
Wmiddot (logW )W
(x
2|δq|+
x
16(1minus ρ)Q
)dW
W
= κ2x
(1radic2|δ|q
+1
4radic
(1minus ρ)Q
)int θ|δq|
V
radiclogW
dW
W
=2κ2
3x
(1radic2|δ|q
+1
4radic
(1minus ρ)Q
)((log θ|δ|q)32 minus (log V )32
)
The term x8(1minus ρ) contributes
radic2κ1x
int θ|δq|
V
radiclogW
W (1minus ρ)
dW
Wleradic
2κ1xradic1minus ρ
int infinV
radiclogW
W 32dW
le κ2xradic2(1minus ρ)V
(radic
log V +radic
1 log V )
where we use the estimate
int infinV
radiclogW
W 32dW =
1radicV
int infin1
radiclog u+ log V
u32du
le 1radicV
int infin1
radiclog V
u32du+
1radicV
int infin1
1
2radic
log V
log u
u32du
= 2
radiclog VradicV
+1
2radicV log V
middot 4 le 2radicV
(radiclog V +
radic1 log V
)
It is time to collect all type II terms Let us start with the case of general δ We willset θ ge e later If q le V2θ then |SII | is at most
xradic2φ(q)
middot
radic(log
x
UV+ log 2q log
(1 +
log xUV
log V2q
))(κ6 log
x
UV+ 2κ7
)+radic
2κ2
radicq
φ(q)
(1 + 115
radiclog 2q
log x2Uq
)xradicU
+ κ9xradicV
(637)
116 CHAPTER 6 MINOR-ARC TOTALS
If V2θ lt q le x2θU then |SII | is at most
xradic2φ(q)
middot
radic(log
x
U middot 2θq+ log 2q log
log x2Uq
log θ
)(κ6 log
x
U middot 2θq+ 2κ7
)
+radic
2κ2
radicq
φ(q)
(1 + 115
radiclog 2q
log x2Uq
)xradicU
+ (κ2
radiclog 2θq + κ9)
xradicV
+κ2
6
((log 2θq)32 minus (log V )32
) xradicq
+ κ2
(radic2θ middot log 2θq +
2
3((log 2θq)32 minus (log V )32)
)radicqx
(638)where we use the fact that Q ge xU (implying that ρ0 = max(1 2qQ) equals 1 forq le x2U ) Finally if q gt x2θU
|SII | le (κ2
radic2 log xU + κ9)
xradicV
+ κ2
radiclog xU
xradicU
+2κ2
3((log xU)32 minus (log V )32)
(x
2radic
2q+radicqx
)
(639)
Now let us examine the alternative bounds for |δ| ge 8 Here we assume θ ge 3e28If |δq| le Vθ then |SII | is at most
2xradic|δ|φ(q)
radicradicradicradiclogx
UV+ log
|δq|(1 + ε1)
4log
(1 +
log xUV
log 4V|δ|(1+ε1)q
)
middotradicκ6 log
x
UV+ 2κ7
+ κ2
radic2q
φ(q)middot
radiclog V
log 2V|δq|middot xradic
U+ κ9
xradicV
(640)
where ε1 = x2UQ If Vθ lt |δ|q le xθU then |SII | is at most
2xradic|δ|φ(q)
radicradicradicradic(logx
U middot θ|δ|q+ log
3|δq|8
loglog 8x
3U |δq|
log 8θ3
)(κ6 log
x
U middot θ|δq|+ 2κ7
)
+2κ2
3
(xradic2|δq|
+x
4radicQminus q
)((log θ|δq|)32 minus (log V )32
)+
(κ2radic
2(1minus ρ)
(radiclog V +
radic1 log V
)+ κ9
)xradicV
+ κ2
radicq
φ(q)middotradic
log θ|δq| middot xradicU
(641)
63 ADJUSTING PARAMETERS CALCULATIONS 117
where ρ = qQ Note that |δ| le xQq implies ρ le xQ2 and so ρ will be very smalland Qminus q will be very close to Q
The case |δq| gt xθU will not arise in practice essentially because of |δ|q le xQ
63 Adjusting parameters Calculations
We must bound the exponential sumsumn Λ(n)e(αn)η(nx) By (38) it is enough to
sum the bounds we obtained in sect62 We will now see how it will be best to set U Vand other parameters
Usually the largest terms will be
C0UV (642)
where C0 equals
c4I2 + c9I2 = 439779 + 521993ε0 if |δ| le 12c2 sim 074463c4I2 + (1 + ε)c13I2 = (489106 + 131541ε)(1 + ε0) if |δ| gt 12c2
(643)(from (613) and (614) type I we will specify ε and ε0 = (4 log 2)(xUV ) later)and
xradicδ0φ(q)
radicradicradicradiclogx
UV+ (log δ0(1 + ε1)q) log
(1 +
log xUV
log Vδ0(1+ε1)q
)
middotradicκ6 log
x
UV+ 2κ7
(644)
(from (637) and (640) type II here δ0 = max(2 |δ|4) while ε1 = x2UQ for|δ| gt 8 and ε1 = 0 for |δ| lt 8
We set UV = κxradicqδ0 we must choose κ gt 0
Let us first optimize (or rather almost optimize) κ in the case |δ| le 4 so thatδ0 = 2 and ε1 = 0 For the purpose of choosing κ we replace
radicφ(q) by
radicqC1
where C1 = 23536 sim 510510φ(510510) and also replace V by q2c c a constantWe use the approximation
log
(1 +
log xUV
log V|2q|
)= log
(1 +
log(radic
2qκ)
log(q2c)
)= log
(3
2+
log 2radiccκ
log q2c
)sim log
3
2+
2 log 2radiccκ
3 log q2c
118 CHAPTER 6 MINOR-ARC TOTALS
What we must minimize then is
C0κradic2q
+C1radic2q
radicradicradicradic(log
radic2q
κ+ log 2q
(log
3
2+
2 log 2radicc
κ3 log q
2c
))(κ6 log
radic2q
κ+ 2κ7
)
le C0κradic2q
+C1
2radicq
radicκ6radicκprime1
radicκprime1 log q minus
(5
3+
2
3
log 4c
log q2c
)logκ + κprime2
middot
radicκprime1 log q minus 2κprime1 logκ +
4κprime1κ7
κ6+ κprime1 log 2
le C0radic2q
(κ + κprime4
(κprime1 log q minus
((5
6+ κprime1
)+
1
3
log 4c
log q2c
)logκ + κprime3
))
(645)where
κprime1 =1
2+ log
3
2 κprime2 = log
radic2 + log 2 log
3
2+
log 4c log 2q
3 log q2c
κprime3 =1
2
(κprime2 +
4κprime1κ7
κ6+ κprime1 log 2
)=
log 4c
6+
(log 4c)2
6 log q2c
+ κprime5
κprime4 =C1
C0
radicκ6
2κprime1sim
030915
1+118694ε0if |δ| le 4
027797(1+026894ε)(1+ε0) if |δ| gt 4
κprime5 =1
2(logradic
2 + log 2 log3
2+
4κprime1κ7
κ6+ κprime1 log 2) sim 101152
Taking derivatives we see that the minimum is attained when
κ =
(5
6+ κprime1 +
1
3
log 4c
log q2c
)κprime4 sim
(17388 +
log 4c
3 log q2c
)middot 030915
1 + 119ε0(646)
provided that |δ| le 4 (What we obtain for |δ| gt 4 is essentially the same only withδ0q = δq4 instead of 2q and 027797((1 + 027ε)(1 + ε0)) in place of 030915) Forq = 5 middot 105 c = 25 and |δ| le 4 (typical values in the most delicate range) we get thatκ should be about 05582(1 + 119ε0) Values of q c nearby give similar values forκ whether |δ| le 4 or for |δ| gt 4
(Incidentally at this point we could already give a back-of-the-envelope estimatefor the last line of (645) ie our main term It suggests that choosing w = 1 insteadof w = 2 would have given bounds worse by about 15 percent)
We make the choices
κ = 12 and so UV =x
2radicqδ0
for the sake of simplicity (Unsurprisingly (645) changes very slowly around its min-imum) Note by the way that this means that ε0 = (2 log 2)
radicqδ0
Now we must decide how to choose U V and Q given our choice of UV We willactually make two sets of choices
63 ADJUSTING PARAMETERS CALCULATIONS 119
First we will use the SI2 estimates for q le QV to treat all α of the form α =aq +Olowast(1qQ) q le y (Here y is a parameter satisfying y le QV )
Then the remaining α will get treated with the (coarser) SI2 estimate for q gtQV with Q reset to a lower value (call it Qprime) If α was not treated in the first go (sothat it must be dealt with the coarser estimate) then α = aprimeqprime + δprimex where eitherqprime gt y or δprimeqprime gt xQ (Otherwise α = aprimeqprime +Olowast(1qprimeQ) would be a valid estimatewith qprime le y) The value of Qprime is set to be smaller than Q both because this is helpful(it diminishes error terms that would be large for large q) and because this is harmless(since we are no longer assuming that q le QV )
631 First choice of parameters q le y
The largest items affected strongly by our choices at this point are
c16I2
(2 +
1 + ε
εlog+ 2UV |δ|q
x
)x
QV+ c17I2Q (from SI2 |δ| gt 12c2)(
c10I2 logU
q+ 2c5I2 + c12I2
)Q (from SI2 |δ| le 12c2)
(647)and
κ2
radic2q
φ(q)
(1 + 115
radiclog 2q
log x2Uq
)xradicU
+ κ9xradicV
(from SII any |delta|)
(648)with
κ2
radic2q
φ(q)middot
radiclog V
log 2V|δq|middot xradic
U(from SII )
as an alternative to (648) for |δ| ge 8 (In several of these expressions we are apply-ing some minor simplifications that our later choices will justify Of course even ifthese simplifications were not justified we would not be getting incorrect results onlypotentially suboptimal ones we are trying to decide how choose certain parameters)
In addition we have a relatively mild but important dependence on V in the mainterm (644) even when we hold UV constant (as we do in so far as we have alreadychosen UV ) We must also respect the condition q le QV the lower bound onU given by (617) and the assumptions made at the beginning of the chapter (egQ ge xU V ge 2 middot 106) Recall that UV = x2
radicqδ0
We setQ =
x
8y
since we will then have not just q le y but also q|δ| le xQ = 8y and so qδ0 le 2yWe want q le QV to be true whenever q le y this means that
q le Q
V=QU
UV=
QU
x2radicqδ0
=Uradicqδ0
4y
120 CHAPTER 6 MINOR-ARC TOTALS
must be true when q le y and so it is enough to set U = 4y2radicqδ0 The following
choices make sense we will work with the parameters
y =x13
6 Q =
x
8y=
3
4x23 xUV = 2
radicqδ0 le 2
radic2y
U =4y2
radicqδ0
=x23
9radicqδ0
V =x
(xUV ) middot U=
x
8y2=
9x13
2
(649)
where as before δ0 = max(2 |δ|4) So for instance we obtain ε1 le x2UQ =6radicqδ0x
13 le 2radic
3x16 Assuming
x ge 216 middot 1020 (650)
we obtain that U(xUV ) ge (x239radicqδ0)(2
radicqδ0) = x2318qδ0 ge x136 ge
106 and so (617) holds We also get that ε1 le 0002Since V = x8y2 = (92)x13 (650) also implies that V ge 2 middot 106 (in fact
V ge 27 middot 106) It is easy to check that
V lt x4 UV le x Q ge max(16 2radicx) Q ge max(2U xU) (651)
as stated at the beginning of the chapter Let θ = (32)3 = 278 Then
V
2θq=x8y2
2θqge x
16θy3=
x
54y3= 4 gt 1
V
θ|δq|ge x8y2
8θyge x
64θy3=
x
216y3= 1
(652)
The first type I bound is
|SI1| lex
qmin
(1cprime0δ2
)min
45
qφ(q)
log+ x23 9
q52 δ
120
1
(log 9x13
radicqδ0 + c3I
)+c4Iq
φ(q)
+
(c7I log
y
c2+ c8I log x
)y +
c10Ix13
3422q32δ120
(log 9x13radiceqδ0)
+
(c5I log
2x23
9c2radicqδ0
+ c6I logx53
9radicqδ0
)x23
9radicqδ0
+ c9Iradicx log
2radicex
c2+c10I
e
(653)where the constants are as in sect621 For any cR ge 1 the function
xrarr (log cx)(log xR)
attains its maximum on [Rprimeinfin] Rprime gt R at x = Rprime Hence for qδ0 fixed
min
45
log+ 4x23
9(δ0q)52
1
(log 9x13
radicqδ0 + c3I
)(654)
63 ADJUSTING PARAMETERS CALCULATIONS 121
attains its maximum for x isin [(9e45(δ0q)524)32infin) at
x =(
9e45(δ0q)524
)32
= (278)e65(qδ0)154 (655)
Now notice that for smaller values of x (654) increases as x increases since the termmin( 1) equals the constant 1 Hence (654) attains its maximum for x isin (0infin)at (655) and so
min
45
log+ 4x23
9(δ0q)52
1
(log 9x13
radicqδ0 + c3I
)+ c4I
le log27
2e25(δ0q)
74 + c3I + c4I le7
4log δ0q + 611676
Examining the other terms in (653) and using (650) we conclude that
|SI1| lex
qmin
(1cprime0δ2
)middot q
φ(q)
(7
4log δ0q + 611676
)+
x23
radicqδ0
(067845 log xminus 120818) + 037864x23
(656)
where we are using (650) (and of course the trivial bound δ0q ge 2) to simplify thesmaller error terms We recall that cprime0 = 0798437 gt c0(2π)2
Let us now consider SI2 The terms that appear both for |δ| small and |δ| large aregiven in (612) The second line in (612) equals
c8I2
(x
4q2δ0+
2UV 2
x+qV 2
x
)+c10I
2
(q
2radicqδ0
+x23
18qδ0
)log
9x13
2
le c8I2(
x
4q2δ0+
9x13
2radic
2+
27
8
)+c10I
2
(y16
232+
x23
18qδ0
)(1
3log x+ log
9
2
)le 029315
x
q2δ0+ (008679 log x+ 039161)
x23
qδ0+ 000153
radicx
where we are using (650) to simplify Now
min
(45
log+ Q4V q2
1
)log V q = min
(45
log+ y4q2
1
)log
9x13q
2(657)
can be bounded trivially by log(9x13q2) le (23) log x+log 34 We can also bound(657) as we bounded (654) before namely by fixing q and finding the maximum forx variable In this way we obtain that (657) is maximal for y = 4e45q2 since bydefinition x136 = y (657) then equals
log9(6 middot 4e45q2)q
2= 3 log q + log 108 +
4
5le 3 log q + 548214
122 CHAPTER 6 MINOR-ARC TOTALS
We conclude that (612) is at most
min
(1
4cprime0δ2
)middot(
3
2log q + 274107
)x
φ(q)
+ 029315x
q2δ0+ (00434 log x+ 01959)x23
(658)
If |δ| le 12c2 we must consider (613) This is at most
(c4I2 + c9I2)x
2radicqδ0
+ (c10I2 logx23
9q32radicδ0
+ 2c5I2 + c12I2) middot 3
4x23
le 21989xradicqδ0
+361818x
qδ0+ (177019 log x+ 292955)x23
where we recall that ε0 = (4 log 2)(xUV ) = (2 log 2)radicqδ0 which can be bounded
crudely byradic
2 log 2 (Thus c10I2 leradic
1 +radic
8 log 2middot178783 lt 354037 and c12I2 le293333 + 11902
radic2 log 2 le 410004)
If |δ| gt 12c2 we must consider (614) instead For ε = 007 that is at most
(c4I2 + (1 + ε)c13I2)x
2radicqδ0
(1 +
2 log 2radicqδ0
)+ (338845
(1 +
2 log 2radicqδ0
)log δq3 + 208823)
x
|δ|q
+
(688133
(1 +
4 log 2radicqδ0
)log |δ|q + 720828
)x23 + 604141x13
= 249157xradicqδ0
(1 +
2 log 2radicqδ0
)+ (338845 log δq3 + 326771)
x
|δ|q
+
(229378 log x+ 190791
log |δ|qradicqδ0
+ 130691
)x
23
le 249157xradicqδ0
+ (359676 log δ0 + 273032 log q + 912218)x
qδ0
+ (229378 log x+ 411228)x23
where besides the crude bound ε0 leradic
2 log 2 we use the inequalities
log |δ|qradicqδ0
le log 4qδ0radicqδ0
le log 8radic2
log qradicqδ0le 1radic
2
log qradicqle 1radic
2
log e2
e=
radic2
e
1
|δ|le 4c2
δ0
log |δ||δ|
le 2
e log 2middot log δ0
δ0
(Obviously 1|δ| le 4c2δ0 is based on the assumption |δ| gt 12c2 and on the inequal-ity 16c2 ge 1 The bound on (log |δ|)|δ| is based on the fact that (log t)t reaches itsmaximum at t = e and (log δ0)δ0 = (log 2)2 for |δ| le 8)
63 ADJUSTING PARAMETERS CALCULATIONS 123
We sum (658) and whichever one of our bounds for (613) and (614) is greater(namely the latter) We obtain that for any δ
|SI2| le 249157xradicqδ0
+ min
(1
4cprime0δ2
)middot(
3
2log q + 274107
)x
φ(q)
+ (359676 log δ0 + 273032 log q + 91515)x
qδ0+ (229812 log x+ 411424)x23
(659)where we bound one of the lower-order terms in (658) by xq2δ0 le xqδ0
For type II we have to consider two cases (a) |δ| lt 8 and (b) |δ| ge 8 Considerfirst |δ| lt 8 Then δ0 = 2 Recall that θ = 278 We have q le V2θ and |δq| le Vθthanks to (652) We apply (637) and obtain that for |δ| lt 8
|SII | lexradic
2φ(q)middot
radicradicradicradic1
2log 4qδ0 + log 2q log
(1 +
12 log 4qδ0
log V2q
)middotradic
030214 log 4qδ0 + 02562
+ 822088
radicq
φ(q)
1 + 115
radicradicradicradic log 2q
log 9x13radicδ0
2radicq
(qδ0)14x23 + 184251x56
le xradic2φ(q)
middotradicCx2q log 2q +
log 8q
2middotradic
030214 log 2q + 067506
+ 16406
radicq
φ(q)x34 + 184251x56
(660)where we bound
log 2q
log 9x13radicδ0
2radicq
lelog x13
3
log 9x16radic
2
2radic
16
lt limxrarrinfin
log x13
3
log 9x16radic
2
2radic
16
= 2
and where we define
Cxt = log
(1 +
log 4t
2 log 9x13
2004t
)
for 0 lt t lt 9x132 (We have 2004 here instead of 2 because we want a constantge 2(1 + ε1) in later occurences of Cxt for reasons that will soon become clear)
For purposes of later comparison we remark that 16404 le 157863x45minus34 forx ge 216 middot 1020
Consider now case (b) namely |δ| ge 8 Then δ0 = |δ|4 By (652) |δq| le Vθ
124 CHAPTER 6 MINOR-ARC TOTALS
Hence (640) gives us that
|SII | le2xradic|δ|φ(q)
middot
radicradicradicradic1
2log |δq|+ log
|δq|(1 + ε1)
4log
(1 +
log |δ|q2 log 18x13
|δ|(1+ε1)q
)middotradic
030214 log |δ|q + 02562
+ 822088
radicq
φ(q)
radicradicradicradic log 9x13
2
log 9x13
|δq|
middot (qδ0)14x23 + 184251x56
le xradicδ0φ(q)
radicCxδ0q log δ0(1 + ε1)q +
log 4δ0q
2
radic030214 log δ0q + 067506
+ 179926
radicq
φ(q)x45 + 184251x56
(661)since
822088
radicradicradicradic log 9x13
2
log 9x13
|δq|
middot (qδ0)14 le 822088
radiclog 9x13
2
log 274
middot (x133)14
le 179926x45minus23
for x ge 216 middot 1020 Clearly
log δ0(1 + ε1)q = log δ0q + log(1 + ε1) le log δ0q + ε1
By Lemma C22 qφ(q) le z(y) = z(x136) (since x ge 183) It is easy tocheck that x rarr
radicz(x136)x45minus56 is decreasing for x ge 216 middot 1020 (in fact for
183) Using (650) we conclude that 167718radicqφ(q)x45 le 089657x56 and by
the way 16406radicqφ(q)x34 le 078663x56 This allows us to simplify the last lines
of (660) and (661) We obtain that for δ arbitrary
|SII | lexradicδ0φ(q)
radicCxδ0q(log δ0q + ε1) +
log 4δ0q
2
radic030214 log δ0q + 067506
+ 273908x56(662)
It is time to sum up SI1 SI2 and SII The main terms come from the first lineof (662) and the first term of (659) Lesser-order terms can be dealt with roughlywe bound min(1 cprime0δ
2) and min(1 4cprime0δ2) from above by 2δ0 (using the fact that
cprime0 = 0798437 lt 16 which implies that 8δ gt 4cprime0δ2 for δ gt 8 of course for δ le 8
we have min(1 4cprime0δ2) le 1 = 22 = 2δ0)
63 ADJUSTING PARAMETERS CALCULATIONS 125
The terms inversely proportional to q φ(q) or q2 thus add up to at most
2x
δ0qmiddot q
φ(q)
(7
4log δ0q + 611676
)+
2x
δ0φ(q)
(3
2log q + 274107
)+ (359676 log δ0 + 273032 log q + 91515)
x
qδ0
le 2x
δ0φ(q)
(13
4log δ0q + 781811
)+
2x
δ0q(136516 log δ0q + 375415)
where for instance we bound (32) log q + 274107 by (32) log δ0q + 274107 minus(32) log 2
As for the other terms ndash we use the assumption x ge 216 middot 1020 to bound x23
and x23 log x by a small constant times x56 We bound x23radicqδ0 by x23
radic2 (in
(656)) We obtain
x23
radic2
(067845 log xminus 120818) + 037864x23
+ (229812 log x+ 411424)x23 + 273908x
56 le 335531x56
The sums S0infin and S0w in (311) are 0 (by (650) and the fact that η2(t) = 0 fort le 14) We conclude that for q le y = x136 x ge 216 middot 1020 and η = η2 as in(34)
|Sη(x α)| le |SI1|+ |SI2|+ |SII |
le xradicδ0φ(q)
radicCxδ0q(log δ0q + 0002) +
log 4δ0q
2
radic030214 log δ0q + 067506
+249157xradic
δ0q+
2x
δ0φ(q)
(13
4log δ0q + 781811
)+
2x
δ0q(136516 log δ0q + 375415)
+ 335531x56(663)
where
δ0 = max(2 |δ|4) Cxt = log
(1 +
log 4t
2 log 9x13
2004t
) (664)
SinceCxt is an increasing function as a function of t (for x fixed and t le 9x132004)and δ0q le 2y we see that Cxt le Cx2y It is clear that x 7rarr Cxt (fixed t) is adecreasing function of x For x = 216 middot 1020 Cx2y = 139942
632 Second choice of parameters
If with the original choice of parameters we obtained q gt y = x136 we now resetour parameters (Q U and V ) Recall that while the value of q may now change (due tothe change inQ) we will be able to assume that either q gt y or |δq| gt x(x8y) = 8y
126 CHAPTER 6 MINOR-ARC TOTALS
We want U(xUV ) ge 5 middot 105 (this is (617)) We also want UV small With thisin mind we let
V =x13
3 U = 500
radic6x13 Q =
x
U=
x23
500radic
6 (665)
Then (617) holds (as an equality) Since we are assuming (650) we have V ge 2 middot106It is easy to check that (650) also implies that U le
radicx2 and Q ge 2
radicx and so the
inequalities in (651) all holdWrite 2α = aq + δx for the new approximation we must have either q gt y or
|δ| gt 8yq since otherwise aq would already be a valid approximation under the firstchoice of parameters Thus either (a) q gt y or both (b1) |δ| gt 8 and (b2) |δ|q gt 8ySince now V = 2y we have q gt V2θ in case (a) and |δq| gt Vθ in case (b) for anyθ ge 1 We set θ = 4
(Thanks to this choice of θ we have |δq| le xQ le xθU as we commented at theend of sect623 this will help us avoid some case-work later)
By (64)
|SI1| lex
qmin
(1cprime0δ2
)(log x23 minus log 500
radic6 + c3I + c4I
q
φ(q)
)+
(c7I log
Q
c2+ c8I log x log c11I
Q2
x
)Q+ c10I
U2
4xlog
e12x23
500radic
6+c10I
e
+
(c5I log
1000radic
6x13
c2+ c6I log 500
radic6x43
)middot 500radic
6x13 + c9Iradicx log
2radicex
c2
le x
qmin
(1cprime0δ2
)(2
3log xminus 499944 + 100303
q
φ(q)
)+
289
1000x23(log x)2
where we are bounding
c7I logQ
c2+ c8I log x log c11I
Q2
x
=c8I(log x)2 minus(c8I(log 1500000minus log c11I)minus
2
3c7I
)log x+ c7I log
1
500radic
6c2
lec8I(log x)2 minus 38 log x
We are also using the assumption (650) repeatedly in order to show that the sum ofall lower-order terms is less than (38c8I log x)(500
radic6) Note that c8I(log x)2Q le
000289x23(log x)2We have qφ(q) le z(Q) (where z is as in (C19)) and since Q gt
radic6 middot 12 middot 109
for x ge 216 middot 1020
100303z(Q) le 100303
(eγ log logQ+
250637
log logradic
6 middot 12 middot 109
)le 02359 logQ+ 079 lt 01573 log x
63 ADJUSTING PARAMETERS CALCULATIONS 127
(It is possible to give a much better estimation but it is not worthwhile since this willbe a very minor term) We have either q gt y or q|δ| gt 8y if q|δ| gt 8y but q le y then|δ| ge 8 and so cprime0δ
2q lt 18|δ|q lt 164y lt 1y Hence
|SI1| lex
y
((2
3+ 01573
)log x
)+ 000289x23(log x)2
le 24719x23 log x+ 000289x23(log x)2
We bound |SI2| using Lemma 424 First we bound (450) this is at most
x
2qmin
(1
4cprime0δ2
)log
x13q
3
+ c0
(1
4minus 1
π2
) (UV )2 log x13
3
2x+
3c42
500radic
6
9+
(500radic
6x13 + 1)2x13 log x
23
6x
where c4 = 103884 We bound the second line of this using (650) As for the firstline we have either q ge y (and so the first line is at most (x2y)(log x13y3)) orq lt y and 4cprime0δ
2q lt 116y lt 1y (and so the same bound applies) Hence (450) isat most
3x23
(2
3log xminus log 18
)+ 002017x23 log x = 202017x23 log xminus3(log 18)x23
Now we bound (451) which comes up when |δ| le 12c2 where c2 = 6π5radicc0
c0 = 31521 (and so c2 = 06714769 ) Since 12c2 lt 8 it follows that q gt y (thealternative q le y q|δ| gt 8y is impossible since it implies |δ| gt 8) Then (451) is atmost
2radicc0c1π
(UV log
UVradice
+Q
(radic3 log
c2x
Q+
logUV
2log
UV
Q2
))+
3c12
x
ylogUV log
UV
c2xy+
16 log 2
πQ log
c0e3Q2
4π middot 8 log 2 middot xlog
Q
2
+3c1
2radic
2c2
radicx log
c2x
2+
25c04π2
(3c2)12radicx log x
(666)
where c1 = 1000189 gt 1 + (8 log 2)(2xUV )The first line of (666) is a linear combination of terms of the form x23 logCx
C gt 1 using (650) we obtain that it is at most 1144693x23 log x (The main contri-bution comes from the first term) Similarly we can bound the first term in the secondline by 330536x23 log x Since log(c0e
3Q2(4π middot 8 log 2 middot x)) logQ2 is at mostlog x13 log x23 the second term in the second line is at most 00006406x(log x)2The third line of (666) can be bounded easily by 00122x23 log x
Hence (666) is at most
117776x23 log x+ 00006406x23(log x)2
128 CHAPTER 6 MINOR-ARC TOTALS
If |δ| gt 12c2 then we know that |δq| gt min(y2c2 8y) = y2c2 Thus (452)(with ε = 001) is at most
2radicc0c1π
UV logUVradice
+202radicc0c1
π
(x
y2c2+ 1
)((radic
302minus 1) log
xy2c2
+ 1radic
2+
1
2logUV log
e2UVx
y2c2
)
+
(3c12
(1
2+
303
016log x
)+
20c03π2
(2c2)32
)radicx log x
Again by (650) and in much the same way as before this simplifies to
le (114466 + 15107 + 68523)x23 log x+ 29136x12(log x)2
le 122885x23(log x)
Hence in total and for any |δ|
|SI2| le 202017x23 log x+ 122885x23(log x) + 00006406x23(log x)2
le 12309x23(log x) + 00006406x23(log x)2
Now we must estimate SII As we said before either (a) q gt y or both (b1)|δ| gt 8 and (b2) |δ|q gt 8y Recall that θ = 4 In case (a) we have q gt x136 =V2 gt V2θ thus we can use (638) and obtain that if q le x8U |SII | is at most
xradicz(q)radic2q
radic(log
x
U middot 8q+ log 2q log
log x(2Uq)
log 4
)(κ6 log
x
U middot 8q+ 2κ7
)
+radic
2κ2
radicz( x
8U
)(1 + 115
radiclog x4U
log 4
)xradicU
+ (κ2
radiclog xU + κ9)
xradicV
+κ2
6
((log 8y)32 minus (log 2y)32
) xradicy
+ κ2
(radic8 log xU +
2
3((log xU)32 minus (log V )32)
)xradic8U
(667)where z is as in (C19) (We are already simplifying the third line the bound givenis justified by a derivative test) It is easy to check that q rarr (log 2q)(log log q)q isdecreasing for q ge y (indeed for q ge 9) and so the first line of (667) is maximal forq = y
63 ADJUSTING PARAMETERS CALCULATIONS 129
We can thus bound (667) by x56 timesradic3z(et36)
(t
3minus log 8c+
(t
3minus log 3
)log
t3 minus log 2c
log 4
)(κ6
3tminus 4214
)+
radic2κ2radic6c
radicz(e2t3
48c
)1 + 115
radic23 tminus log 24c
log 4
+
(κ2
radic2t
3minus log 6c+ κ9
)radic
3
+κ2radic
6
((t
3+ log
8
6
) 32
minus(t
3+ log
2
6
) 32
)
+κ2radic48c
(radic8
(2t
3minus log 6c
)+
2
3
((2t
3minus log 6c
) 32
minus(t
3minus log 3
) 32
))(668)
where t = log x and c = 500radic
6 Asymptotically the largest term in (667) comesfrom the last line (of order t32) even if the first line is larger in practice (while beingof order at most t log t) Let us bound (668) by a multiple of t32
First of all notice that
d
dt
z(et3
6
)log t
=
(eγ log
(t3 minus log 6
)+ 250637
log( t3minuslog 6)
)primelog t
minusz(et3
6
)t(log t)2
=eγ minus 250637
log2( t3minuslog 6)
(tminus 3 log 6) log tminuseγ + 250637
log2( t3minuslog 6)
t log tmiddot
log(t3 minus log 6
)log t
(669)
which for t ge 100 is
gteγ log 3minus 2middot250637 log t
log2( t3minuslog 6)
t(log t)2ge
195671minus 892482log t
t(log t)2gt 0
Similarly for t ge 2000
d
dt
z(e2t3
48c
)log t
gteγ log 3
2 minus250637 log t
log2( 2t3 minuslog 48c)
minus 250637
log( 2t3 minuslog 48c)
t(log t)2
ge072216minus 545234
log t
t(log t)2gt 0
Thus
z(et3
6
)le (log t) middot lim
srarrinfin
z(es3
6
)log s
= eγ log t for t ge 100
z(e2t3
48c
)le (log t) middot lim
srarrinfin
z(e2s3
48c
)log s
= eγ log t for t ge 2000
(670)
130 CHAPTER 6 MINOR-ARC TOTALS
Also note that since (x32)prime = (32)radicx((
t
3+ log
8
6
) 32
minus(t
3+ log
2
6
) 32
)le 3
2
radict
3+ log
8
6middot log 4 le 120083
radict
for t ge 2000 We also have(2t
3minus log 6c
) 32
minus(t
3minus log 3
) 32
lt
(2t
3minus log 9
) 32
minus(t
3minus log 3
) 32
= (232 minus 1)
(t
3minus log 3
) 32
lt (232 minus 1)t32
332le 035189t32
Of course
t
3minus log 8c+
(t
3minus log 3
)log
t3 minus log 2c
log 4lt
(t
3+t
3log
t
3
)ltt
3log t
We conclude that for t ge 2000 (668) is at mostradic3 middot eγ log t middot t
3log t middot κ6
3t+
radic2κ2radic6c
radiceγ log t
(1 + 079749
radict)
+
(κ2
radic2
3t12 + κ9
)radic
3 +κ2radic
6middot 12009
radict+
κ2radic48c
(radic16t
3+
2
3middot 035189t32
)le (010181 + 000012 + 000145 + 0000048 + 000462)t32 le 010848t32
On the remaining interval log(216 middot 1020) le t le log 2000 we use interval arith-metic (as in sect26 with 30 iterations) to bound the ratio of (668) to t32 We obtain thatit is at most
0275964t32
Hence for all x ge 216 middot 1020
|SII | le 0275964x56(log x)32 (671)
in the case y lt q le x8U If x8U lt q le Q we use (639) In this range x2
radic2q +
radicqx adopts its max-
imum at q = Q (because x2radic
2q for q = x8U is smaller thanradicqx for q = Q by
(665) and (650)) Hence (639) is at most x56 times(κ2
radic2
(2
3tminus log cprime
)+ κ9
)radic
3 + κ2
radic2
3tminus log cprime middot 1radic
cprime
+2κ2
3
((2
3tminus log cprime
) 32
minus(t
3minus log 3
) 32
)( radiccprime
2radic
2eminust6 +
1radiccprime
)
63 ADJUSTING PARAMETERS CALCULATIONS 131
where t = log x (as before) and cprime = 500radic
6 This is at most
(2κ2 +radic
3κ9)radict+
κ2radiccprime
radic2
3
radict+
2κ2
3
232 minus 1
332t32
( radiccprime
2radic
2eminust6 +
1radiccprime
)le 010327
for t ge log(216 middot 1020
) and so
|SII | le 010327x56(log x)32
for x8U lt q le Q using the assumption x ge 216 middot 1020Finally let us treat case (b) that is |δ| gt 8 and |δ|q gt 8y we can also assume
q le y as otherwise we are in case (a) which has already been treated Since |δx| le1qQ we know that
|δq| le x
Q= U = 500
radic6x13 le x23
2000radic
6=
x
4U=
x
θU
again under assumption (650) We apply (641) and obtain that |SII | is at most
2xradicz(y)radic8y
radic(log
x
U middot 4 middot 8y+ log 3y log
log x3Uy
log 323
)(κ6 log
x
U middot 4 middot 8y+ 2κ7
)+
2κ2
3
(xradic16y
((log 32y)32 minus (log 2y)
32 ) +
x4radicQminus y
((log 4U)32 minus (log 2y)
32 )
)+
(κ2radic
2(1minus yQ)
(radiclog V +
radic1 log V
)+ κ9
)xradicV
+ κ2
radicz(y) middot
radiclog 4U middot xradic
U
(672)where we are using the facts that (log 3t8)t is increasing for t ge 8y gt 8e3 and that
d
dt
(log t)32 minus (log V )32
radict
=3(log t)12 minus ((log t)32 minus (log V )32)
2t32
= minuslog t
e3 middotradic
log tminus (log V )32
2t32lt 0
for t ge θ middot 8y = 16V thanks to(log
16V
e3
)2
log 16V gt (log V )3 +
(log 16minus 2 log
e3
16
)(log V )2
+
((log
16
e3
)2
minus 2 loge3
16log 16
)log V gt (log V )3
132 CHAPTER 6 MINOR-ARC TOTALS
(valid for log V ge 1) Much as before we can rewrite (672) as x56 times
2radicz(et36)radic
86
radict
3minus log 32c+
(t
3minus log 2
)log
t3 minus log 3c
log 323
middot
radicκ6
(t
3minus log 32c
)+ 2κ7 +
2κ2
3
radic3
8
((t
3+ log
32
6
) 32
minus(t
3minus log 3
) 32
)
+2κ2
3
14radicet3
6c minus16
((t
3+ log 24c
)32
minus(t
3minus log 3
)32)
+κ2
radic3radic
2(1minus c
et3
)(radic
t3minus log 3 +1radic
t3minus log 3
)+ κ9
radic3
+ κ2
radicz(et36)
radict3 + log 24c
6c
(673)where t = log x and c = 500
radic6 For t ge 100 we use (670) to bound z(et36)
and we obtain that (673) is at most
2radiceγradic
86
radic1
3middot κ6
3middot (log t)t+
2κ2
3
radic3
8middot 1
2
(t
3+ log
32
6
)12
middot log 16
+2κ2
3
14radice1003
6c minus 16
middot 1
2
(t
3+ log 24c
)12
middot log 72c
+κ2
radic3radic
2(1minus c
e1003
)(radic
t3 +1radict3
)+ κ9
radic3 + κ2
radiceγ log t
radict3 + log 24c
6c
(674)where we have bounded expressions of the form a32minusb32 (a gt b) by (a122)middot(aminusb)The ratio of (674) to t32 is clearly a decreasing function of t For t = 200 this ratiois 023747 hence (674) (and thus (673)) is at most 023748t32 for t ge 200
On the range log(216 middot 1020) le t le 200 the bisection method (with 25 iterations)gives that the ratio of (673) to t32 is at most 023511
We conclude that when |δ| gt 8 and |δ|q gt 8y
|SII | le 023511x56(log x)32
Thus (671) gives the worst caseWe now take totals and obtain
Sη(x α) le |SI1|+ |SI2|+ |SII |le (24719 + 12309)x23 log x+ (000289 + 00006406)x23(log x)2
+ 0275964x56(log x)32
le 027598x56(log x)32 + 123338x23 log x(675)
64 CONCLUSION 133
where we use (650) yet again
64 ConclusionProof of Theorem 311 We have shown that |Sη(α x)| is at most (663) for q lex136 and at most (675) for q gt x136 It remains to simplify (663) slightlyBy the geometric meanarithmetic mean inequalityradic
Cxδ0q(log δ0q + 0002) +log 4δ0q
2
radic030214 log δ0q + 067506 (676)
is at most
1
2radicρ
(Cxδ0q(log δ0q + 0002) +
log 4δ0q
2
)+
radicρ
2(030214 log δ0q + 067506)
for any ρ gt 0 We recall that
Cxt = log
(1 +
log 4t
2 log 9x13
2004t
)
Let
ρ =Cx12q0(log 2q0 + 0002) + log 8q0
2
030214 log 2q0 + 067506= 3397962
where x1 = 1025 q0 = 2 middot 105 (In other words we are optimizing matters for x = x1δ0q = 2q0 the losses in nearby ranges will be very slight) We obtain that (676) is atmost
Cxδ0q2radicρ
(log δ0q + 0002) +
(1
4radicρ
+
radicρ middot 030214
2
)log δ0q
+1
2
(log 2radicρ
+
radicρ
2middot 067506
)le 027125Cxt(log δ0q + 0002) + 04141 log δ0q + 049911
(677)
Now for x ge x0 = 216 middot 1020
Cxtlog t
le Cx0t
log t=
1
log tlog
(1 +
log 4t
2 log 54middot106
2004t
)le 008659
for 8 le t le 106 (by the bisection method with 20 iterations) and
Cxtlog t
leC(6t)3t
log tle 1
log tlog
(1 +
log 4t
2 log 9middot62004
)le 008659
if 106 lt t le x136 Hence
027125 middot Cxδ0q middot 0002 le 0000047 log δ0q
134 CHAPTER 6 MINOR-ARC TOTALS
We conclude that for q le x136
|Sη(α x)| le Rxδ0q log δ0q + 049911radicφ(q)δ0
middot x+2492xradicqδ0
+2x
δ0φ(q)
(13
4log δ0q + 782
)+
2x
δ0q(1366 log δ0q + 3755) + 336x56
where
Rxt = 027125 log
(1 +
log 4t
2 log 9x13
2004t
)+ 041415
Part II
Major arcs
135
Chapter 7
Major arcs overview andresults
Our task as in Part I will be to estimate
Sη(α x) =sumn
Λ(n)e(αn)η(nx) (71)
where η R+ rarr C us a smooth function Λ is the von Mangoldt function and e(t) =e2πit Here we will treat the case of α lying on the major arcs
We will see how we can obtain good estimates by using smooth functions η basedon the Gaussian eminust
22 This will involve proving new fully explicit bounds for theMellin transform of the twisted Gaussian or what is the same bounds on paraboliccylindrical functions in certain ranges It will also require explicit formulae that aregeneral and strong enough even for moderate values of x
Let α = aq + δx For us saying that α lies on a major arc will be the same assaying that q and δ are bounded more precisely q will be bounded by a constant r and|δ| will be bounded by a constant times rq As is customary on the major arcs wewill express our exponential sum (31) as a linear combination of twisted sums
Sηχ(δx x) =
infinsumn=1
Λ(n)χ(n)e(δnx)η(nx) (72)
for χ Zrarr C a Dirichlet character mod q ie a multiplicative character on (ZqZ)lowast
lifted to Z (The advantage here is that the phase term is now e(δnx) rather thane(αn) and e(δnx) varies very slowly as n grows) Our task then is to estimateSηχ(δx x) for δ small
Estimates on Sηχ(δx x) rely on the properties of DirichletL-functionsL(s χ) =sumn χ(n)nminuss What is crucial is the location of the zeroes of L(s χ) in the critical strip
0 le lt(s) le 1 (a region in which L(s χ) can be defined by analytic continuation) Incontrast to most previous work we will not use zero-free regions which are too narrowfor our purposes Rather we use a verification of the Generalized Riemann Hypothesisup to bounded height for all conductors q le 300000 (due to D Platt [Plab])
137
138 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS
A key feature of the present work is that it allows one to mimic a wide varietyof smoothing functions by means of estimates on the Mellin transform of a singlesmoothing function ndash here the Gaussian eminust
22
71 Results
Write ηhearts(t) = eminust22 Let us first give a bound for exponential sums on the primes
using ηhearts as the smooth weight Without loss of generality we may assume that ourcharacter χ mod q is primitive ie that it is not really a character to a smaller modulusqprime|q
Theorem 711 Let x be a real numberge 108 Let χ be a primitive Dirichlet charactermod q 1 le q le r where r = 300000
Then for any δ isin R with |δ| le 4rq
infinsumn=1
Λ(n)χ(n)e
(δ
xn
)eminus
(nx)2
2 = Iq=1 middot ηhearts(minusδ) middot x+ E middot x
where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and
|E| le 4306 middot 10minus22 +1radicx
(650400radicq
+ 112
)
We normalize the Fourier transform f as follows f(t) =intinfinminusinfin e(minusxt)f(x)dx Of
course ηhearts(minusδ) is justradic
2πeminus2π2δ2 As it turns out smooth weights based on the Gaussian are often better in applica-
tions than the Gaussian ηhearts itself Let us give a bound based on η(t) = t2ηhearts(t)
Theorem 712 Let η(t) = t2eminust22 Let x be a real number ge 108 Let χ be a
primitive character mod q 1 le q le r where r = 300000Then for any δ isin R with |δ| le 4rq
infinsumn=1
Λ(n)χ(n)e
(δ
xn
)η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x
where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and
|E| le 2485 middot 10minus19 +1radicx
(281200radicq
+ 56
)
The advantage of η(t) = t2ηhearts(t) over ηhearts is that it vanishes at the origin (to secondorder) as we shall see this makes it is easier to estimate exponential sums with thesmoothing η lowastM g where lowastM is a Mellin convolution and g is nearly arbitrary Here isa good example that is used crucially in Part III
71 RESULTS 139
Corollary 713 Let η(t) = t2eminust22 lowastM η2(t) where η2 = η1 lowastM η1 and η1 =
2 middot I[121] Let x be a real number ge 108 Let χ be a primitive character mod q1 le q le r where r = 300000
Then for any δ isin R with |δ| le 4rq
infinsumn=1
Λ(n)χ(n)e
(δ
xn
)η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x
where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and
|E| le 2485 middot 10minus19 +1radicx
(381500radicq
+ 76
)
Let us now look at a different kind of modification of the Gaussian smoothing Saywe would like a weight of a specific shape for example what we will need to do inPart III we would like an approximation to the function
η t 7rarr
t3(2minus t)3eminus(tminus1)22 for t isin [0 2]0 otherwise
(73)
At the same time what we have is an estimate for the Mellin transform of the Gaussianeminust
22 centered at t = 0The route taken here is to work with an approximation η+ to η We let
η+(t) = hH(t) middot teminust22 (74)
where hH is a band-limited approximation to
h(t) =
t2(2minus t)3etminus12 if t isin [0 2]0 otherwise
(75)
By band-limited we mean that the restriction of the Mellin transform of hH to theimaginary axis is of compact support (We could alternatively let hH be a functionwhose Fourier transform is of compact support this would be technically easier insome ways but it would also lead to using GRH verifications less efficiently)
To be precise we define
FH(t) =sin(H log y)
π log y
hH(t) = (h lowastM FH)(y) =
int infin0
h(tyminus1)FH(y)dy
y
(76)
and H is a positive constant It is easy to check that MFH(iτ) = 1 for minusH ltτ lt H and MFH(iτ) = 0 for τ gt H or τ lt minusH (unsurprisingly since FH is aDirichlet kernel under a change of variables) Since in general the Mellin transform ofa multiplicative convolution f lowastM g equals Mf middotMg we see that the Mellin transform
140 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS
of hH on the imaginary axis equals the truncation of the Mellin transform of h to[minusiH iH] Thus hH is a band-limited approximation to h as we desired
The distinction between the odd and the even case in the statement that followssimply reflects the two different points up to which computations where carried out in[Plab] these computations were in turn to some extent tailored to the needs of thepresent work (as was the shape of η+ itself)
Theorem 714 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and
H = 200 Let x be a real numberge 1012 Let χ be a primitive character mod q where1 le q le 150000 if q is odd and 1 le q le 300000 if q is even
Then for any δ isin R with |δ| le 600000 middot gcd(q 2)q
infinsumn=1
Λ(n)χ(n)e
(δ
xn
)η(nx) = Iq=1 middot η(minusδ) middot x+ E middot x
where Iq=1 = 1 if q = 1 Iq=1 = 0 if q 6= 1 and
|E| le 13482 middot 10minus14 +1617 middot 10minus10
q+
1radicx
(499900radicq
+ 52
)
If q = 1 we have the sharper bound
|E| le 4772 middot 10minus11 +251400radic
x
This is a paradigmatic example in that following the proof given in sect94 we canbound exponential sums with weights of the form hH(t)eminust
22 where hH is a band-limited approximation to just about any continuous function of our choosing
Lastly we will need an explicit estimate of the `2 norm corresponding to the sumin Thm 714 for the trivial character
Proposition 715 Let η(t) = η+(t) = hH(t)teminust22 where hH is as in (76) and
H = 200 Let x be a real number ge 1012Theninfinsumn=1
Λ(n)(log n)η2(nx) = x middotint infin
0
η2+(t) log xt dt+ E1 middot x log x
= 0640206x log xminus 0021095x+ E2 middot x log x
where|E1| le 5123 middot 10minus15 +
36691radicx
|E2| le 2 middot 10minus6 +36691radic
x
72 Main ideasAn explicit formula gives an expression
Sηχ(δx x) = Iq=1η(minusδ)xminussumρ
Fδ(ρ)xρ + small error (77)
72 MAIN IDEAS 141
where Iq=1 = 1 if q = 1 and Iq=1 = 0 otherwise Here ρ runs over the complexnumbers ρ with L(ρ χ) = 0 and 0 lt lt(ρ) lt 1 (ldquonon-trivial zerosrdquo) The function Fδis the Mellin transform of e(δt)η(t) (see sect24)
The questions are then where are the non-trivial zeros ρ of L(s χ) How fast doesFδ(ρ) decay as =(ρ)rarr plusmninfin
Write σ = lt(s) τ = =(s) The belief is of course that σ = 12 for every non-trivial zero (Generalized Riemann Hypothesis) but this is far from proven Most workto date has used zero-free regions of the form σ le 1minus1C log q|τ | C a constant Thisis a classical zero-free region going back qualitatively to de la Vallee-Poussin (1899)The best values of C known are due to McCurley [McC84a] and Kadiri [Kad05]
These regions seem too narrow to yield a proof of the three-primes theorem Whatwe will use instead is a finite verification of GRH ldquoup to Tqrdquo ie a computation show-ing that for every Dirichlet character of conductor q le r0 (r0 a constant as above)every non-trivial zero ρ = σ + iτ with |τ | le Tq satisfies lt(σ) = 12 Such verifica-tions go back to Riemann modern computer-based methods are descended in part froma paper by Turing [Tur53] (See the historical article [Boo06b]) In his thesis [Pla11]D Platt gave a rigorous verification for r0 = 105 Tq = 108q In coordination withthe present work he has extended this to
bull all odd q le 3 middot 105 with Tq = 108q
bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)
This was a major computational effort involving in particular a fast implementationof interval arithmetic (used for the sake of rigor)
What remains to discuss then is how to choose η in such a way Fδ(ρ) decreasesfast enough as |τ | increases so that (77) gives a good estimate We cannot hope forFδ(ρ) to start decreasing consistently before |τ | is at least as large as a constant times|δ| Since δ varies within (minuscr0q cr0q) this explains why Tq is taken inverselyproportional to q in the above As we will work with r0 ge 150000 we also see that wehave little margin for maneuver we want Fδ(ρ) to be extremely small already for say|τ | ge 80|δ| We also have a Scylla-and-Charybdis situation courtesy of the uncertaintyprinciple roughly speaking Fδ(ρ) cannot decrease faster than exponentially on |τ ||δ|both for |δ| le 1 and for δ large
The most delicate case is that of δ large since then |τ ||δ| is small It turns outwe can manage to get decay that is much faster than exponential for δ large while noslower than exponential for δ small This we will achieve by working with smoothingfunctions based on the (one-sided) Gaussian ηhearts(t) = eminust
22The Mellin transform of the twisted Gaussian e(δt)eminust
22 is a parabolic cylinderfunction U(a z) with z purely imaginary Since fully explicit estimates for U(a z)z imaginary have not been worked in the literature we will have to derive them our-selves
Once we have fully explicit estimates for the Mellin transform of the twisted Gaus-sian we are able to use essentially any smoothing function based on the Gaussianηhearts(t) = eminust
22 As we already saw we can and will consider smoothing functionsobtained by convolving the twisted Gaussian with another function and also functionsobtained by multiplying the twisted Gaussian with another function All we need to
142 CHAPTER 7 MAJOR ARCS OVERVIEW AND RESULTS
do is use an explicit formula of the right kind ndash that is a formula that does not as-sume too much about the smoothing function or the region of holomorphy of its Mellintransform but still gives very good error terms with simple expressions
All results here will be based on a single general explicit formula (Lem 911) validfor all our purposes The contribution of the zeros in the critical trip can be handled ina unified way (Lemmas 913 and 914) All that has to be done for each smoothingfunction is to bound a simple integral (in (924)) We then apply a finite verification ofGRH and are done
Chapter 8
The Mellin transform of thetwisted Gaussian
Our aim in this chapter is to give fully explicit yet relatively simple bounds for theMellin transform Fδ(ρ) of e(δt)ηhearts(t) where ηhearts(t) = eminust
22 and δ is arbitrary Therapid decay that results will establish that the Gaussian ηhearts is a very good choice for asmoothing particularly when the smoothing has to be twisted by an additive charactere(δt)
The Gaussian smoothing has been used before in number theory see notablyHeath-Brownrsquos well-known paper on the fourth power moment of the Riemann zetafunction [HB79] What is new here is that we will derive fully explicit bounds on theMellin transform of the twisted Gaussian This means that the Gaussian smoothing willbe a real option in explicit work on exponential sums in number theory and elsewherefrom now on1
Theorem 801 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of fδ
Let s = σ + iτ σ ge 0 τ 6= 0 Let ` = minus2πδ Then if sgn(δ) 6= sgn(τ) and δ 6= 0
|Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot
c1σττ
σ2 for ρ arbitraryc2στ`
σ for ρ le 32(81)
1 There has also been work using the Gaussian after a logarithmic change of variables see in particular[Leh66] In that case the Mellin transform is simply a Gaussian (as in eg [MV07 Ex XII29]) Howeverfor δ non-zero the Mellin transform of a twist e(δt)eminus(log t)22 decays very slowly and thus would not beuseful for our purposes or in general for most applications in which GRH is not assumed
143
144 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
where ρ = 4τ`2
E(ρ) =1
2
(arccos
1
υ(ρ)minus 2(υ(ρ)minus 1)
ρ
)
c1στ =1
2
1 + 214
(2
1 + sin2 π8
)σ2+eminus(radic
2minus12
)τ(
tan π8
)σ
c2στ =1
2
1 + min
2σ+ 12
radicsec 2π
5(sin π
5
)σ+
eminusτ6
(1radic
3)σ
(82)
and
υ(ρ) =
radic1 +
radicρ2 + 1
2
If sgn(δ) = sgn(τ) or δ = 0
|Fδ(s)| le |x0|minusσ middot eminus12 `
2
|Γ(s)|eπ2 |τ | middot((
1 +π
232
)eminus
π4 |τ | +
1
2eminusπ|τ |
) (83)
where
|x0| ge
051729
radicτ for ρ arbitrary
084473 |τ ||`| for ρ le 32(84)
As we shall see the choice of smoothing function η(t) = eminust22 can be easily
motivated by the method of stationary phase but the problem is actually solved by thesaddle-point method One of the challenges here is to keep all expressions explicit andpractical
(In particular the more critical estimate (81) is optimal up to a constant dependingon σ the constants we give will be good rather than optimal)
The expressions in Thm 801 can be easily simplified further especially if one isready to introduce some mild constraints and make some sacrifices in the main term
Corollary 802 Let fδ(t) = eminust22e(δt) δ isin R Let Fδ be the Mellin transform of
fδ Let s = σ + iτ where σ isin [0 1] and |τ | ge 20 Then for 0 le k le 2
|Fδ(s+ k)|+ |Fδ((1minus s) + k)| le
κk0(|τ ||`|
)keminus01065( 2|τ|
|`| )2
if 4|τ |`2 lt 32
κk1|τ |k2eminus01598|τ | if 4|τ |`2 ge 32
whereκ00 le 3001 κ10 le 4903 κ20 le 796
κ01 le 3286 κ11 le 4017 κ21 le 513
We are considering Fδ(s + k) and not just Fδ(s) because bounding Fδ(s + k)
enables us to work with smoothing functions equal to or based on tkeminust22 Clearly
we can easily derive bounds with k arbitrary from Thm 801 It is just that we will
81 HOW TO CHOOSE A SMOOTHING FUNCTION 145
use k = 0 1 2 in practice Corollary 802 is meant to be applied to cases where τis larger than a constant (10 say) times |`| and σ cannot be bounded away from 1 ifeither condition fails to hold it is better to apply Theorem 801 directly
Let us end by a remark that may be relevant to applications outside number theoryBy (89) Thm 801 gives us bounds on the parabolic cylinder function U(a z) for zpurely imaginary (Surprisingly there seem to have been no fully explicit bounds forthis case in the literature) The bounds are useful when |=(a)| is at least somewhatlarger than |=(z)| (ie when |τ | is large compared to `) While the Thm 801 is statedfor σ ge 0 (ie for lt(a) ge minus12) extending the result to larger half-planes for a isnot hard
81 How to choose a smoothing functionLet us motivate our choice of smoothing function η The method of stationary phase([Olv74 sect411] [Won01 sectII3])) suggests that the main contribution to the integral
Fδ(t) =
int infin0
e(δt)η(t)tsdt
t(85)
should come when the phase has derivative 0 The phase part of (85) is
e(δt)t=(s)i = e(2πδt+τ log t)i
(where we write s = σ + iτ ) clearly
(2πδt+ τ log t)prime = 2πδ +τ
t= 0
when t = minusτ2πδ This is meaningful when t ge 0 ie sgn(τ) 6= sgn(δ) Thecontribution of t = minusτ2πδ to (85) is then
η(t)e(δt)tsminus1 = η
(minusτ2πδ
)eminusiτ
(minusτ2πδ
)σ+iτminus1
(86)
multiplied by a ldquowidthrdquo approximately equal to a constant divided byradic|(2πiδt+ τ log t)primeprime| =
radic| minus τt2| = 2π|δ|radic
|τ |
The absolute value of (86) is
η(minus τ
2πδ
)middot∣∣∣∣ minusτ2πδ
∣∣∣∣σminus1
(87)
In other words if sgn(τ) 6= sgn(δ) and δ is not too small asking that Fδ(σ + iτ)decay rapidly as |τ | rarr infin amounts to asking that η(t) decay rapidly as t rarr 0 Thusif we ask for Fδ(σ + iτ) to decay rapidly as |τ | rarr infin for all moderate δ we arerequesting that
146 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
1 η(t) decay rapidly as trarrinfin
2 the Mellin transform F0(σ + iτ) decay rapidly as τ rarr plusmninfin
Requirement (2) is there because we also need to consider Fδ(σ+ it) for δ very smalland in particular for δ = 0
There is clearly an uncertainty-principle issue here one cannot do arbitrarily wellin both aspects at the same time Once we are conscious of this the choice η(t) = eminust
in Hardy-Littlewood actually looks fairly good obviously η(t) = eminust decays expo-nentially and its Mellin transform Γ(s + iτ) also decays exponentially as τ rarr plusmninfinMoreover for this choice of η the Mellin transform Fδ(s) can be written explicitlyFδ(s) = Γ(s)(1minus 2πiδ)s
It is not hard to work out an explicit formula2 for η(t) = eminust However it is nothard to see that for Fδ(s) as above Fδ(12 + it) decays like eminust2π|δ| just as weexpected from (87) This is a little too slow for our purposes we will often haveto work with relatively large δ and we would like to have to check the zeroes of Lfunctions only up to relatively low heights t ndash say up to 50|δ| Then eminust2π|δ| gteminus8 = 000033 which is not very small We will settle for a different choice of ηthe Gaussian
The decay of the Gaussian smoothing function η(t) = eminust22 is much faster than
exponential Its Mellin transform is Γ(s2) which decays exponentially as =(s) rarrplusmninfin Moreover the Mellin transform Fδ(s) (δ 6= 0) while not an elementary orvery commonly occurring function equals (after a change of variables) a relativelywell-studied special function namely a parabolic cylinder function U(a z) (or inWhittakerrsquos [Whi03] notation Dminusaminus12(z))
For δ not too small the main term will indeed work out to be proportional toeminus(τ2πδ)22 as the method of stationary phase indicated This is of course muchbetter than eminusτ2π|δ| The ldquocostrdquo is that the Mellin transform Γ(s2) for δ = 0 nowdecays like eminus(π4)|τ | rather than eminus(π2)|τ | This we can certainly afford
82 The twisted Gaussian overview and setup
821 Relation to the existing literatureWe wish to approximate the Mellin transform
Fδ(s) =
int infin0
eminust22e(δt)ts
dt
t (88)
where δ isin R The parabolic cylinder function U C2 rarr C is given by
U(a z) =eminusz
24
Γ(
12 + a
) int infin0
taminus12 eminus
12 t
2minusztdt
2There may be a minor gap in the literature in this respect The explicit formula given in [HL22 Lemma4] does not make all constants explicit The constants and trivial-zero terms were fully worked out forq = 1 by [Wig20] (cited in [MV07 Exercise 12118(c)] the sign of hypκq(z) there seems to be off) Aswas pointed out by Landau (see [Har66 p 628]) [HL22] seems to neglect the effect of the zeros ρ withlt(ρ) = 0 =(ρ) 6= 0 for χ non-primitive (The author thanks R C Vaughan for this information and thereferences)
82 THE TWISTED GAUSSIAN OVERVIEW AND SETUP 147
for lt(a) gt minus12 the function can be extended to all a z isin C either by analyticcontinuation or by other integral representations ([AS64 sect195] [Tem10 sect125(i)])Hence
Fδ(s) = e(πiδ)2Γ(s)U
(sminus 1
2minus2πiδ
) (89)
The second argument of U is purely imaginary it would be otherwise if a Gaussian ofnon-zero mean were chosen
Let us briefly discuss the state of knowledge up to date on Mellin transforms ofldquotwistedrdquo Gaussian smoothings that is eminust
22 multiplied by an additive charactere(δt) As we have just seen these Mellin transforms are precisely the parabolic cylin-der functions U(a z)
The function U(a z) has been well-studied for a and z real see eg [Tem10]Less attention has been paid to the more general case of a and z complex The mostnotable exception is by far the work of Olver [Olv58] [Olv59] [Olv61] [Olv65] hegave asymptotic series for U(a z) a z isin C These were asymptotic series in the senseof Poincare and thus not in general convergent they would solve our problem if andonly if they came with error term bounds Unfortunately it would seem that all fullyexplicit error terms in the literature are either for a and z real or for a and z outsideour range of interest (see both Olverrsquos work and [TV03]) The bounds in [Olv61]involve non-explicit constants Thus we will have to find expressions with expliciterror bounds ourselves Our case is that of a in the critical strip z purely imaginary
822 General approach
We will use the saddle-point method (see eg [dB81 sect5] [Olv74 sect47] [Won01sectII4]) to obtain bounds with an optimal leading-order term and small error terms (Weused the stationary-phase method solely as an exploratory tool)
What do we expect to obtain Both the asymptotic expressions in [Olv59] and thebounds in [Olv61] make clear that if the sign of τ = =(s) is different from that of δthere will a change in behavior when τ gets to be of size about (2πδ)2 This is unsur-prising given our discussion using stationary phase for |=(a)| smaller than a constanttimes |=(z)|2 the term proportional to eminus(π4)|τ | = eminus|=(a)|2 should be dominantwhereas for |=(a)| much larger than a constant times |=(z)|2 the term proportional to
eminus12 ( τ
2πδ )2
should be dominantThere is one important difference between the approach we will follow here and
that in [Hela] In [Hela] the integral (88) was estimated by a direct application ofthe saddle-point method Here following a suggestion of N Temme we will use theidentity
U(a z) =e
14 z
2
radic2πi
int c+iinfin
cminusiinfineminuszu+u2
2 uminusaminus12 du (810)
(see eg [OLBC10 (1256)] c gt 0 is arbitrary) Together (89) and (810) give usthat
Fδ(s) =eminus2π2δ2Γ(s)radic
2πi
int c+iinfin
cminusiinfine2πiδu+u2
2 uminussdu (811)
148 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
Estimating the integral in (811) turns out to be a somewhat cleaner task than estimating(88) The overall procedure however is in essence the same in both cases
We write
φ(u) = minusu2
2minus (2πiδ)u+ iτ log u (812)
for u real or complex so that the integral in (811) equals
I(s) =
int c+iinfin
cminusiinfineminusφ(u)uminusσdu (813)
We wish to find a saddle point A saddle point is a point u at which φprime(u) = 0This means that
minus uminus 2πiδ +iτ
u= 0 ie u2 minus i`uminus iτ = 0 (814)
where ` = minus2πδ The solutions to φprime(u) = 0 are thus
u0 =i`plusmnradicminus`2 + 4iτ
2 (815)
The value of φ(u) at u0 is
φ(u0) = minus i`u0 + iτ
2+ i`u0 + iτ log u0
=i`
2u0 + iτ log
u0radice
(816)
The second derivative at u0 is
φprimeprime(u0) = minus 1
u20
(u2
0 + iτ)
= minus 1
u20
(i`u0 + 2iτ) (817)
Assign the names u0+ u0minus to the roots in (815) according to the sign in frontof the square-root (where the square-root is defined so as to have its argument in theinterval (minusπ2 π2]) We will actually have to pay attention just to u0+ since unlikeu0minus it lies on the right half of the plane where our contour of integration also liesWe remark that
u0+ =i`+ |`|
radicminus1 + 4iτ
`2
2=`
2
(iplusmnradicminus1 +
4τ
`2i
)(818)
where the sign plusmn is + if ` gt 0 and minus if ` lt 0 If ` = 0 then u0+ = (1radic
2 +iradic
2)radicτ
We can assume without loss of generality that τ ge 0 We will find it convenient toassume τ gt 0 since we can deal with τ = 0 simply by letting τ rarr 0+
83 THE SADDLE POINT 149
83 The saddle point
831 The coordinates of the saddle point
We should start by determining u0+ explicitly both in rectangular and polar coordi-nates For one thing we will need to estimate the integrand in (813) for u = u0+ Theabsolute value of the integrand is then
∣∣eminusφ(u0+)uminusσ0+
∣∣ = |u0+|minusσeminusltφ(u0+) and by(816)
ltφ(u0+) = minus `2=(u0+)minus arg(u0+)τ (819)
If ` = 0 we already know that lt(u0+) = =(u0+) =radicτ2 |u0+| =
radicτ and
arg u0+ = π4 Assume from now on that ` 6= 0
We will use the expression for u0+ in (818) Solving a quadratic equation we seethat
radicminus1 +
4τ
`2i =
radicj(ρ)minus 1
2+ i
radicj(ρ) + 1
2 (820)
where j(ρ) = (1 + ρ2)12 and ρ = 4τ`2 Hence
lt(u0+) = plusmn `2
radicj(ρ)minus 1
2 =(u0+) =
`
2
(1plusmn
radicj(ρ) + 1
2
) (821)
Here and in what follows the signplusmn is + if ` gt 0 andminus if ` lt 0 (Notice thatlt(u0+)and =(u0+) are always positive except for τ = ` = 0 in which case lt(u0+) ==(u0+) = 0) By (821)
|u0+| =|`|2middot
∣∣∣∣∣radicminus1 + j(ρ)
2+
(1plusmn
radic1 + j(ρ)
2
)i
∣∣∣∣∣=|`|2
radicminus1 + j(ρ)
2+
1 + j(ρ)
2+ 1plusmn 2
radic1 + j(ρ)
2
=|`|2
radic1 + j(ρ)plusmn 2
radic1 + j(ρ)
2=|`|radic
2
radicυ(ρ)2 plusmn υ(ρ)
(822)
150 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
where υ(ρ) =radic
(1 + j(ρ))2 We now compute the argument of u0+
arg(u0+) = arg(`(iplusmnradicminus1 + iρ
))= arg
(radicminus1 + j(ρ)
2+ i
(plusmn1 +
radic1 + j(ρ)
2
))
= arcsin
plusmn1 +radic
1+j(ρ)2radic
1 + j(ρ)plusmn 2radic
1+j(ρ)2
= arcsin
radicplusmn1 +
radic1+j(ρ)
2radic2radic
1+j(ρ)2
= arcsin
radicradicradicradic1
2
(1plusmn
radic2
1 + j(ρ)
) =π
2minus 1
2arccos
(plusmn
radic2
1 + j(ρ)
)(823)
(by cos(π minus 2θ) = minus cos 2θ = 2 sin2 θ minus 1) Thus
arg(u0+) =
π2 minus
12 arccos 1
υ(ρ) = 12 arccos minus1
υ(ρ) if ` gt 012 arccos 1
υ(ρ) if ` lt 0(824)
In particular arg(u0+) lies in [0 π2] and is close to π2 only when ` gt 0 andρ rarr 0+ Here and elsewhere we follow the convention that arcsin and arctan haveimage in [minusπ2 π2] whereas arccos has image in [0 π]
832 The direction of steepest descent
As is customary in the saddle-point method it is now time to determine the directionof steepest descent at the saddle-point u0+ Even if we decide to use a contour thatgoes through the saddle-point in a direction that is not quite optimal it will be usefulto know what the direction w of steepest descent actually is A contour that passesthrough the saddle-point making an angle between minusπ4 + ε and π4 minus ε with wmay be acceptable in that the contribution of the saddle point is then suboptimal by atmost a bounded factor depending on ε an angle approaching minusπ4 or π4 leads to acontribution suboptimal by an unbounded factor
Let w isin C be the unit vector pointing in the direction of steepest descent Thenby definition w2φprimeprime(u0+) is real and positive where φ is as in (812) Thus arg(w) =minus arg(φprimeprime(u0+))2 modπ (The direction of steepest descent is defined only moduloπ) By (817)
arg(φprimeprime(u0+)) = minusπ + arg(i`u0+ + 2iτ)minus 2 arg(u0+) mod 2π
= minusπ2
+ arg(`u0+ + 2τ)minus 2 arg(u0+) mod 2π
83 THE SADDLE POINT 151
By (821)
lt(`u0+ + 2τ) =`2
2
(plusmnradicj(ρ)minus 1
2+
4τ
`2
)=`2
2
(ρplusmn
radicj(ρ)minus 1
2
)
=(`u0+ + 2τ) =`2
2
(1plusmn
radicj(ρ) + 1
2
)
Therefore arg(`u0+ + 2τ) = arctan$ where
$ =1plusmn
radicj(ρ)+1
2
ρplusmnradic
j(ρ)minus12
It is easy to check that sgn$ = sgn ` Hence
arctan$ = plusmnπ2minus arctan
ρplusmnradic
j(ρ)minus12
1plusmnradic
j(ρ)+12
At the same time
ρplusmnradic
jminus12
1plusmnradic
j+12
=
(ρplusmn
radicjminus1
2
)(1∓
radicj+1
2
)1minus j+1
2
=ρplusmn
radic2(j minus 1)∓ ρ
radic2(j + 1)
1minus j
=ρplusmn
radic2j+1
(radicj2 minus 1minus ρ middot (j + 1)
)1minus j
=ρplusmn 1
υ (ρminus ρ middot (j + 1))
1minus j
=ρ(1∓ jυ)
1minus j=
(minus1plusmn jυ)(j + 1)
ρ=
2υ(minusυ plusmn j)ρ
(825)Hence modulo 2π
arg(φprimeprime(u0+)) = minus arctan2υ(minusυ plusmn j)
ρminus 2 arg(u0+)minus
0 if ` ge 0
π if ` lt 0
Therefore the direction of steepest descent is
arg(w) = minusarg(φprimeprime(u0+))
2= arg(u0+) +
1
2arctan
2υ(minusυ plusmn j)ρ
+
0 if ` ge 0π2 if ` lt 0
(826)By (824) and arccos 1υ = arctan
radicυ2 minus 1 = arctan
radic(j minus 1)2 we conclude that
arg(w) =
π2 + 1
2
(minus arctan 2υ(j+υ)
ρ + arctanradic
jminus12
)if ` lt 0
π2 + 1
2
(arctan 2υ(jminusυ)
ρ minus arctanradic
jminus12
)if ` ge 0
(827)
152 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
Figure 81 arg(w) minus π2 as a function ofρ for ` lt 0
Figure 82 arg(w) minus π2 as a function ofρ for ` ge 0
There is nothing wrong in using plots here to get an idea of the behavior of arg(w)since at any rate the direction of steepest descent will play only an advisory role inour choices See Figures 81 and 82
84 The integral over the contourWe must now choose the contour of integration The optimal contour should be one onwhich the phase of the integrand in (813) is constant ie =(φ(u)) is constant Thisis so because throughout the contour we want to keep descending from the saddleas rapidly as possible and so we want to maximize the absolute value of the deriva-tive of the real part of the exponent minusφ(u) At any point u if we are to maximize|lt(dφ(u)dt)| we want our contour to be such that =(dφ(u)dt) = 0 (We can alsosee this as follows if =(φ(u)) is constant there is no cancellation in (813) for us tomiss)
Writing u = x+ iy we obtain from (812) that
=(φ(u)) = minusxy + `x+ τ logradicx2 + y2 (828)
We would thus be considering the curve =(φ(u)) = c where c is a constant Since weneed the contour to pass through the saddle point u0+ we set c = =(φ(u0+)) Theonly problem is that the curve =(φ(u)) = 0 given by (828) is rather uncomfortable towork with
Instead we shall use several rather simple contours each appropriate for differentvalues of ` and τ
841 A simple contourAssume first that ` gt 0 We could just let our contour L be the vertical line goingthrough u0+ Since the direction of steepest descent is never far from vertical (see
84 THE INTEGRAL OVER THE CONTOUR 153
(82)) this would be a good choice However the vertical line has the defect of goingtoo close to the origin when ρrarr 0
Instead we will let L consist of three segments (a) the straight vertical ray
(x0 y) y ge y0
where x0 = ltu0+ ge 0 y0 = =u0+ gt 0 (b) the straight segment going downwardsand to the right from u0+ to the x-axis forming an angle of π2 minus β (where β gt 0will be determined later) with the x-axis at a point (x1 0) (c) the straight vertical ray(x1 y) y le 0 Let us call these three segments L1 L2 L3 Shifting the contour in(813) we obtain
I =
intL
eminusφ(u)uminusσdu
and so |I| le I1 + I2 + I3 where
Ij =
intLj
∣∣∣eminusφ(u)uminusσ∣∣∣ |du| (829)
As we shall see we have chosen the segments Lj so that each of the three integrals Ijwill be easy to bound
Let us start with I1 Since σ ge 0
I1 le |u0+|minusσint infiny0
eminusltφ(x0+iy)dy
where by (812)
ltφ(x+ iy) =y2 minus x2
2minus `y minus τ arg(x+ iy) (830)
Let us expand the expression on the right of (830) for x = x0 and y around y0 ==u0+ gt 0 The constant term is
ltφ(u0+) = minus `2y0 minus τ arg(u0+) = minus`
2
4(1 + υ(ρ))minus τ
2arccos
minus1
υ(ρ)
= minus(
1 + υ(ρ)
ρ+
1
2arccos
minus1
υ(ρ)
)τ
(831)
where we are using (819) (821) and (824)The linear term vanishes because u0+ is a saddle-point (and thus a local extremum
on L) It remains to estimate the quadratic term Now in (830) the term arg(x+ iy)equals arctan(yx) whose quadratic term we should now examine ndash but instead weare about to see that we can bound it trivially In general for t0 t isin R and f isin C2
f(t) = f(t0) + f prime(t0) middot (tminus t0) +
int t
t0
int r
t0
f primeprime(s)dsdr (832)
Now arctanprimeprime(s) = minus2s(s2 + 1)2 and this is negative for s gt 0 and obeys
arctanprimeprime(minuss) = minus arctanprimeprime(s)
154 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
for all s Hence for t0 ge 0 and t ge minust0
arctan t le arctan t0 + (arctanprime t0) middot (tminus t0) (833)
Therefore in (830) we can consider only the quadratic term coming from (y2minusx2)2ndash namely (yminusy0)22 ndash and ignore the quadratic term coming from arg(x+ iy) Thus
ltφ(x0 + iy) ge (y minus y0)2
2+ ltφ(u0+) (834)
for y ge minusy0 and in particular for y ge y0 Henceint infiny0
eminusltφ(x0+iy)dy le eminusltφ(u0+)
int infiny0
eminus12 (yminusy0)2dy =
radicπ2 middot eminusltφ(u0+) (835)
Notice that once we choose to use the approximation (833) the vertical direction isactually optimal (In turn the fact that the direction of steepest descent is close tovertical shows us that we are not losing much by using the approximation (833))
As for |u0+|minusσ we will estimate it by the easy bound
|u0+| =`radic2
radicυ2 + υ ge `radic
2max
(radicρ
2radic
2
)= max(
radicτ `) (836)
where we use (822)Let us now bound I2 As we already said the linear term at u0+ vanishes Let
u be the point at which L2 meets the line normal to it through the origin We musttake care that the angle formed by the origin u0+ and u be no larger than the angleformed by the origin (x1 0) and u0 this will ensure that we are in the range in whichthe approximation (833) is valid (namely t ge minust0 where t0 = tanα0) The firstangle is π2 +βminus arg u0+ whereas the second angle is π2minusβ Hence it is enoughto set β le (arg u0+)2 Then we obtain from (812) and (833) that
ltφ(u) ge ltφ(u0+)minuslt (uminus u0+)2
2 (837)
If we let s = |uminus u0+| we see that
lt (uminus u0+)2
2=s2
2cos(
2 middot(π
2minus β
))= minuss
2
2cos 2β
Hence
I2 le |u|minusσintL2
eminusltφ(u)|du|
lt |u|minusσint infin
0
eminusltφ(u0+)minus s22 cos 2βds = |u|minusσeminusltφ(u0+)
radicπ
2 cos 2β
(838)
Since arg u0 = arg u0+ minus β we see that by (821)
|u| = lt ((x0 + iy0) (cosβ minus i sinβ))
=`
2
(radicj minus 1
2cosβ +
(1 +
radicj + 1
2
)sinβ
)
(839)
84 THE INTEGRAL OVER THE CONTOUR 155
The square of the expression within the outer parentheses is at least
j minus 1
2cos2 β +
(1 +
j + 1
2+radic
2(j + 1)
)sin2 β +
(radicj2 minus 1
4+
radicj minus 1
2
)sin 2β
ge j
2+
7
2sin2 β minus 1
2cos2 β +
j
2sin2 β
If β ge π8 then tanβ gt 1radic
7 and so since j gt ρ we obtain
|u| ge`
2
radicj
2(1 + sin2 β) gt
`radicρ
232
radic1 + sin2 β
We can also apply the trivial bound j ge 1 directly to (839) Thus
|u| ge max
(radicτ
2
radic1 + sin2 β ` sinβ
)
Let us choose β as follows We could always set β = π8 since arg u0+ ge π4 wethen have β le (arg u0+)2 as required However if ρ le 32 then υ(ρ) le 118381and so by (824) arg u0+ ge 128842 We can thus set either β = π6 = 0523598 or β = π5 = 0628318 say either of which is smaller than (arg u0+)2 Goingback to (838) we conclude that
I2 le eminusltφ(u0+) middotradicπ
214
∣∣∣∣radicτ
2
radic1 + sin2 π
8
∣∣∣∣minusσfor ρ arbitrary and
I2 le eminusltφ(u0+) middotmin
(radicπ2
cos 2π5middot∣∣∣` sin
π
5
∣∣∣minusσ radicπ ∣∣∣∣ `2∣∣∣∣minusσ)
when υ(ρ) le 32It remains to estimate I3 For u = x1
minuslt (uminus u0+)2
2= minuslty
20 (tanβ minus i)2
2=
1
2
(1minus tan2 β
)y2
0
ge(1minus tan2 β
)middot `
2
8
(1 +
j + 1
2
)ge `2
8
(1minus tan2 β
)middot ρ
2
ge 1
4
(1minus tan2 β
)τ
where we are using (821) Thus (837) tells us that
ltφ(x1) ge ltφ(u0+) +1minus tan2 β
4τ
At the same time by (830) and τ ` ge 0
ltφ(x1 + iy) ge ltφ(x1) +y2
2
156 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
for y le 0 Hence
I3 le |x1|minusσintL3
eminusltφ(u)|du| le |x1|minusσeminusltφ(x1)
int 0
minusinfineminusy
22dy
le |x1|minusσ middotradicπ
2eminus
1minustan2 β4 τeminusltφ(u0+)
Here note that x1 ge (tanβ)|u0+| and so by (836)
x1 ge tanβ middotmax(radicτ `)
We conclude that for ` gt 0
|I| le
1 + 214
(2
1 + sin2 π8
)σ2+eminus(radic
2minus12
)τ(
tan π8
)σ middot radicπ2
τσ2eminusltφ(u0+)
(since (1minus tan2 π8)4 = (radic
2minus 1)2) and when ρ le 32
|I| le
1 + min
2σ+ 12
radicsec 2π
5(sin π
5
)σ+
eminusτ6
(1radic
3)σ
middot radicπ2`σ
eminusltφ(u0+)
We know ltφ(u0+) from (831) Write
E(ρ) =1
2arccos
1
υ(ρ)minus υ(ρ)minus 1
ρ (840)
so that
minusltφ(u0+) =1 + υ(ρ)
ρ+
1
2arccos
minus1
υ(ρ)=π
2minus E(ρ) +
2
ρ
To finish we just need to apply (811) It makes sense to group together Γ(s)eπ2 τ
since it is bounded on the critical line (by the classical formula |Γ(12 + iτ)| =radicπ coshπτ as in [MV07 Exer C1(b)]) and in general of slow growth on bounded
strips Using (811) and noting that 2π2δ2 = `22 = (2ρ) middot τ we obtain
|Fδ(s)| le |Γ(s)|eπ2 τeminusE(ρ)τ middot
c1σττ
σ2 for ρ arbitraryc2στ`
σ for ρ le 32(841)
where
c1στ =1
2
1 + 214
(2
1 + sin2 π8
)σ2+eminus(radic
2minus12
)τ(
tan π8
)σ
c2στ =1
2
1 + min
2σ+ 12
radicsec 2π
5(sin π
5
)σ+
eminusτ6
(1radic
3)σ
(842)
84 THE INTEGRAL OVER THE CONTOUR 157
We have assumed throughout that ` ge 0 and τ ge 0 We can immediately obtain abound valid for ` le 0 τ le 0 by reflection on the x-axis we simply put absolutevalues around τ and ` in (841)
We see that we have obtained a bound in a neat closed form without too mucheffort Of course this effortlessness is usually in part illusory the contour we haveused here is actually the product of some trial and error in that some other contoursgive results that are comparable in quality but harder to simplify We will have tochoose a different contour when sgn(`) 6= sgn(τ)
842 Another simple contourWe now wish to give a bound for the case of sgn(`) 6= sgn(τ) ie sgn(δ) = sgn(τ)We expect a much smaller upper bound than for sgn(`) = sgn(τ) given what wealready know from the method of stationary phase This also means that we will notneed to be as careful in order to get a bound that is good enough for all practicalpurposes
Our contour L will consist of three segments (a) the straight vertical ray (x0 y) y ge 0 (b) the quarter-circle from (x0 0) to (0minusx0) (that is an arc where the argu-ment runs from 0 to minusπ2) and (c) the straight vertical ray (0 y) y le minusx0 Wecall these segments L1 L2 L3 and define the integrals I1 I2 and I3 just as in (829)
Much as before we have
I1 le xminusσ0
int infin0
eminusltφ(x0+iy)dy
Since (833) is valid for t ge 0 (834) holds and so
I1 le xminusσ0 eminusltφ(u0+)
int infinminusinfin
eminus12 (yminusy0)2dy = xminusσ0
radic2π middot eminusltφ(u0+)
By (812) and (830)
I2 le xminusσ0
intL2
eminusltφ(u)du = x1minusσ0
int π2
0
eminus(minus x
20 cos 2α
2 +`x0 sinα+τα
)dα (843)
Now for α ge 0 and ` le 0
(`x0 sinα+ τα)prime
= `x0 cosα+ τ ge `x0 + τ
Since j =radic
1 + ρ2 le 1 + ρ22 we haveradic
(j minus 1)2 le ρ2 and so by (821)|`x0| le `2ρ4 = τ and thus `x0 + τ ge 0 In other words the exponent in (843)equals (x2
0 cos 2α)2 minus an increasing function and so since ltφ(x0) = minusx202
I2 le xminusσ0 middot x0
int π2
0
ex20 cos 2α
2 dα = xminusσ0 middot π2x0 middot I0(x2
02)
where I0(t) = 1π
int π0et cos θdθ is the modified Bessel function of the first kind (and
order 0)
158 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
Since cos θ =radic
1minus sin2 θ lt 1minus (sin2 θ)2 le 1minus 2θ2π2 we have3
I0(t) le 1
π
int π
0
et(
1minus 2θ2
π2
)dθ lt et middot 1
π
int infin0
eminus2tπ2 θ
2
dθ = etπradic
2t
π
radicπ
2=
radicπ
232
etradict
for t ge 0Using the fact that ltφ(x0) = minusx2
02 we conclude that
I2 le xminusσ0 middot π2x0 middot
radicπ
232
ex202
x0radic
2=π32
4xminusσ0 eminusltφ(x0)
By (834) which is valid for all ` we know that ltφ(x0) ge ltφ(u0+)Let us now estimate the integral on L3 Again by (830) for y lt 0
ltφ(iy) =y2
2minus `y + τ
π
2
Hence ∣∣∣∣intL3
eminusφ(u)uminusσdu
∣∣∣∣ le xminusσ0
int minusx0
minusinfineminus(y2
2 minus`y+τ π2
)du
= xminusσ0 e12 `
2
eminusτπ2
int minusx0
minusinfineminus
12 (yminus`)2dy = xminusσ0 eminus
τπ2
radicπ
2
since yminus` le minus` for y le minusx0 andint minus`minusinfin eminust
22dt leradicπ2middoteminus`22 (by [AS64 7113])
Now that we have bounded the integrals over L1 L2 and L3 it remains to boundx0 from below starting from (821) We will bound it differently for ρ lt 32 and forρ ge 32 (The choice of 32 is fairly arbitrary)
Expanding (radic
1 + t minus 1)2 gt 0 we obtain that 2(1 + t) minus 2radic
1 + t ge t for allt ge minus1 and so(radic
1 + tminus 1
t
)prime=
1
t2
(t
2radic
1 + tminus (radic
1 + tminus 1)
)lt 0
ie (radic
1 + tminus 1)t decreases as t increases Hence for ρ le ρ0 where ρ0 ge 0
j(ρ) =radic
1 + ρ2 ge 1 +
radic1 + ρ2
0 minus 1
ρ20
ρ2 (844)
which equals 1 + (29)(radic
13minus 2)ρ2 for ρ0 = 32 Thus for ρ le 32
x0 ge|`|2
radic29 (radic
13minus 2)ρ2
2=
radicradic13minus 2
6|`|ρ
=2radicradic
13minus 2
3
τ
|`|ge 084473
|τ |`
(845)
3It is actually not hard to prove rigorously the better bound I0(t) le 0468823etradict For t ge 8 this can
be done directly by the change of variables cos θ = 1 minus 2s2 dθ = 2dsradic
1minus s2 followed by the usageof different upper bounds on the the integrand exp(minus2ts2
radic1minus s2) for 0 le s le 12 and 12 le s le 1
(Thanks are due G Kuperberg for this argument) For t lt 8 use the Taylor expansion of I0(t) aroundt = 0 [AS64 (9612)] truncate it after 16 terms and then bound the maximum of the truncated series bythe bisection method implemented via interval arithmetic (as described in sect26)
85 CONCLUSIONS 159
On the other hand(j(ρ)minus 1
ρ
)prime=
1
ρ2(jprime(ρ)ρminus (j(ρ)minus 1)) =
ρ2 minus (1 + ρ2) +radic
1 + ρ2
ρ2radic
1 + ρ2ge 0
and so for ρ ge 32 (j(ρ) minus 1)ρ is minimal at ρ = 32 where it takes the value(radic
13minus 2)3 Hence
x0 =|`|2
radicj(ρ)minus 1
2ge|`|radicρ
2
radicradic13minus 2radic
6=
radicradic13minus 2radic
6
radicτ ge 051729
radicτ (846)
We now sum I1 I2 and I3 and then use (811) we obtain that when ` lt 0 andτ ge 0
|Fδ(s)| leeminus2π2δ2 |Γ(s)|radic
2π
∣∣∣∣intL
eminusφ(u)uminusσdu
∣∣∣∣le |x0|minusσ
((1 +
π
232
)eminusltφ(u0+) +
1
2eminus
τπ2
)eminus
12 `
2
|Γ(s)|(847)
By (819) (821) and (824)
minuslt(φ(u0+)) =`2
4(1minus υ(ρ)) +
τ
2arccos
1
υ(ρ)ltτ
2arccos
1
υ(ρ)le π
4τ
We conclude that when sgn(`) 6= sgn(τ) (ie sgn(δ) = sgn(τ))
|Fδ(s)| le |x0|minusσ middot eminus12 `
2
|Γ(s)|eπ2 |τ | middot((
1 +π
232
)eminus
π4 |τ | +
1
2eminusπ|τ |
)
where x0 can be bounded as in (845) and (846) Here as before we reducing the caseτ lt 0 to the case τ gt 0 by reflection This concludes the proof of Theorem 801
85 ConclusionsWe have obtained bounds on |Fδ(s)| for sgn(δ) 6= sgn(τ) (841) and for sgn(δ) =sgn(τ) (847) Our task is now to simplify them
First let us look at the exponent E(ρ) defined as in (82) Its plot can be seen inFigure 85 We claim that
E(ρ) ge
01598 if ρ ge 1501065ρ if ρ lt 15
(848)
This is so for ρ ge 15 because E(ρ) is increasing on ρ and E(15) = 015982 Thecase ρ lt 15 is a little more delicate We can easily see that arccos(1minus t22) ge t for0 le t ge 2 (since the derivative of the left side is 1
radic1minus t24 which is always ge 1)
We also have
1 +ρ2
2minus ρ4
8le j(ρ) le 1 +
ρ2
2
160 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
Figure 83 The function E(ρ)
for 0 le ρ leradic
8 and so
1 +ρ2
8minus 5ρ4
128le υ(ρ) le 1 +
ρ2
8
for 0 le ρ leradic
325 this in turn gives us that 1υ(ρ) le 1minus ρ28 + 7ρ4128 (againfor 0 le ρ le
radic325) and so 1υ(ρ) le 1 minus (1 minus 764)ρ28 for 0 le ρ le 12 We
conclude that
arccos1
υ(ρ)ge 1
2
radic57
64ρ
therefore
E(ρ) ge 1
4
radic57
64ρminus ρ
8gt 011093ρ gt 01065ρ
In the remaining range 12 le ρ le 32 we prove that E(ρ)ρ gt 0106551 usingthe bisection method (with 20 iterations) implemented by means of interval arithmeticThis concludes the proof of (848)
Assume from this point onwards that |τ | ge 20 Let us show that the contributionof (83) is negligible relative to that of (81) Indeed((
1 +π
232
)eminus
π4 |τ | +
1
2eminusπ|τ |
)le 78
106eminus01598τ
It is useful to note that eminus`22 = eminus2τρ and so for σ le k + 1 and ρ le 32
eminus2τρ
(084473|τ |`)σle eminus40ρ(
0844734 ρ
)σ`σle 1
`σ
(4
084473 middot 15
)σeminus80(3t)
tσ
le 1
`σmiddot 315683k+1 e
minus80(3t)
tk+1
(849)
85 CONCLUSIONS 161
where t = 2ρ3 le 1 Since eminuscttk+1 attains its maximum at t = c(k + 1)
eminus80(3t)
tk+1le eminus(k+1)
(3(k + 1)
80
)k+1
and so for ρ le 32
|x0|minusσeminus12 `
2
le 1
`σmiddot
004355 if 0 le σ le 1
000759 if 1 le σ le 2
000224 if 2 le σ le 3
whereas |x0|minusσeminus`22 le |x0|minusσ le (051729
radicτ)minusσ for ρ ge 32
We conclude that for |τ | ge 20 and σ le 3
|Fδ(s)| le |Γ(s)|eπ2 τ middot eminus01598τ middot
4
1071`σ if ρ le 32
6105
1τσ2
if ρ ge 32(850)
provided that sgn(δ) = sgn(τ) or δ = 0 This will indeed be negligible compared toour bound for the case sgn(δ) = minus sgn(τ)
Let us now deal with the factor |Γ(s)|eπ2 τ By Stirlingrsquos formula with remainderterm [GR94 (8344)]
log Γ(s) =1
2log(2π) +
(sminus 1
2
)log sminus s+
1
12s+R2(s)
where
|R2(s)| lt 130
12|s|3 cos3(
arg s2
) =
radic2
180|s|3
for lt(s) ge 0 The real part of (sminus 12) log sminus s is
(σ minus 12) log |s| minus τ arg(s)minus σ = (σ minus 12) log |s| minus π
2τ + τ
(arctan
σ
|τ |minus σ
|τ |
)for s = σ + iτ σ ge 0 Since arctan(r) le r for r ge 0 we conclude that
|Γ(s)|eπ2 τ leradic
2π|s|σminus 12 e
112|s|+
radic2
180|s|3 (851)
Lastly |s|σminus12 = |τ |σminus12|1 + iστ |σminus12 For |τ | ge 20
|1 + iστ |σminus12 le
1000625 if 0 le σ le 11007491 if 1 le σ le 21028204 if 2 le σ le 3
ande
112|τ|+
radic2
180|τ|3 le 1004177
162 CHAPTER 8 THE MELLIN TRANSFORM OF THE TWISTED GAUSSIAN
Thus
|Γ(s)|eπ2 τ le |τ |σminus12 middot
251868 if 0 le σ le 1253596 if 1 le σ le 225881 if 2 le σ le 3
(852)
Let us now estimate the constants c1στ and c2στ in (82) By |τ | ge 20
eminus(radic
2minus12
)τ le 0015889 eminus
τ6 le 0035674 (853)
Since 8 sin(π8) = 3061467 gt 1 we obtain that
c1στ le
130454 if 0 le σ le 1158361 if 1 le σ le 2198186 if 2 le σ le 3
c2στ le
194511 if 0 le σ le 1315692 if 1 le σ le 2502186 if 2 le σ le 3
Lastly note that for k le σ le k + 1 we have
1
τσ2middot |τ |σminus12 = |τ |(σminus1)2 le τk2
whereas for ρ le 32 and 0 le γ le 1
|τ |γminus12
|`|γle |τ |
γ2minus
12
( τ`2
)γ2le 20
γ2minus
12
(32
4
)γ2le(
3
8
)12
and so1
`σmiddot |τ |σminus12 =
(|τ |`
)k |τ |σminus12
|`|σleradic
3
8middot(|τ |`
)k
Multiplying and remembering to add (850) we obtain that for k = 0 1 2 σ isin[0 1] and |τ | ge 20
|Fδ(s+ k)|+ |Fδ((1minus s) + k)| le
κk0(|τ ||`|
)keminus01065( 2|τ|
|`| )2
if ρ lt 32
κk1|τ |keminus01598|τ | if ρ ge 32
whereκ00 le (4 middot 10minus7 + 194511) middot 251868 middot
radic38 le 3001
κ10 le (4 middot 10minus7 + 315692) middot 253596 middotradic
38 le 4903
κ20 le (4 middot 10minus7 + 502186) middot 25881 middotradic
38 le 796
and similarly
κ01 le (6 middot 10minus5 + 130454) middot 251868 le 3286
κ11 le (6 middot 10minus5 + 158361) middot 253596 le 4017
κ21 le (6 middot 10minus5 + 198186) middot 25881 le 513
This concludes the proof of Corollary 802
Chapter 9
Explicit formulas
An explicit formula is an expression restating a sum such as Sηχ(δx x) as a sum ofthe Mellin transformGδ(s) over the zeros of the L function L(s χ) More specificallyfor us Gδ(s) is the Mellin transform of η(t)e(δt) for some smoothing function η andsome δ isin R We want a formula whose error terms are good both for δ very close orequal to 0 and for δ farther away from 0 (Indeed our choice(s) of η will be made sothat Fδ(s) decays rapidly in both cases)
We will be able to base all of our work on a single general explicit formula namelyLemma 911 This explicit formula has simple error terms given purely in terms of afew norms of the given smoothing function η We also give a common framework forestimating the contribution of zeros on the critical strip (Lemmas 913 and 914)
The first example we work out is that of the Gaussian smoothing η(t) = eminust22
We actually do this in part for didactic purposes and in part because of its likely ap-plicability elsewhere for our applications we will always use smoothing functionsbased on teminust
22 and t2eminust22 generally in combination with something else Since
η(t) = eminust22 does not vanish at t = 0 its Mellin transform has a pole at s = 0
ndash something that requires some additional work (Lemma 912 see also the proof ofLemma 911)
Other than that for each function η(t) all that has to be done is to bound an integral(from Lemma 913) and bound a few norms Still both for ηlowast and for η+ we find afew interesting complications Since η+ is defined in terms of a truncation of a Mellintransform (or alternatively in terms of a multiplicative convolution with a Dirichletkernel as in (74) and (76)) bounding the norms of η+ and ηprime+ takes a little work Weleave this to Appendix A The effect of the convolution is then just to delay the decaya shift in that a rapidly decaying function f(τ) will get replaced by f(τ minus H) H aconstant
The smoothing function ηlowast is defined as a multiplicative convolution of t2eminust22
with something else Given that we have an explicit formula for t2eminust22 we obtain an
explicit formula for ηlowast by what amounts to just exchanging the order of a sum and anintegral (We already went over this in the introduction in (140))
163
164 CHAPTER 9 EXPLICIT FORMULAS
91 A general explicit formulaWe will prove an explicit formula valid whenever the smoothing η and its derivative ηprime
satisfy rather mild assumptions ndash they will be assumed to be L2-integrable and to havestrips of definition containing s 12 le lt(s) le 32 though any strip of the forms ε le lt(s) le 1 + ε would do just as well
(For explicit formulas with different sets of assumptions see eg [IK04 sect55] and[MV07 Ch 12])
The main idea in deriving any explicit formula is to start with an expression givinga sum as integral over a vertical line with an integrand involving a Mellin transform(here Gδ(s)) and an L-function (here L(s χ)) We then shift the line of integration tothe left If stronger assumptions were made (as in Exercise 5 in [IK04 sect55]) we couldshift the integral all the way tolt(s) = minusinfin the integral would then disappear replacedentirely by a sum over zeros (or even as in the same Exercise 5 by a particularly simpleintegral) Another possibility is to shift the line only to lt(s) = 12 + ε for some ε gt 0ndash but this gives a weaker result and at any rate the factor Lprime(s χ)L(s χ) can be largeand messy to estimate within the critical strip 0 lt lt(s) lt 1
Instead we will shift the line to lts = minus12 We can do this because the assump-tions on η and ηprime are enough to continue Gδ(s) analytically up to there (with a possiblepole at s = 0) The factor Lprime(s χ)L(s χ) is easy to estimate for lts lt 0 and s = 0(by the functional equation) and the part of the integral on lts = minus12 coming fromGδ(s) can be estimated easily using the fact that the Mellin transform is an isometry
Lemma 911 Let η R+0 rarr R be in C1 Let x isin R+ δ isin R Let χ be a primitive
character mod q q ge 1Write Gδ(s) for the Mellin transform of η(t)e(δt) Assume that η(t) and ηprime(t) are
in `2 (with respect to the measure dt) and that η(t)tσminus1 and ηprime(t)tσminus1 are in `1 (againwith respect to dt) for all σ in an open interval containing [12 32]
Theninfinsumn=1
Λ(n)χ(n)e
(δ
xn
)η(nx) = Iq=1 middot η(minusδ)xminus
sumρ
Gδ(ρ)xρ
minusR+Olowast ((log q + 601) middot (|ηprime|2 + 2π|δ||η|2))xminus12
(91)
where
Iq=1 =
1 if q = 10 if q 6= 1
R = η(0)
(log
2π
q+ γ minus Lprime(1 χ)
L(1 χ)
)+Olowast(c0)
(92)
for q gt 1 R = η(0) log 2π for q = 1 and
c0 =2
3Olowast(∣∣∣∣ηprime(t)radict
∣∣∣∣1
+∣∣∣ηprime(t)radict∣∣∣
1+ 2π|δ|
(∣∣∣∣η(t)radict
∣∣∣∣1
+ |η(t)radict|1))
(93)
The norms |η|2 |ηprime|2 |ηprime(t)radict|1 etc are taken with respect to the usual measure dt
The sumsumρ is a sum over all non-trivial zeros ρ of L(s χ)
91 A GENERAL EXPLICIT FORMULA 165
Proof Since (a) η(t)tσminus1 is in `1 for σ in an open interval containing 32 and (b)η(t)e(δt) has bounded variation (since η ηprime isin `1 implying that the derivative ofη(t)e(δt) is also in `1) the Mellin inversion formula (as in eg [IK04 4106]) holds
η(nx)e(δnx) =1
2πi
int 32 +iinfin
32minusiinfin
Gδ(s)xsnminussds
Since Gδ(s) is bounded for lt(s) = 32 (by η(t)t32minus1 isin `1) andsumn Λ(n)nminus32 is
bounded as well we can change the order of summation and integration as follows
infinsumn=1
Λ(n)χ(n)e(δnx)η(nx) =
infinsumn=1
Λ(n)χ(n) middot 1
2πi
int 32 +iinfin
32minusiinfin
Gδ(s)xsnminussds
=1
2πi
int 32 +iinfin
32minusiinfin
infinsumn=1
Λ(n)χ(n)Gδ(s)xsnminussds
=1
2πi
int 32 +iinfin
32minusiinfin
minusLprime(s χ)
L(s χ)Gδ(s)x
sds
(94)
(This is the way the procedure always starts see for instance [HL22 Lemma 1] orto look at a recent standard reference [MV07 p 144] We are being very scrupulousabout integration because we are working with general η)
The first question we should ask ourselves is up to where can we extend Gδ(s)Since η(t)tσminus1 is in `1 for σ in an open interval I containing [12 32] the transformGδ(s) is defined for lt(s) in the same interval I However we also know that thetransformation rule M(tf prime(t))(s) = minuss middotMf(s) (see (210) by integration by parts)is valid when s is in the holomorphy strip for both M(tf prime(t)) and Mf In our case(f(t) = η(t)e(δt)) this happens when lt(s) isin (I minus 1) cap I (so that both sides of theequation in the rule are defined) Hence s middot Gδ(s) (which equals s middotMf(s)) can beanalytically continued to lt(s) in (I minus 1) cup I which is an open interval containing[minus12 32] This implies immediately that Gδ(s) can be analytically continued to thesame region with a possible pole at s = 0
When does Gδ(s) have a pole at s = 0 This happens when sGδ(s) is non-zero ats = 0 ie when M(tf prime(t))(0) 6= 0 for f(t) = η(t)e(δt) Now
M(tf prime(t))(0) =
int infin0
f prime(t)dt = limtrarrinfin
f(t)minus f(0)
We already know that f prime(t) = (ddt)(η(t)e(δt)) is in `1 Hence limtrarrinfin f(t) existsand must be 0 because f is in `1 Hence minusM(tf prime(t))(0) = f(0) = η(0)
Let us look at the next term in the Laurent expansion of Gδ(s) at s = 0 It is
limsrarr0
sGδ(s)minus η(0)
s= limsrarr0
minusM(tf prime(t))(s)minus f(0)
s= minus lim
srarr0
1
s
int infin0
f prime(t)(ts minus 1)dt
= minusint infin
0
f prime(t) limsrarr0
ts minus 1
sdt = minus
int infin0
f prime(t) log t dt
166 CHAPTER 9 EXPLICIT FORMULAS
Here we were able to exchange the limit and the integral because f prime(t)tσ is in `1for σ in a neighborhood of 0 in turn this is true because f prime(t) = ηprime(t) + 2πiδη(t)and ηprime(t)tσ and η(t)tσ are both in `1 for σ in a neighborhood of 0 In fact we willuse the easy bounds |η(t) log t| le (23)(|η(t)tminus12|1 + |η(t)t12|1) |ηprime(t) log t| le(23)(|ηprime(t)tminus12|1 + |ηprime(t)t12|1) resulting from the inequality
2
3
(tminus
12 + t
12
)le | log t| (95)
valid for all t gt 0We conclude that the Laurent expansion of Gδ(s) at s = 0 is
Gδ(s) =η(0)
s+ c0 + c1s+ (96)
where
c0 = Olowast(|f prime(t) log t|1)
=2
3Olowast(∣∣∣∣ηprime(t)radict
∣∣∣∣1
+∣∣∣ηprime(t)radict∣∣∣
1+ 2πδ
(∣∣∣∣η(t)radict
∣∣∣∣1
+ |η(t)radict|1))
We shift the line of integration in (94) to lt(s) = minus12 We obtain
1
2πi
int 2+iinfin
2minusiinfinminusLprime(s χ)
L(s χ)Gδ(s)x
sds = Iq=1Gδ(1)xminussumρ
Gδ(ρ)xρ minusR
minus 1
2πi
int minus12+iinfin
minus12minusiinfin
Lprime(s χ)
L(s χ)Gδ(s)x
sds
(97)
where
R = Ress=0Lprime(s χ)
L(s χ)Gδ(s)
Of course
Gδ(1) = M(η(t)e(δt))(1) =
int infin0
η(t)e(δt)dt = η(minusδ)
Let us work out the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 By the func-tional equation (as in eg [IK04 Thm 415])
Lprime(s χ)
L(s χ)= log
π
qminus 1
2ψ
(s+ κ
2
)minus 1
2ψ
(1minus s+ κ
2
)minus Lprime(1minus s χ)
L(1minus s χ) (98)
where ψ(s) = Γprime(s)Γ(s) and
κ =
0 if χ(minus1) = 1
1 if χ(minus1) = minus1
91 A GENERAL EXPLICIT FORMULA 167
By ψ(1 minus x) minus ψ(x) = π cotπx (immediate from Γ(s)Γ(1 minus s) = π sinπs) andψ(s) + ψ(s+ 12) = 2(ψ(2s)minus log 2) (Legendre [AS64 (638)])
minus 1
2
(ψ
(s+ κ
2
)+ ψ
(1minus s+ κ
2
))= minusψ(1minuss)+log 2+
π
2cot
π(s+ κ)
2 (99)
Hence unless q = 1 the Laurent expansion of Lprime(s χ)L(s χ) at s = 0 is
1minus κs
+
(log
2π
qminus ψ(1)minus Lprime(1 χ)
L(1 χ)
)+a1
s+a2
s2+
Here ψ(1) = minusγ the Euler gamma constant [AS64 (632)]There is a special case for q = 1 due to the pole of ζ(s) at s = 1 We know that
ζ prime(0)ζ(0) = log 2π (see eg [MV07 p 331])From this and (96) we conclude that if η(0) = 0 then
R =
c0 if q gt 1 and χ(minus1) = 10 otherwise
where c0 = Olowast(|ηprime(t) log t|1 + 2π|δ||η(t) log t|1) If η(0) 6= 0 then
R = η(0)
(log
2π
q+ γ minus Lprime(1 χ)
L(1 χ)
)+
c0 if χ(minus1) = 1
0 otherwise
for q gt 1 andR = η(0) log 2π
for q = 1It is time to estimate the integral on the right side of (97) For that we will need to
estimate Lprime(s χ)L(s χ) for lt(s) = minus12 using (98) and (99)If lt(z) = 32 then |t2 + z2| ge 94 for all real t Hence by [OLBC10 (5915)]
and [GR94 (34111)]
ψ(z) = log z minus 1
2zminus 2
int infin0
tdt
(t2 + z2)(e2πt minus 1)
= log z minus 1
2z+ 2 middotOlowast
(int infin0
tdt94 (e2πt minus 1)
)= log z minus 1
2z+
8
9Olowast(int infin
0
tdt
e2πt minus 1
)= log z minus 1
2z+
8
9middotOlowast
(1
(2π)2Γ(2)ζ(2)
)= log z minus 1
2z+Olowast
(1
27
)= log z +Olowast
(10
27
)
(910)
Thus in particular ψ(1 minus s) = log(32 minus iτ) + Olowast(1027) where we write s =12 + iτ Now ∣∣∣∣cot
π(s+ κ)
2
∣∣∣∣ =
∣∣∣∣e∓π4 iminusπ2 τ + eplusmnπ4 i+
π2 τ
e∓π4 iminus
π2 τ minus eplusmnπ4 i+π
2 τ
∣∣∣∣ = 1
168 CHAPTER 9 EXPLICIT FORMULAS
Since lt(s) = minus12 a comparison of Dirichlet series gives∣∣∣∣Lprime(1minus s χ)
L(1minus s χ)
∣∣∣∣ le |ζ prime(32)||ζ(32)|
le 150524 (911)
where ζ prime(32) and ζ(32) can be evaluated by Euler-Maclaurin Therefore (98) and(99) give us that for s = minus12 + iτ ∣∣∣∣Lprime(s χ)
L(s χ)
∣∣∣∣ le ∣∣∣logq
π
∣∣∣+ log
∣∣∣∣32 + iτ
∣∣∣∣+10
27+ log 2 +
π
2+ 150524
le∣∣∣log
q
π
∣∣∣+1
2log
(τ2 +
9
4
)+ 41396
(912)
Recall that we must bound the integral on the right side of (97) The absolute valueof the integral is at most xminus12 times
1
2π
int minus 12 +iinfin
minus 12minusiinfin
∣∣∣∣Lprime(s χ)
L(s χ)Gδ(s)
∣∣∣∣ ds (913)
By Cauchy-Schwarz this is at mostradicradicradicradic 1
2π
int minus 12 +iinfin
minus 12minusiinfin
∣∣∣∣Lprime(s χ)
L(s χ)middot 1
s
∣∣∣∣2 |ds| middotradicradicradicradic 1
2π
int minus 12 +iinfin
minus 12minusiinfin
|Gδ(s)s|2 |ds|
By (912)radicradicradicradicint minus 12 +iinfin
minus 12minusiinfin
∣∣∣∣Lprime(s χ)
L(s χ)middot 1
s
∣∣∣∣2 |ds| leradicradicradicradicint minus 1
2 +iinfin
minus 12minusiinfin
∣∣∣∣ log q
s
∣∣∣∣2 |ds|+
radicradicradicradicint infinminusinfin
∣∣ 12 log
(τ2 + 9
4
)+ 41396 + log π
∣∣214 + τ2
dτ
leradic
2π log q +radic
226844
where we compute the last integral numerically1
Again we use the fact that by (210) sGδ(s) is the Mellin transform of
minus td(e(δt)η(t))
dt= minus2πiδte(δt)η(t)minus te(δt)ηprime(t) (914)
Hence by Plancherel (as in (26))radicradicradicradic 1
2π
int minus 12 +iinfin
minus 12minusiinfin
|Gδ(s)s|2 |ds| =
radicint infin0
|minus2πiδte(δt)η(t)minus te(δt)ηprime(t)|2 tminus2dt
= 2π|δ|
radicint infin0
|η(t)|2dt+
radicint infin0
|ηprime(t)|2dt
(915)1By a rigorous integration from τ = minus100000 to τ = 100000 using VNODE-LP [Ned06] which runs
on the PROFILBIAS interval arithmetic package [Knu99]
91 A GENERAL EXPLICIT FORMULA 169
Thus (913) is at most(log q +
radic226844
2π
)middot (|ηprime|2 + 2π|δ||η|2)
Lemma 911 leaves us with three tasks bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with small imaginary part bounding the sum of Gδ(ρ)xρ over allnon-trivial zeroes ρ with large imaginary part and bounding Lprime(1 χ)L(1 χ) Letus start with the last task while in a narrow sense it is optional ndash in that in theapplications we actually need (Thm 712 Cor 713 and Thm 714) we will haveη(0) = 0 thus making the term Lprime(1 χ)L(1 χ) disappear ndash it is also very easy andcan be dealt with quickly
Since we will be using a finite GRH check in all later applications we might aswell use it here
Lemma 912 Let χ be a primitive character mod q q gt 1 Assume that all non-trivialzeroes ρ = σ + it of L(s χ) with |t| le 58 satisfy lt(ρ) = 12 Then∣∣∣∣Lprime(1 χ)
L(1 χ)
∣∣∣∣ le 5
2logM(q) + c
where M(q) = maxn
∣∣∣summlen χ(m)∣∣∣ and
c = 5 log2radic
3
ζ(94)ζ(98)= 1507016
Proof By a lemma of Landaursquos (see eg [MV07 Lemma 63] where the constantsare easily made explicit) based on the Borel-Caratheodory Lemma (as in [MV07Lemma 62]) any function f analytic and zero-free on a disc Cs0R = s |sminus s0| leR of radius R gt 0 around s0 satisfies
f prime(s)
f(s)= Olowast
(2R logM|f(s0)|
(Rminus r)2
)(916)
for all s with |s minus s0| le r where 0 lt r lt R and M is the maximum of |f(z)| onCs0R Assuming L(s χ) has no non-trivial zeros off the critical line with |=(s)| le H where H gt 12 we set s0 = 12 +H r = H minus 12 and let Rrarr Hminus We obtain
Lprime(1 χ)
L(1 χ)= Olowast
(8H log
maxsisinCs0H |L(s χ)||L(s0 χ)|
) (917)
Now
|L(s0 χ)| geprodp
(1 + pminuss0)minus1 =prodp
(1minus pminus2s0)minus1
(1minus pminuss0)minus1=ζ(2s0)
ζ(s0)
Since s0 = 12 +H Cs0H is contained in s isin C lt(s) gt 12 for any value of H We choose (somewhat arbitrarily) H = 58
170 CHAPTER 9 EXPLICIT FORMULAS
By partial summation for s = σ + it with 12 le σ lt 1 and any N isin Z+
L(s χ) =sumnleN
χ(m)nminuss minus
summleN
χ(m)
(N + 1)minuss
+sum
ngeN+1
summlen
χ(m)
(nminuss minus (n+ 1)minuss+1)
= Olowast(N1minus12
1minus 12+N1minusσ +M(q)Nminusσ
)
(918)
where M(q) = maxn
∣∣∣summlen χ(m)∣∣∣ We set N = M(q)3 and obtain
|L(s χ)| le 2M(q)Nminus12 = 2radic
3radicM(q) (919)
We put this into (917) and are done
Let M(q) be as in the statement of Lem 912 Since the sum of χ(n) (χ mod qq gt 1) over any interval of length q is 0 it is easy to see that M(q) le q2 We alsohave the following explicit version of the Polya-Vinogradov inequality
M(q) le
2π2
radicq log q + 4
π2
radicq log log q + 3
2
radicq if χ(minus1) = 1
12π
radicq log q + 1
π
radicq log log q +
radicq if χ(minus1) = 1
(920)
Taken together with M(q) le q2 this implies that
M(q) le q45 (921)
for all q ge 1 and also thatM(q) le 2q35 (922)
for all q ge 1Notice lastly that ∣∣∣∣log
2π
q+ γ
∣∣∣∣ le log q + logeγ middot 2π
32
for all q ge 3 (There are no primitive characters modulo 2 so we can omit q = 2)We conclude that for χ primitive and non-trivial∣∣∣∣log
2π
q+ γ minus Lprime(1 χ)
L(1 χ)
∣∣∣∣ le logeγ middot 2π
32+ log q +
5
2log q
45 + 1507017
le 3 log q + 15289
Obviously 15289 is more than log 2π the bound for χ trivial Hence the absolutevalue of the quantity R in the statement of Lemma 911 is at most
|η(0)|(3 log q + 15289) + |c0| (923)
91 A GENERAL EXPLICIT FORMULA 171
for all primitive χIt now remains to bound the sum
sumρGδ(ρ)xρ in (91) Clearly∣∣∣∣∣sum
ρ
Gδ(ρ)xρ
∣∣∣∣∣ lesumρ
|Gδ(ρ)| middot xlt(ρ)
Recall that these are sums over the non-trivial zeros ρ of L(s χ)We first prove a general lemma on sums of values of functions on the non-trivial
zeros of L(s χ) This is little more than partial summation given a (classical) boundfor the number of zeroesN(T χ) of L(s χ) with |=(s)| le T The error term becomesparticularly simple if f is real-valued and decreasing the statement is then practicallyidentical to that of [Leh66 Lemma 1] (for χ principal) except for the fact that the errorterm is improved here
Lemma 913 Let f R+ rarr C be piecewise C1 Assume limtrarrinfin f(t)t log t = 0Let χ be a primitive character mod q q ge 1 let ρ denote the non-trivial zeros ρ ofL(s χ) Then for any y ge 1sum
ρ non-trivial=(ρ)gty
f(=(ρ)) =1
2π
int infiny
f(T ) logqT
2πdT
+1
2Olowast(|f(y)|gχ(y) +
int infiny
|f prime(T )| middot gχ(T )dT
)
(924)
wheregχ(T ) = 05 log qT + 177 (925)
If f is real-valued and decreasing on [yinfin) the second line of (924) equals
Olowast(
1
4
int infiny
f(T )
TdT
)
Proof WriteN(T χ) for the number of non-trivial zeros ofL(s χ) satisfying |=(s)| leT Write N+(T χ) for the number of (necessarily non-trivial) zeros of L(s χ) with0 lt =(s) le T Then for any f R+ rarr C with f piecewise differentiable andlimtrarrinfin f(t)N(T χ) = 0sum
ρ=(ρ)gty
f(=(ρ)) =
int infiny
f(T ) dN+(T χ)
= minusint infiny
f prime(T )(N+(T χ)minusN+(y χ))dT
= minus1
2
int infiny
f prime(T )(N(T χ)minusN(y χ))dT
Now by [Ros41 Thms 17ndash19] and [McC84a Thm 21] (see also [Tru Thm 1])
N(T χ) =T
πlog
qT
2πe+Olowast (gχ(T )) (926)
172 CHAPTER 9 EXPLICIT FORMULAS
for T ge 1 where gχ(T ) is as in (925) (This is a classical formula the referencesserve to prove the explicit form (925) for the error term gχ(T ))
Thus for y ge 1sumρ=(ρ)gty
f(=(ρ)) = minus1
2
int infiny
f prime(T )
(T
πlog
qT
2πeminus y
πlog
qy
2πe
)dT
+1
2Olowast(|f(y)|gχ(y) +
int infiny
|f prime(T )| middot gχ(T )dT
)
(927)
Here
minus 1
2
int infiny
f prime(T )
(T
πlog
qT
2πeminus y
πlog
qy
2πe
)dT =
1
2π
int infiny
f(T ) logqT
2πdT (928)
If f is real-valued and decreasing (and so by limtrarrinfin f(t) = 0 non-negative)
|f(y)|gχ(y) +
int infiny
|f prime(T )| middot gχ(T )dT = f(y)gχ(y)minusint infiny
f prime(T )gχ(T )dT
= 05
int infiny
f(T )
TdT
since gprimeχ(T ) le 05T for all T ge T0
Let us bound the part of the sumsumρGδ(ρ)xρ corresponding to ρ with bounded
|=(ρ)| The bound we will give is proportional toradicT0 log qT0 whereas a very naive
approach (based on the trivial bound |Gδ(σ + iτ)| le |G0(σ)|) would give a boundproportional to T0 log qT0
We could obtain a bound proportional toradicT0 log qT0 for η(t) = tkeminust
22 by usingTheorem 801 Instead we will give a bound of that same quality valid for η essentiallyarbitrary simply by using the fact that the Mellin transform is an isometry (preceded byan application of Cauchy-Schwarz)
Lemma 914 Let η R+0 rarr R be such that both η(t) and (log t)η(t) lie in L1 cap L2
and η(t)radict lies in L1 (with respect to dt) Let δ isin R Let Gδ(s) be the Mellin
transform of η(t)e(δt)Let χ be a primitive character mod q q ge 1 Let T0 ge 1 Assume that all non-
trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Thensumρ non-trivial|=(ρ)|leT0
|Gδ(ρ)|
is at most
(|η|2 + |η middot log |2)radicT0 log qT0 + (1721|η middot log |2 minus (log 2π
radice)|η|2)
radicT0
+∣∣∣η(t)
radict∣∣∣1middot (132 log q + 345)
(929)
91 A GENERAL EXPLICIT FORMULA 173
Proof For s = 12 + iτ we have the trivial bound
|Gδ(s)| leint infin
0
|η(t)|t12 dtt
=∣∣∣η(t)
radict∣∣∣1 (930)
where Fδ is as in (947) We also have the trivial bound
|Gprimeδ(s)| =∣∣∣∣int infin
0
(log t)η(t)tsdt
t
∣∣∣∣ le int infin0
|(log t)η(t)|tσ dtt
=∣∣(log t)η(t)tσminus1
∣∣1
(931)for s = σ + iτ
Let us start by bounding the contribution of very low-lying zeros (|=(ρ)| le 1) By(926) and (925)
N(1 χ) =1
πlog
q
2πe+Olowast (05 log q + 177) = Olowast(0819 log q + 168)
Therefore sumρ non-trivial|=(ρ)|le1
|Gδ(ρ)| le∣∣∣η(t)tminus12
∣∣∣1middot (0819 log q + 168)
Let us now consider zeros ρ with |=(ρ)| gt 1 Apply Lemma 913 with y = 1 and
f(t) =
|Gδ(12 + it)| if t le T0
0 if t gt T0
This gives us thatsumρ1lt|=(ρ)|leT0
f(=(ρ)) =1
π
int T0
1
f(T ) logqT
2πdT
+Olowast(|f(1)|gχ(1) +
int infin1
|f prime(T )| middot gχ(T ) dT
)
(932)
where we are using the fact that f(σ+ iτ) = f(σminus iτ) (because η is real-valued) ByCauchy-Schwarz
1
π
int T0
1
f(T ) logqT
2πdT le
radic1
π
int T0
1
|f(T )|2dT middot
radic1
π
int T0
1
(log
qT
2π
)2
dT
Now
1
π
int T0
1
|f(T )|2dT le 1
2π
int infinminusinfin
∣∣∣∣Gδ (1
2+ iT
)∣∣∣∣2 dT le int infin0
|e(δt)η(t)|2dt = |η|22
by Plancherel (as in (26)) We also haveint T0
1
(log
qT
2π
)2
dT le 2π
q
int qT02π
0
(log t)2dt le
((log
qT0
2πe
)2
+ 1
)middot T0
174 CHAPTER 9 EXPLICIT FORMULAS
Hence1
π
int T0
1
f(T ) logqT
2πdT le
radic(log
qT0
2πe
)2
+ 1 middot |η|2radicT0
Again by Cauchy-Schwarzint infin1
|f prime(T )| middot gχ(T ) dT le
radic1
2π
int infinminusinfin|f prime(T )|2dT middot
radic1
π
int T0
1
|gχ(T )|2dT
Since |f prime(T )| = |Gprimeδ(12 + iT )| and (Mη)prime(s) is the Mellin transform of log(t) middote(δt)η(t) (by (210))
1
2π
int infinminusinfin|f prime(T )|2dT = |η(t) log(t)|2
Much as beforeint T0
1
|gχ(T )|2dT leint T0
0
(05 log qT + 177)2dT
= (025(log qT0)2 + 172(log qT0) + 29609)T0
Summing we obtain
1
π
int T0
1
f(T ) logqT
2πdT +
int infin1
|f prime(T )| middot gχ(T ) dT
le((
logqT0
2πe+
1
2
)|η|2 +
(log qT0
2+ 1721
)|η(t)(log t)|2
)radicT0
Finally by (930) and (925)
|f(1)|gχ(1) le∣∣∣η(t)
radict∣∣∣1middot (05 log q + 177)
By (932) and the assumption that all non-trivial zeros with |=(ρ)| le T0 lie on the linelt(s) = 12 we conclude thatsum
ρ non-trivial1lt|=(ρ)|leT0
|Gδ(ρ)| le (|η|2 + |η middot log |2)radicT0 log qT0
+ (1721|η middot log |2 minus (log 2πradice)|η|2)
radicT0
+∣∣∣η(t)
radict∣∣∣1middot (05 log q + 177)
All that remains is to bound the contribution tosumρGδ(ρ)xρ corresponding to all
zeroes ρ with |=(ρ)| gt T0 This will do by another application of Lemma 913combined with bounds on Gδ(ρ) for =(ρ) large This is the only part that will requireus to take a look at the actual smoothing function η we are working with it is at thispoint not before that we actually have to look at each of our options for η one by one
92 SUMS AND DECAY FOR THE GAUSSIAN 175
92 Sums and decay for the GaussianIt is now time to derive our bounds for the Gaussian smoothing As we were sayingthere is really only one thing left to do namely an estimate for the sum
sumρ |Fδ(ρ)|
over all zeros ρ with |=(ρ)| gt T0
Lemma 921 Let ηhearts(t) = eminust22 Let x isin R+ δ isin R Let χ be a primitive character
mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge 50
Write Fδ(s) for the Mellin transform of η(t)e(δt) Thensumρ
|=(ρ)|gtT0
|Fδ(ρ)| le logqT0
2πmiddot(
353eminus01598T0 + 225δ2
T0eminus01065( T0
π|δ| )2)
Here we have preferred to give a bound with a simple form It is probably feasibleto derive from Theorem 801 a bound essentially proportional to eminusE(ρ)T0 where ρ =T0(πδ)
2 and E(ρ) is as in (82) (As we discussed in sect85 E(ρ) behaves as eminus(π4)T0
for ρ large and as eminus0125(T0(πδ))2
for ρ small)
Proof First of allsumρ
|=(ρ)|gtT0
|Fδ(ρ)| =sumρ
=(ρ)gtT0
(|Fδ(ρ)|+ |Fδ(1minus ρ)|)
by the functional equation (which implies that non-trivial zeros come in pairs ρ 1minusρ)Hence by a somewhat brutish application of Cor 802sum
ρ
|=(ρ)|gtT0
|Fδ(ρ)| lesumρ
=(ρ)gtT0
f(=(ρ)) (933)
wheref(τ) = 3001eminus01065( τ
πδ )2
+ 3286eminus01598|τ | (934)
Obviously f(τ) is a decreasing function of τ for τ ge T0We now apply Lemma 913 We obtain thatsum
ρ
=(ρ)gtT0
f(=(ρ)) leint infinT0
f(T )
(1
2πlog
qT
2π+
1
4T
)dT (935)
We just need to estimate some integrals For any y ge 1 c c1 gt 0int infiny
(log t+
c1t
)eminusctdt le
int infiny
(log tminus 1
ct
)eminusctdt+
(1
c+ c1
)int infiny
eminusct
tdt
=(log y)eminuscy
c+
(1
c+ c1
)E1(cy)
176 CHAPTER 9 EXPLICIT FORMULAS
where E1(x) =intinfinxeminustdtt Clearly E1(x) le
intinfinxeminustdtx = eminusxx Henceint infin
y
(log t+
c1t
)eminusctdt le
(log y +
(1
c+ c1
)1
y
)eminuscy
c
We conclude thatint infinT0
eminus01598t
(1
2πlog
qt
2π+
1
4t
)dt
le 1
2π
int infinT0
(log t+
π2
t
)eminusctdt+
log q2π
2πc
int infinT0
eminusctdt
=1
2πc
(log T0 + log
q
2π+
(1
c+π
2
)1
T0
)eminuscT0
(936)
with c = 01598 Since T0 ge 50 and q ge 1 this is at most
1072 logqT0
2πeminuscT0 (937)
Now let us deal with the Gaussian term (It appears only if T0 lt (32)(πδ)2 asotherwise |τ | ge (32)(πδ)2 holds whenever |τ | ge T0) For any y ge e c ge 0int infin
y
eminusct2
dt =1radicc
int infinradiccy
eminust2
dt le 1
cy
int infinradiccy
teminust2
dt le eminuscy2
2cy (938)
int infiny
eminusct2
tdt =
int infincy2
eminust
2tdt =
E1(cy2)
2le eminuscy
2
2cy2 (939)int infin
y
(log t)eminusct2
dt leint infiny
(log t+
log tminus 1
2ct2
)eminusct
2
dt =log y
2cyeminuscy
2
(940)
Hence int infinT0
eminus01065( Tπδ )2(
1
2πlog
qT
2π+
1
4T
)dT
=
int infinT0π|δ|
eminus01065t2(|δ|2
logq|δ|t
2+
1
4t
)dt
le
|δ|2 log T0
π|δ|
2cprime T0
π|δ|+|δ|2 log q|δ|
2
2cprime T0
π|δ|+
1
8cprime(T0
π|δ|
)2
eminuscprime( T0π|δ| )
2
(941)
with cprime = 01065 Since T0 ge 50 and q ge 1
2π
8T0le π
200le 00152 middot 1
2log
qT0
2π
Thus the last line of (941) is less than
10152|δ|2 log qT0
2π2cprimeT0
π|δ|eminusc
prime( T0π|δ| )
2
= 7487δ2
T0middot log
qT0
2πmiddot eminusc
prime( T0π|δ| )
2
(942)
92 SUMS AND DECAY FOR THE GAUSSIAN 177
Again by T0 ge 4π2|δ| we see that 10057π|δ|(4cT0) le 10057(16cπ) le 018787To obtain our final bound we simply sum (937) and (942) after multiplying them
by the constants 3286 and 3001 in (934) We conclude that the integral in (935) is atmost (
353eminus01598T0 + 225δ2
T0eminus01065( T0
π|δ| )2)
logqT0
2π
We need to record a few norms related to the Gaussian ηhearts(t) = eminust22 before we
proceed Recall we are working with the one-sided Gaussian ie we set ηhearts(t) = 0for t lt 0 Symbolic integration then gives
|ηhearts|22 =
int infin0
eminust2
dt =
radicπ
2
|ηprimehearts|22 =
int infin0
(teminust22)2dt =
radicπ
4
|ηhearts middot log |22 =
int infin0
eminust2
(log t)2dt
=
radicπ
16
(π2 + 2γ2 + 8γ log 2 + 8(log 2)2
)le 194753
(943)
|ηhearts(t)radict|1 =
int infin0
eminust22
radictdt =
Γ(14)
234le 215581
|ηprimehearts(t)radict| = |ηhearts(t)
radict|1 =
int infin0
eminust2
2
radictdt =
Γ(34)
214le 103045∣∣∣ηprimehearts(t)t12
∣∣∣1
=∣∣∣ηhearts(t)t32
∣∣∣1
=
int infin0
eminust2
2 t32 dt = 107791
(944)
We can now state what is really our main result for the Gaussian smoothing (Theversion in sect71 will as we shall later see follow from this given numerical inputs)
Proposition 922 Let η(t) = eminust22 Let x ge 1 δ isin R Let χ be a primitive character
mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie onthe critical line Assume that T0 ge 50
Then
infinsumn=1
Λ(n)χ(n)e
(δ
xn
)η(nx
)=
η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1
(945)where
errηχ(δ x) = logqT0
2πmiddot(
353eminus01598T0 + 225δ2
T0eminus01065( T0
π|δ| )2)
+ (2337radicT0 log qT0 + 21817
radicT0 + 285 log q + 7438)xminus
12
+ (3 log q + 14|δ|+ 17)xminus1 + (log q + 6) middot (1 + 5|δ|) middot xminus32
178 CHAPTER 9 EXPLICIT FORMULAS
Proof Let Fδ(s) be the Mellin transform of ηhearts(t)e(δt) By Lemmas 914 (withGδ =Fδ) and Lemma 921 ∣∣∣∣∣∣
sumρ non-trivial
Fδ(ρ)xρ
∣∣∣∣∣∣is at most (929) (with η = ηhearts) times
radicx plus
logqT0
2πmiddot(
353eminus01598T0 + 225|δ|2
T0eminus01065( T0
π|δ| )2)middot x
By the norm computations in (943) and (944) we see that (929) is at most
2337radicT0 log qT0 + 21817
radicT0 + 285 log q + 7438
Let us now apply Lemma 911 We saw that the value of R in Lemma 911 isbounded by (923) We know that ηhearts(0) = 1 Again by (943) and (944) the quantityc0 defined in (93) is at most 14056 + 133466|δ| Hence
|R| le 3 log q + 13347|δ|+ 16695
Lastly|ηprimehearts|2 + 2π|δ||ηhearts|2 le 0942 + 4183|δ| le 1 + 5|δ|
Clearly(601minus 6) middot (1 + 5|δ|) + 13347|δ|+ 16695 lt 14|δ|+ 17
and so we are done
93 The case of ηlowast(t)We will now work with a weight based on the Gaussian
η(t) =
t2eminust
22 if t ge 00 if t lt 0
(946)
The fact that this vanishes at t = 0 actually makes it easier to work with at severallevels
Its Mellin transform is just a shift of that of the Gaussian Write
Fδ(s) = (M(eminust2
2 e(δt)))(s)
Gδ(s) = (M(η(t)e(δt)))(s)(947)
Then by the definition of the Mellin transform
Gδ(s) = Fδ(s+ 2)
We start by bounding the contribution of zeros with large imaginary part just asbefore
93 THE CASE OF ηlowast(T ) 179
Lemma 931 Let η(t) = t2eminust22 Let x isin R+ δ isin R Let χ be a primitive character
mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 satisfylt(s) = 12 Assume that T0 ge max(10π|δ| 50)
Write Gδ(s) for the Mellin transform of η(t)e(δt) Then
sumρ
|=(ρ)|gtT0
|Gδ(ρ)| le T0 logqT0
2πmiddot(
611eminus01598T0 + 1578eminus01065middot T
20
(πδ)2
)
Proof We start by writingsumρ
|=(ρ)|gtT0
|Gδ(ρ)| =sumρ
=(ρ)gtT0
(|Fδ(ρ+ 2)|+ |Fδ((1minus ρ) + 2)|)
where we are usingGδ(ρ) = Fδ(ρ+2) and the fact that non-trivial zeros come in pairsρ 1minus ρ
By Cor 802 with k = 2sumρ
|=(ρ)|gtT0
|Gδ(ρ)| lesumρ
=(ρ)gtT0
f(=(ρ))
where
f(τ) =
κ21|τ |eminus01598|τ | +κ20
4
(|τ |πδ
)2
eminus01065( |τ|πδ )2
if |τ | lt 32 (πδ)2
κ21|τ |eminus01598|τ | if |τ | ge 32 (πδ)2
(948)
where κ20 = 796 and κ21 = 513 We are including the term |τ |eminus01598|τ | in bothcases in part because we cannot be bothered to take it out (just as we could not bebothered in the proof of Lem 921) and in part to ensure that f(τ) is a decreasingfunction of τ for τ ge T0
We can now apply Lemma 913 We obtain againsumρ
=(ρ)gtT0
f(=(ρ)) leint infinT0
f(T )
(1
2πlog
qT
2π+
1
4T
)dT (949)
Just as before we will need to estimate some integralsFor any y ge 1 c c1 gt 0 such that log y gt 1(cy)int infin
y
teminusctdt =
(y
c+
1
c2
)eminuscy
int infiny
(t log t+
c1t
)eminusctdt le
int infiny
((t+
aminus 1
c
)log tminus 1
cminus a
c2t
)eminusctdt
=(yc
+a
c2
)eminuscy log y
(950)
180 CHAPTER 9 EXPLICIT FORMULAS
where
a =
log yc + 1
c + c1y
log yc minus
1c2y
Setting c = 01598 c1 = π2 y = T0 ge 50 we obtain thatint infinT0
(1
2πlog
qT
2π+
1
4T
)Teminus01598T dT
le 1
2π
(log
q
2πmiddot(T0
c+
1
c2
)+
(T0
c+a
c2
)log T0
)eminus01598T0
(951)
and
a =
log T0
01598 + 101598 + π2
T0
log T0
01598 minus1
015982T0
le 1299
It is easy to see that ratio of the expression within parentheses on the right side of(951) to T0 log(qT02π) increases as q decreases and if we hold q fixed decreases asT0 ge 2π increases thus it is maximal for q = 1 and T0 = 50 Multiplying (951) byκ21 = 513 and simplifying by the assumption T0 ge 50 we obtain thatint infin
T0
513Teminus01598T
(1
2πlog
qT0
2π+
1
4T
)dT le 611T0 log
qT0
2πmiddot eminus01598T0
(952)Now let us examine the Gaussian term First of all ndash when does it arise If T0 ge
(32)(πδ)2 then |τ | ge (32)(πδ)2 holds whenever |τ | ge T0 and so (948) does notgive us a Gaussian term Recall that T0 ge 10π|δ| which means that |δ| le 20(3π)implies that T0 ge (32)(πδ)2 We can thus assume from now on that |δ| gt 20(3π)since otherwise there is no Gaussian term to treat
For any y ge 1 c c1 gt 0int infiny
t2eminusct2
dt lt
int infiny
(t2 +
1
4c2t2
)eminusct
2
dt =
(y
2c+
1
4c2y
)middot eminuscy
2
int infiny
(t2 log t+ c1t) middot eminusct2
dt leint infiny
(t2 log t+
at log et
2cminus log et
2cminus a
4c2t
)eminusct
2
dt
=(2cy + a) log y + a
4c2middot eminuscy
2
where
a =c1y + log ey
2cy log ey
2c minus 14c2y
=1
y+
c1y + 14c2y2
y log ey2c minus 1
4c2y
=1
y+
2c1c
log ey+
c12cy log ey + 1
4c2y2
y log ey2c minus 1
4c2y
(Note that a decreases as y ge y0 increases provided that log ey0 gt 1(2cy20)) Setting
93 THE CASE OF ηlowast(T ) 181
c = 01065 c1 = 1(2|δ|) le 316 and y = T0(π|δ|) ge 4π we obtainint infinT0π|δ|
(1
2πlog
q|δ|t2
+1
4π|δ|t
)t2eminus01065t2dt
le(
1
2πlog
q|δ|2
)middot(
T0
2πc|δ|+
1
4c2 middot 10
)middot eminus01065( T0
π|δ| )2
+1
2πmiddot
(2c T0
π|δ| + a)
log T0
π|δ| + a
4c2middot eminus01065( T0
π|δ| )2
and
a le 1
10+
(2middot203π
)minus1 middot 10 + 14middot010652middot102
10 log 10e2middot01065 minus
14middot010652middot10
le 0117
Multiplying by (κ204)π|δ| we get thatint infinT0
κ20
4
(T
π|δ|
)2
eminus01065( Tπ|δ| )
2(
1
2πlog
qT0
2π+
1
4T
)dT (953)
is at most eminus01065( T0π|δ| )
2
times((1487T0 + 2194|δ|) middot log
q|δ|2
+ 1487T0 logT0
π|δ|+ 2566|δ| log
eT0
π|δ|
)le
(1487 + 2566 middot
1 + 1log T0π|δ|
T0|δ|
)T0 log
qT0
2πle 1578 middot T0 log
qT0
2π
(954)
where we are using several times the assumption that T0 ge 4π2|δ| (and in one occa-sion the fact that |δ| gt 20(3π) gt 2)
We sum (952) and the estimate for (953) we have just got to reach our conclusion
Again we record some norms obtained by symbolic integration for η as in (946)
|η|22 =3
8
radicπ |ηprime|22 =
7
16
radicπ
|η middot log |22 =
radicπ
64
(8(3γ minus 8) log 2 + 3π2 + 6γ2 + 24(log 2)2 + 16minus 32γ
)le 016364
|η(t)radict|1 =
214Γ(14)
4le 107791 |η(t)
radict|1 =
3
4234Γ(34) le 154568
|ηprime(t)radict|1 =
int radic2
0
t32eminust2
2 dtminusint infinradic
2
t32eminust2
2 dt le 148469
|ηprime(t)radict|1 le 172169
(955)
182 CHAPTER 9 EXPLICIT FORMULAS
Proposition 932 Let η(t) = t2eminust22 Let x ge 1 δ isin R Let χ be a primitive
character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0
lie on the critical line Assume that T0 ge max(10π|δ| 50)Theninfinsumn=1
Λ(n)χ(n)e
(δ
xn
)η(nx) =
η(minusδ)x+Olowast (errηχ(δ x)) middot x if q = 1Olowast (errηχ(δ x)) middot x if q gt 1
(956)where
errηχ(δ x) = T0 logqT0
2πmiddot(
611eminus01598T0 + 1578eminus01065middot T
20
(πδ)2
)+(
122radicT0 log qT0 + 5056
radicT0 + 1423 log q + 3719
)middot xminus12
+ (3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32(957)
Proof We proceed as in the proof of Prop 922 The contribution of Lemma 931 is
T0 logqT0
2πmiddot(
611eminus01598T0 + 1578eminus01065middot T
20
(πδ)2
)middot x
whereas the contribution of Lemma 914 is at most
(122radicT0 log qT0 + 5056
radicT0 + 1423 log q + 37188)
radicx
Let us now apply Lemma 911 Since η(0) = 0 we have
R = Olowast(c0) = Olowast(2138 + 1099|δ|)
Lastly|ηprime|2 + 2π|δ||η|2 le 0881 + 5123|δ|
Now that we have Prop 932 we can derive from it similar bounds for a smoothingdefined as the multiplicative convolution of η with something else In general forϕ1 ϕ2 [0infin)rarr C if we know how to bound sums of the form
Sfϕ1(x) =sumn
f(n)ϕ1(nx) (958)
we can bound sums of the form Sfϕ1lowastMϕ2 simply by changing the order of summationand integration
Sfϕ1lowastMϕ2 =sumn
f(n) middot (ϕ1 lowastM ϕ2)(nx
)=
int infin0
sumn
f(n)ϕ1
( n
wx
)ϕ2(w)
dw
w=
int infin0
Sfϕ1(wx)ϕ2(w)
dw
w
(959)
93 THE CASE OF ηlowast(T ) 183
This is particularly nice if ϕ2(t) vanishes in a neighbourhood of the origin since thenthe argument wx of Sfϕ1(wx) is always large
We will use ϕ1(t) = t2eminust22 ϕ2(t) = η1 lowastM η1 where η1 is 2 times the char-
acteristic function of the interval [12 1] The motivation for the choice of ϕ1 and ϕ2
is clear we have just got bounds based on ϕ1(t) in the major arcs and we obtainedminor-arc bounds for the weight ϕ2(t) in Part I
Corollary 933 Let η(t) = t2eminust22 η1 = 2 middot I[121] η2 = η1 lowastM η1 Let ηlowast =
η2 lowastM η Let x isin R+ δ isin R Let χ be a primitive character mod q q ge 1 Assumethat all non-trivial zeros ρ of L(s χ) with |=(ρ)| le T0 lie on the critical line Assumethat T0 ge max(10π|δ| 50)
Theninfinsumn=1
Λ(n)χ(n)e
(δ
xn
)ηlowast(nx) =
ηlowast(minusδ)x+Olowast (errηlowastχ(δ x)) middot x if q = 1Olowast (errηlowastχ(δ x)) middot x if q gt 1
(960)where
errηχlowast(δ x) = T0 logqT0
2πmiddot(
611eminus01598T0 + 00102 middot eminus01065middot T20
(πδ)2
)+(
1679radicT0 log qT0 + 6957
radicT0 + 1958 log q + 5117
)middot xminus 1
2
+ (6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32(961)
Proof The left side of (960) equalsint infin0
infinsumn=1
Λ(n)χ(n)e
(δn
x
)η( n
wx
)η2(w)
dw
w
=
int 1
14
infinsumn=1
Λ(n)χ(n)e
(δwn
wx
)η( n
wx
)η2(w)
dw
w
since η2 is supported on [minus14 1] By Prop 932 the main term (if q = 1) contributesint 1
14
η(minusδw)xw middot η2(w)dw
w= x
int infin0
η(minusδw)η2(w)dw
= x
int infin0
int infinminusinfin
η(t)e(δwt)dt middot η2(w)dw = x
int infin0
int infinminusinfin
η( rw
)e(δr)
dr
wη2(w)dw
= x
int infinminusinfin
(int infin0
η( rw
)η2(w)
dw
w
)e(δr)dr = ηlowast(minusδ) middot x
The error term isint 1
14
errηχ(δwwx) middot wx middot η2(w)dw
w= x middot
int 1
14
errηχ(δwwx)η2(w)dw (962)
184 CHAPTER 9 EXPLICIT FORMULAS
Using the fact that
η2(w) =
4 log 4w if w isin [14 12]4 logwminus1 if w isin [12 1]0 otherwise
we can easily check thatint infin0
η2(w)dw = 1
int infin0
wminus12η2(w)dw le 137259int infin0
wminus1η2(w)dw = 4(log 2)2 le 192182
int infin0
wminus32η2(w)dw le 274517
and by rigorous numerical integration from 14 to 12 and from 12 to 1 (using egVNODE-LP [Ned06])int infin
0
eminus01065middot102( 1w2minus1)η2(w)dw le 0006446
We then see that (957) and (962) imply (961)
94 The case of η+(t)
We will work with
η(t) = η+(t) = hH(t) middot tηhearts(t) = hH(t) middot teminust22 (963)
where hH is as in (76) We recall that hH is a band-limited approximation to thefunction h defined in (75) ndash to be more precise MhH(it) is the truncation of Mh(it)to the interval [minusHH]
We are actually defining h hH and η in a slightly different way from what was donein the first version of [Hela] The difference is instructive There η(t) was defined ashH(t)eminust
22 and hH was a band-limited approximation to a function h defined as in(75) but with t3(2 minus t)3 instead of t2(2 minus t)3 The reason for our new definitions isthat now the truncation of Mh(it) will not break the holomorphy of Mη and so wewill be able to use the general results we proved in sect91
In essence Mh will still be holomorphic because the Mellin transform of tηhearts(t) isholomorphic in the domain we care about unlike the Mellin transform of ηhearts(t) whichdoes have a pole at s = 0
As usual we start by bounding the contribution of zeros with large imaginary partThe procedure is much as before since η+(t) = ηH(t)ηhearts(t) the Mellin transformMη+ is a convolution of M(teminust
22) and something of support in [minusHH]i namelyMηH restricted to the imaginary axis This means that the decay of Mη+ is (at worst)like the decay of M(teminust
22) delayed by H
94 THE CASE OF η+(T ) 185
Lemma 941 Let η = η+ be as in (963) for some H ge 25 Let x isin R+ δ isin R Letχ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ of L(s χ)with |=(ρ)| le T0 satisfy lt(s) = 12 where T0 ge H + max(10π|δ| 50)
Write Gδ(s) for the Mellin transform of η(t)e(δt) Then
sumρ
|=(ρ)|gtT0
|Gδ(ρ)| le
(11308
radicT prime0eminus01598T prime0 + 16147|δ|e
minus01065
(T prime0πδ
)2)log
qT0
2π
where T prime0 = T0 minusH
Proof As usual sumρ
|=(ρ)|gtT0
|Gδ(ρ)| =sumρ
=(ρ)gtT0
(|Gδ(ρ)|+ |Gδ(1minus ρ)|)
Let Fδ be as in (947) Then since η+(t)e(δt) = hH(t)teminust22e(δt) where hH is as
in (76) we see by (29) that
Gδ(s) =1
2π
int H
minusHMh(ir)Fδ(s+ 1minus ir)dr
and so since |Mh(ir)| = |Mh(minusir)|
|Gδ(ρ)|+ |Gδ(1minus ρ)| le 1
2π
int H
minusH|Mh(ir)|(|Fδ(1 +ρminus ir)|+ |Fδ(2minus (ρminus ir))|)dr
(964)We apply Cor 802 with k = 1 and T0minusH instead of T0 and obtain that |Fδ(ρ)|+
|Fδ(1minus ρ)| le g(τ) where
g(τ) = κ11
radic|τ |eminus01598|τ | + κ10
|τ |2π|δ|
eminus01065( τπδ )
2
(965)
where κ10 = 4903 and κ11 = 4017 (As in the proof of Lemmas 921 and 931 weare putting in extra terms so as to simplify our integrals)
From (964) we conclude that
|Gδ(ρ)|+ |Gδ(1minus ρ)| le f(τ)
for ρ = σ + iτ τ gt 0 where
f(τ) =|Mh(ir)|1
2πmiddot g(τ minusH)
is decreasing for τ ge T0 (because g(τ) is decreasing for τ ge T0 minus H) By (A17)|Mh(ir)|1 le 16193918
186 CHAPTER 9 EXPLICIT FORMULAS
We apply Lemma 913 and get that
sumρ
|=(ρ)|gtT0
|Gδ(ρ)| leint infinT0
f(T )
(1
2πlog
qT
2π+
1
4T
)dT
=|Mh(ir)|1
2π
int infinT0
g(T minusH)
(1
2πlog
qT
2π+
1
4T
)dT
(966)
Now we just need to estimate some integrals For any y ge e2 c gt 0 and κ κ1 ge 0int infiny
radicteminusctdt le
(radicy
c+
1
2c2radicy
)eminuscy
int infiny
(radict log(t+ κ) +
κ1radict
)eminusctdt le
(radicy
c+
a
c2radicy
)log(y + κ)eminuscy
where
a =1
2+
1 + cκ1
log(y + κ)
The contribution of the exponential term in (965) to (966) thus equals
κ11|Mh(ir)|12π
int infinT0
(1
2πlog
qT
2π+
1
4T
)radicT minusH middot eminus01598(TminusH)dT
le 103532
int infinT0minusH
(1
2πlog(T +H) +
log q2π
2π+
1
4T
)radicTeminus01598T dT
le 103532
2π
(radicT0 minusH01598
+a
015982radicT0 minusH
)log
qT0
2πmiddot eminus01598(T0minusH)
(967)
where a = 12+(1+01598π2) log T0 Since T0minusH ge 50 and T0 ge 50+25 = 75this is at most
11308radicT0 minusH log
qT0
2πmiddot eminus01598(T0minusH)
We now estimate a few more integrals so that we can handle the Gaussian term in(965) For any y gt 1 c gt 0 κ κ1 ge 0int infin
y
teminusct2
dt =eminuscy
2
2c
int infiny
(t log(t+ κ) + κ1)eminusct2
dt le
(1 +
κ1 + 12cy
y log(y + κ)
)log(y + κ) middot eminuscy2
2c
Proceeding just as before we see that the contribution of the Gaussian term in (965)
94 THE CASE OF η+(T ) 187
to (966) is at most
κ10|Mh(ir)|12π
int infinT0
(1
2πlog
qT
2π+
1
4T
)T minusH2π|δ|
middot eminus01065(TminusHπδ )2
dT
le 126368 middot |δ|4
int infinT0minusHπ|δ|
(log
(T +
H
π|δ|
)+ log
q|δ|2
+π2
T
)Teminus01065T 2
dT
le 126368 middot |δ|8 middot 01065
1 +
π2 + π|δ|
2middot01065middot(T0minusH)
T0minusHπ|δ| log T0
π|δ|
logqT0
2πmiddot eminus01065(T0minusHπδ )
2
(968)Since (T0 minusH)(π|δ|) ge 10 this is at most
16147|δ| logqT0
2πmiddot eminus01065(T0minusHπδ )
2
Proposition 942 Let η = η+ be as in (963) for some H ge 25 Let x ge 103 δ isin RLet χ be a primitive character mod q q ge 1 Assume that all non-trivial zeros ρ ofL(s χ) with |=(ρ)| le T0 lie on the critical line where T0 ge H + max(10π|δ| 50)
Theninfinsumn=1
Λ(n)χ(n)e
(δ
xn
)η+(nx) =
η+(minusδ)x+Olowast
(errη+χ(δ x)
)middot x if q = 1
Olowast(errη+χ(δ x)
)middot x if q gt 1
(969)where
errη+χ(δ x) =
(11308
radicT prime0 middot eminus01598T prime0 + 16147|δ|e
minus01065
(T prime0πδ
)2)log
qT0
2π
+ (1634radicT0 log qT0 + 1243
radicT0 + 1321 log q + 3451)x12
+ (9 + 11|δ|)xminus1 + (log q)(11 + 6|δ|)xminus32(970)
where T prime0 = T0 minusH
Proof We can apply Lemmas 911 and Lemma 914 because η+(t) (log t)η+(t) andηprime+(t) are in `2 (by (A25) (A28) and (A32)) and η+(t)tσminus1 and ηprime+(t)tσminus1 are in`1 for σ in an open interval containing [12 32] (by (A30) and (A33)) (Because of(95) the fact that η+(t)tminus12 and η+(t)t12 are in `1 implies that η+(t) log t is also in`1 as is required by Lemma 914)
We apply Lemmas 911 914 and 941 We bound the norms involving η+ usingthe estimates in sectA3 and sectA4 Since η+(0) = 0 (by the definition (A3) of η+) theterm R in (92) is at most c0 where c0 is as in (93) We bound
c0 le2
3
(2922875
(radicΓ(12) +
radicΓ(32)
)+ 1062319
(radicΓ(52) +
radicΓ(72)
))+
4π
3|δ| middot 1062319
(radicΓ(32) +
radicΓ(52)
)le 6536232 + 9319578|δ|
188 CHAPTER 9 EXPLICIT FORMULAS
using (A30) and (A33) By (A25) (A32) and the assumption H ge 25
|η+|2 le 080365 |ηprime+|2 le 10845789
Thus the error terms in (91) total at most
6536232+9319578|δ|+ (log q + 601)(10845789 + 2π middot 080365|δ|)xminus12
le 9 + 11|δ|+ (log q)(11 + 6|δ|)xminus12(971)
The part of the sumsumρGδ(ρ)xρ in (91) corresponding to zeros ρ with |=(ρ)| gt
T0 gets estimated by Lem 941 By Lemma 914 the part of the sum correspondingto zeros ρ with |=(ρ)| le T0 is at most
(1634radicT0 log qT0 + 1243
radicT0 + 1321 log q + 3451)x12
where we estimate the norms |η+|2 |η middot log |2 and |η(t)radict|1 by (A25) (A28) and
(A30)
95 A sum for η+(t)2
Using a smoothing function sometimes leads to considering sums involving the squareof the smoothing function In particular in Part III we will need a result involving η2
+
ndash something that could be slightly challenging to prove given the way in which η+ isdefined Fortunately we have bounds on |η+|infin and other `infin-norms (see AppendixA5) Our task will also be made easier by the fact that we do not have a phase e(δnx)this time All in all this will be yet another demonstration of the generality of theframework developed in sect91
Proposition 951 Let η = η+ be as in (963) H ge 25 Let x ge 108 Assume thatall non-trivial zeros ρ of the Riemann zeta function ζ(s) with |=(ρ)| le T0 lie on thecritical line where T0 ge max(2H + 25 200)
Theninfinsumn=1
Λ(n)(log n)η2+(nx) = x middot
int infin0
η2+(t) log xt dt+Olowast(err`2η+) middot x log x (972)
where
err`2η+ =
((0462
(log T1)2
log x+ 0909 log T1
)T1 + 171
(1 +
log T1
log x
)H
)eminus
π4 T1
+ (2445radicT0 log T0 + 5004) middot xminus12
(973)and T1 = T0 minus 2H
The assumption T0 ge 200 is stronger than what we strictly need but as it happenswe could make much stronger assumptions still Proposition 951 relies on a verifica-tion of zeros of the Riemann zeta function such verifications have gone up to valuesof T0 much higher than 200
95 A SUM FOR η+(T )2 189
Proof We will need to consider two smoothing functions namely η+0(t) = η+(t)2
and η+1 = η+(t)2 log t Clearly
infinsumn=1
Λ(n)(log n)η2+(nx) = (log x)
infinsumn=1
Λ(n)η+0(nx) +
infinsumn=1
Λ(n)η+1(nx)
Since η+(t) = hH(t)teminust22
η+0(r) = h2H(t)t2eminust
2
η+1(r) = h2H(t)(log t)t2eminust
2
Let η+2 = (log x)η+0 + η+1 = η2+(t) log xt
We wish to apply Lemma 911 For this we must first check that some norms arefinite Clearly
η+2(t) = η2+(t) log x+ η2
+(t) log t
ηprime+2(t) = 2η+(t)ηprime+(t) log x+ 2η+(t)ηprime+(t) log t+ η2+(t)t
(974)
Thus we see that η+2(t) is in `2 because η+(t) is in `2 and η+(t) η+(t) log t are bothin `infin (see (A25) (A38) (A40))
|η+2(t)|2 le∣∣η2
+(t)∣∣2
log x+∣∣η2
+(t) log t∣∣2
le |η+|infin |η+|2 log x+ |η+(t) log t|infin |η+|2 (975)
Similarly ηprime+2(t) is in `2 because η+(t) is in `2 ηprime+(t) is in `2 (A32) and η+(t)η+(t) log t and η+(t)t (see (A41)) are all in `infin∣∣ηprime+2(t)
∣∣2le∣∣2η+(t)ηprime+(t)
∣∣2
log x+∣∣2η+(t)ηprime+(t) log t
∣∣2
+∣∣η2
+(t)t∣∣2
le 2 |η+|infin∣∣ηprime+∣∣2 log x+ 2 |η+(t) log t|infin
∣∣ηprime+∣∣2 + |η+(t)t|infin |η+|2 (976)
In the same way we see that η+2(t)tσminus1 is in `1 for all σ in (minus1infin) (because the sameis true of η+(t)tσminus1 (A30) and η+(t) η+(t) log t are both in `infin) and ηprime+2(t)tσminus1 isin `1 for all σ in (0infin) (because the same is true of η+(t)tσminus1 and ηprime+(t)tσminus1 (A33)and η+(t) η+(t) log t η+(t)t are all in `infin)
We now apply Lemma 911 with q = 1 δ = 0 Since η+2(0) = 0 the residueterm R equals c0 which by (974) is at most 23 times
2 (|η+|infin log x+ |η+(t) log t|infin)(∣∣∣ηprime+(t)
radict∣∣∣1
+∣∣∣ηprime+(t)
radict∣∣∣1
)+ |η+(t)t|infin
(∣∣∣η+(t)radict∣∣∣1
+∣∣∣η+(t)
radict∣∣∣1
)
Using the bounds (A38) (A40) (A41) (with the assumption H ge 25) (A30) and(A33) we get that this means that
c0 le 1857606 log x+ 863264
190 CHAPTER 9 EXPLICIT FORMULAS
Since q = 1 and δ = 0 we get from (976) (and (A38) (A40) (A41) with theassumption H ge 25 and also (A25) and (A32)) that
(log q + 601)middot(∣∣ηprime+2∣∣2 + 2π|δ| |η+2|2
)xminus12
= 601∣∣ηprime+2∣∣2 xminus12 le (16256 log x+ 59325)xminus12
Using the assumption x ge 108 we obtain
c0 + (18526 log x+ 71799)xminus12 le 19064 log x (977)
We will now apply Lemma 914 ndash as we may because of the finiteness of the normswe have already checked together with
|η+2(t) log t|2 le∣∣η2
+(t) log t∣∣2
log x+∣∣η2
+(t)(log t)2∣∣2
le |η+(t) log t|infin (|η+(t)|2 log x+ |η+(t) log t|2)
le 04976 middot (080365 log x+ 082999) le 03999 log x+ 041301(978)
(by (A40) (A25) and (A28) use the assumption H ge 25) We also need the bounds
|η+2(t)|2 le 114199 log x+ 039989 (979)
(from (975) by the norm bounds (A38) (A40) and (A25) all with H ge 25) and∣∣∣η+2(t)radict∣∣∣1le (|η+(t)|infin log x+ |η+(t) log t|infin)
∣∣∣η+(t)radict∣∣∣1
le 14211 log x+ 049763(980)
(by (A38) (A40) (again with H ge 25) and (A30))Applying Lemma 914 we obtain that the sum
sumρ |G0(ρ)|xρ (where G0(ρ) =
Mη+2(ρ)) over all non-trivial zeros ρ with |=(ρ)| le T0 is at most x12 times
(154189 log x+ 08129)radicT0 log T0 + (421245 log x+ 617301)
radicT0
+ 491 log x+ 172(981)
where we are bounding norms by (979) (978) and (980) (We are using the fact thatT0 ge 2π
radice to ensure that the quantity
radicT0 log T0minus (log 2π
radice)radicT0 being multiplied
by |η+2|2 is positive thus an upper bound for |η+2|2 suffices) By the assumptionsx ge 108 T0 ge 200 (981) is at most
(2445radicT0 log T0 + 50034) log x
In comparison 19064xminus12 log x le 0002 log x since x ge 108It remains to bound the sum of Mη+2(ρ) over zeros with |=(ρ)| gt T0 This we
will do as usual by Lemma 913 For that we will need to bound Mη+2(ρ) for ρ inthe critical strip
95 A SUM FOR η+(T )2 191
The Mellin transform of eminust2
is Γ(s2)2 and so the Mellin transform of t2eminust2
is Γ(s2 + 1)2 By (210) this implies that the Mellin transform of (log t)t2eminust2
isΓprime(s2 + 1)4 Hence by (29)
Mη+2(s) =1
4π
int infinminusinfin
M(h2H)(ir) middot Fx (sminus ir) dr (982)
whereFx(s) = (log x)Γ
(s2
+ 1)
+1
2Γprime(s
2+ 1) (983)
Moreover
M(h2H)(ir) =
1
2π
int infinminusinfin
MhH(iu)MhH(i(r minus u)) du (984)
and so M(h2H)(ir) is supported on [minus2H 2H] We also see that |Mh2
H(ir)|1 le|MhH(ir)|212π We know that |MhH(ir)|212π le 4173727 by (A17)
Hence
|Mη+2(s)| le 1
4π
int infinminusinfin|M(h2
H)(ir)|dr middot max|r|le2H
|Fx(sminus ir)|
le 4173727
4πmiddot max|r|le2H
|Fx(sminus ir)| le 332135 middot max|r|le2H
|Fx(sminus ir)|(985)
By (851) (Stirling with explicit constants)
|Γ(s)| leradic
2π|s|σminus 12 e
112|s|+
radic2
180|s|3 eminusπ|=(s)|2 (986)
when lt(s) ge 0 and so
|Γ(s)| leradic
2π
(radic1252 + 152
125
)e
112middot125 +
radic2
180middot1253 middot |=(s)|eminusπ|=(s)|2
le 2542|=(s)|eminusπ|=(s)|2
(987)
for s isin C with 0 lt lt(s) le 32 and |=(s)| ge 252 Moreover by [OLBC10 5112]and the remarks at the beginning of [OLBC10 511(ii)]
Γprime(s)
Γ(s)= log sminus 1
2s+Olowast
(1
12|s|2middot 1
cos3 θ2
)for | arg(s)| lt θ (θ isin (minusπ π)) Again for s = σ + iτ with 0 lt σ le 32 and|τ | ge 252 this gives us
Γprime(s)
Γ(s)= log |τ |+ log
radic|τ |2 + 152
|τ |+Olowast
(1
2|τ |
)+Olowast
(1
12|τ |2middot 1
(1radic
2)3
)= log |τ |+Olowast
(9
8|τ |2+
1
2|τ |
)+Olowast(0236)
|τ |2
= log |τ |+Olowast(
0609
|τ |
)
192 CHAPTER 9 EXPLICIT FORMULAS
Hence for 0 le lt(s) le 1 (or in fact minus2 le lt(s) le 1) and |=(s)| ge 25
|Fx(s)| le(
(log x) +1
2log∣∣∣τ2
∣∣∣+1
2Olowast(
0609
|τ2|
))Γ(s
2+ 1)
le 2542((log x) +1
2log |τ | minus 0297)
|τ |2eminusπ|τ |2
(988)
Thus by (985) for ρ = σ + iτ with |τ | ge T0 ge 2H + 25 and 0 le σ le 1
|Mη+2(ρ)| le f(τ)
where
f(T ) = 845
(log x+
1
2log T
)(|τ |2minusH
)middot eminus
π(|τ|minus2H)4 (989)
The functions t 7rarr teminusπt2 and t 7rarr (log t)teminusπt2 are decreasing for t ge e (or in factfor t ge 1762) setting t = T2minusH we see that the right side of (989) is a decreasingfunction of T for T ge T0 since T02minusH ge 252 gt e
We can now apply Lemma 913 and get thatsumρ
|=(ρ)|gtT0
|Mη+2(ρ)| leint infinT0
f(T )
(1
2πlog
T
2π+
1
4T
)dT (990)
Since T ge T0 ge 75 gt 2 we know that ((12π) log(T2π) + 14T ) le (12π) log T Hence the right side of (990) is at most
839
4π
int infinT0
((log x)(log T ) +
(log T )2
2
)(T minus 2H)eminus
π(Tminus2H)4 dT
le 0668
int infinT1
((log x)
(log t+
2H
t
)+
((log t)2
2+ 2H
log t
t
))teminus
πt4 dt
(991)
where T1 = T0 minus 2H and t = T minus 2H we are using the facts that (log t)primeprime lt 0 fort gt 0 and ((log t)2)primeprime lt 0 for t gt e (Of course T1 ge 25 gt e)
Of courseintinfinT1eminus(π4)t = (4π)eminus(π4)T1 We recall (936) and (950)int infinT1
log t middot eminusπ4 tdt le(
log T1 +4π
T1
)eminus
π4 T1
π4int infinT1
(log t)teminusπ4 tdt le
(T1 +
4a
π
)eminus
π4 T1 log T1
π4
for T1 ge 1 satisfying log T1 gt 4(πT1) where a = 1 + (1 + 4(πT1))(log T1 minus4(πT1)) It is easy to check that log T1 gt 4(πT1) and 4aπ le 16957 for T1 ge 25of course we also have (4π)25 le 0051 Lastlyint infin
T1
(log t)2teminusπ4 tdt le
(T1 +
4b
π
)eminus
π4 T1(log T1)2
π4
96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 193
for T1 ge e where b = 1 + (2 + 8(πT1))(log T1 minus 8(πT1)) and we check that4bπ le 21319 for T1 ge 25 We conclude that the integral on the second line of (991)is at most
4
π
((log T1)2
2(T1 + 2132) + (log x)(log T1)(T1 + 1696)
)eminus
π4 T1
+4
πmiddot 2H(log T1 + 0051 + log x)eminus
π4 T1
Multiplying this by 0668 and simplifying further (using T1 ge 25) we conclude thatsumρ|=(ρ)|gtT0
|Mη+2(ρ)| is at most
((0462 log T1 + 0909 log x)(log T1)T1 + 171(log T1 + log x)H) eminusπ4 T1
96 A verification of zeros and its consequencesDavid Platt verified in his doctoral thesis [Pla11] that for every primitive character χof conductor q le 105 all the non-trivial zeroes of L(s χ) with imaginary partle 108qlie on the critical line ie have real part exactly 12 (We call this a GRH verificationup to 108q)
In work undertaken in coordination with the present work [Plab] Platt has extendedthese computations to
bull all odd q le 3 middot 105 with Tq = 108q
bull all even q le 4 middot 105 with Tq = max(108q 200 + 75 middot 107q)
The method used was rigorous its implementation uses interval arithmeticLet us see what this verification gives us when used as an input to Prop 922 We
are interested in bounds on | errηχlowast(δ x)| for q le r and |δ| le 4rq We set r = 3middot105(We will not be using the verification for q even with 3 middot 105 lt q le 4 middot 105 though wecertainly could)
We let T0 = 108q Thus
T0 ge108
3 middot 105=
1000
3
T0
π|δ|ge 108q
π middot 4rq=
1000
12π
(992)
and so by |δ| le 4rq le 12 middot 106q le 12 middot 106
353eminus01598T0 le 2597 middot 10minus23
225δ2
T0eminus01065
T20
(πδ)2 le |δ| middot 7715 middot 10minus34 le 9258 middot 10minus28
194 CHAPTER 9 EXPLICIT FORMULAS
Since qT0 le 108 this gives us that
logqT0
2πmiddot(
353eminus01598T0 + 225δ2
T0eminus01065
T20
(πδ)2
)le 43054 middot 10minus22 +
154 middot 10minus26
qle 4306 middot 10minus22
Again by T0 = 108q
2337radicT0 log qT0 + 21817
radicT0 + 285 log q + 7438
is at most648662radicq
+ 111
and
3 log q + 14|δ|+ 17 le 55 +17 middot 107
q
(log q + 6) middot (1 + 5|δ|) le 19 +12 middot 108
q
Hence assuming x ge 108 to simplify we see that Prop 922 gives us that
errηχ(δ x) le 4306 middot 10minus22 +
648662radicq + 111radicx
+55 + 17middot107
q
x+
19 + 12middot108
q
x32
le 4306 middot 10minus22 +1radicx
(650400radicq
+ 112
)for η(t) = eminust
22 This proves Theorem 711Let us now see what Plattrsquos calculations give us when used as an input to Prop 932
and Cor 933 Again we set r = 3 middot 105 δ0 = 8 |δ| le 4rq and T0 = 108q so(992) is still valid We obtain
T0 logqT0
2πmiddot(
611eminus01598T0 + 1578eminus01065middot T
20
(πδ)2
)le log
108
2π
(611 middot 1000
3eminus01598middot 10003 + 108 middot 1578eminus01065( 1000
12π )2)
le 2485 middot 10minus19
since t exp(minus01598t) is decreasing on t for t ge 101598 We use the same boundwhen we have 00102 instead of 1578 on the left side as in (961) (The coefficientaffects what is by far the smaller term so we are wasting nothing) Again by T0 =108q and q le r
122radicT0 log qT0 + 5053
radicT0 + 1423 log q + 3719 le 279793
radicq
+ 552
1679radicT0 log qT0 + 6957
radicT0 + 1958 log q + 5117 le 378854
radicq
+ 759
96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 195
For x ge 108 we use |δ| le 4rq le 12 middot 106q to bound
(3 + 11|δ|)xminus1 + (log q + 6) middot (1 + 6|δ|) middot xminus32 le(
00004 +1322
q
)xminus12
(6 + 22|δ|)xminus1 + (log q + 6) middot (3 + 17|δ|) middot xminus32 le(
00007 +2644
q
)xminus12
Summing we obtain
errηχ le 2485 middot 10minus19 +1radicx
(281200radicq
+ 56
)for η(t) = t2eminust
22 and
errηχ le 2485 middot 10minus19 +1radicx
(381500radicq
+ 76
)for η(t) = t2eminust
22 lowastM η2(t) This proves Theorem 712 and Corollary 713Now let us work with the smoothing weight η+ This time around set r = 150000
if q is odd and r = 300000 if q is even As before we assume
q le r |δ| le 4rq
We can see that Plattrsquos verification [Plab] mentioned before allows us to take
T0 = H +250r
q H = 200
since Tq is always at least this (Tq = 108q ge 200 + 7 middot 107q gt 200 + 375 middot 107qfor q le 150000 odd Tq ge 200 + 75 middot 107q for q le 300000 even)
Thus
T0 minusH =250r
qge 250r
r= 250
T0 minusHπδ
ge 250r
πδqge 250
4π= 1989436
and also
T0 le 200 + 250 middot 150000 le 3751 middot 107 qT0 le rH + 250r le 135 middot 108
Hence sinceradicteminus01598t is decreasing on t for t ge 1(2 middot 01598)
11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065
(T0minusH)2
(πδ)2
le 79854 middot 10minus16 +4r
qmiddot 79814 middot 10minus18
le 79854 middot 10minus16 +95777 middot 10minus12
q
196 CHAPTER 9 EXPLICIT FORMULAS
Examining (970) we get
errη+χ(δ x) le log135 middot 108
2πmiddot(
79854 middot 10minus16 +95777 middot 10minus12
q
)+
((1634 log(135 middot 108) + 1243
) radic135 middot 108
radicq
+ 1321 log 300000 + 3451
)1radicx
+
(9 + 11 middot 12 middot 106
q
)xminus1 + (log 300000)
(11 + 6 middot 12 middot 106
q
)xminus32
le 13482 middot 10minus14 +1617 middot 10minus10
q
+
(499845radicq
+ 5117 +132 middot 106
qradicx
+9radicx
+91 middot 107
qx+
139
x
)1radicx
Making the assumption x ge 1012 we obtain
errη+χ(δ x) le 13482 middot 10minus14 +1617 middot 10minus10
q+
(499900radicq
+ 52
)1radicx
This proves Theorem 714 for general qLet us optimize things a little more carefully for the trivial character χT Again
we will make the assumption x ge 1012 We will also assume as we did before that|δ| le 4rq this now gives us |δ| le 600000 since q = 1 and r = 150000 for q oddWe will go up to a height T0 = H + 600000π middot t where H = 200 and t ge 10 Then
T0 minusHπδ
=600000πt
4πrge t
Hence
11308radicT0 minusHeminus01598(T0minusH) + 16147|δ|eminus01065
(T0minusH)2
(πδ)2
le 10minus1300000 + 9689000eminus01065t2
Looking at (970) we get
errη+χT (δ x) le logT0
2πmiddot(
10minus1300000 + 9689000eminus01065t2)
+ ((1634 log T0 + 1243)radicT0 + 3451)xminus12 + 6600009xminus1
The value t = 20 seems good enough we choose it because it is not far from optimalfor x sim 1027 We get that T0 = 12000000π + 200 since T0 lt 108 we are within therange of the computations in [Plab] (or for that matter [Wed03] or [Plaa]) We obtain
errη+χT (δ x) le 4772 middot 10minus11 +251400radic
x
Lastly let us look at the sum estimated in (972) Here it will be enough to go upto just T0 = 2H + max(50 H4) = 450 where as before H = 200 Of course the
96 A VERIFICATION OF ZEROS AND ITS CONSEQUENCES 197
verification of the zeros of the Riemann zeta function does go that far as we alreadysaid it goes until 108 (or rather more see [Wed03] and [Plaa]) We make again theassumption x ge 1012 We look at (973) and obtain that err`2η+ is at most((
0462(log 50)2
log 1012+ 0909 log 50
)middot 50 + 171
(1 +
log 50
log 1012
)middot 200
)eminus
π4 50
+ (2445radic
450 log 450 + 5004) middot xminus12
le 5123 middot 10minus15 +36691radic
x
(993)It remains only to estimate the integral in (972) First of allint infin
0
η2+(t) log xt dt =
int infin0
η2(t) log xt dt
+ 2
int infin0
(η+(t)minus η(t))η(t) log xt dt+
int infin0
(η+(t)minus η(t))2 log xt dt
The main term will be given byint infin0
η2(t) log xt dt =
(064020599736635 +O
(10minus14
))log x
minus 0021094778698867 +O(10minus15
)
where the integrals were computed rigorously using VNODE-LP [Ned06] (The in-tegral
intinfin0η2(t)dt can also be computed symbolically) By Cauchy-Schwarz and the
triangle inequalityint infin0
(η+(t)minus η(t))η(t) log xt dt le |η+ minus η|2|η(t) log xt|2
le |η+ minus η|2(|η|2 log x+ |η middot log |2)
le 27486
H72(080013 log x+ 0214)
le 1944 middot 10minus6 middot log x+ 52 middot 10minus7
where we are using (A23) and evaluate |η middot log |2 rigorously as above By (A23) and(A24)int infin
0
(η+(t)minus η(t))2 log xt dt le(
27486
H72
)2
log x+27428
H7
le 5903 middot 10minus12 middot log x+ 2143 middot 10minus12
We conclude thatint infin0
η2+(t) log xt dt
= (0640206 +Olowast(195 middot 10minus6)) log xminus 0021095 +Olowast(53 middot 10minus7)
(994)
198 CHAPTER 9 EXPLICIT FORMULAS
We add to this the error term 5123 middot 10minus15 + 36691radicx from (993) and simplify
using the assumption x ge 1012 We obtain
infinsumn=1
Λ(n)(log n)η2+(nx) = 0640206x log xminus 0021095x
+Olowast(2 middot 10minus6x log x+ 36691
radicx log x
)
(995)
and so Prop 951 gives us Proposition 715As we can see the relatively large error term 2 middot 10minus6 comes from the fact that we
have wanted to give the main term in (972) as an explicit constant rather than as anintegral This is satisfactory Prop 715 is an auxiliary result that will be needed forone specific purpose in Part III as opposed to Thms 711ndash714 which while crucialfor Part III are also of general applicability and interest
Part III
The integral over the circle
199
Chapter 10
The integral over the major arcs
LetSη(α x) =
sumn
Λ(n)e(αn)η(nx) (101)
where α isin RZ Λ is the von Mangoldt function and η R rarr C is of fast enoughdecay for the sum to converge
Our ultimate goal is to bound from belowsumn1+n2+n3=N
Λ(n1)Λ(n2)Λ(n3)η1(n1x)η2(n2x)η3(n3x) (102)
where η1 η2 η3 R rarr C Once we know that this is neither zero nor very close tozero we will know that it is possible to write N as the sum of three primes n1 n2 n3
in at least one way that is we will have proven the ternary Goldbach conjectureAs can be readily seen (102) equalsint
RZSη1(α x)Sη2(α x)Sη3(α x)e(minusNα) dα (103)
In the circle method the set RZ gets partitioned into the set of major arcs M and theset of minor arcs m the contribution of each of the two sets to the integral (103) isevaluated separately
Our objective here is to treat the major arcs we wish to estimateintM
Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα)dα (104)
for M = Mδ0r where
Mδ0r =⋃qlerq odd
⋃a mod q
(aq)=1
(a
qminus δ0r
2qxa
q+δ0r
2qx
)cup⋃qle2rq even
⋃a mod q
(aq)=1
(a
qminus δ0r
qxa
q+δ0r
qx
)(105)
201
202 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
and δ0 gt 0 r ge 1 are givenIn other words our major arcs will be few (that is a constant number) and narrow
While [LW02] used relatively narrow major arcs as well their number as in all pre-vious proofs of Vinogradovrsquos result was not bounded by a constant (In his proof ofthe five-primes theorem [Tao14] is able to take a single major arc around 0 this is notpossible here)
What we are about to see is the general major-arc setup This is naturally the placewhere the overlap with the existing literature is largest Two important differences cannevertheless be singled out
bull The most obvious one is the presence of smoothing At this point it improvesand simplifies error terms but it also means that we will later need estimates forexponential sums on major arcs and not just at the middle of each major arc (Ifthere is smoothing we cannot use summation by parts to reduce the problem ofestimating sums to a problem of counting primes in arithmetic progressions orweighted by characters)
bull Since our L-function estimates for exponential sums will give bounds that arebetter than the trivial one by only a constant ndash even if it is a rather large con-stant ndash we need to be especially careful when estimating error terms findingcancellation when possible
101 Decomposition of Sη by charactersWhat follows is largely classical cf [HL22] or say [Dav67 sect26] The only differencefrom the literature lies in the treatment of n non-coprime to q and the way in whichwe show that our exponential sum (108) is equal to a linear combination of twistedsums Sηχlowast over primitive characters χlowast (Non-primitive characters would give us L-functions with some zeroes inconveniently placed on the line lt(s) = 0)
Write τ(χ b) for the Gauss sum
τ(χ b) =sum
a mod q
χ(a)e(abq) (106)
associated to a b isin ZqZ and a Dirichlet character χ with modulus q We let τ(χ) =τ(χ 1) If (b q) = 1 then τ(χ b) = χ(bminus1)τ(χ)
Recall that χlowast denotes the primitive character inducing a given Dirichlet characterχ Writing
sumχ mod q for a sum over all characters χ of (ZqZ)lowast) we see that for any
a0 isin ZqZ
1
φ(q)
sumχ mod q
τ(χ b)χlowast(a0) =1
φ(q)
sumχ mod q
suma mod q
(aq)=1
χ(a)e(abq)χlowast(a0)
=sum
a mod q
(aq)=1
e(abq)
φ(q)
sumχ mod q
χlowast(aminus1a0) =sum
a mod q
(aq)=1
e(abq)
φ(q)
sumχ mod qprime
χ(aminus1a0)
(107)
101 DECOMPOSITION OF Sη BY CHARACTERS 203
where qprime = q gcd(q ainfin0 ) Nowsumχ mod qprime χ(aminus1a0) = 0 unless a = a0 (in which
casesumχ mod qprime χ(aminus1a0) = φ(qprime)) Thus (107) equals
φ(qprime)
φ(q)
suma mod q
(aq)=1
aequiva0 mod qprime
e(abq) =φ(qprime)
φ(q)
sumk mod qqprime
(kqqprime)=1
e
((a0 + kqprime)b
q
)
=φ(qprime)
φ(q)e
(a0b
q
) sumk mod qqprime
(kqqprime)=1
e
(kb
qqprime
)=φ(qprime)
φ(q)e
(a0b
q
)micro(qqprime)
provided that (b q) = 1 (We are evaluating a Ramanujan sum in the last step) Hencefor α = aq + δx q le x (a q) = 1
1
φ(q)
sumχ
τ(χ a)sumn
χlowast(n)Λ(n)e(δnx)η(nx)
equals sumn
micro((q ninfin))
φ((q ninfin))Λ(n)e(αn)η(nx)
Since (a q) = 1 τ(χ a) = χ(a)τ(χ) The factor micro((q ninfin))φ((q ninfin)) equals 1when (n q) = 1 the absolute value of the factor is at most 1 for every n Clearlysum
n(nq)6=1
Λ(n)η(nx
)=sump|q
log psumαge1
η
(pα
x
)
Recalling the definition (101) of Sη(α x) we conclude that
Sη(α x) =1
φ(q)
sumχ mod q
χ(a)τ(χ)Sηχlowast
(δ
x x
)+Olowast
2sump|q
log psumαge1
η
(pα
x
)
(108)where
Sηχ(β x) =sumn
Λ(n)χ(n)e(βn)η(nx) (109)
Hence Sη1(α x)Sη2(α x)Sη3(α x)e(minusNα) equals
1
φ(q)3
sumχ1
sumχ2
sumχ3
τ(χ1)τ(χ2)τ(χ3)χ1(a)χ2(a)χ3(a)e(minusNaq)
middot Sη1χlowast1 (δx x)Sη2χlowast2 (δx x)Sη3χlowast3 (δx x)e(minusδNx)
(1010)
plus an error term of absolute value at most
2
3sumj=1
prodjprime 6=j
|Sηjprime (α x)|sump|q
log psumαge1
ηj
(pα
x
) (1011)
204 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
We will later see that the integral of (1011) over S1 is negligible ndash for our choices ofηj it will in fact be of size O(x(log x)A) A a constant The error term O(x(log x)A)should be compared to the main term which will be of size about a constant times x2
In (1010) we have reduced our problems to estimating Sηχ(δx x) for χ prim-itive a more obvious way of reaching the same goal would have made (1011) worseby a factor of about
radicq
102 The integral over the major arcs the main term
We are to estimate the integral (104) where the major arcs Mδ0r are defined as in(105) We will use η1 = η2 = η+ η3(t) = ηlowast(κt) where η+ and ηlowast will be set later
We can write
Sηχ(δx x) = Sη(δx x) =
int infin0
η(tx)e(δtx)dt+Olowast(errηχ(δ x)) middot x
= η(minusδ) middot x+Olowast(errηχT (δ x)) middot x(1012)
for χ = χT the trivial character and
Sηχ(δx) = Olowast(errηχ(δ x)) middot x (1013)
for χ primitive and non-trivial The estimation of the error terms err will come laterlet us focus on (a) obtaining the contribution of the main term (b) using estimates onthe error terms efficiently
The main term three principal characters The main contribution will be given bythe term in (1010) with χ1 = χ2 = χ3 = χ0 where χ0 is the principal character modq
The sum τ(χ0 n) is a Ramanujan sum as is well-known (see eg [IK04 (32)])
τ(χ0 n) =sumd|(qn)
micro(qd)d (1014)
This simplifies to micro(q(q n))φ((q n)) for q square-free The special case n = 1 givesus that τ(χ0) = micro(q)
Thus the term in (1010) with χ1 = χ2 = χ3 = χ0 equals
e(minusNaq)φ(q)3
micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx) (1015)
where of course Sηχlowast0 (α x) = Sη(α x) (since χlowast0 is the trivial character) Summing(1015) for α = aq+δx and a going over all residues mod q coprime to q we obtain
micro(
q(qN)
)φ((qN))
φ(q)3micro(q)3Sη+χlowast0 (δx x)2Sηlowastχlowast0 (δx x)e(minusδNx)
102 THE INTEGRAL OVER THE MAJOR ARCS THE MAIN TERM 205
The integral of (1015) over all of M = Mδ0r (see (105)) thus equals
sumqlerq odd
φ((qN))
φ(q)3micro(q)2micro((qN))
int δ0r2qx
minus δ0r2qx
S2η+χlowast0
(α x)Sηlowastχlowast0 (α x)e(minusαN)dα
+sumqle2rq even
φ((qN))
φ(q)3micro(q)2micro((qN))
int δ0rqx
minus δ0rqxS2η+χlowast0
(α x)Sηlowastχlowast0 (α x)e(minusαN)dα
(1016)The main term in (1016) is
x3 middotsumqlerq odd
φ((qN))
φ(q)3micro(q)2micro((qN))
int δ0r2qx
minus δ0r2qx
(η+(minusαx))2ηlowast(minusαx)e(minusαN)dα
+x3 middotsumqle2rq even
φ((qN))
φ(q)3micro(q)2micro((qN))
int δ0rqx
minus δ0rqx(η+(minusαx))2ηlowast(minusαx)e(minusαN)dα
(1017)We would like to complete both the sum and the integral Before we should say
that we will want to be able to use smoothing functions η+ whose Fourier transformsare not easy to deal with directly All we want to require is that there be a smoothingfunction η easier to deal with such that η be close to η+ in `2 norm
Assume then that
|η+ minus η|2 le ε0|η|
where η is thrice differentiable outside finitely many points and satisfies η(3) isin L1
Then (1017) equals
x3 middotsumqlerq odd
φ((qN))
φ(q)3micro(q)2micro((qN))
int δ0r2qx
minus δ0r2qx
(η(minusαx))2ηlowast(minusαx)e(minusαN)dα
+x3 middotsumqle2rq even
φ((qN))
φ(q)3micro(q)2micro((qN))
int δ0rqx
minus δ0rqx(η(minusαx))2ηlowast(minusαx)e(minusαN)dα
(1018)plus
Olowast
(x2 middot
sumq
micro(q)2
φ(q)2
int infinminusinfin|(η+(minusα))2 minus (η(minusα))2||ηlowast(minusα)|dα
) (1019)
206 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
Here (1019) is bounded by 282643x2 (by (C9)) times
|ηlowast(minusα)|infin middot
radicint infinminusinfin|η+(minusα)minus η(minusα)|2dα middot
int infinminusinfin|η+(minusα) + η(minusα)|2dα
le |ηlowast|1 middot |η+ minus η|2|η+ + η|2 = |ηlowast|1 middot |η+ minus η|2|η+ + η|2le |ηlowast|1 middot |η+ minus η|2(2|η|2 + |η+ minus η|2) = |ηlowast|1|η|22 middot (2 + ε0)ε0
Now (1018) equals
x3
int infinminusinfin
(η(minusαx))2ηlowast(minusαx)e(minusαN)sum
q(q2)lemin( δ0r
2|α|x r)micro(q)2=1
φ((qN))
φ(q)3micro((qN))dα
= x3
int infinminusinfin
(η(minusαx))2ηlowast(minusαx)e(minusαN)dα middot
sumqge1
φ((qN))
φ(q)3micro(q)2micro((qN))
minusx3
int infinminusinfin
(η(minusαx))2ηlowast(minusαx)e(minusαN)sum
q(q2)
gtmin( δ0r
2|α|x r)micro(q)2=1
φ((qN))
φ(q)3micro((qN))dα
(1020)The last line in (1020) is bounded1 by
x2|ηlowast|infinint infinminusinfin|η(minusα)|2
sumq
(q2)gtmin( δ0r2|α| r)
micro(q)2
φ(q)2dα (1021)
By (21) (with k = 3) (C16) and (C17) this is at most
x2|ηlowast|1int δ02
minusδ02|η(minusα)|2 431004
rdα
+ 2x2|ηlowast|1int infinδ02
(|η(3) |1
(2πα)3
)2862008|α|
δ0rdα
le |ηlowast|1
(431004|η|22 + 000113
|η(3) |21δ50
)x2
r
It is easy to see that
sumqge1
φ((qN))
φ(q)3micro(q)2micro((qN)) =
prodp|N
(1minus 1
(pminus 1)2
)middotprodp-N
(1 +
1
(pminus 1)3
)
1This is obviously crude in that we are bounding φ((qN))φ(q) by 1 We are doing so in order toavoid a potentially harmful dependence on N
103 THE `2 NORM OVER THE MAJOR ARCS 207
Expanding the integral implicit in the definition of f int infininfin
(η(minusαx))2ηlowast(minusαx)e(minusαN)dα =
1
x
int infin0
int infin0
η(t1)η(t2)ηlowast
(N
xminus (t1 + t2)
)dt1dt2
(1022)
(This is standard One rigorous way to obtain (1022) is to approximate the integralover α isin (minusinfininfin) by an integral with a smooth weight at different scales as the scalebecomes broader the Fourier transform of the weight approximates (as a distribution)the δ function Apply Plancherel)
Hence (1017) equals
x2 middotint infin
0
int infin0
η(t1)η(t2)ηlowast
(N
xminus (t1 + t2)
)dt1dt2
middotprodp|N
(1minus 1
(pminus 1)2
)middotprodp-N
(1 +
1
(pminus 1)3
)
(1023)
(the main term) plus
282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 000113
|η(3) |21δ50
r
|ηlowast|1x2 (1024)
Here (1023) is just as in the classical case [IK04 (1910)] except for the fact thata factor of 12 has been replaced by a double integral Later in chapter 11 we will seehow to choose our smoothing functions (and x in terms ofN ) so as to make the doubleintegral as large as possible in comparison with the error terms This is an importantoptimization (We already had a first discussion of this in the introduction see (139)and what follows)
What remains to estimate is the contribution of all the terms of the form errηχ(δ x)in (1012) and (1013) Let us first deal with another matter ndash bounding the `2 norm of|Sη(α x)|2 over the major arcs
103 The `2 norm over the major arcs
We can always bound the integral of |Sη(α x)|2 on the whole circle by Plancherel Ifwe only want the integral on certain arcs we use the bound in Prop 1212 (based onwork by Ramare) If these arcs are really the major arcs ndash that is the arcs on whichwe have useful analytic estimates ndash then we can hope to get better bounds using L-functions This will be useful both to estimate the error terms in this section and tomake the use of Ramarersquos bounds more efficient later
208 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
By (108)
suma mod q
gcd(aq)=1
∣∣∣∣Sη (aq +δ
x χ
)∣∣∣∣2
=1
φ(q)2
sumχ
sumχprime
τ(χ)τ(χprime)
suma mod q
gcd(aq)=1
χ(a)χprime(a)
middot Sηχlowast(δx x)Sηχprimelowast(δx x)
+Olowast(
2(1 +radicq)(log x)2|η|infinmax
α|Sη(α x)|+
((1 +
radicq)(log x)2|η|infin
)2)=
1
φ(q)
sumχ
|τ(χ)|2|Sηχlowast(δx x)|2 +Kq1(2|Sη(0 x)|+Kq1)
where
Kq1 = (1 +radicq)(log x)2|η|infin
As is well-known (see eg [IK04 Lem 31])
τ(χ) = micro
(q
qlowast
)χlowast(q
qlowast
)τ(χlowast)
where qlowast is the modulus of χlowast (ie the conductor of χ) and
|τ(χlowast)| =radicqlowast
Using the expressions (1012) and (1013) we obtain
suma mod q
(aq)=1
∣∣∣∣Sη (aq +δ
x x
)∣∣∣∣2 =micro2(q)
φ(q)|η(minusδ)x+Olowast (errηχT (δ x) middot x)|2
+1
φ(q)
sumχ 6=χT
micro2
(q
qlowast
)qlowast middotOlowast
(| errηχ(δ x)|2x2
)+Kq1(2|Sη(0 x)|+Kq1)
=micro2(q)x2
φ(q)
(|η(minusδ)|2 +Olowast (|errηχT (δ x)(2|η|1 + errηχT (δ x))|)
)+Olowast
(maxχ6=χT
qlowast| errηχlowast(δ x)|2x2 +Kq2x
)
where Kq2 = Kq1(2|Sη(0 x)|x+Kq1x)
103 THE `2 NORM OVER THE MAJOR ARCS 209
Thus the integral of |Sη(α x)|2 over M (see (105)) is
sumqlerq odd
suma mod q
(aq)=1
int aq+
δ0r2qx
aqminus
δ0r2qx
|Sη(α x)|2 dα+sumqle2rq even
suma mod q
(aq)=1
int aq+
δ0rqx
aqminus
δ0rqx
|Sη(α x)|2 dα
=sumqlerq odd
micro2(q)x2
φ(q)
int δ0r2qx
minus δ0r2qx
|η(minusαx)|2 dα+sumqle2rq even
micro2(q)x2
φ(q)
int δ0rqx
minus δ0rqx|η(minusαx)|2 dα
+Olowast
(sumq
micro2(q)x2
φ(q)middot gcd(q 2)δ0r
qx
(ET
ηδ0r2
(2|η|1 + ETηδ0r2
)))
+sumqlerq odd
δ0rx
qmiddotOlowast
maxχ mod q
χ 6=χT|δ|leδ0r2q
qlowast| errηχlowast(δ x)|2 +Kq2
x
+sumqle2rq even
2δ0rx
qmiddotOlowast
maxχ mod q
χ 6=χT|δ|leδ0rq
qlowast| errηχlowast(δ x)|2 +Kq2
x
(1025)where
ETηs = max|δ|les
| errηχT (δ x)|
and χT is the trivial character If all we want is an upper bound we can simply remarkthat
xsumqlerq odd
micro2(q)
φ(q)
int δ0r2qx
minus δ0r2qx
|η(minusαx)|2 dα+ xsumqle2rq even
micro2(q)
φ(q)
int δ0rqx
minus δ0rqx|η(minusαx)|2 dα
le
sumqlerq odd
micro2(q)
φ(q)+sumqle2rq even
micro2(q)
φ(q)
|η|22 = 2|η|22sumqlerq odd
micro2(q)
φ(q)
If we also need a lower bound we proceed as follows
Again we will work with an approximation η such that (a) |η minus η|2 is small (b)
210 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
η is thrice differentiable outside finitely many points (c) η(3) isin L1 Clearly
xsumqlerq odd
micro2(q)
φ(q)
int δ0r2qx
minus δ0r2qx
|η(minusαx)|2 dα
lesumqlerq odd
micro2(q)
φ(q)
(int δ0r2q
minus δ0r2q
|η(minusα)|2 dα+ 2〈|η| |η minus η|〉+ |η minus η|22
)
=sumqlerq odd
micro2(q)
φ(q)
int δ0r2q
minus δ0r2q
|η(minusα)|2 dα
+Olowast(
1
2log r + 085
)(2 |η|2 |η minus η|2 + |η minus η|22
)
where we are using (C11) and isometry Alsosumqle2rq even
micro2(q)
φ(q)
int δ0rqx
minus δ0rqx|η(minusαx)|2 dα =
sumqlerq odd
micro2(q)
φ(q)
int δ0r2qx
minus δ0r2qx
|η(minusαx)|2 dα
By (21) and Plancherelint δ0r2q
minus δ0r2q
|η(minusα)|2 dα =
int infinminusinfin|η(minusα)|2 dαminusOlowast
(2
int infinδ0r2q
|η(3) |21
(2πα)6dα
)
= |η|22 +Olowast
(|η(3) |21q5
5π6(δ0r)5
)
Hence
sumqlerq odd
micro2(q)
φ(q)
int δ0r2q
minus δ0r2q
|η(minusα)|2 dα = |η|22 middotsumqlerq odd
micro2(q)
φ(q)+Olowast
sumqlerq odd
micro2(q)
φ(q)
|η(3) |21q5
5π6(δ0r)5
Using (C18) we get thatsumqlerq odd
micro2(q)
φ(q)
|η(3) |21q5
5π6(δ0r)5le 1
r
sumqlerq odd
micro2(q)q
φ(q)middot |η
(3) |21
5π6δ50
le |η(3) |21
5π6δ50
middot(
064787 +log r
4r+
0425
r
)
Going back to (1025) we use (C7) to boundsumq
micro2(q)x2
φ(q)
gcd(q 2)δ0r
qxle 259147 middot δ0rx
103 THE `2 NORM OVER THE MAJOR ARCS 211
We also note that sumqlerq odd
1
q+sumqle2rq even
2
q=sumqler
1
qminussumqle r2
1
2q+sumqler
1
q
le 2 log er minus logr
2le log 2e2r
We have proven the following result
Lemma 1031 Let η [0infin) rarr R be in L1 cap Linfin Let Sη(α x) be as in (101) andlet M = Mδ0r be as in (105) Let η [0infin) rarr R be thrice differentiable outsidefinitely many points Assume η(3)
isin L1Assume r ge 182 ThenintM
|Sη(α x)|2dα = Lrδ0x+Olowast(
519δ0xr
(ET
ηδ0r2middot(|η|1 +
ETηδ0r2
2
)))+Olowast
(δ0r(log 2e2r)
(x middot E2
ηrδ0 +Kr2
))
(1026)where
Eηrδ0 = maxχ mod q
qlermiddotgcd(q2)
|δ|legcd(q2)δ0r2q
radicqlowast| errηχlowast(δ x)| ETηs = max
|δ|les| errηχT (δ x)|
Kr2 = (1 +radic
2r)(log x)2|η|infin(2|Sη(0 x)|x+ (1 +radic
2r)(log x)2|η|infinx)(1027)
and Lrδ0 satisfies both
Lrδ0 le 2|η|22sumqlerq odd
micro2(q)
φ(q)(1028)
and
Lrδ0 = 2|η|22sumqlerq odd
micro2(q)
φ(q)+Olowast(log r + 17) middot
(2 |η|2 |η minus η|2 + |η minus η|22
)
+Olowast
(2|η(3) |21
5π6δ50
)middot(
064787 +log r
4r+
0425
r
)
(1029)Here as elsewhere χlowast denotes the primitive character inducing χ whereas qlowast denotesthe modulus of χlowast
The error term xrETηδ0r will be very small since it will be estimated using theRiemann zeta function the error term involving Kr2 will be completely negligibleThe term involving xr(r+1)E2
ηrδ0 we see that it constrains us to have | errηχ(xN)|
less than a constant times 1r if we do not want the main term in the bound (1026) tobe overwhelmed
212 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
104 The integral over the major arcs conclusion
There are at least two ways we can evaluate (104) One is to substitute (1010) into(104) The disadvantages here are that (a) this can give rise to pages-long formulae (b)this gives error terms proportional to xr| errηχ(xN)| meaning that to win we wouldhave to show that | errηχ(xN)| is much smaller than 1r What we will do instead isto use our `2 estimate (1026) in order to bound the contribution of non-principal termsThis will give us a gain of almost
radicr on the error terms in other words to win it will
be enough to show later that | errηχ(xN)| is much smaller than 1radicr
The contribution of the error terms in Sη3(α x) (that is all terms involving thequantities errηχ in expressions (1012) and (1013)) to (104) is
sumqlerq odd
1
φ(q)
sumχ3 mod q
τ(χ3)sum
a mod q
(aq)=1
χ3(a)e(minusNaq)
int δ0r2qx
minus δ0r2qx
Sη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα
+sumqle2rq even
1
φ(q)
sumχ3 mod q
τ(χ3)sum
a mod q
(aq)=1
χ3(a)e(minusNaq)
int δ0rqx
minus δ0rqxSη+(α+ aq x)2 errηlowastχlowast3 (αx x)e(minusNα)dα
(1030)
We should also remember the terms in (1011) we can integrate them over all of RZand obtain that they contribute at most
intRZ
2
3sumj=1
prodjprime 6=j
|Sηjprime (α x)| middotmaxqler
sump|q
log psumαge1
ηj
(pα
x
)dα
le 2
3sumj=1
prodjprime 6=j
|Sηjprime (α x)|2 middotmaxqler
sump|q
log psumαge1
ηj
(pα
x
)
= 2sumn
Λ2(n)η2+(nx) middot log r middotmax
pler
sumαge1
ηlowast
(pα
x
)
+ 4
radicsumn
Λ2(n)η2+(nx) middot
sumn
Λ2(n)η2lowast(nx) middot log r middotmax
pler
sumαge1
ηlowast
(pα
x
)
by Cauchy-Schwarz and Plancherel
104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 213
The absolute value of (1030) is at most
sumqlerq odd
suma mod q
(aq)=1
int δ0r2qx
minus δ0r2qx
∣∣Sη+(α+ aq x)∣∣2 dα middot max
χ mod q
|δ|leδ0r2q
radicqlowast| errηlowastχlowast(δ x)|
+sumqle2rq even
suma mod q
(aq)=1
int δ0rqx
minus δ0rqx
∣∣Sη+(α+ aq x)∣∣2 dα middot max
χ mod q
|δ|leδ0rq
radicqlowast| errηlowastχlowast(δ x)|
leintMδ0r
∣∣Sη+(α)∣∣2 dα middot max
χ mod q
qlermiddotgcd(q2)
|δ|legcd(q2)δ0rq
radicqlowast| errηlowastχlowast(δ x)|
(1031)We can bound the integral of |Sη+(α)|2 by (1026)
What about the contribution of the error part of Sη2(α x) We can obviouslyproceed in the same way except that to avoid double-counting Sη3(α x) needs tobe replaced by
1
φ(q)τ(χ0)η3(minusδ) middot x =
micro(q)
φ(q)η3(minusδ) middot x (1032)
which is its main term (coming from (1012)) Instead of having an `2 norm as in(1031) we have the square-root of a product of two squares of `2 norms (by Cauchy-Schwarz) namely
intM|Slowastη+(α)|2dα and
sumqlerq odd
micro2(q)
φ(q)2
int δ0r2qx
minus δ0r2qx
|ηlowast(minusαx)x|2 dα+sumqle2rq even
micro2(q)
φ(q)2
int δ0rqx
minus δ0rqx|ηlowast(minusαx)x|2 dα
le x|ηlowast|22 middotsumq
micro2(q)
φ(q)2
(1033)
By (C9) the sum over q is at most 282643As for the contribution of the error part of Sη1(α x) we bound it in the same way
using solely the `2 norm in (1033) (and replacing both Sη2(α x) and Sη3(α x) byexpressions as in (1032))
The total of the error terms is thus
x middot maxχ mod q
qlermiddotgcd(q2)
|δ|legcd(q2)δ0rq
radicqlowast middot | errηlowastχlowast(δ x)| middotA
+ x middot maxχ mod q
qlermiddotgcd(q2)
|δ|legcd(q2)δ0rq
radicqlowast middot | errη+χlowast(δ x)|(
radicA+
radicB+)
radicBlowast
(1034)
where A = (1x)intM|Sη+(α x)|2dα (bounded as in (1026)) and
Blowast = 282643|ηlowast|22 B+ = 282643|η+|22 (1035)
214 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
In conclusion we have proven
Proposition 1041 Let x ge 1 Let η+ ηlowast [0infin)rarr R Assume η+ isin C2 ηprimeprime+ isin L2
and η+ ηlowast isin L1 cap L2 Let η [0infin) rarr R be thrice differentiable outside finitelymany points Assume η(3)
isin L1 and |η+ minus η|2 le ε0|η|2 where ε0 ge 0Let Sη(α x) =
sumn Λ(n)e(αn)η(nx) Let errηχ χ primitive be given as in
(1012) and (1013) Let δ0 gt 0 r ge 1 Let M = Mδ0r be as in (105)Then for any N ge 0int
M
Sη+(α x)2Sηlowast(α x)e(minusNα)dα
equals
C0Cηηlowastx2 +
282643|η|22(2 + ε0) middot ε0 +431004|η|22 + 00012
|η(3) |21δ50
r
|ηlowast|1x2
+Olowast(Eηlowastrδ0Aη+ + Eη+rδ0 middot 16812(radicAη+ + 16812|η+|2)|ηlowast|2) middot x2
+Olowast(
2Zη2+2(x)LSηlowast(x r) middot x+ 4radicZη2+2(x)Zη2lowast2(x)LSη+(x r) middot x
)
(1036)where
C0 =prodp|N
(1minus 1
(pminus 1)2
)middotprodp-N
(1 +
1
(pminus 1)3
)
Cηηlowast =
int infin0
int infin0
η(t1)η(t2)ηlowast
(N
xminus (t1 + t2)
)dt1dt2
(1037)
Eηrδ0 = maxχ mod q
qlegcd(q2)middotr|δ|legcd(q2)δ0r2q
radicqlowast middot | errηχlowast(δ x)| ETηs = max
|δ|lesq| errηχT (δ x)|
Aη =1
x
intM
∣∣Sη+(α x)∣∣2 dα Lηrδ0 le 2|η|22
sumqlerq odd
micro2(q)
φ(q)
Kr2 = (1 +radic
2r)(log x)2|η|infin(2Zη1(x)x+ (1 +radic
2r)(log x)2|η|infinx)
Zηk(x) =1
x
sumn
Λk(n)η(nx) LSη(x r) = log r middotmaxpler
sumαge1
η
(pα
x
)
(1038)and errηχ is as in (1012) and (1013)
Here is how to read these expressions The error term in the first line of (1036)will be small provided that ε0 is small and r is large The third line of (1036) willbe negligible as will be the term 2δ0r(log er)Kr2 in the definition of Aη (ClearlyZηk(x)η (log x)kminus1 and LSη(x q)η τ(q) log x for any η of rapid decay)
104 THE INTEGRAL OVER THE MAJOR ARCS CONCLUSION 215
It remains to estimate the second line of (1036) This includes estimating Aη ndasha task that was already accomplished in Lemma 1031 We see that we will have togive very good bounds for Eηrδ0 when η = η+ or η = ηlowast We also see that we wantto make C0Cη+ηlowastx
2 as large as possible it will be competing not just with the errorterms here but more importantly with the bounds from the minor arcs which will beproportional to |η+|22|ηlowast|1
216 CHAPTER 10 THE INTEGRAL OVER THE MAJOR ARCS
Chapter 11
Optimizing and adaptingsmoothing functions
One of our goals is to maximize the quantity Cηηlowast in (1037) relative to |η|22|ηlowast|1One way to do this is to ensure that (a) ηlowast is concentrated on a very short1 interval [0 ε)(b) η is supported on the interval [0 2] and is symmetric around t = 1 meaning thatη(t) sim η(2minus t) Then for x sim N2 the integralint infin
0
int infin0
η(t1)η(t2)ηlowast
(N
xminus (t1 + t2)
)dt1dt2
in (1037) should be approximately equal to
|ηlowast|1 middotint infin
0
η(t)η
(N
xminus t)dt = |ηlowast|1 middot
int infin0
η(t)2dt = |ηlowast|1 middot |η|22 (111)
provided that η0(t) ge 0 for all t It is easy to check (using Cauchy-Schwarz in thesecond step) that this is essentially optimal (We will redo this rigorously in a littlewhile)
At the same time the fact is that major-arc estimates are best for smoothing func-tions η of a particular form and we have minor-arc estimates from Part I for a differentspecific smoothing η2 The issue then is how do we choose η and ηlowast as above so that
bull ηlowast is concentrated on [0 ε)
bull η is supported on [0 2] and symmetric around t = 1
bull we can give minor-arc and major-arc estimates for ηlowast
bull we can give major-arc estimates for a function η+ close to η in `2 norm
1This is an idea appearing in work by Bourgain in a related context [Bou99]
217
218 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS
111 The symmetric smoothing function ηWe will later work with a smoothing function ηhearts whose Mellin transform decreasesvery rapidly Because of this rapid decay we will be able to give strong results basedon an explicit formula for ηhearts The issue is how to define η given ηhearts so that η issymmetric around t = 1 (ie η(2minus x) sim η(x)) and is very small for x gt 2
We will later set ηhearts(t) = eminust22 Let
h t 7rarr
t3(2minus t)3etminus12 if t isin [0 2]0 otherwise
(112)
We define η Rrarr R by
η(t) = h(t)ηhearts(t) =
t3(2minus t)3eminus(tminus1)22 if t isin [0 2]0 otherwise
(113)
It is clear that η is symmetric around t = 1 for t isin [0 2]
1111 The product η(t)η(ρminus t)We now should go back and redo rigorously what we discussed informally around(111) More precisely we wish to estimate
η(ρ) =
int infinminusinfin
η(t)η(ρminus t)dt =
int infinminusinfin
η(t)η(2minus ρ+ t)dt (114)
for ρ le 2 close to 2 In this it will be useful that the Cauchy-Schwarz inequalitydegrades slowly in the following sense
Lemma 1111 Let V be a real vector space with an inner product 〈middot middot〉 Then forany v w isin V with |w minus v|2 le |v|22
〈v w〉 = |v|2|w|2 +Olowast(271|v minus w|22)
Proof By a truncated Taylor expansion
radic1 + x = 1 +
x
2+x2
2max
0letle1
1
4(1minus (tx)2)32
= 1 +x
2+Olowast
(x2
232
)for |x| le 12 Hence for δ = |w minus v|2|v|2
|w|2|v|2
=
radic1 +
2〈w minus v v〉+ |w minus v|22|v|22
= 1 +2 〈wminusvv〉|v|22
+ δ2
2+Olowast
((2δ + δ2)2
232
)= 1 + δ +Olowast
((1
2+
(52)2
232
)δ2
)= 1 +
〈w minus v v〉|v|22
+Olowast(
271|w minus v|22|v|22
)
112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS219
Multiplying by |v|22 we obtain that
|v|2|w|2 = |v|22 + 〈w minus v v〉+Olowast(271|w minus v|22
)= 〈v w〉+Olowast
(271|w minus v|22
)
Applying Lemma 1111 to (114) we obtain that
(η lowast η)(ρ) =
int infinminusinfin
η(t)η((2minus ρ) + t)dt
=
radicint infinminusinfin|η(t)|2dt
radicint infinminusinfin|η((2minus ρ) + t)|2dt
+Olowast(
271
int infinminusinfin|η(t)minus η((2minus ρ) + t)|2 dt
)= |η|22 +Olowast
(271
int infinminusinfin
(int 2minusρ
0
|ηprime(r + t)| dr)2
dt
)
= |η|22 +Olowast(
271(2minus ρ)
int 2minusρ
0
int infinminusinfin|ηprime(r + t)|2 dtdr
)= |η|22 +Olowast(271(2minus ρ)2|ηprime|22)
(115)
We will be working with ηlowast supported on the non-negative reals we recall that ηis supported on [0 2] Henceint infin
0
int infin0
η(t1)η(t2)ηlowast
(N
xminus (t1 + t2)
)dt1dt2
=
int Nx
0
(η lowast η)(ρ)ηlowast
(N
xminus ρ)dρ
=
int Nx
0
(|η|22 +Olowast(271(2minus ρ)2|ηprime|22)) middot ηlowast(N
xminus ρ)dρ
= |η|22int N
x
0
ηlowast(ρ)dρ+ 271|ηprime|22 middotOlowast(int N
x
0
((2minusNx) + ρ)2ηlowast(ρ)dρ
)
(116)provided that Nx ge 2 We see that it will be wise to set Nx very slightly larger than2 As we said before ηlowast will be scaled so that it is concentrated on a small interval[0 ε)
112 The smoothing function ηlowast adapting minor-arcbounds
Here the challenge is to define a smoothing function ηlowast that is good both for minor-arcestimates and for major-arc estimates The two regimes tend to favor different kinds of
220 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS
smoothing function For minor-arc estimates we use as [Tao14] did
η2(t) = 4 max(log 2minus | log 2t| 0) = ((2I[121]) lowastM (2I[121]))(t) (117)
where I[121](t) is 1 if t isin [12 1] and 0 otherwise For major-arc estimates we willuse a function based on
ηhearts = eminust22
We will actually use here the function t2eminust22 whose Mellin transform isMηhearts(s+2)
(by eg [BBO10 Table 111]))We will follow the simple expedient of convolving the two smoothing functions
one good for minor arcs the other one for major arcs In general let ϕ1 ϕ2 [0infin)rarrC It is easy to use bounds on sums of the form
Sfϕ1(x) =
sumn
f(n)ϕ1(nx) (118)
to bound sums of the form Sfϕ1lowastMϕ2
Sfϕ1lowastMϕ2=sumn
f(n)(ϕ1 lowastM ϕ2)(nx
)=
int infin0
sumn
f(n)ϕ1
( n
wx
)ϕ2(w)
dw
w=
int infin0
Sfϕ1(wx)ϕ2(w)dw
w
(119)The same holds of course if ϕ1 and ϕ2 are switched since ϕ1 lowastM ϕ2 = ϕ2 lowastM ϕ1The only objection is that the bounds on (118) that we input might not be valid ornon-trivial when the argument wx of Sfϕ1
(wx) is very small Because of this it isimportant that the functions ϕ1 ϕ2 vanish at 0 and desirable that their first derivativesdo so as well
Let us see how this works out in practice for ϕ1 = η2 Here η2 [0infin) rarr R isgiven by
η2 = η1 lowastM η1 = 4 max(log 2minus | log 2t| 0) (1110)
where η1 = 2 middot I[121]Let us restate the bounds from Theorem 311 ndash the main result of Part I We will
use Lemma C22 to bound terms of the form qφ(q)Let x ge x0 x0 = 216 middot 1020 Let 2α = aq + δx q le Q gcd(a q) = 1
|δx| le 1qQ where Q = (34)x23 Then if 3 le q le x136 Theorem 311 givesus that
|Sη2(α x)| le gx(
max
(1|δ|8
)middot q)x (1111)
where
gx(r) =(Rx2r log 2r + 05)
radicz(r) + 25radic
2r+L2r
r+ 336xminus16 (1112)
112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS221
with
Rxt = 027125 log
(1 +
log 4t
2 log 9x13
2004t
)+ 041415
Lt = z(t2)
(13
4log t+ 782
)+ 1366 log t+ 3755
(1113)
If q gt x136 then again by Theorem 311
|Sη2(α x)| le h(x)x (1114)
whereh(x) = 0276xminus16(log x)32 + 1234xminus13 log x (1115)
We will work with x varying within a range and so we must pay some attentionto the dependence of (1111) and (1114) on x Let us prove two auxiliary lemmas onthis
Lemma 1121 Let gx(r) be as in (1112) and h(x) as in (1115) Then
x 7rarr
h(x) if x lt (6r)3
gx(r) if x ge (6r)3
is a decreasing function of x for r ge 11 fixed and x ge 21
Proof It is clear from the definitions that x 7rarr h(x) (for x ge 21) and x 7rarr gx(r) areboth decreasing Thus we simply have to show that h(xr) ge gxr (r) for xr = (6r)3Since xr ge (6 middot 11)3 gt e125
Rxr2r le 027125 log(0065 log xr + 1056) + 041415
le 027125 log((0065 + 00845) log xr) + 041415 le 027215 log log xr
Hence
Rxr2r log 2r + 05 le 027215 log log xr log x13r minus 027215 log 125 log 3 + 05
le 009072 log log xr log xr minus 0255
At the same time
z(r) = eγ log logx
13r
6+
250637
log log rle eγ log log xr minus eγ log 3 + 19521
le eγ log log xr
(1116)
for r ge 37 and we also get z(r) le eγ log log xr for r isin [11 37] by the bisectionmethod with 10 iterations Hence
(Rxr2r log 2r + 05)radicz(r) + 25
le (009072 log log xr log xr minus 0255)radiceγ log log xr + 25
le 01211 log xr(log log xr)32 + 2
222 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS
and so
(Rxr2r log 2r + 05)radic
z(r) + 25radic2r
le (021 log xr(log log xr)32 + 347)xminus16
r
Now by (1116)
L2r le eγ log log xr middot(
13
4log(x13
r 3) + 782
)+ 1366 log(x13
r 3) + 3755
le eγ log log xr middot(
13
12xr + 425
)+ 456 log xr + 2255
It is clear that
425eγ log log xr + 456 log xr + 2255
x13r 6
lt 1234xminus13r log xr
for xr ge e we make the comparison for xr = e and take the derivative of the ratio ofthe left side by the right side
It remains to show that
021 log xr(log log xr)32 + 347 + 336 +
13
2eγxminus13
r log xr log log xr (1117)
is less than 0276(log xr)32 for xr large enough Since t 7rarr (log t)32t12 is de-
creasing for t gt e3 we see that
021 log xr(log log xr)32 + 683 + 13
2 eγxminus13r log xr log log xr
0276(log xr)32lt 1
for all xr ge e33 simply because it is true for x = e33 which is greater than ee3
We conclude that h(xr) ge gxr (r) = gxr (x
13r 6) for xr ge e33 We check that
h(xr) ge gxr (x13r 6) for log xr isin [log 663 33] as well by the bisection method
(applied with 30 iterations with log xr as the variable on the intervals [log 663 20][20 25] [25 30] and [30 33]) Since r ge 11 implies xr ge 663 we are done
Lemma 1122 Let Rxr be as in (1112) Then t rarr Retr(r) is convex-up for t ge3 log 6r
Proof Since trarr eminust6 and trarr t are clearly convex-up all we have to do is to showthat trarr Retr is convex-up In general since
(log f)primeprime =
(f prime
f
)prime=f primeprimef minus (f prime)2
f2
a function of the form (log f) is convex-up exactly when f primeprimef minus (f prime)2 ge 0 If f(t) =1 + a(tminus b) we have f primeprimef minus (f prime)2 ge 0 whenever
(t+ aminus b) middot (2a) ge a2
ie a2 + 2at ge 2ab and that certainly happens when t ge b In our case b =3 log(2004r9) and so t ge 3 log 6r implies t ge b
112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS223
Now we come to the point where we prove bounds on exponential sums of the formSηlowast(α x) (that is sums based on the smoothing ηlowast) based on our bounds (1111) and(1114) on the exponential sums Sη2(α x) This is straightforward as promised
Proposition 1123 Let x ge Kx0 x0 = 216 middot 1020 K ge 1 Let Sη(α x) be asin (101) Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) and ϕ [0infin) rarr [0infin) iscontinuous and in L1
Let 2α = aq+δx q le Q gcd(a q) = 1 |δx| le 1qQ whereQ = (34)x23If q le (xK)136 then
Sηlowast(α x) le gxϕ(
max
(1|δ|8
)q
)middot |ϕ|1x (1118)
where
gxϕ(r) =(RxKϕ2r log 2r + 05)
radicz(r) + 25radic
2r+L2r
r+ 336K16xminus16
RxKϕt = Rxt + (RxKt minusRxt)Cϕ2K|ϕ|1
logK(1119)
with Rxt and Lt are as in (1113) and
Cϕ2K = minusint 1
1K
ϕ(w) logw dw (1120)
If q gt (xK)136 then
|Sηlowast(α x)| le hϕ(xK) middot |ϕ|1x
wherehϕ(x) = h(x) + Cϕ0K|ϕ|1
Cϕ0K = 104488
int 1K
0
|ϕ(w)|dw(1121)
and h(x) is as in (1115)
Proof By (119)
Sηlowast(α x) =
int 1K
0
Sη2(αwx)ϕ(w)dw
w+
int infin1K
Sη2(αwx)ϕ(w)dw
w
We bound the first integral by the trivial estimate |Sη2(αwx)| le |Sη2(0 wx)| andCor C13 int 1K
0
|Sη2(0 wx)|ϕ(x)dw
wle 104488
int 1K
0
wxϕ(w)dw
w
= 104488x middotint 1K
0
ϕ(w)dw
224 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS
Ifw ge 1K thenwx ge x0 and we can use (1111) or (1114) If q gt (xK)136then |Sη2(αwx)| le h(xK)wx by (1114) moreover |Sη2(α y)| le h(y)y forxK le y lt (6q)3 (by (1114)) and |Sη2(α y)| le gy1(r) for y ge (6q)3 (by (1111))Thus Lemma 1121 gives us thatint infin
1K
|Sη2(αwx)|ϕ(w)dw
wleint infin
1K
h(xK)wx middot ϕ(w)dw
w
= h(xK)x
int infin1K
ϕ(w)dw le h(xK)|ϕ|1 middot x
If q le (xK)136 we always use (1111) We can use the coarse boundint infin1K
336xminus16 middot wx middot ϕ(w)dw
wle 336K16|ϕ|1x56
Since Lr does not depend on xint infin1K
Lrrmiddot wx middot ϕ(w)
dw
wle Lr
r|ϕ|1x
By Lemma 1122 and q le (xK)136 y 7rarr Reyt is convex-up and decreasingfor y isin [log(xK)infin) Hence
Rwxt le
logwlog 1
K
RxKt +(
1minus logwlog 1
K
)Rxt if w lt 1
Rxt if w ge 1
Thereforeint infin1K
Rwxt middot wx middot ϕ(w)dw
w
leint 1
1K
(logw
log 1K
RxKt +
(1minus logw
log 1K
)Rxt
)xϕ(w)dw +
int infin1
Rxtϕ(w)xdw
le Rxtx middotint infin
1K
ϕ(w)dw + (RxKt minusRxt)x
logK
int 1
1K
ϕ(w) logwdw
le(Rxt|ϕ|1 + (RxKt minusRxt)
Cϕ2logK
)middot x
where
Cϕ2K = minusint 1
1K
ϕ(w) logw dw
We finish by proving a couple more lemmas
Lemma 1124 Let x gt K gt 1 Let ηlowast = η2 lowastM ϕ where η2 is as in (1110) andϕ [0infin)rarr [0infin) is continuous and in L1 Let gxϕ be as in (1119)
Then gxϕ(r) is a decreasing function of r for 670 le r le (xK)136
112 THE SMOOTHING FUNCTION ηlowast ADAPTING MINOR-ARC BOUNDS225
Proof Taking derivatives we can easily see that
r 7rarr log log r
r r 7rarr log r
r r 7rarr log r log log r
r r 7rarr (log r)2 log log r
r(1122)
are decreasing for r ge 20 The same is true if log log r is replaced by z(r) sincez(r) log log r is a decreasing function for r ge e Since (Cϕ2K|ϕ|1) logK le 1(by (1120)) we see that it is enough to prove that r 7rarr Ry2r log 2r
radiclog log r
radic2r is
decreasing on r for y = x and y = xK (under the assumption that r ge 670)Looking at (1113) and at (1122) we see that it remains only to check that
r 7rarr log
(1 +
log 8r
2 log 9y13
4008r
)log 2r middot
radiclog log r
r(1123)
is decreasing on r for r ge 670 Taking logarithms and then derivatives we see that wehave to show that
1r `+
log 8rr
2`2(1 + log 8r
2`
)log(
1 + log 8r2`
) +1
r log 2r+
1
2r log r log log rlt
1
2r
where ` = log 9y13
4008r We multiply by 2r and see that this is equivalent to
1`
(2minus 1
1+ log 8r2`
)log(
1 + log 8r2`
) +2
log 2r+
1
log r log log rlt 1 (1124)
A derivative test is enough to show that s log(1 + s) is an increasing function of s fors gt 0 hence so is s middot (2minus 1(1 + s)) log(1 + s) Setting s = (log 8r)` we obtainthat the left side of (1124) is a decreasing function of ` for r ge 1 fixed
Since r le y136 ` ge log 544008 gt 26 Thus for (1124) to hold it is enoughto ensure that
126
(2minus 1
1+ log 8r52
)log(
1 + log 8r52
) +2
log 2r+
1
log r log log rlt 1 (1125)
A derivative test shows that (2 minus 1s) log(1 + s) is a decreasing function of s fors ge 123 since log(8 middot 75)52 gt 123 this implies that the left side of (1125) is adecreasing function of r for r ge 75
We check that the left side of (1125) is indeed less than 1 for r = 670 we concludethat it is less than 1 for all r ge 670
Lemma 1125 Let x ge 1025 Let φ [0infin) rarr [0infin) be continuous and in L1 Letgxφ(r) and h(x) be as in (1119) and (1115) respectively Then
gxφ
(3
8x415
)ge h(2x log x)
226 CHAPTER 11 OPTIMIZING AND ADAPTING SMOOTHING FUNCTIONS
Proof We can bound gxφ(r) from below by
gmx(r) =(Rxr log 2r + 05)
radicz(r) + 25radic
2r
Let r = (38)x415 Using the assumption that x ge 1025 we see that
Rxr = 027125 log
1 +log(
3x415
2
)2 log
(9
2004middot 38middot x 1
3minus415)+ 041415 ge 063368
(1126)(It is easy to see that the left side of (1126) is increasing on x) Using x ge 1025 againwe get that
z(r) = eγ log log r +250637
log log rge 568721
Since log 2r = (415) log x+ log(34) we conclude that
gmx(r) ge 040298 log x+ 325765radic34 middot x215
Recall that
h(x) =0276(log x)32
x16+
1234 log x
x13
We can see that
x 7rarr (log x+ 33)x215
(log(2x log x))32(2x log x)16(1127)
is increasing for x ge 1025 (and indeed for x ge e27) by taking the logarithm of theright side of (1127) and then taking its derivative with respect to t = log x We cansee in the same way that (1x215)(log(2x log x)(2x log x)13) is increasing forx ge e22 Since
040298(log x+ 33)radic34 middot x215
ge 0276(log(2x log x))32
(2x log x)16
325765minus 33 middot 040298radic34 middot x215
ge 1234 log(2x log(x))
(2x log(x))13
for x = 1025 we are done
Chapter 12
The `2 norm and the large sieve
Our aim here is to give a bound on the `2 norm of an exponential sum over the minorarcs While we care about an exponential sum in particular we will prove a result validfor all exponential sums S(α x) =
sumn ane(αn) with an of prime support
We start by adapting ideas from Ramarersquos version of the large sieve for primes toestimate `2 norms over parts of the circle (sect121) We are left with the task of givingan explicit bound on the factor in Ramarersquos work this we do in sect122 As a side effectthis finally gives a fully explicit large sieve for primes that is asymptotically optimalmeaning a sieve that does not have a spurious factor of eγ in front this was an arguablyimportant gap in the literature
121 Variations on the large sieve for primes
We are trying to estimate an integralintRZ |S(α)|3dα Instead of bounding it trivially by
|S|infin|S|22 we can use the fact that large (ldquomajorrdquo) values of S(α) have to be multipliedonly by
intM|S(α)|2dα where M is a union (small in measure) of major arcs Now
can we give an upper bound forintM|S(α)|2dα better than |S|22 =
intRZ |S(α)|2dα
The first version of [Helb] gave an estimate on that integral using a technique due toHeath-Brown which in turn rests on an inequality of Montgomeryrsquos ([Mon71 (39)]see also eg [IK04 Lem 715]) The technique was communicated by Heath-Brownto the present author who communicated it to Tao who used it in his own notable workon sums of five primes (see [Tao14 Lem 46] and adjoining comments) We will beable to do better than that estimate here
The role played by Montgomeryrsquos inequality in Heath-Brownrsquos method is playedhere by a result of Ramarersquos ([Ram09 Thm 21] see also [Ram09 Thm 52]) Thefollowing proposition is based on Ramarersquos result or rather on one possible proof ofit Instead of using the result as stated in [Ram09] we will actually be using elementsof the proof of [Bom74 Thm 7A] credited to Selberg Simply integrating Ramarersquosinequality would give a non-trivial if slightly worse bound
227
228 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
Proposition 1211 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le
radicx Let Q0 ge 1 δ0 ge 1 be such that
δ0Q20 le x2 set Q =
radicx2δ0 ge Q0 Let
M =⋃qleQ0
⋃a mod q
(aq)=1
(a
qminus δ0r
qxa
q+δ0r
qx
) (121)
Let S(α) =sumn ane(αn) for α isin RZ Thenint
M
|S(α)|2 dα le(
maxqleQ0
maxsleQ0q
Gq(Q0sq)
Gq(Qsq)
)sumn
|an|2
where
Gq(R) =sumrleR
(rq)=1
micro2(r)
φ(r) (122)
Proof By (121)intM
|S(α)|2 dα =sumqleQ0
int δ0Q0qx
minus δ0Q0qx
suma mod q
(aq)=1
∣∣∣∣S (aq + α
)∣∣∣∣2 dα (123)
Thanks to the last equations of [Bom74 p 24] and [Bom74 p 25]
suma mod q
(aq)=1
∣∣∣∣S (aq)∣∣∣∣2 =
1
φ(q)
sumqlowast|q
(qlowastqqlowast)=1
micro2(qqlowast)=1
qlowast middotsumlowast
χ mod qlowast
∣∣∣∣∣sumn
anχ(n)
∣∣∣∣∣2
for every q leradicx where we use the assumption that n is prime and gt
radicx (and thus
coprime to q) when an 6= 0 HenceintM
|S(α)|2 dα =sumqleQ0
sumqlowast|q
(qlowastqqlowast)=1
micro2(qqlowast)=1
qlowastint δ0Q0
qx
minus δ0Q0qx
1
φ(q)
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
=sumqlowastleQ0
qlowast
φ(qlowast)
sumrleQ0qlowast
(rqlowast)=1
micro2(r)
φ(r)
int δ0Q0qlowastrx
minus δ0Q0qlowastrx
sumlowast
χ mod qlowast
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
=sumqlowastleQ0
qlowast
φ(qlowast)
int δ0Q0qlowastx
minus δ0Q0qlowastx
sumrleQ0
qlowast min(1δ0|α|x )
(rqlowast)=1
micro2(r)
φ(r)
sumlowast
χ mod qlowast
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 229
Here |α| le δ0Q0qlowastx implies (Q0q)δ0|α|x ge 1 Thereforeint
M
|S(α)|2 dα le(
maxqlowastleQ0
maxsleQ0qlowast
Gqlowast(Q0sqlowast)
Gqlowast(Qsqlowast)
)middot Σ (124)
where
Σ =sumqlowastleQ0
qlowast
φ(qlowast)
int δ0Q0qlowastx
minus δ0Q0qlowastx
sumrle Q
qlowast min(1δ0|α|x )
(rqlowast)=1
micro2(r)
φ(r)
sumlowast
χ mod qlowast
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
lesumqleQ
q
φ(q)
sumrleQq(rq)=1
micro2(r)
φ(r)
int δ0Qqrx
minus δ0Qqrx
sumlowast
χ mod q
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
As stated in the proof of [Bom74 Thm 7A]
χ(r)χ(n)τ(χ)cr(n) =
qrsumb=1
(bqr)=1
χ(b)e2πin bqr
for χ primitive of modulus q Here cr(n) stands for the Ramanujan sum
cr(n) =sum
u mod r(ur)=1
e2πnur
For n coprime to r cr(n) = micro(r) Since χ is primitive |τ(χ)| =radicq Hence for
r leradicx coprime to q
q
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
=
∣∣∣∣∣∣∣∣qrsumb=1
(bqr)=1
χ(b)S
(b
qr+ α
)∣∣∣∣∣∣∣∣2
Thus
Σ =sumqleQ
sumrleQq(rq)=1
micro2(r)
φ(rq)
int δ0Qqrx
minus δ0Qqrx
sumlowast
χ mod q
∣∣∣∣∣∣∣∣qrsumb=1
(bqr)=1
χ(b)S
(b
qr+ α
)∣∣∣∣∣∣∣∣2
dα
lesumqleQ
1
φ(q)
int δ0Qqx
minus δ0Qqx
sumχ mod q
∣∣∣∣∣∣∣∣qsumb=1
(bq)=1
χ(b)S
(b
q+ α
)∣∣∣∣∣∣∣∣2
dα
=sumqleQ
int δ0Qqx
minus δ0Qqx
qsumb=1
(bq)=1
∣∣∣∣S ( bq + α
)∣∣∣∣2 dα
230 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
Let us now check that the intervals (bq minus δ0Qqx bq + δ0Qqx) do not overlapSince Q =
radicx2δ0 we see that δ0Qqx = 12qQ The difference between two
distinct fractions bq bprimeqprime is at least 1qqprime For q qprime le Q 1qqprime ge 12qQ+ 12QqprimeHence the intervals around bq and bprimeqprime do not overlap We conclude that
Σ leintRZ|S(α)|2 =
sumn
|an|2
and so by (124) we are done
We will actually use Prop 1211 in the slightly modified form given by the follow-ing statement
Proposition 1212 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le
radicx Let Q0 ge 1 δ0 ge 1 be such that
δ0Q20 le x2 set Q =
radicx2δ0 ge Q0 Let M = Mδ0Q0
be as in (105)Let S(α) =
sumn ane(αn) for α isin RZ Then
intMδ0Q0
|S(α)|2 dα le
maxqle2Q0
q even
maxsle2Q0q
Gq(2Q0sq)
Gq(2Qsq)
sumn
|an|2
where
Gq(R) =sumrleR
(rq)=1
micro2(r)
φ(r) (125)
Proof By (105)intM
|S(α)|2 dα =sumqleQ0
q odd
int δ0Q02qx
minus δ0Q02qx
suma mod q
(aq)=1
∣∣∣∣S (aq + α
)∣∣∣∣2 dα+sumqleQ0
q even
int δ0Q0qx
minus δ0Q0qx
suma mod q
(aq)=1
∣∣∣∣S (aq + α
)∣∣∣∣2 dαWe proceed as in the proof of Prop 1211 We still have (123) Hence
intM|S(α)|2 dα
equals
sumqlowastleQ0
qlowast odd
qlowast
φ(qlowast)
int δ0Q02qlowastx
minus δ0Q02qlowastx
sumrleQ0
qlowast min(1δ0
2|α|x )(r2qlowast)=1
micro2(r)
φ(r)
sumlowast
χ mod qlowast
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
+sum
qlowastle2Q0
qlowast even
qlowast
φ(qlowast)
int δ0Q0qlowastx
minus δ0Q0qlowastx
sumrle 2Q0
qlowast min(1δ0
2|α|x )(rqlowast)=1
micro2(r)
φ(r)
sumlowast
χ mod qlowast
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
121 VARIATIONS ON THE LARGE SIEVE FOR PRIMES 231
(The sum with q odd and r even is equal to the first sum hence the factor of 2 in front)Therefore int
M
|S(α)|2 dα le
maxqlowastleQ0
qlowast odd
maxsleQ0qlowast
G2qlowast(Q0sqlowast)
G2qlowast(Qsqlowast)
middot 2Σ1
+
maxqlowastle2Q0
qlowast even
maxsle2Q0qlowast
Gqlowast(2Q0sqlowast)
Gqlowast(2Qsqlowast)
middot Σ2
(126)
where
Σ1 =sumqleQq odd
q
φ(q)
sumrleQq
(r2q)=1
micro2(r)
φ(r)
int δ0Q2qrx
minus δ0Q2qrx
sumlowast
χ mod q
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
=sumqleQq odd
q
φ(q)
sumrle2Qq
(rq)=1
r even
micro2(r)
φ(r)
int δ0Qqrx
minus δ0Qqrx
sumlowast
χ mod q
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
Σ2 =sumqle2Qq even
q
φ(q)
sumrle2Qq
(rq)=1
micro2(r)
φ(r)
int δ0Qqrx
minus δ0Qqrx
sumlowast
χ mod q
∣∣∣∣∣sumn
ane(αn)χ(n)
∣∣∣∣∣2
dα
The two expressions within parentheses in (126) are actually equalMuch as before using [Bom74 Thm 7A] we obtain that
Σ1 lesumqleQq odd
1
φ(q)
int δ0Q2qx
minus δ0Q2qx
qsumb=1
(bq)=1
∣∣∣∣S ( bq + α
)∣∣∣∣2 dαΣ1 + Σ2 le
sumqle2Qq even
1
φ(q)
int δ0Qqx
minus δ0Qqx
qsumb=1
(bq)=1
∣∣∣∣S ( bq + α
)∣∣∣∣2 dαLet us now check that the intervals of integration (bq minus δ0Q2qx bq + δ0Q2qx)(for q odd) (bq minus δ0Qqx bq + δ0Qqx) (for q even) do not overlap Recall thatδ0Qqx = 12qQ The absolute value of the difference between two distinct fractionsbq bprimeqprime is at least 1qqprime For q qprime le Q odd this is larger than 14qQ + 14Qqprimeand so the intervals do not overlap For q le Q odd and qprime le 2Q even (or vice versa)1qqprime ge 14qQ + 12Qqprime and so again the intervals do not overlap If q le Qand qprime le Q are both even then |bq minus bprimeqprime| is actually ge 2qqprime Clearly 2qqprime ge12qQ+ 12Qqprime and so again there is no overlap We conclude that
2Σ1 + Σ2 leintRZ|S(α)|2 =
sumn
|an|2
232 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
122 Bounding the quotient in the large sieve for primesThe estimate given by Proposition 1211 involves the quotient
maxqleQ0
maxsleQ0q
Gq(Q0sq)
Gq(Qsq) (127)
where Gq is as in (122) The appearance of such a quotient (at least for s = 1)is typical of Ramarersquos version of the large sieve for primes see eg [Ram09] Wewill see how to bound such a quotient in a way that is essentially optimal not justasymptotically but also in the ranges that are most relevant to us (This includes forexample Q0 sim 106 Q sim 1015)
As the present work shows an approach based on Ramarersquos work gives bounds thatare in some contexts better than those of other large sieves for primes by a constantfactor (approaching eγ = 178107 ) Thus giving a fully explicit and nearly optimalbound for (127) is a task of clear general relevance besides being needed for our maingoal
We will obtain bounds for Gq(Q0sq)Gq(Qsq) when Q0 le 2 middot 1010 Q ge Q20
As we shall see our bounds will be best when s = q = 1 ndash or sometimes when s = 1and q = 2 instead
Write G(R) for G1(R) =sumrleR micro
2(r)φ(r) We will need several estimates forGq(R) and G(R) As stated in [Ram95 Lemma 34]
G(R) le logR+ 14709 (128)
for R ge 1 By [MV73 Lem 7]
G(R) ge logR+ 107 (129)
for R ge 6 There is also the trivial bound
G(R) =sumrleR
micro2(r)
φ(r)=sumrleR
micro2(r)
r
prodp|r
(1minus 1
p
)minus1
=sumrleR
micro2(r)
r
prodp|r
sumjge1
1
pjgesumrleR
1
rgt logR
(1210)
The following bound also well-known and easy
G(R) le q
φ(q)Gq(R) le G(Rq) (1211)
can be obtained by multiplying Gq(R) =sumrleR(rq)=1 micro
2(r)φ(r) term-by-term byqφ(q) =
prodp|q(1 + 1φ(p))
We will also use Ramarersquos estimate from [Ram95 Lem 34]
Gd(R) =φ(d)
d
logR+ cE +sump|d
log p
p
+Olowast(
7284Rminus13f1(d))
(1212)
122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 233
for all d isin Z+ and all R ge 1 where
f1(d) =prodp|d
(1 + pminus23)
(1 +
p13 + p23
p(pminus 1)
)minus1
(1213)
andcE = γ +
sumpge2
log p
p(pminus 1)= 13325822 (1214)
by [RS62 (211)]If R ge 182 then
logR+ 1312 le G(R) le logR+ 1354 (1215)
where the upper bound is valid for R ge 120 This is true by (1212) for R ge 4 middot 107we check (1215) for 120 le R le 4 middot 107 by a numerical computation1 Similarly forR ge 200
logR+ 1661
2le G2(R) le logR+ 1698
2(1216)
by (1212) for R ge 16 middot108 and by a numerical computation for 200 le R le 16 middot108Write ρ = (logQ0)(logQ) le 1 We obtain immediately from (1215) and (1216)
thatG(Q0)
G(Q)le logQ0 + 1354
logQ+ 1312
G2(Q0)
G2(Q)le logQ0 + 1698
logQ+ 1661
(1217)
for QQ0 ge 200 What is hard is to approximate Gq(Q0)Gq(Q) for q large and Q0
smallLet us start by giving an easy bound off from the truth by a factor of about eγ
(Specialists will recognize this as a factor that appears often in first attempts at esti-mates based on either large or small sieves) First we need a simple explicit lemma
Lemma 1221 Let m ge 1 q ge 1 Thenprodp|qorplem
p
pminus 1le eγ(log(m+ log q) + 065771) (1218)
Proof Let P =prodplemorp|q p Then by [RS75 (51)]
P le qprodplem
p = qesumplem log p le qe(1+ε0)m
where ε0 = 0001102 Now by [RS62 (342)]
n
φ(n)le eγ log log n+
250637
log log nle eγ log log x+
250637
log log x
1Using D Plattrsquos implementation [Pla11] of double-precision interval arithmetic based on Lambovrsquos[Lam08] ideas
234 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
for all x ge n ge 27 (since given a b gt 0 the function t 7rarr a + bt is increasing on tfor t ge
radicba) Hence if qem ge 27
P
φ(P)le eγ log((1 + ε0)m+ log q) +
250637
log(m+ log q)
le eγ(
log(m+ log q) + ε0 +250637eγ
log(m+ log q)
)
Thus (1218) holds when m + log q ge 853 since then ε0 + (250637eγ) log(m +log q) le 065771 We verify all choices of m q ge 1 with m + log q le 853 compu-tationally the worst case is that of m = 1 q = 6 which give the value 065771 in(1218)
Here is the promised easy bound
Lemma 1222 Let Q0 ge 1 Q ge 182Q0 Let q le Q0 s le Q0q q an integer Then
Gq(Q0sq)
Gq(Qsq)leeγ log
(Q0
sq + log q)
+ 1172
log QQ0
+ 1312le eγ logQ0 + 1172
log QQ0
+ 1312
Proof Let P =prodpleQ0sqorp|q p Then
Gq(Q0sq)GP(QQ0) le Gq(Qsq)
and soGq(Q0sq)
Gq(Qsq)le 1
GP(QQ0) (1219)
Now the lower bound in (1211) gives us that for d = P R = QQ0
GP(QQ0) ge φ(P)
PG(QQ0)
By Lem 1221
P
φ(P)le eγ
(log
(Q0
sq+ log q
)+ 0658
)
Hence using (1215) we get that
Gq(Q0sq)
Gq(Qsq)le Pφ(P)
G(QQ0)leeγ log
(Q0
sq + log q)
+ 1172
log QQ0
+ 1312 (1220)
since QQ0 ge 184 Since(Q0
sq+ log q
)prime= minusQ0
sq2+
1
q=
1
q
(1minus Q0
sq
)le 0
the rightmost expression of (1220) is maximal for q = 1
122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 235
Lemma 1222 will play a crucial role in reducing to a finite computation the prob-lem of bounding Gq(Q0sq)Gq(Qsq) As we will now see we can use Lemma1222 to obtain a bound that is useful when sq is large compared to Q0 ndash precisely thecase in which asymptotic estimates such as (1212) are relatively weak
Lemma 1223 Let Q0 ge 1 Q ge 200Q0 Let q le Q0 s le Q0q Let ρ =(logQ0) logQ le 23 Then for any σ ge 1312ρ
Gq(Q0sq)
Gq(Qsq)le logQ0 + σ
logQ+ 1312(1221)
holds provided thatQ0
sqle c(σ) middotQ(1minusρ)eminusγ
0 minus log q
where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))
Proof By Lemma 1222 we see that (1221) will hold provided that
eγ log
(Q0
sq+ log q
)+ 1172 le
log QQ0
+ 1312
logQ+ 1312middot (logQ0 + σ) (1222)
The expression on the right of (1222) equals
logQ0 + σ minus (logQ0 + σ) logQ0
logQ+ 1312
= (1minus ρ)(logQ0 + σ) +1312ρ(logQ0 + σ)
logQ+ 1312
ge (1minus ρ)(logQ0 + σ) + 1312ρ2
and so (1222) will hold provided that
eγ log
(Q0
sq+ log q
)+ 1172 le (1minus ρ)(logQ0) + (1minus ρ)σ + 1312ρ2
Taking derivatives we see that
(1minus ρ)σ + 1312ρ2 minus 1172 ge(
1minus σ
2624
)σ + 1312
( σ
2624
)2
minus 1172
= σ minus σ2
4 middot 1312minus 1172
Hence it is enough that
Q0
sq+ log q le ee
minusγ(
(1minusρ) logQ0+σminus σ2
4middot1312minus1172)
= c(σ) middotQ(1minusρ)eminusγ0
where c(σ) = exp(exp(minusγ) middot (σ minus σ25248minus 1172))
We now pass to the main result of the section
236 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
Proposition 1224 Let Q ge 20000Q0 Q0 ge Q0min where Q0min = 105 Letρ = (logQ0) logQ Assume ρ le 06 Then for every 1 le q le Q0 and everys isin [1 Q0q]
Gq(Q0sq)
Gq(Qsq)le logQ0 + c+
logQ+ cE (1223)
where cE is as in (1214) and c+ = 136
An ideal result would have c+ instead of cE but this is not actually possible errorterms do exist even if they are in reality smaller than the bound given in (1212) thismeans that a bound such as (1223) with c+ instead of cE would be false for q = 1s = 1
There is nothing special about the assumptions
Q ge 20000Q0 Q0 ge 105 (logQ0)(logQ) le 06
They can all be relaxed at the cost of an increase in c+
Proof Define errqR so that
Gq(R) =φ(q)
q
logR+ cE +sump|q
log p
p
+ errqR (1224)
Then (1223) will hold if
logQ0
sq+ cE +
sump|q
log p
p+
q
φ(q)err
qQ0sq
le
logQ
sq+ cE +
sump|q
log p
p+
q
φ(q)errq Qsq
logQ0 + c+logQ+ cE
(1225)
This in turn happens iflog sq minussump|q
log p
p
(1minus logQ0 + c+logQ+ cE
)+ c+ minus cE
ge q
φ(q)
(err
qQ0sqminus logQ0 + c+
logQ+ cEerrq Qsq
)
Defineω(ρ) =
logQ0min + c+1ρ logQ0min + cE
= ρ+c+ minus ρcE
1ρ logQ0min + cE
Then ρ le (logQ0 + c+)(logQ+ cE) le ω(ρ) (because c+ ge ρcE) We conclude that(1225) (and hence (1223)) holds provided that
(1minus ω(ρ))
log sq minussump|q
log p
p
+ c∆
ge q
φ(q)
(err
qQ0sq
+ω(ρ) max(
0minus errq Qsq
))
(1226)
122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 237
where c∆ = c+ minus cE Note that 1minus ω(ρ) gt 0First let us give some easy bounds on the error terms these bounds will yield upper
bounds for s By (128) and (1211)
errqR leφ(q)
q
log q minussump|q
log p
p+ (14709minus cE)
for R ge 1 by (1215) and (1211)
errqR ge minusφ(q)
q
sump|q
log p
p+ (cE minus 1312)
for R ge 182 Therefore the right side of (1226) is at most
log q minus (1minus ω(ρ))sump|q
log p
p+ ((14709minus cE) + ω(ρ)(cE minus 1312))
and so (1226) holds provided that
(1minus ω(ρ)) log sq ge log q + (14709minus cE) + ω(ρ)(cE minus 1312)minus c∆ (1227)
We will thus be able to assume from now on that (1227) does not hold or what is thesame that
sq lt (cρ2q)1
1minusω(ρ) (1228)
holds where cρ2 = exp((14709minus cE) + ω(ρ)(cE minus 1312)minus c∆)What values of R = Q0sq must we consider for q given First by (1228) we
can assume R gt Q0min(cρ2q)1(1minusω(ρ)) We can also assume
R gt c(c+) middotmax(RqQ0min)(1minusρ)eminusγ minus log q (1229)
for c(c+) is as in Lemma 1223 since all smaller R are covered by that LemmaClearly (1229) implies that
R1minusτ gt c(c+) middot qτ minus log q
Rτgt c(c+)qτ minus log q
where τ = (1minusρ)eminusγ and also thatR gt c(c+)Q(1minusρ)eminusγ0min minus log q Iterating we obtain
that we can assume that R gt $(q) where
$(q) = max
($0(q) c(c+)Qτ0min minus log q
Q0min
(cρ2q)1
1minusω(ρ)
)(1230)
and
$0(q) =
(c(c+)qτ minus log q
(c(c+)qτminuslog q)τ
1minusτ
) 11minusτ
if c(c+)qτ gt log q + 1
0 otherwise
238 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
Looking at (1226) we see that it will be enough to show that for all R satisfyingR gt $(q) we have
errqR +ω(ρ) max (0minus errqtR) le φ(q)
qκ(q) (1231)
for all t ge 20000 where
κ(q) = (1minus ω(ρ))
log q minussump|q
log p
p
+ c∆
Ramarersquos bound (1212) implies that
| errqR | le 7284Rminus13f1(q) (1232)
with f1(q) as in (1213) and so
errqR +ω(ρ) max (0minus errqtR) le (1 + βρ) middot 7284Rminus13f1(q)
where βρ = ω(ρ)2000013 This is enough when
R ge λ(q) =
(q
φ(q)
7284(1 + βρ)f1(q)
κ(q)
)3
(1233)
It remains to do two things First we have to compute how large q has to be for$(q) to be guaranteed to be greater than λ(q) (For such q there is no checking to bedone) Then we check the inequality (1231) for all smaller q letting R range throughthe integers in [$(q) λ(q)] We bound errqtR using (1232) but we compute errqRdirectly
How large must q be for $(q) gt λ(q) to hold We claim that $(q) gt λ(q)whenever q ge 22 middot 1010 Let us show this
It is easy to see that (p(pminus1)) middotf1(p) and prarr (log p)p are decreasing functionsof p for p ge 3 moreover for both functions the value at p ge 7 is smaller than forp = 2 Hence we have that for q lt
prodplep0 p p0 a prime
κ(q) ge (1minus ω(ρ))
(log q minus
sumpltp0
log p
p
)+ c∆ (1234)
and
λ(q) le
prodpltp0
p
pminus 1middot
7284(1 + βρ)prodpltp0
f1(p)
(1minus ω(ρ))(
log q minussumpltp0
log pp
)+ c∆
3
(1235)
If we also assume that 2 middot 3 middot 5 middot 7 - q we obtain
κ(q) ge (1minus ω(ρ))
log q minussumpltp0p 6=7
log p
p
+ c∆ (1236)
122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 239
and
λ(q) le
prodpltp0p 6=7
p
pminus 1middot
7284(1 + βρ)prodpltp0p 6=7 f1(p)
(1minus ω(ρ))(
log q minussumpltp0p6=7
log pp
)+ c∆
3
(1237)
for q ltprodplep0 (We are taking out 7 because it is the ldquoleast helpfulrdquo prime to omit
among all primes from 2 to 7 again by the fact that (p(p minus 1)) middot f1(p) and p rarr(log p)p are decreasing functions for p ge 3)
We know how to give upper bounds for the expression on the right of (1235)The task is in essence simple we can base our bounds on the classic explicit work in[RS62] except that we also have to optimize matters so that they are close to tight forp1 = 29 p1 = 31 and other low p1
By [RS62 (330)] and a numerical computation for 29 le p1 le 43prodplep1
p
pminus 1lt 190516 log p1
for p1 ge 29 Since ω(ρ) is increasing on ρ and we are assuming ρ le 06 Q0min =100000
ω(ρ) le 0627312 βρ le 0023111
For x gt a where a gt 1 is any constant we obviously havesumaltplex
log(
1 + pminus23)le
sumaltplex
(log p)pminus23
log a
by Abel summation (133) and the estimate [RS62 (332)] for θ(x) =sumplex log psum
altplex
(log p)pminus23 = (θ(x)minus θ(a))xminus23 minus
int x
a
(θ(u)minus θ(a))
(minus2
3uminus
53
)du
le (101624xminus θ(a))xminus23 +
2
3
int x
a
(101624uminus θ(a))uminus53 du
= (101624xminus θ(a))xminus23 + 2 middot 101624(x13 minus a13) + θ(a)(xminus23 minus aminus23)
= 3 middot 101624 middot x13 minus (203248a13 + θ(a)aminus23)
We conclude thatsum
104ltplex log(1 + pminus23) le 033102x13 minus 706909 for x gt 104Since
sumple104 log p le 1009062 this means thatsum
plex
log(1 + pminus23) le(
033102 +1009062minus 706909
1043
)x13 le 047126x13
for x gt 104 a direct computation for all x prime between 29 and 104 then confirmsthat sum
plex
log(1 + pminus23) le 074914x13
240 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
for all x ge 29 Thusprodplex
f1(p) le esumplex log(1+pminus23)prod
ple29
(1 + p13+p23
p(pminus1)
) le e074914x13
662365
for x ge 29 Finally by [RS62 (324)]sumplep1
log pp lt log p1
We conclude that for q ltprodplep0 p0 p0 a prime and p1 the prime immediately
preceding p0
λ(q) le
190516 log p1 middot745235 middot
(e074914p
131
662365
)037268(log q minus log p1) + 002741
3
le 190272(log p1)3e224742p131
(log q minus log p1 + 007354)3
(1238)
It is clear from (1230) that $(q) is increasing as soon as
q ge max(Q0min Q1minusω(ρ)0min cρ2)
and c(c+)qτ gt log q+ 1 since then $0(q) is increasing and $(q) = $0(q) Here it isuseful to recall that cρ2 ge exp(14709 minus c+) and to note that c(c+)qτ minus (log q + 1)is increasing for q ge 1(τ middot c(c+))1τ we see also that 1(τ middot c(c+))1τ le 1((1 minus06)eminusγc(c+))1((1minus06)eminusγ) for ρ le 06 A quick computation for our value of c+makes us conclude that q gt 112Q0min = 112000 is a sufficient condition for $(q) tobe equal to $0(q) and for $0(q) to be increasing
Since (1238) is decreasing on q for p1 fixed and $0(q) is decreasing on ρ andincreasing on q we set ρ = 06 and check that then
$0
(22 middot 1010
)ge 846765
whereas by (1238)
λ(22 middot 1010) le 838227 lt 846765
this is enough to ensure that λ(q) lt $0(q) for 22 middot 1010 le q ltprodple31 p
Let us now give some rough bounds that will be enough to cover the case q geprodple31 p First as we already discussed $(q) = $0(q) and since c(c+)qτ gt log q +
1
$0(q) ge (c(c+)qτ minus log q)1
1minusτ ge (0911q0224 minus log q)1289 ge q02797 (1239)
by q geprodple31 p We are in the range
prodplep1 p le q le
prodplep0 p where p1 lt p0
are two consecutive primes with p1 ge 31 By [RS62 (316)] and a computation for31 le q lt 200 we know that log q ge
prodplep1 log p ge 08009p1 By (1238) and
(1239) it follows that we just have to show that
e0224t gt190272(log t)3e224742t13
(08009tminus log t+ 007354)3
122 BOUNDING THE QUOTIENT IN THE LARGE SIEVE FOR PRIMES 241
for t ge 31 Now t ge 31 implies 08009tminus log t+ 007354 ge 06924t and so takinglogarithms we see that we just have to verify
0224tminus 224742t13 gt 3 log log tminus 3 log t+ 63513 (1240)
for t ge 31 and since the left side is increasing and the right side is decreasing fort ge 31 this is trivial to check
We conclude that $(q) gt λ(q) whenever q ge 22 middot 1010It remains to see how we can relax this assumption if we assume that 2 middot 3 middot 5 middot 7 - q
We repeat the same analysis as before using (1236) and (1237) instead of (1234) and(1235) For p1 ge 29
prodplep1p 6=7
p
pminus 1lt 1633 log p1
prodplep1p6=7
f1(p) le e074914x13minuslog(1+7minus23)
58478le e074914x13
744586
andsumplep1p 6=7(log p)p lt log p1minus (log 7)7 So for q lt
prodplep0p 6=7 p and p1 ge 29
the prime immediately preceding p0
λ(q) le
1633 log p1 middot745235 middot
(e074914p
131
744586
)037268
(log q minus log p1 + log 7
7
)+ 002741
3
le 84351(log p1)3e224742p131
(log q minus log p1 + 035152)3
Thus we obtain just like before that
$0(33 middot 109) ge 477465 λ(33 middot 109) le 475513 lt 477465
We also check that $0(q0) ge 916322 is greater than λ(q0) le 429731 for q0 =prodple31p 6=7 p The analysis for q ge
prodple37p 6=7 p is also just like before since log q ge
08009p1 minus log 7 we have to show that
e0224t
7gt
84351(log t)3e224742t13
(08009tminus log t+ 007354)3
for t ge 37 and that in turn follows from
0224tminus 224742t13 gt 3 log log tminus 3 log t+ 674849
which we check for t ge 37 just as we checked (1240)We conclude that $(q) gt λ(q) if q ge 33 middot 109 and 210 - qComputation Now for q lt 33middot109 (and also for 33middot109 le q lt 22middot1010 210|q)
we need to check that the maximum mqR1 of errqR over all $(q) le R lt λ(q)satisfies (1231) Note that there is a term errqtR in (1231) we bound it using (1232)
Since logR is increasing on R and Gq(R) depends only on bRc we can tell from(1224) that since we are taking the maximum of errqR it is enough to check integer
242 CHAPTER 12 THE `2 NORM AND THE LARGE SIEVE
values of R We check all integers R in [$(q) λ(q)) for all q lt 33 middot 109 (and all33 middot 109 le q lt 22 middot 1010 210|q) by an explicit computation2
Finally we have the trivial bound
Gq(Q0sq)
Gq(Qsq)le 1 (1241)
which we shall use for Q0 close to Q
Corollary 1225 Let aninfinn=1 an isin C be supported on the primes Assume thatan is in `1 cap `2 and that an = 0 for n le
radicx Let Q0 ge 105 δ0 ge 1 be such that
(20000Q0)2 le x2δ0 set Q =radicx2δ0
Let S(α) =sumn ane(αn) for α isin RZ Let M as in (121) Then if Q0 le Q06int
M
|S(α)|2 dα le logQ0 + c+logQ+ cE
sumn
|an|2
where c+ = 136 and cE = γ +sumpge2(log p)(p(pminus 1)) = 13325822
Let Mδ0Q0 as in (105) Then if (2Q0) le (2Q)06intMδ0Q0
|S(α)|2 dα le log 2Q0 + c+log 2Q+ cE
sumn
|an|2 (1242)
Here of courseintRZ |S(α)|2 dα =
sumn |an|2 (Plancherel) If Q0 gt Q06 we will
use the trivial boundintMδ0r
|S(α)|2 dα leintRZ|S(α)|2 dα =
sumn
|an|2 (1243)
Proof Immediate from Prop 1211 Prop 1212 and Prop 1224
Obviously one can also give a statement derived from Prop 1211 the resultingbound is int
M
|S(α)|2dα le logQ0 + c+logQ+ cE
sumn
|an|2
where M is as in (121)We also record the large-sieve form of the result
2This is by far the heaviest computation in the present work though it is still rather minor (about twoweeks of computing on a single core of a fairly new (2010) desktop computer carrying out other tasks as wellthis is next to nothing compared to the computations in [Plab] or even those in [HP13]) For the applicationshere we could have assumed ρ le 815 and that would have reduced computation time drastically thelighter assumption ρ le 06 was made with views to general applicability in the future As elsewhere in thissection numerical computations were carried out by the author in C all floating-point operations used DPlattrsquos interval arithmetic package